Open Source Search Engine Apache Lucene/Solr Gets Big Update

Today the Apache Foundation released a major update to the open source search engine building tools Lucene and Solr. Version 4.0 adds several new features aimed at making Solr easier to use, more scalable and more customizable.

Although they’re jointly developed, Lucene and Solr are actually two different things. Lucene is just a Java library, not a stand alone search engine. Solr is a search engine server built with Lucene as its core.

Lucene was created in 1999 by Doug Cutting, better known as the creator of Apache Hadoop, and has been used both companies like AOL and LinkedIn to power search features. Solr was created by Yonik Seeley in 2004. It can be used as a custom search engine, or be used to power search for a separate application.

Scalability was the Solr/Lucene team’s biggest focus for today’s release, according to Search Engine Hub — particularly scaling out as opposed to scaling up.

Web companies like Google and Amazon.com have popularized scaling out in recent years. To over simplify: when you scale up, you replace your existing servers with more powerful ones when you need more capacity. When you scale out you add more servers to your environment to add capacity. This approach is generally seen to provide more bang for the buck, but clusters of servers can be difficult to setup and manage, and distributing data across a cluster introduces a number of challenges.

To address these issues version 4.0 introduces a collection of tools designed to make it easier to build and manage Solr server clusters, including a new indexing system designed to deliver near real-time search results in a distributed environment.

These features will help Solr compete with ElasticSearch, an open source, Lucene-based search engine server that has long focused on distributed environments.

Other new features in 4.0 include a new web based UI, a spell checker and better support for spatial data (which will be useful for anyone doing geographic searches). The new version will also give users more customization and control.

LucidWorks, a company founded by Seeley, offers commercial support for the Solr.

Apache Solr 4.0 admin screenshot