Lucene vs Sphinx: What are the differences?
What is Lucene? A high-performance, full-featured text search engine library written entirely in Java. Lucene Core, our flagship sub-project, provides Java-based indexing and search technology, as well as spellchecking, hit highlighting and advanced analysis/tokenization capabilities.
What is Sphinx? Open source full text search server, designed from the ground up with performance, relevance (aka search quality), and integration simplicity in mind. Sphinx lets you either batch index and search data stored in an SQL database, NoSQL storage, or just files quickly and easily — or index and search data on the fly, working with Sphinx pretty much as with a database server. A variety of text processing features enable fine-tuning Sphinx for your particular application requirements, and a number of relevance functions ensures you can tweak search quality as well.
Lucene and Sphinx belong to "Search Engines" category of the tech stack.
Some of the features offered by Lucene are:
- over 150GB/hour on modern hardware
- small RAM requirements -- only 1MB heap
- incremental indexing as fast as batch indexing
On the other hand, Sphinx provides the following key features:
- Output formats: HTML (including Windows HTML Help), LaTeX (for printable PDF versions), ePub, Texinfo, manual pages, plain text
- Extensive cross-references: semantic markup and automatic links for functions, classes, citations, glossary terms and similar pieces of information
- Hierarchical structure: easy definition of a document tree, with automatic links to siblings, parents and children
Grooveshark, Ansible, and Webedia are some of the popular companies that use Sphinx, whereas Lucene is used by Evernote, Twitter, and Slack. Sphinx has a broader approval, being mentioned in 38 company stacks & 13 developers stacks; compared to Lucene, which is listed in 33 company stacks and 9 developer stacks.