Need advice about which tool to choose?Ask the StackShare community!

Lucene

167
225
+ 1
2
Sphinx

420
287
+ 1
32
Add tool

Lucene vs Sphinx: What are the differences?

What is Lucene? A high-performance, full-featured text search engine library written entirely in Java. Lucene Core, our flagship sub-project, provides Java-based indexing and search technology, as well as spellchecking, hit highlighting and advanced analysis/tokenization capabilities.

What is Sphinx? Open source full text search server, designed from the ground up with performance, relevance (aka search quality), and integration simplicity in mind. Sphinx lets you either batch index and search data stored in an SQL database, NoSQL storage, or just files quickly and easily — or index and search data on the fly, working with Sphinx pretty much as with a database server. A variety of text processing features enable fine-tuning Sphinx for your particular application requirements, and a number of relevance functions ensures you can tweak search quality as well.

Lucene and Sphinx belong to "Search Engines" category of the tech stack.

Some of the features offered by Lucene are:

  • over 150GB/hour on modern hardware
  • small RAM requirements -- only 1MB heap
  • incremental indexing as fast as batch indexing

On the other hand, Sphinx provides the following key features:

  • Output formats: HTML (including Windows HTML Help), LaTeX (for printable PDF versions), ePub, Texinfo, manual pages, plain text
  • Extensive cross-references: semantic markup and automatic links for functions, classes, citations, glossary terms and similar pieces of information
  • Hierarchical structure: easy definition of a document tree, with automatic links to siblings, parents and children

Grooveshark, Ansible, and Webedia are some of the popular companies that use Sphinx, whereas Lucene is used by Evernote, Twitter, and Slack. Sphinx has a broader approval, being mentioned in 38 company stacks & 13 developers stacks; compared to Lucene, which is listed in 33 company stacks and 9 developer stacks.

Get Advice from developers at your company using StackShare Enterprise. Sign up for StackShare Enterprise.
Learn More
Pros of Lucene
Pros of Sphinx
  • 1
    Fast
  • 1
    Small
  • 16
    Fast
  • 9
    Simple deployment
  • 6
    Open source
  • 1
    Lots of extentions

Sign up to add or upvote prosMake informed product decisions

What is Lucene?

Lucene Core, our flagship sub-project, provides Java-based indexing and search technology, as well as spellchecking, hit highlighting and advanced analysis/tokenization capabilities.

What is Sphinx?

It lets you either batch index and search data stored in an SQL database, NoSQL storage, or just files quickly and easily — or index and search data on the fly, working with it pretty much as with a database server.

Need advice about which tool to choose?Ask the StackShare community!

Jobs that mention Lucene and Sphinx as a desired skillset
CBRE
United States of America California Santa Monica
CBRE
United States of America South Carolina Moncks Corner
CBRE
United States of America California Sunnyvale
CBRE
United States of America Texas Richardson
CBRE
United Kingdom of Great Britain and Northern Ireland England London
What companies use Lucene?
What companies use Sphinx?
See which teams inside your own company are using Lucene or Sphinx.
Sign up for StackShare EnterpriseLearn More

Sign up to get full access to all the companiesMake informed product decisions

What tools integrate with Lucene?
What tools integrate with Sphinx?

Sign up to get full access to all the tool integrationsMake informed product decisions

Blog Posts

What are some alternatives to Lucene and Sphinx?
Solr
Solr is the popular, blazing fast open source enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, near real-time indexing, dynamic clustering, database integration, rich document (e.g., Word, PDF) handling, and geospatial search. Solr is highly reliable, scalable and fault tolerant, providing distributed indexing, replication and load-balanced querying, automated failover and recovery, centralized configuration and more. Solr powers the search and navigation features of many of the world's largest internet sites.
Elasticsearch
Elasticsearch is a distributed, RESTful search and analytics engine capable of storing data and searching it in near real time. Elasticsearch, Kibana, Beats and Logstash are the Elastic Stack (sometimes called the ELK Stack).
Apache Solr
It uses the tools you use to make application building a snap. It is built on the battle-tested Apache Zookeeper, it makes it easy to scale up and down.
Hadoop
The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage.
MongoDB
MongoDB stores data in JSON-like documents that can vary in structure, offering a dynamic, flexible schema. MongoDB was also designed for high availability and scalability, with built-in replication and auto-sharding.
See all alternatives