What is Solr?
Who uses Solr?
Here are some stack decisions, common use cases and reviews by companies and developers who chose Solr in their tech stack.
I'm planning to create a web application and also a mobile application to provide a very good shopping experience to the end customers. Shortly, my application will be aggregate the product details from difference sources and giving a clear picture to the user that when and where to buy that product with best in Quality and cost.
I have planned to develop this in many milestones for adding N number of features and I have picked my first part to complete the core part (aggregate the product details from different sources).
As per my work experience and knowledge, I have chosen the followings stacks to this mission.
Service: I have planned to use Java as the main business layer language as I have 7+ years of experience on this I believe I can do better work using Java than other languages. In addition, I'm thinking to use the stacks Node.js.
Database and ORM: I'm gonna pick MySQL as DB and Hibernate as ORM since I have a piece of good knowledge and also work experience on this combination.
Search Engine: I need to deal with a large amount of product data and it's in-detailed info to provide enough details to end user at the same time I need to focus on the performance area too. so I have decided to use Solr as a search engine for product search and suggestions. In addition, I'm thinking to replace Solr by Elasticsearch once explored/reviewed enough about Elasticsearch.
Host: As of now, my plan to complete the application with decent features first and deploy it in a free hosting environment like Docker and Heroku and then once it is stable then I have planned to use the AWS products Amazon S3, EC2, Amazon RDS and Amazon Route 53. I'm not sure about Microsoft Azure that what is the specialty in it than Heroku and Amazon EC2 Container Service. Anyhow, I will do explore these once again and pick the best suite one for my requirement once I reached this level.
Build and Repositories: I have decided to choose Apache Maven and Git as these are my favorites and also so popular on respectively build and repositories.
Additional Utilities :) - I would like to choose Codacy for code review as their Startup plan will be very helpful to this application. I'm already experienced with Google CheckStyle and SonarQube even I'm looking something on Codacy.
Happy Coding! Suggestions are welcome! :)
One of the earliest public references to Slack’s stack comes from a Twitter conversation. The Slack account states that “the messaging server is java, the app is php, db is mysql and solr for search,” and that uploaded files are “Stored on S3, but private files require authentication so requests go through the app.”
"Slack provides two strategies for searching: Recent and Relevant. Recent search finds the messages that match all terms and presents them in reverse chronological order. If a user is trying to recall something that just happened, Recent is a useful presentation of the results.
Relevant search relaxes the age constraint and takes into account the Lucene score of the document — how well it matches the query terms (Solr powers search at Slack). Used about 17% of the time, Relevant search performed slightly worse than Recent according to the search quality metrics we measured: the number of clicks per search and the click-through rate of the search results in the top several positions. We recognized that Relevant search could benefit from using the user’s interaction history with channels and other users — their ‘work graph’."
elastic search 와 함께 유명한 검색 엔진 오픈 소스 중 하나 이다. 처음 설정할 것이 많은데, 어플리케이션의 이해가 없다면 잦은 수정이 필요하다. Solr Client 로 제어 할 수 없고 Server 에서 설정해 줘야하는 것들이 있어 서버 설정하는 부분이 중요하다. 서버 설정만 잘 되있다면, Client 쪽 소스는 별게 없다.
중요한 건 형태소 분석기.... Solr
Full text search is provided by a SOLR cluster. This is done on Master/Slave replication with Varnish as a cache. Solr
- Advanced Full-Text Search Capabilities
- Optimized for High Volume Web Traffic
- Standards Based Open Interfaces - XML, JSON and HTTP
- Comprehensive HTML Administration Interfaces
- Server statistics exposed over JMX for monitoring
- Linearly scalable, auto index replication, auto failover and recovery
- Near Real-time indexing
- Flexible and Adaptable with XML configuration
- Extensible Plugin Architecture