Posted by: Sochinda Tith
Solr is the popular, blazing fast open source enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, near real-time indexing, dynamic clustering, database integration, rich document (e.g., Word, PDF) handling, and geospatial search. Solr is highly reliable, scalable and fault tolerant, providing distributed indexing, replication and load-balanced querying, automated failover and recovery, centralized configuration and more. Solr powers the search and navigation features of many of the world’s largest internet sites.
Solr is a standalone enterprise search server with a REST-like API. You put documents in it (called “indexing”) via XML, JSON, CSV or binary over HTTP. You query it via HTTP GET and receive XML, JSON, CSV or binary results.
- Advanced Full-Text Search Capabilities
- Optimized for High Volume Web Traffic
- Standards Based Open Interfaces – XML, JSON and HTTP
- Comprehensive HTML Administration Interfaces
- Server statistics exposed over JMX for monitoring
- Linearly scalable, auto index replication, auto failover and recovery
- Near Real-time indexing
- Flexible and Adaptable with XML configuration
- Extensible Plugin Architecture
As image above that my example for preparing indexing data, I have separated as 4 parts in system architecture:
- Client Side: this is part that is used by end user such as website, web application and mobile platform.
- Web Service: this is API web service (using RESTful) that the most important for the whole system, why? because it’s a middle ware that interact between end user and indexing server, DBMS
- Indexing Server: this is Apache Solr for preparing all document as system file for fasting search data by using full text search.
- DBMS or Document: it’s a storage data such Microsoft SQL server, MySQL or any files that will show to end user.
How to build
- Apache Tomcat
- Apache Solr
- Solarium (framework for PHP)