Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.



titleTable of Contents

Table of Contents




Solr Instance
  • multiple instances can run ('multiple solr instances are running')
  • deploy webapp on multiple servers, each of which is an instance
Solr Core
  • each solr instance can have multiple cores
  • also referred to as Solr Index, or simply Core or Index
  • implemented in a databases
  • generally, each core runs in isolation, but can configure some communication between cores via CoreContainer
  • 0..m documents live in a core
  • basic unit of information
  • 0..m fields live in a document
  • various types:  text, numeric, date, etc.
  • type tells solr how to interpret the field and how it can be queried
  • type: String stores a word/sentence as an exact string without performing tokenization etc. Commonly useful for storing exact matches, e.g, for facetting.

  • type: Text typically performs tokenization, and secondary processing (such as lower-casing etc.). Useful for all scenarios when we want to match part of a sentence.



Indexing Documents

  • index via...
    • Request Handlers & Update Handlers (via HTTP POST/PUT)
      • default:  XML, Binary, JSON, CVS, etc.
      • can define own handlers in config
    • Index Handlers
      • import from databases
    • Solr Cell framework (???)
    • custom Java application to ingest data through Solr's Java Client and other apps
  • update processors
    • signature
    • logging
    • indexing

Request Handlers

Code Block
<!--  solr.SearchHandler  -->
<requestHandler name="standard" class="solr.SearchHandler">               <!-- /select -->
<requestHandler name="search" class="solr.SearchHandler" default="true">
<requestHandler name="permissions" class="solr.SearchHandler" >
<requestHandler name="document" class="solr.SearchHandler" >

<!--  solr.UpdateRequestHandler  -->
<requestHandler name="/update" class="solr.UpdateRequestHandler"  />

<!--  other handlers  -->
<requestHandler name="/replication" class="solr.ReplicationHandler" startup="lazy" />
<requestHandler name="/analysis/field" startup="lazy" class="solr.FieldAnalysisRequestHandler" />
<requestHandler name="/admin/" class="org.apache.solr.handler.admin.AdminHandlers" />
<requestHandler name="/admin/ping" class="solr.PingRequestHandler">


To see what a requestHandler returns, change the value of qt from /select to the name of the handler in the solr admin Query page (  NOTE: You will need to change the host to your solr admin host and may need to change the name of the core from development to the name or your core.



  • receive XML, JSON, CSV, or binary (via HTTP GET)
  • request handlers (via HTTP GET)
    • default:  /admin, /select, /spell
    • can define own handlers in config
  • search components
    • query
    • spelling
    • faceting
    • highlighting
    • statistics
    • debug
    • clustering
  • search process  (see Common Query Parameters)

    qtselects Request Handler for a query using /selectDisMaxRequestHandler
    defTypeselects a Query Parser for the queryparser configured in Request Handler
    qfield_name:field_value with * as wildcard to search for*:*q=title:*Archery*
    fqfilters query by applying an additional query to the initial query's results, caches the results (same syntax as q)*:*fq=popularity:[10TO*]& fq=section:0
    sortsort fieldscore desc
    startan offset into the query results where the returned response should begin0start=0
    rowsthe number of rows to be displayed at one time10rows=20
    flfields to return in resultallfl=id, name
    dfdefault field name (I think) that indicates field to serchall indexed fieldsdf=description
    wtselects a Response Writer for formatting the query responsexml | jsonwt=json
    qflist of fields and the "boosts" to associate with each of them when building DisjunctionMaxQueries  (see also SOLR df and qf explanation)all indexed fields are required (???)
    qf=title^20 description^10



  • High Level
    • Advanced Full-Text Search
    • Optimized for High Volume Web Traffic
    • Standards Based Open Interfaces - XML, JSON, HTTP
    • Comprehensive HTML Admin Interfaces
    • Service statistics exposed over JMX for monitoring
    • Near Real-time indexing and Adaptable with XML configuration
    • Linearly scalable, auto index replication, auto, extensible plugin architecture
  • Specific Features
    • faceting
    • highlighting
    • spell checking
    • query-re-ranking
    • transforming
    • suggestors
    • more like this
    • pagination
    • grouping & clustering
    • spatial search
    • components
    • real time (get & update)
    • labs



  • schema.xml
    • field types
    • etc.
  • solrconfig.xml
    • register Request Handlers for querying the index
    • register Update Handlers for indexing documents
    • register Event Handlers for searcher events (e.g. queries to execute to warm new searches)
    • activate version-dependent features in Lucene
    • Lib directives indicates where Solr can find JAR files for extensions
    • Index management settings
    • Enable JMX instrumentation of Solr MBeans
    • Cache-management settings
  • solr.xml


Defined in schema.xml

Hydra Types: 

defined by <types><fieldType>...</></>


NOTE: letter indicates the postfix indicator that sets the type for Hydra dynamic fields.  Ex. name_tsi means that name has type="text"

Hydra Field Def Parameters:

defined by <fields><dynamicField>...</></>


NOTE: letter indicates the postfix indicator that sets that to true for Hydra dynamic fields.  Ex. name_tsi means that name has stored=true/indexed="true"

Examples for values of stored and indexed:


stored="true" indexed="false"

  • destination URL
  • file system path
  • time stamp
  • icon image
  • sort string - have a name that is tokenized text with stored=false/indexed=true and this field that is the exact string for sorting



indexed="false" stored="false"

  • Use this when you want to ignore fields. For example, the following will ignore unknown fields that don't match a defined field rather than throwing an error by default.

    Code Block
    <fieldtype name="ignored" stored="false" indexed="false" />
    <dynamicField name="*" type="ignored" />


Solr Cloud Features

  • horizontal scaling (for sharding and replication)
  • elastic scaling
  • high availability
  • distributed indexing
  • distributed searching
  • central configuration for entire cluster
  • automatic load balancing
  • automatic failover for queries
  • zookeeper integration for coordination & configurations




Return all results with search term = "book"


Code Block
titleQuery for search term



NOTE: Examples use stream.body to show how to do this through a URL.  Usually done via HTTP POST.


  • In Solr UI, select core to effect from selection box on left side menu
  • select Documents on left side menu
  • set Document Type = XML
  • set Doucment(s) text area to `<delete><query>*:*</query></delete>`
  • leave commit within and overwrite as defaults
  • Submit


  • More Query Examples

Search for a specific field, category, containing a search term, book

Code Block
titleQuery for search term in a specific field

Search for price between 0 and 400, inclusive

Code Block
titleSearch for range of values
http://localhost:8983/solr/#/development/select?q=price:[0 TO 400]

Limit search results to return only fields id, name, and price.

Code Block
titleQuery for search term & limit fields returned

Return facets for a specific field, category, with counts for each value of category based on the search results.

Code Block
titleQuery for search term & limit fields returned & include facets


Code Block
<lst name="facet_counts">
  <lst name="facet_queries" />
  <lst name="facet_fields">
    <lst name="category">
      <int name="book">10</int>
      <int name="video">2</int>
      <int name="audio">2</int>
  <lst name="facet_dates"/>

Return facets for a specific field, category, with specific value for category, book, with counts for each value of category based on the search results.

Code Block
titleQuery for search term & limit fields returned & include facets
