Thursday, April 23, 2015

The Apache Solr Query Architecture

Overview

Fig 1: Solr Query Architecture

The diagram above demonstrates a document that contains the text: "the greenhouse gas effect".

A user query for "climate change" will result in a scored relevancy match against this document.

The user query against the Solr server is very simple:
q=climate change
Solr will augment this query in two stages:

  1. The Request Handler will add additional meta-data about how the query should be executed.  Notions of relevance, number of rows returned, if highlighting should be used, the fields to query and return, are all specified here.
    1. The query then becomes:
      q=climate change&defType=xml&wt=xml&fl=id title text&qf=title^2 text&rows=10&pf=title^2 text&ps=5&echoParams=all&hl=true&hl.fl=title text&debug=true
  2. The Query Analyzer will perform a linguistic analysis of the user query.  Tokenization, pattern filtering, stemming, synonyms, etc are all specified here.
    1. The query then becomes:
      q=(+((DisjunctionMaxQuery((text:climate | title:climate^2.0 | speaker:climate)) DisjunctionMaxQuery((text:change | title:change^2.0 | speaker:change)))~2) DisjunctionMaxQuery((title:"(greenhouse ghg climate climate deforestation pollution greenhouse carbon co2 methane nitrous n2o hydroflurocarbons hfcs perfluorocarbons pfcs sulfur sf6) (gas change shift gasses dioxide oxide hexafluoride)"~5^2.0 | text:"(greenhouse ghg climate climate deforestation pollution greenhouse carbon co2 methane nitrous n2o hydroflurocarbons hfcs perfluorocarbons pfcs sulfur sf6) (gas change shift gasses dioxide oxide hexafluoride)"~5)))/no_coord&defType=xml&wt=xml&fl=id title text&qf=title^2 text&rows=10&pf=title^2 text&ps=5&echoParams=all&hl=true&hl.fl=title text&debug=true

This is a powerful design technique for abstracting complexity away from the user query while creating very complex and specific queries to find relevant documents.


Request Handler


The first augmentation stage is controlled by the request handler.

Request handlers are defined within solrconfig.xml:
<requestHandler name="/docQuery" class="solr.SearchHandler">
  <lst name="defaults">
    <str name="defType">edismax</str>
    <str name="wt">xml</str>
    <str name="fl">id author abstract heading text</str>
    <str name="qf">title^4 abstract^2 text</str>
    <str name="rows">10</str>
    <str name="pf">title^4 abstract^2 text</str>
    <str name="ps">5</str>
    <str name="echoParams">all</str>
    <str name="mm">3&lt;-1 5&lt;-2 6&lt;-40%</str> 
    <str name="hl">true</str>
    <str name="hl.fl">title abstract text</str>
    <str name="debug">true</str>
    <str name="explain">true</str>
  </lst>
</requestHandler> 

This request handler creates the node entitled "Augmented User Query 1".  So here's a major benefit to the configuration files already.  The user (or the application) didn't have to append all this information to the query string.  It's appended by default to each query.


Query Analyzer


The query is next augmented by each field that is being searched.

Within the schema.xml file, I have a field defined for text:
    <fieldType name="text_doc" class="solr.TextField" positionIncrementGap="100">

      <!-- Indexer -->
      <analyzer type="index">
        <charFilter class="solr.PatternReplaceCharFilterFactory" pattern="([a-zA-Z])\1+" replacement="$1$1" />
        <charFilter class="solr.PatternReplaceCharFilterFactory" pattern="([a-zA-Z])(/)([a-zA-Z])" replacement="$1 or $3" />
        <charFilter class="solr.PatternReplaceCharFilterFactory" pattern="(\()(.)+(\))" replacement="" />
        <tokenizer class="solr.WhitespaceTokenizerFactory" />
        <filter class="solr.WordDelimiterFilterFactory"
          generateWordParts="1"
          splitOnCaseChange="1"
          splitOnNumerics="1"
          stemEnglishPossessive="1"
          preserveOriginal="1"
          catenateWords="1"
          generateNumberParts="1"
          catenateNumbers="1"
          catenateAll="1"
          types="wdfftypes.txt" />
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
        <filter class="solr.EnglishPossessiveFilterFactory" />
        <filter class="solr.LowerCaseFilterFactory" />
        <filter class="solr.ASCIIFoldingFilterFactory" />
        <filter class="solr.StemmerOverrideFilterFactory" dictionary="stemdict.txt" />
        <filter class="solr.KStemFilterFactory" />
      </analyzer>

      <!-- Query Analyzer -->
      <analyzer type="query">
        <charFilter class="solr.PatternReplaceCharFilterFactory" pattern="([a-zA-Z])\+1" replacement="$1$1" />
        <charFilter class="solr.PatternReplaceCharFilterFactory" pattern="([a-zA-Z])(/)([a-zA-Z])" replacement="$1 or $3" />
        <charFilter class="solr.PatternReplaceCharFilterFactory" pattern="(\()(.)+(\))" replacement="$2" />
        <tokenizer class="solr.WhitespaceTokenizerFactory" />
        <filter class="solr.WordDelimiterFilterFactory"
          generateWordParts="1"
          splitOnCaseChange="1"
          splitOnNumerics="1"
          stemEnglishPossessive="1"
          preserveOriginal="1"
          catenateWords="1"
          generateNumberParts="1"
          catenateNumbers="1"
          catenateAll="1"
          types="wdfftypes.txt" />
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
        <filter class="solr.EnglishPossessiveFilterFactory" />
        <filter class="solr.LowerCaseFilterFactory" />
        <filter class="solr.ASCIIFoldingFilterFactory" />
       <filter class="solr.StemmerOverrideFilterFactory" dictionary="stemdict.txt" />
        <filter class="solr.KStemFilterFactory" />
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true" />
      </analyzer>

    </fieldType>


Note that in the configuration file above, there are two analyzers specified for the field called "text_doc".  One analyzer is for the document text being indexed during the ingestion phase.  The other analyzer is for the user query text that triggers the search (for the indexed text).  In both cases, the configuration is largely identical, except for the use of synonyms.

This is an important concept to grasp.  If the indexed content and the user query are both treated by (nearly) identical analyzers, it's going to be a lot easier to find relevant text.  As a counter example, imagine having to design a tokenization pipeline for user queries against content indexed by multiple, unknown configurations.  If you use aggresive stemming and wildcards to boost recall, this will come at the expense of precision.


References

  1. [YouTube, 5:54] Apache Solr: Complex Query Format

Tuesday, April 14, 2015

Apache Solr and Docker (for Beginners)

Introduction


It is possible to find instructions for installing Docker directly on your OS. But there's almost no reason these days not to use docker. There are plenty of docker containers for Solr, and the operation (as I will demonstrate in this article) is almost trivial. Launching a docker container from a trusted image not only saves time and effort, but leverages best-practices installation and configuration techniques, at least where official or highly trusted Dockerfiles are concerned.

In this tutorial, I'll walk through launching a docker container for Solr, attaching an external data volume, and demonstrating the successful GET and POST of data to Solr between container lifecycles.

Using the docker search command, I find that these images have already been built:
craig@devenv:~$ sudo docker search solr
NAME                              DESCRIPTION                                     STARS     OFFICIAL   AUTOMATED
makuk66/docker-solr               Solr is the popular, blazing-fast, open so...   37                   [OK]
guywithnose/solr                                                                  6                    [OK]
raycoding/piggybank-solr-tomcat   SolrCloud on Docker                             4                    [OK]
cygri/solr                        A custom build of Solr for use with CKAN        3                    [OK]
pointslope/solr                   This is a lightweight Apache Solr installa...   2                    [OK]
yoshz/solr                        A docker image running SOLR  on top of Ubu...   1                    [OK]
geoblacklight/solr                Solr for GeoBlacklight                          0                    [OK]
infotechsoft/solr                 SOLR installed on CentOS using openjdk7         0                    [OK]
reinblau/solr3                    Apache Solr , prepared for Search Api Solr      0                    [OK]
kyberna/solr                                                                      0                    [OK]
encoflife/solr                                                                    0                    [OK]
lphoward/vehicleforge-solr        Dockerfile and resources for Apache Solr o...   0                    [OK]
cpilsworth/solr                   Solr 4.10.2 on Jetty 9.2.5 using Oracle Ja...   0                    [OK]
holmes/solr                                                                       0                    [OK]
writl/solr-typo3                  Apache Solr configured for Typo3 solr exte...   0                    [OK]
hiroara/solr                                                                      0                    [OK]
blinkreaction/drupal-solr                                                         0                    [OK]
quirky/solr                                                                       0                    [OK]
eccenca/ckan-solr                                                                 0                    [OK]
obi12341/solr-typo3               Please use writl/solr-typo3. This repo wil...   0                    [OK]
pmoust/solr                       Solr container image - Ubuntu Trusty (LTS)...   0                    [OK]
pataquets/solr                                                                    0                    [OK]
anapsix/docker-solr               SOLR Java8 / Ubuntu 14.04. Includes JDBC f...   0                    [OK]
manycore/solr                     Solr is the popular open source enterprise...   0                    [OK]
dhorbach/solr                                                                     0                    [OK]
craig@devenv:~$


At the time of this article, the first docker image is the most popular, and has a Dockerfile with a configuration that I agree with. The image is built on top of java:8, which in turn uses dockerfile/ubuntu as the base image. I like this design approach rather than pulling the OS directly. If I ever want to extend this Dockerfile, I'm generally happier to work with an underlying Debian image as well, although this isn't generally much of an issue with Docker.

The image is also well-supported, with an updated configuration for the Solr 5.1 release being released today.


Building the Image


This is an optional section, but as I've mentioned here, I like to piggyback on Dockerfiles that I'm planning to use on a project.

Given a local docker registry and multiple team members, it becomes easier to point members to the registry and a standard naming scheme for images. In this case, I'm not adding any extensions to the Dockerfile at the moment, but that could always come later.

My copy looks like:
FROM    makuk66/docker-solr
MAINTAINER  Craig Trim "craigtrim@gmail.com"

and I build this using
$ sudo docker build -t craig/solr .

from within the directory containing my Dockerfile.


Launching a Container


I can launch a container from the image directly by typing
$ sudo docker run -d -p 8983:8983 craig/solr
691f2adee14563904996cbc465c513861860dd654a011063dc31f89660def543

This launches the container and exposes the Solr installation's port 8983 on the Host port 8983.  I could use any Host port I want, and if I launch multiple containers from this image, will indeed need to select something different.


The installation can be verified via a web browser:
Fig 1: http://192.168.x.y:8983/solr

A slight problem however: if I click on "Core Admin" in the dashboard and try to add a core, I get an error:

Fig 2: SolrCore Initialization Error
And at any rate, if I was able to add a core, what would this prove?  Given the ephemeral nature of Docker containers, any data or configuration I peform will be lost when the container is closed.  Granted, I could always commit my changes to the local image, but then I seem to lose much of the value of Docker.  My Dockerfile is no longer quite as relevant, and it becomes necessary to share a local image with the rest of my team (and even between my own devices), rather than simply pointing to a script.


Attaching a Volume

If you're familiar with fundamental Java concepts (particularly within the Spring Framework) the concept of "dependency injection" may be familiar.  Logic is crafted with reliance on a component, but without the requirement of managing that component's lifecycle.  And if you're not familiar with this dessign pattern, that's fine.  The docker concepts are simple enough as they stand.

I'm tempted to call the attachement of data volumes to a docker image "directory injection".  The pattern is clear enough: the container relies on a given directory, or many given directories.  Now, whether these directories exist (and are configured) within the container, or on the host that the container is running on, isn't that important.

Again, given the ephemeral nature of a docker container, it is often necessary to use a data volume on the host, and make the container dependent on this.

In the case of Solr, we'll want to inject a directory into our container that holds a configuration for a Solr core.

We do that like this:
$ sudo docker run -p 8983:8983 -v /home/craig/solr_data/:/opt/solr/server/solr/books craig/solr

The host information is in green and the container information in orange.

In some cases (particularly when debugging, or just getting started), I avoid running a container in detached mode, and focus on the immediate log output.

As in this case:
$ sudo docker run -p 8983:8983 -v /home/craig/solr_data/:/opt/solr/server/solr/books craig/solr

Starting Solr on port 8983 from /opt/solr/server

0    [main] INFO  org.eclipse.jetty.server.Server  ? jetty-8.1.10.v20130312
48   [main] INFO  org.eclipse.jetty.deploy.providers.ScanningAppProvider  ? Deployment monitor /opt/solr-5.0.0/server/contexts at interval 0
55   [main] INFO  org.eclipse.jetty.deploy.DeploymentManager  ? Deployable added: /opt/solr-5.0.0/server/contexts/solr-jetty-context.xml
122  [main] INFO  org.eclipse.jetty.webapp.WebInfConfiguration  ? Extract jar:file:/opt/solr-5.0.0/server/webapps/solr.war!/ to /opt/solr-5.0.0/server/solr-webapp/webapp
1541 [main] INFO  org.eclipse.jetty.webapp.StandardDescriptorProcessor  ? NO JSP Support for /solr, did not find org.apache.jasper.servlet.JspServlet
1598 [main] INFO  org.apache.solr.servlet.SolrDispatchFilter  ? SolrDispatchFilter.init()WebAppClassLoader=2028371466@78e67e0a
1622 [main] INFO  org.apache.solr.core.SolrResourceLoader  ? JNDI not configured for solr (NoInitialContextEx)
1623 [main] INFO  org.apache.solr.core.SolrResourceLoader  ? using system property solr.solr.home: /opt/solr/server/solr
1624 [main] INFO  org.apache.solr.core.SolrResourceLoader  ? new SolrResourceLoader for directory: '/opt/solr/server/solr/'
1776 [main] INFO  org.apache.solr.core.ConfigSolr  ? Loading container configuration from /opt/solr/server/solr/solr.xml
1869 [main] INFO  org.apache.solr.core.CoresLocator  ? Config-defined core root directory: /opt/solr/server/solr
1878 [main] INFO  org.apache.solr.core.CoreContainer  ? New CoreContainer 1048855692
1879 [main] INFO  org.apache.solr.core.CoreContainer  ? Loading cores into CoreContainer [instanceDir=/opt/solr/server/solr/]
1892 [main] INFO  org.apache.solr.handler.component.HttpShardHandlerFactory  ? Setting socketTimeout to: 600000
1892 [main] INFO  org.apache.solr.handler.component.HttpShardHandlerFactory  ? Setting urlScheme to: null
1896 [main] INFO  org.apache.solr.handler.component.HttpShardHandlerFactory  ? Setting connTimeout to: 60000
1899 [main] INFO  org.apache.solr.handler.component.HttpShardHandlerFactory  ? Setting maxConnectionsPerHost to: 20
1900 [main] INFO  org.apache.solr.handler.component.HttpShardHandlerFactory  ? Setting maxConnections to: 10000
1900 [main] INFO  org.apache.solr.handler.component.HttpShardHandlerFactory  ? Setting corePoolSize to: 0
1900 [main] INFO  org.apache.solr.handler.component.HttpShardHandlerFactory  ? Setting maximumPoolSize to: 2147483647
1900 [main] INFO  org.apache.solr.handler.component.HttpShardHandlerFactory  ? Setting maxThreadIdleTime to: 5
1901 [main] INFO  org.apache.solr.handler.component.HttpShardHandlerFactory  ? Setting sizeOfQueue to: -1
1901 [main] INFO  org.apache.solr.handler.component.HttpShardHandlerFactory  ? Setting fairnessPolicy to: false
1906 [main] INFO  org.apache.solr.handler.component.HttpShardHandlerFactory  ? Setting useRetries to: false
2037 [main] INFO  org.apache.solr.update.UpdateShardHandler  ? Creating UpdateShardHandler HTTP client with params: socketTimeout=600000&connTimeout=60000&retry=true
2040 [main] INFO  org.apache.solr.logging.LogWatcher  ? SLF4J impl is org.slf4j.impl.Log4jLoggerFactory
2041 [main] INFO  org.apache.solr.logging.LogWatcher  ? Registering Log Listener [Log4j (org.slf4j.impl.Log4jLoggerFactory)]
2044 [main] INFO  org.apache.solr.core.CoreContainer  ? Host Name:
2079 [main] INFO  org.apache.solr.core.CoresLocator  ? Looking for core definitions underneath /opt/solr/server/solr
2093 [main] INFO  org.apache.solr.core.CoresLocator  ? Found core books in /opt/solr/server/solr/books/
2101 [main] INFO  org.apache.solr.core.CoresLocator  ? Found 1 core definitions
2104 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? new SolrResourceLoader for directory: '/opt/solr/server/solr/books/'
2188 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrConfig  ? current version of requestparams : -1
2194 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrConfig  ? Adding specified lib dirs to ClassLoader
2196 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/extraction/lib/icu4j-54.1.jar' to classloader
2196 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/extraction/lib/poi-3.11.jar' to classloader
2197 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/extraction/lib/aspectjrt-1.8.0.jar' to classloader
2197 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/extraction/lib/xmpcore-5.1.2.jar' to classloader
2197 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/extraction/lib/poi-ooxml-3.11.jar' to classloader
2198 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/extraction/lib/xz-1.5.jar' to classloader
2198 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/extraction/lib/apache-mime4j-dom-0.7.2.jar' to classloader
2200 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/extraction/lib/boilerpipe-1.1.0.jar' to classloader
2200 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/extraction/lib/xmlbeans-2.6.0.jar' to classloader
2200 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/extraction/lib/pdfbox-1.8.8.jar' to classloader
2200 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/extraction/lib/isoparser-1.0.2.jar' to classloader
2200 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/extraction/lib/vorbis-java-tika-0.6.jar' to classloader
2201 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/extraction/lib/juniversalchardet-1.0.3.jar' to classloader
2201 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/extraction/lib/tika-xmp-1.7.jar' to classloader
2201 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/extraction/lib/tika-core-1.7.jar' to classloader
2202 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/extraction/lib/bcmail-jdk15-1.45.jar' to classloader
2202 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/extraction/lib/rome-1.0.jar' to classloader
2203 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/extraction/lib/tika-java7-1.7.jar' to classloader
2203 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/extraction/lib/tika-parsers-1.7.jar' to classloader
2203 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/extraction/lib/vorbis-java-core-0.6.jar' to classloader
2203 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/extraction/lib/poi-scratchpad-3.11.jar' to classloader
2203 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/extraction/lib/metadata-extractor-2.6.2.jar' to classloader
2204 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/extraction/lib/apache-mime4j-core-0.7.2.jar' to classloader
2204 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/extraction/lib/jhighlight-1.0.jar' to classloader
2205 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/extraction/lib/poi-ooxml-schemas-3.11.jar' to classloader
2207 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/extraction/lib/java-libpst-0.8.1.jar' to classloader
2207 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/extraction/lib/jdom-1.0.jar' to classloader
2207 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/extraction/lib/jmatio-1.0.jar' to classloader
2208 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/extraction/lib/commons-compress-1.8.1.jar' to classloader
2208 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/extraction/lib/xercesImpl-2.9.1.jar' to classloader
2208 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/extraction/lib/bcprov-jdk15-1.45.jar' to classloader
2208 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/extraction/lib/jempbox-1.8.8.jar' to classloader
2209 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/extraction/lib/tagsoup-1.2.1.jar' to classloader
2211 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/extraction/lib/fontbox-1.8.8.jar' to classloader
2212 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/dist/solr-cell-5.0.0.jar' to classloader
2213 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/clustering/lib/mahout-collections-1.0.jar' to classloader
2213 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/clustering/lib/mahout-math-0.6.jar' to classloader
2213 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/clustering/lib/carrot2-mini-3.9.0.jar' to classloader
2216 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/clustering/lib/attributes-binder-1.2.1.jar' to classloader
2216 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/clustering/lib/jackson-mapper-asl-1.9.13.jar' to classloader
2217 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/clustering/lib/simple-xml-2.7.jar' to classloader
2219 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/clustering/lib/jackson-core-asl-1.9.13.jar' to classloader
2219 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/clustering/lib/hppc-0.5.2.jar' to classloader
2219 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/dist/solr-clustering-5.0.0.jar' to classloader
2220 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/langid/lib/jsonic-1.2.7.jar' to classloader
2220 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/langid/lib/langdetect-1.1-20120112.jar' to classloader
2221 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/dist/solr-langid-5.0.0.jar' to classloader
2224 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/velocity/lib/velocity-1.7.jar' to classloader
2224 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/velocity/lib/commons-beanutils-1.8.3.jar' to classloader
2224 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/velocity/lib/velocity-tools-2.0.jar' to classloader
2225 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/contrib/velocity/lib/commons-collections-3.2.1.jar' to classloader
2225 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrResourceLoader  ? Adding 'file:/opt/solr/dist/solr-velocity-5.0.0.jar' to classloader
2290 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.update.SolrIndexConfig  ? IndexWriter infoStream solr logging is enabled
2297 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrConfig  ? Using Lucene MatchVersion: 4.7.0
2401 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.Config  ? Loaded SolrConfig: solrconfig.xml
2410 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.schema.IndexSchema  ? Reading Solr Schema from /opt/solr/server/solr/books/conf/schema.xml
2417 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.schema.IndexSchema  ? [books] Schema name=books
2473 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.schema.IndexSchema  ? unique key field: id
2567 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.CoreContainer  ? Creating SolrCore 'books' using configuration from instancedir /opt/solr/server/solr/books/
2586 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrCore  ? solr.NRTCachingDirectoryFactory
2593 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrCore  ? [books] Opening new SolrCore at /opt/solr/server/solr/books/, dataDir=/opt/solr-5.0.0/server/solr/books/data/
2598 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.JmxMonitoredMap  ? No JMX servers found, not exposing Solr information with JMX.
2603 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrCore  ? [books] Added SolrEventListener for newSearcher: org.apache.solr.core.QuerySenderListener{queries=[]}
2603 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrCore  ? [books] Added SolrEventListener for firstSearcher: org.apache.solr.core.QuerySenderListener{queries=[{q=static firstSearcher warming in solrconfig.xml}]}
2624 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.CachingDirectoryFactory  ? return new directory for /opt/solr-5.0.0/server/solr/books/data
2627 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrCore  ? New index directory detected: old=null new=/opt/solr-5.0.0/server/solr/books/data/index/
2631 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.CachingDirectoryFactory  ? return new directory for /opt/solr-5.0.0/server/solr/books/data/index
2653 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrCore  ? created json: solr.JSONResponseWriter
2654 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrCore  ? adding lazy queryResponseWriter: solr.VelocityResponseWriter
2655 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrCore  ? created velocity: solr.VelocityResponseWriter
2662 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrCore  ? created xslt: solr.XSLTResponseWriter
2662 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.response.XSLTResponseWriter  ? xsltCacheLifetimeSeconds=5
2773 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrCore  ? no updateRequestProcessorChain defined as default, creating implicit default
2785 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.RequestHandlers  ? created /update/json/docs: org.apache.solr.handler.UpdateRequestHandler
2786 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.RequestHandlers  ? created /config: org.apache.solr.handler.SolrConfigHandler
2788 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.RequestHandlers  ? created /schema: org.apache.solr.handler.SchemaHandler
2790 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.RequestHandlers  ? created /admin/luke: org.apache.solr.handler.admin.LukeRequestHandler
2793 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.RequestHandlers  ? created /admin/system: org.apache.solr.handler.admin.SystemInfoHandler
2794 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.RequestHandlers  ? created /admin/mbeans: org.apache.solr.handler.admin.SolrInfoMBeanHandler
2796 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.RequestHandlers  ? created /admin/plugins: org.apache.solr.handler.admin.PluginInfoHandler
2796 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.RequestHandlers  ? created /admin/threads: org.apache.solr.handler.admin.ThreadDumpHandler
2796 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.RequestHandlers  ? created /admin/properties: org.apache.solr.handler.admin.PropertiesRequestHandler
2796 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.RequestHandlers  ? created /admin/logging: org.apache.solr.handler.admin.LoggingHandler
2799 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.RequestHandlers  ? created /admin/file: org.apache.solr.handler.admin.ShowFileRequestHandler
2804 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.RequestHandlers  ? created /select: solr.SearchHandler
2804 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.RequestHandlers  ? created /query: solr.SearchHandler
2807 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.RequestHandlers  ? created /get: solr.RealTimeGetHandler
2808 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.RequestHandlers  ? created /browse: solr.SearchHandler
2809 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.RequestHandlers  ? created /update: solr.UpdateRequestHandler
2809 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.RequestHandlers  ? created /update/json: solr.UpdateRequestHandler
2809 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.RequestHandlers  ? created /update/csv: solr.UpdateRequestHandler
2811 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.RequestHandlers  ? adding lazy requestHandler: solr.extraction.ExtractingRequestHandler
2813 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.RequestHandlers  ? created /update/extract: solr.extraction.ExtractingRequestHandler
2815 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.RequestHandlers  ? adding lazy requestHandler: solr.FieldAnalysisRequestHandler
2815 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.RequestHandlers  ? created /analysis/field: solr.FieldAnalysisRequestHandler
2816 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.RequestHandlers  ? adding lazy requestHandler: solr.DocumentAnalysisRequestHandler
2816 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.RequestHandlers  ? created /analysis/document: solr.DocumentAnalysisRequestHandler
2831 [coreLoadExecutor-5-thread-1] WARN  org.apache.solr.core.SolrResourceLoader  ? Solr loaded a deprecated plugin/analysis class [solr.admin.AdminHandlers]. Please consult documentation how to replace it accordingly.
2832 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.RequestHandlers  ? created /admin/: solr.admin.AdminHandlers
2835 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.RequestHandlers  ? created /admin/ping: solr.PingRequestHandler
2837 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.RequestHandlers  ? created /debug/dump: solr.DumpRequestHandler
2853 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.RequestHandlers  ? created /replication: solr.ReplicationHandler
2853 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.RequestHandlers  ? adding lazy requestHandler: solr.SearchHandler
2854 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.RequestHandlers  ? created /spell: solr.SearchHandler
2854 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.RequestHandlers  ? adding lazy requestHandler: solr.SearchHandler
2855 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.RequestHandlers  ? created /suggest: solr.SearchHandler
2855 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.RequestHandlers  ? adding lazy requestHandler: solr.SearchHandler
2855 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.RequestHandlers  ? created /tvrh: solr.SearchHandler
2855 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.RequestHandlers  ? adding lazy requestHandler: solr.SearchHandler
2855 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.RequestHandlers  ? created /terms: solr.SearchHandler
2880 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.handler.loader.XMLLoader  ? xsltCacheLifetimeSeconds=60
2884 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.handler.loader.XMLLoader  ? xsltCacheLifetimeSeconds=60
2887 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.handler.loader.XMLLoader  ? xsltCacheLifetimeSeconds=60
2888 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.handler.loader.XMLLoader  ? xsltCacheLifetimeSeconds=60
2891 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrCore  ? Using default statsCache cache: org.apache.solr.search.stats.LocalStatsCache
2924 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrCore  ? Hard AutoCommit: if uncommited for 15000ms;
2925 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrCore  ? Soft AutoCommit: disabled
2976 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrCore  ? SolrDeletionPolicy.onInit: commits: num=1
        commit{dir=NRTCachingDirectory(MMapDirectory@/opt/solr-5.0.0/server/solr/books/data/index lockFactory=org.apache.lucene.store.NativeFSLockFactory@362b8706; maxCacheMB=48.0 maxMergeSizeMB=4.0),segFN=segments_2,generation=2}
2977 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.SolrCore  ? newest commit generation = 2
3018 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.search.SolrIndexSearcher  ? Opening Searcher@5e80903c[books] main
3027 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.rest.ManagedResourceStorage  ? File-based storage initialized to use dir: /opt/solr/server/solr/books/conf
3028 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.rest.RestManager  ? Initializing RestManager with initArgs: {storageDir=/opt/solr/server/solr/books/conf}
3038 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.rest.ManagedResourceStorage  ? Reading _rest_managed.json using file:dir=/opt/solr/server/solr/books/conf
3039 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.rest.RestManager  ? Initializing 0 registered ManagedResources
3039 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.handler.component.SpellCheckComponent  ? Initializing spell checkers
3049 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.spelling.DirectSolrSpellChecker  ? init: {name=default,field=text,classname=solr.DirectSolrSpellChecker,distanceMeasure=internal,accuracy=0.5,maxEdits=2,minPrefix=1,maxInspections=5,minQueryLength=4,maxQueryFrequency=0.01}
3056 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.handler.component.SpellCheckComponent  ? No queryConverter defined, using default converter
3058 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.handler.component.SuggestComponent  ? Initializing SuggestComponent
3060 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.spelling.suggest.SolrSuggester  ? init: {name=mySuggester,lookupImpl=FuzzyLookupFactory,dictionaryImpl=DocumentDictionaryFactory,field=cat,weightField=price,suggestAnalyzerFieldType=string}
3078 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.spelling.suggest.SolrSuggester  ? Dictionary loaded with params: {name=mySuggester,lookupImpl=FuzzyLookupFactory,dictionaryImpl=DocumentDictionaryFactory,field=cat,weightField=price,suggestAnalyzerFieldType=string}
3094 [coreLoadExecutor-5-thread-1] WARN  org.apache.solr.handler.admin.AdminHandlers  ? <requestHandler name="/admin/"
 class="solr.admin.AdminHandlers" /> is deprecated . It is not required anymore
3094 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.handler.ReplicationHandler  ? Commits will be reserved for  10000
3096 [coreLoadExecutor-5-thread-1] INFO  org.apache.solr.core.CoreContainer  ? registering core: books
3096 [searcherExecutor-6-thread-1] INFO  org.apache.solr.core.SolrCore  ? QuerySenderListener sending requests to Searcher@5e80903c[books] main{ExitableDirectoryReader(UninvertingDirectoryReader(Uninverting(_0(5.0.0):C1)))}
3105 [main] INFO  org.apache.solr.servlet.SolrDispatchFilter  ? user.dir=/opt/solr-5.0.0/server
3105 [main] INFO  org.apache.solr.servlet.SolrDispatchFilter  ? SolrDispatchFilter.init() done
3140 [main] INFO  org.eclipse.jetty.server.AbstractConnector  ? Started SocketConnector@0.0.0.0:8983
3146 [searcherExecutor-6-thread-1] ERROR org.apache.solr.core.SolrCore  ? org.apache.solr.common.SolrException: undefined field text
        at org.apache.solr.schema.IndexSchema.getDynamicFieldType(IndexSchema.java:1291)
        at org.apache.solr.schema.IndexSchema$SolrQueryAnalyzer.getWrappedAnalyzer(IndexSchema.java:444)
        at org.apache.lucene.analysis.DelegatingAnalyzerWrapper$DelegatingReuseStrategy.getReusableComponents(DelegatingAnalyzerWrapper.java:74)
        at org.apache.lucene.analysis.Analyzer.tokenStream(Analyzer.java:172)
        at org.apache.lucene.util.QueryBuilder.createFieldQuery(QueryBuilder.java:205)
        at org.apache.solr.parser.SolrQueryParserBase.newFieldQuery(SolrQueryParserBase.java:373)
        at org.apache.solr.parser.SolrQueryParserBase.getFieldQuery(SolrQueryParserBase.java:741)
        at org.apache.solr.parser.SolrQueryParserBase.handleBareTokenQuery(SolrQueryParserBase.java:540)
        at org.apache.solr.parser.QueryParser.Term(QueryParser.java:299)
        at org.apache.solr.parser.QueryParser.Clause(QueryParser.java:185)
        at org.apache.solr.parser.QueryParser.Query(QueryParser.java:107)
        at org.apache.solr.parser.QueryParser.TopLevelQuery(QueryParser.java:96)
        at org.apache.solr.parser.SolrQueryParserBase.parse(SolrQueryParserBase.java:150)
        at org.apache.solr.search.LuceneQParser.parse(LuceneQParser.java:50)
        at org.apache.solr.search.QParser.getQuery(QParser.java:141)
        at org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:156)
        at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:201)
        at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:144)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:2006)
        at org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:64)
        at org.apache.solr.core.SolrCore$5.call(SolrCore.java:1778)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

3149 [searcherExecutor-6-thread-1] INFO  org.apache.solr.core.SolrCore  ? [books] webapp=null path=null params={q=static+firstSearcher+warming+in+solrconfig.xml&distrib=false&event=firstSearcher} status=400 QTime=48
3151 [searcherExecutor-6-thread-1] INFO  org.apache.solr.core.SolrCore  ? QuerySenderListener done.
3151 [searcherExecutor-6-thread-1] INFO  org.apache.solr.handler.component.SpellCheckComponent  ? Loading spell index for spellchecker: default
3152 [searcherExecutor-6-thread-1] INFO  org.apache.solr.handler.component.SpellCheckComponent  ? Loading spell index for spellchecker: wordbreak
3152 [searcherExecutor-6-thread-1] INFO  org.apache.solr.handler.component.SuggestComponent  ? Loading suggester index for: mySuggester
3152 [searcherExecutor-6-thread-1] INFO  org.apache.solr.spelling.suggest.SolrSuggester  ? reload()
3152 [searcherExecutor-6-thread-1] INFO  org.apache.solr.spelling.suggest.SolrSuggester  ? build()
3169 [searcherExecutor-6-thread-1] INFO  org.apache.solr.core.SolrCore  ? [books] Registered new searcher Searcher@5e80903c[books] main{ExitableDirectoryReader(UninvertingDirectoryReader(Uninverting(_0(5.0.0):C1)))}



The Solr Core


During startup, Solr will scan sub-directories looking for specific files named core.properties.

The solr_data directory that I have injected into my container in the last section, contains these contents (at a minimum):
$ tree
.
+-- solr_data
    +-- conf
    ¦   +-- schema.xml
    ¦   +-- solrconfig.xml
    +-- core.properties

2 directories, 3 files


The contents of each file are hyperlinked in place above.

When we launch the docker container now, with this directory and its contents injected, both the log file and dashboard show us a more successful configuration.

Last line of the log file:
3169 [searcherExecutor-6-thread-1] INFO  org.apache.solr.core.SolrCore  ? [books] Registered new searcher Searcher@5e80903c[books] main{ExitableDirectoryReader(UninvertingDirectoryReader(Uninverting(_0(5.0.0):C1)))}


The dashboard:
Fig 3: Core Configuration working properly

If I click on the "Core Selector" drop down list (at the bottom of the left-hand nav pane), this will bring up an overview pane, from which I can select "Query".  At the bottom of the query pane is an "execute" button.

Clicking this button will perform a default query against the configured Solr core.

Fig 4: Query returns no results
As expected, when we run this query, there is no data.


Java Client


I'm going to use a Java client to add data:
package com.trimc.blogger.solr.minimalist;

import static org.junit.Assert.assertNotNull;

import java.util.ArrayList;
import java.util.Collection;

import org.apache.solr.client.solrj.impl.HttpSolrServer;
import org.apache.solr.client.solrj.impl.XMLResponseParser;
import org.apache.solr.client.solrj.request.UpdateRequest;
import org.apache.solr.client.solrj.response.UpdateResponse;
import org.apache.solr.common.SolrInputDocument;

@SuppressWarnings("deprecation")
public final class PostData {

 public static void main(String... args) throws Throwable {

  String url = "http://192.168.1.34:8983/solr/books";

  HttpSolrServer server = new HttpSolrServer(url);
  assertNotNull(server);

  server.setMaxRetries(1); // defaults to 0.  > 1 not recommended.
  server.setConnectionTimeout(5000); // 5 seconds to establish TCP
  // Setting the XML response parser is only required for cross
  // version compatibility and only when one side is 1.4.1 or
  // earlier and the other side is 3.1 or later.
  server.setParser(new XMLResponseParser()); // binary parser is used by default
  // The following settings are provided here for completeness.
  // They will not normally be required, and should only be used 
  // after consulting javadocs to know whether they are truly required.
  server.setSoTimeout(1000); // socket read timeout
  server.setDefaultMaxConnectionsPerHost(100);
  server.setMaxTotalConnections(100);
  server.setFollowRedirects(false); // defaults to false
  // allowCompression defaults to false.
  // Server side must support gzip or deflate for this to have any effect.
  server.setAllowCompression(true);

  SolrInputDocument doc1 = new SolrInputDocument();
  doc1.addField("id", "id2", 1.0f);
  doc1.addField("title", "The Yellow Admiral", 1.0f);
  doc1.addField("speaker", "Patrick O'Brian", 1.0f);
  doc1.addField("page", 56, 1.0f);
  doc1.addField("url", "http://en.wikipedia.org/wiki/The_Yellow_Admiral", 1.0f);
  doc1.addField("line", "the dark of the moon", 1.0f);

  Collection<SolrInputDocument> docs = new ArrayList<SolrInputDocument>();
  docs.add(doc1);

  UpdateRequest req = new UpdateRequest();
  req.setAction(UpdateRequest.ACTION.COMMIT, false, false);
  req.add(docs);

  UpdateResponse rsp = req.process(server);
  assertNotNull(rsp);
 }
}


I run the client locally, and then query the Solr core via the dashboard:
Fig 5: Query returns results



Restarting the Container


To prove if the data is persisted between container lifecycles, first stop the running instance.

If we examine the solr_data directory, we find this:
$ tree
.
+-- conf
¦   +-- schema.xml
¦   +-- solrconfig.xml
+-- core.properties
+-- data
    +-- index
    ¦   +-- _2.fdt
    ¦   +-- _2.fdx
    ¦   +-- _2.fnm
    ¦   +-- _2_Lucene50_0.doc
    ¦   +-- _2_Lucene50_0.pos
    ¦   +-- _2_Lucene50_0.tim
    ¦   +-- _2_Lucene50_0.tip
    ¦   +-- _2.si
    ¦   +-- segments_4
    ¦   +-- write.lock
    +-- tlog
        +-- tlog.0000000000000000000
        +-- tlog.0000000000000000001
        +-- tlog.0000000000000000002

4 directories, 16 files


Note the creation of the sub-directory named data. If you've worked directly with Lucene before, you'll recognize the contents directory.

I'm going to go ahead and start my instances in detached mode:
$ sudo docker run -d -p 8983:8983 -v /home/craig/solr_data/:/opt/solr/server/solr/books craig/solr
ad59d3fbd9d8b13cde5906f6192397c5a4556077957ef57d17f8a50bcb67f9c9

and if I revisit the dash board via my web browser, the data is still present.  Since we were able to view the Lucene index on our host system, this is no surprise.


References

  1. Resources:
    1. [GitHub] solr-minimalist
      1. Java/Maven project containing the client code to post data to Solr.
    2. [GitHub] Solr Dockerfile
  2. [StackOverflow] Solr Collections vs Cores
  3. [DigitalOcean] Manual installation of Solr 4.7.x on Ubuntu 14.04
    1. Leaves a couple key steps out (for example, doesn't explain why the Jetty install is necessary), and some links have changed over time. Be sure to thoroughly read the comments section.