It is possible to find instructions for installing Docker directly on your OS. But there's almost no reason these days not to use docker. There are plenty of docker containers for Solr, and the operation (as I will demonstrate in this article) is almost trivial. Launching a docker container from a trusted image not only saves time and effort, but leverages best-practices installation and configuration techniques, at least where official or highly trusted Dockerfiles are concerned.
In this tutorial, I'll walk through launching a docker container for Solr, attaching an external data volume, and demonstrating the successful GET and POST of data to Solr between container lifecycles.
Using the docker search command, I find that these images have already been built:
At the time of this article, the first docker image is the most popular, and has a Dockerfile with a configuration that I agree with. The image is built on top of java:8, which in turn uses dockerfile/ubuntu as the base image. I like this design approach rather than pulling the OS directly. If I ever want to extend this Dockerfile, I'm generally happier to work with an underlying Debian image as well, although this isn't generally much of an issue with Docker.
The image is also well-supported, with an updated configuration for the Solr 5.1 release being released today.
Building the Image
This is an optional section, but as I've mentioned here, I like to piggyback on Dockerfiles that I'm planning to use on a project.
Given a local docker registry and multiple team members, it becomes easier to point members to the registry and a standard naming scheme for images. In this case, I'm not adding any extensions to the Dockerfile at the moment, but that could always come later.
My copy looks like:
and I build this using
from within the directory containing my Dockerfile.
Launching a Container
I can launch a container from the image directly by typing
This launches the container and exposes the Solr installation's port 8983 on the Host port 8983. I could use any Host port I want, and if I launch multiple containers from this image, will indeed need to select something different.
The installation can be verified via a web browser:
|Fig 1: http://192.168.x.y:8983/solr|
A slight problem however: if I click on "Core Admin" in the dashboard and try to add a core, I get an error:
|Fig 2: SolrCore Initialization Error|
Attaching a VolumeIf you're familiar with fundamental Java concepts (particularly within the Spring Framework) the concept of "dependency injection" may be familiar. Logic is crafted with reliance on a component, but without the requirement of managing that component's lifecycle. And if you're not familiar with this dessign pattern, that's fine. The docker concepts are simple enough as they stand.
I'm tempted to call the attachement of data volumes to a docker image "directory injection". The pattern is clear enough: the container relies on a given directory, or many given directories. Now, whether these directories exist (and are configured) within the container, or on the host that the container is running on, isn't that important.
Again, given the ephemeral nature of a docker container, it is often necessary to use a data volume on the host, and make the container dependent on this.
In the case of Solr, we'll want to inject a directory into our container that holds a configuration for a Solr core.
We do that like this:
The host information is in green and the container information in orange.
In some cases (particularly when debugging, or just getting started), I avoid running a container in detached mode, and focus on the immediate log output.
As in this case:
The Solr Core
During startup, Solr will scan sub-directories looking for specific files named core.properties.
The solr_data directory that I have injected into my container in the last section, contains these contents (at a minimum):
The contents of each file are hyperlinked in place above.
When we launch the docker container now, with this directory and its contents injected, both the log file and dashboard show us a more successful configuration.
Last line of the log file:
|Fig 3: Core Configuration working properly|
If I click on the "Core Selector" drop down list (at the bottom of the left-hand nav pane), this will bring up an overview pane, from which I can select "Query". At the bottom of the query pane is an "execute" button.
Clicking this button will perform a default query against the configured Solr core.
|Fig 4: Query returns no results|
I'm going to use a Java client to add data:
I run the client locally, and then query the Solr core via the dashboard:
|Fig 5: Query returns results|
Restarting the Container
To prove if the data is persisted between container lifecycles, first stop the running instance.
If we examine the solr_data directory, we find this:
Note the creation of the sub-directory named data. If you've worked directly with Lucene before, you'll recognize the contents directory.
I'm going to go ahead and start my instances in detached mode:
and if I revisit the dash board via my web browser, the data is still present. Since we were able to view the Lucene index on our host system, this is no surprise.
- [GitHub] solr-minimalist
- Java/Maven project containing the client code to post data to Solr.
- [GitHub] Solr Dockerfile
- [StackOverflow] Solr Collections vs Cores
- [DigitalOcean] Manual installation of Solr 4.7.x on Ubuntu 14.04
- Leaves a couple key steps out (for example, doesn't explain why the Jetty install is necessary), and some links have changed over time. Be sure to thoroughly read the comments section.