Apache Solr

PDS Registry requires Apache Solr. Solr has to be installed and started in cloud mode before running PDS Registry. PDS Registry has been tested with Solr version 8.4.1.

Solr can run in two modes:

  • SolrCloud - this is the only mode support by the PDS EN Registry tools
  • Standalone - this can be used for your own Solr testing purposes.

SolrCloud Overview

SolrCloud is a cluster of Solr servers that provides fault tolerance and high availability. It uses Apache ZooKeeper for cluster configuration and coordination. Number of nodes you need, usually depends on amount of data, number of queries, fault tolerance and high availability requirements.

Multinode Cluster

Below is a simplified diagram of a production setup with 3 node ZooKeeper cluster and multinode Solr cluster. Both ZooKeeper and Solr clusters can be expanded by adding more nodes. This setup provides fault tolerance and high availability.

Single Node Cluster with Embedded ZooKeeper

This is the simplest configuration of SolrCloud. A single instance of ZooKeeper and Solr server are running in the same JVM. This setup does not provide any fault tolerance or high availability, and could not be expanded. This configuration is recommended for development or testing and can be easily installed on a laptop or desktop.

Single Node Cluster with Separate ZooKeeper

This configuration has single node ZooKeeper and single node Solr server running in different JVMs. This setup can be easily expanded by addning more ZooKeeper or Solr nodes.

Installation

Single Node Cluster with Embedded ZooKeeper

  • Download the Solr 8.4.1 ZIP or TAR package.
  • Unzip it to any directory, for example, /opt/solr-8.4.1. We will call this directory SOLR_HOME throughout the rest of this documentation.
  • Go to SOLR_HOME/bin and run solr.cmd on Windows or solr on Unix or Mac.
  • Pass start and -cloud parameters to start Solr in cloud mode.
    # Unix or Mac
    ./solr start -cloud
    
    # Windows
    solr.cmd start -cloud
        
  • Check that Solr server started in cloud mode. Depending upon where you are installing this software, there are 2 ways to test it started successfully:
    • Open the Solr Admin UI:


      You should see a page similar to this.

      Check that Cloud menu item is available on the left. Click it. You should see a page similar to this.

    • If experiencing issues accessing the Admin UI, try the following to verify the server started as expected:
      $ ./solr status
      
      Found 1 Solr nodes:
      
      Solr process 55672 running on port 8983
      {
      "solr_home":"/opt/solr-8.4.1/server/solr",
      "version":"8.4.1 832bf13dd9187095831caf69783179d41059d013 - ishan - 2020-01-10 13:40:28",
      "startTime":"2020-04-11T01:48:15.071Z",
      "uptime":"0 days, 0 hours, 2 minutes, 23 seconds",
      "memory":"91.7 MB (%17.9) of 512 MB",
      "cloud":{
      "ZooKeeper":"localhost:9983",
      "liveNodes":"1",
      "collections":"0"}}
              

Using Docker

You can run Solr in Docker or Podman. Use the official Solr image from DockerHub (solr:8.4). Map both ZooKeeper (9983) and Solr (8983) ports. You would need access to ZooKeeper to upload PDS registry collection configset. Do not forget to pass -cloud parameter to start Solr in cloud mode.

The following example is for CentOS 8 / RHEL 8 which usually have Podman preinstalled (available in AppStream). If your system has Docker, use docker command with the same parameters.

podman run -p 8983:8983 -p 9983:9983 -d --name solr -t solr:8.4 -cloud

Security

Currently, we are working on defining a more comprehensive Security component for the Registry. In the meantime, it is recommended this service be installed on servers that allow internal access only.

Setup HTTP Proxy for Solr Admin UI

To view the Solr Admin UI on a remote server, you will need to have your System Administrator setup a proxy similar to the following:

ProxyPass /solr/ http://localhost:8983/solr/
ProxyPassReverse /solr/ http://localhost:8983/solr/
ProxyPass /solr http://localhost:8983/solr
ProxyPassReverse /solr http://localhost:8983/solr
<Location /solr>
  <Limit PUT POST DELETE>
    Order Deny,Allow
    Deny from All
    Allow from 127.0.0.1
  </Limit>
</Location>
  

Next Steps

Your Solr server is ready, you should now deploy the tools to harvest metadata and manage your registry.

Having trouble?

If you are having trouble getting Solr to start up and/or access the Admin UI, there are a few potential debugging mechanisms:

  • Make sure the 8983 port is open or proxy is setup - this is necessary to access the repo
  • Try replacing localhost with the hostname of your machine - depending on your server setup, localhost may not be accessible
  • Check if anything else is running on that port - make sure you don't have another Solr instance running or a Docker instance up. If so, shutdown or remove the docker image.