Registry Collection Overview
PDS Registry stores its data in Apache Solr collection called "registry". Registry Manager comes with default "configset", consisting of two files, managed-schema and solrconfig.xml, located in REGISTRY_MANAGER_HOME/solr/collections/registry folder. REGISTRY_MANAGER_HOME is a directory where you installed Registry Manager, for example /home/pds/registry.
Default managed-schema file defines few common fields such as lid, vid, lidvid, title, product_class, internal refrences and basic file information, such as file name, type, size, and MD5 hash. Those are the fields extracted from PDS4 product labels by Harvest by default.
Lidvid is a primary key. If you load the same Harvest-generated "intermediate" data file multiple times, existing Solr documents will be replaced with new documents with the same lidvid.
<field name="lidvid" type="string" indexed="true" stored="true" required="true" multiValued="false" /> <uniqueKey>lidvid</uniqueKey>
When you load data, unknown (undefined) fields are ignored
<dynamicField name="*" type="ignored" /> <fieldType name="ignored" stored="false" indexed="false" multiValued="true" class="solr.StrField" />
Adding More Fields
Detailed information about Solr fields and schema design is available at Solr website.
You can define new fields in managed-schema file of the registry configset. The following XML fragment will add start_date_time and stop_date_time fields to the registry collection.
<field name="start_date_time" type="pdate" indexed="true" stored="true" multiValued="false"/> <field name="stop_date_time" type="pdate" indexed="true" stored="true" multiValued="false"/>
To apply the changes you have to delete the registry collection and all its data!
registry-manager delete-registry
and then recreate the collection again.
registry-manager create-registry
If you copied default configset from REGISTRY_MANAGER_HOME/solr/collections/registry to some other directory, for example /tmp/reg, pass -configDir flag to Registry Manager.
registry-manager create-registry -configDir /tmp/reg
You can also add and delete fields dynamically by calling Solr Schema API. For example,
curl http://localhost:8983/solr/registry/schema -X POST -H 'content-type:application/json' --data-binary '{ "add-field": { "name":"start_date_time", "type":"pdate", "indexed":true, "stored":true, "multiValued":false } }'
When you add fields dynamically, you can keep your existing data, but old documents will not have new fields.
It is recommended to edit managed-schema file to simplify deployment and to keep track of your changes.