Supplementer

Overview

Supplementer is a command-line tool to load supplemental metadata into PDS Registry.

To see a list of Supplementer commands run supplementer without any parameters.

To print command-specific usage information, pass -help parameter after any command. For example, to print usage information for load-labels command, run

supplementer load-labels -help 
Usage: supplementer load-labels <options>

Load supplemental data from Product_Metadata_Supplemental labels into registry index

Required parameters:
  -file <path>     Either Product_Metadata_Supplemental label file (.xml)
                   or a text file (.txt) with the list of label files
                   (one file path per line).

Optional parameters:
  -auth <file>     Authentication config file
  -es <url>        Elasticsearch URL. Default is http://localhost:9200
  -index <name>    Elasticsearch index name. Default is 'registry'

Supplementer Commands

Load Labels

To load supplemental data from Product_Metadata_Supplemental labels into Elasticsearch registry index, run load-labels command.

Usage: supplementer load-labels <options>

Load supplemental data from Product_Metadata_Supplemental labels into registry index

Required parameters:
  -file <path>     Either Product_Metadata_Supplemental label file (.xml)
                   or a text file (.txt) with the list of label files
                   (one file path per line).

Optional parameters:
  -auth <file>     Authentication config file
  -es <url>        Elasticsearch URL. Default is http://localhost:9200
  -index <name>    Elasticsearch index name. Default is 'registry'

Example 1: Load supplemental data from Product_Metadata_Supplemental label into local registry (Elasticsearch)

supplementer load-labels /home/user1/data/prod_supplemental_1.xml

Example 2: Load supplemental data from a list of Product_Metadata_Supplemental labels generated by Harvest

Harvest generates supplemental.txt file with a list of Product_Metadata_Supplemental label files in its output directory. Content of supplemental.txt file might look like this:

/home/user1/data/my_supplemental_label.xml
/home/user1/data/another_supplemental_product.xml

To load this list into local registry, run the following command:

supplementer load-labels /tmp/harvest/out/supplemental.txt

Elasticsearch Field Names

The following naming convention is used for fields extracted from supplemental data tables: ops:Supplemental/<Elasticsearch data type>:<data table field name>. Data table field names are converted to lower case and spaces are replaced with underscores. For example,

ops:Supplemental/keyword:ir_sampling_mode_id
ops:Supplemental/double:center_latitude
ops:Supplemental/double:declination
ops:Supplemental/integer:swath_width

PDS to Elasticsearch Data Type Mapping

Supplemental data tables use PDS data types, such as ASCII_String, ASCII_Date_Time_YMD, ASCII_Real, etc. PDS data types are mapped to Elasticsearch data types, such as keyword, date, double, etc. Default mappings are stored in $SUPPLEMENTER_HOME/elastic/data-dic-types.cfg configuration file. You can modify this file, to add more mappings or change default values.

The file has the following format:

<PDS data type> = <Elasticsearch data type>

For example,

pds.ASCII_Integer = integer
pds.ASCII_Boolean = boolean
pds.ASCII_Real = double
pds.ASCII_Short_String_Collapsed = keyword
...

Limitations

Arrays / multi-valued fields are not supported.

For example, supplemental data file for Cassini VIMS cubes has the following field definition:

<Field_Character>
  <name>SC_PLANET_POSITION_VECTOR</name>
  <field_number>44</field_number>
  <field_location unit="byte">738</field_location>
  <data_type>ASCII_Real</data_type>
  <field_length unit="byte">44</field_length>
  <description>
    3-valued array. X,Y,Z components of the position
    vector from spacecraft to primary planet center,
    corrected for light-travel time and stellar
    aberration.
  </description>
</Field_Character>

And contains 3-valued array data:

817081698.271,-147434490.181,-90288017.7382

If you want to load this data into Elasticsearch you have to use ASCII_String data type instead of ASCII_Real. In this case 3-valued array will be stored as a single-valued string.