Class JulieXMLTools


  • public class JulieXMLTools
    extends Object
    Utility class offering convenience methods.
    Author:
    faessler
    • Constructor Detail

      • JulieXMLTools

        public JulieXMLTools()
    • Method Detail

      • constructRowIterator

        public static Iterator<Map<String,​Object>> constructRowIterator​(String fileName,
                                                                              int bufferSize,
                                                                              String forEachXpath,
                                                                              List<Map<String,​String>> fields,
                                                                              boolean largeFileSize)
        Convenience method for quick construction of a row iterator over an XML document.

        The fileName determines the location of the XML file to return data records from. For more detailed information see constructRowIterator(VTDNav, String, List, String).

        Parameters:
        fileName - XML file to return data rows from.
        bufferSize - Size of buffers while reading the file at fileName.
        forEachXpath - An XPath expression determining the XML elements to retrieve data records from.
        fields - List of attribute-value pairs determining the record fields returned by the iterator.
        Returns:
        An iterator over all rows extracted from the XMl document pointed to by fileName.
      • constructRowIterator

        public static Iterator<Map<String,​Object>> constructRowIterator​(byte[] data,
                                                                              int bufferSize,
                                                                              String forEachXpath,
                                                                              List<Map<String,​String>> fields,
                                                                              String identifier)
        Convenience method for quick construction of a row iterator over an XML document. data contains the XML data to return data records from. For more detailed information see constructRowIterator(VTDNav, String, List, String).
        Parameters:
        data - Byte array containing an XML document.
        bufferSize - Size of buffers while reading the file at fileName.
        forEachXpath - An XPath expression determining the XML elements to retrieve data records from.
        fields - List of attribute-value pairs determining the record fields returned by the iterator.
        identifier - A string identifying the XML document in data, needed for error messages.
        Returns:
        An iterator over all rows extracted from the XMl document pointed to by fileName.
      • constructRowIterator

        public static Iterator<Map<String,​Object>> constructRowIterator​(com.ximpleware.VTDNav vn,
                                                                              String forEachXpath,
                                                                              List<Map<String,​String>> fields,
                                                                              String identifier)

        The VTDNav vn is a VTD navigator over the XML file to return data records from. For each evaluation of the forEach XPath expression, one data row is created

        Such a row consist of the fields given by the list fields The list contains Maps of attribute-value pairs. All fields are required to have a JulieXMLConstants.XPATH attribute which specifies the XPath pointing to information in the XML documents to retrieve. Likewise, a JulieXMLConstants.NAME attribute is required. This attribute determines the name of the field in the resulting row containing the information retrieved by the field's Constants.XPATH attribute.

        Example:

        A field with the following attribute-value-pairs

        <field name="pmid" xpath="/MedlineCitationSet/MedlineCitation/PMID" >

        will create one field in each returned data row named "pmid" and its value will by the character data at the XPath "/MedlineCitationSet/MedlineCitation/PMID".

        Parameters:
        vn - The VTDNav object which navigates over the XML document to retrieve records from.
        forEachXpath - An XPath expression determining the XML elements for each of which one row should be created.
        fields - The fields to be returned with each data row.
        Returns:
        An iterator over all rows extracted from the XMl document navigated by vn.
      • declareNamespaces

        public static void declareNamespaces​(com.ximpleware.AutoPilot ap,
                                             Map<String,​String> namespaceMap)
        Declares the given namespaces to the passed auto pilot. The namespaceMap can automatically be derived from an XML document by calling buildNamespaceMap(VTDNav).
        Parameters:
        ap -
        namespaceMap -
      • buildNamespaceMap

        public static Map<String,​String> buildNamespaceMap​(com.ximpleware.VTDNav vn)
                                                          throws com.ximpleware.VTDException
        Reads the namespace axis of the XML document associated with vn and returns a map connecting the namespace prefixes with their URI. This map can be passed to declareNamespaces(AutoPilot, Map) to declare all the namespaces of the document to an AutoPilot.
        Parameters:
        vn -
        Returns:
        Throws:
        com.ximpleware.VTDException
      • readStream

        public static byte[] readStream​(InputStream is,
                                        int bufferSize)
                                 throws IOException
        Reads an InputStream buffer wise, concatenates all buffers and returns one byte[] of exact length of the read data.
        Parameters:
        is - InputStream to read.
        bufferSize - Size of maximum bytes to read by one is.read() call.
        Returns:
        A byte[] containing all the data of the InputStream.
        Throws:
        IOException
      • gzipData

        public static byte[] gzipData​(byte[] data)
      • getSolrServerURL

        public static URL getSolrServerURL​(String urlStr,
                                           boolean calledByCLI,
                                           org.slf4j.Logger LOG)
      • getElementText

        public static String getElementText​(com.ximpleware.VTDNav vn)
                                     throws com.ximpleware.NavException
        Throws:
        com.ximpleware.NavException
      • getFragment

        public static String getFragment​(com.ximpleware.VTDNav vn,
                                         int fragmentType,
                                         boolean returnRawString)
                                  throws com.ximpleware.NavException
        Returns the fragment of XML, where vn currently points to, as a string.
        Parameters:
        vn - The XML navigator.
        fragmentType - Either ELEMENT_FRAGMENT or CONTENT_FRAGMENT. Determines which respective method on vn is called. The first returns the whole element, including starting and end tag, the latter omits the tags of the element and only returns its enclosed contents.
        returnRawString - Whether to return a raw string, i.e. the pure XML fragment without resolving XML entities, or a "readable" string which then possibly cannot be used for further XML parsing.
        Returns:
        The XML fragment of the current element vn points to.
        Throws:
        com.ximpleware.NavException
      • setElementText

        public static int setElementText​(com.ximpleware.VTDNav vn,
                                         com.ximpleware.AutoPilot ap,
                                         com.ximpleware.XMLModifier xm,
                                         String xpath,
                                         String text)
                                  throws com.ximpleware.VTDException,
                                         UnsupportedEncodingException
        Sets the text content of an XML element pointed to by xpath to text.

        The cursor of vn is moved to the element determined by xpath.

        Parameters:
        vn - VTDNav object navigating the XML document to modify.
        ap - AutoPilot object bound to vn.
        xm - XMLModifier object bound to vn.
        xpath - An XPath expression pointing to the XML element whose text should be set.
        text - The text which is to be set to the XML element pointed to by xpath.
        Returns:
        The VTD index of the changed element, -1 otherwise.
        Throws:
        com.ximpleware.VTDException - If something with navigation or modification of the XML document goes wrong.
        UnsupportedEncodingException
      • expandArrayEntries

        public static <T> String[] expandArrayEntries​(T[] array,
                                                      String fmtStr)
      • expandArrayEntries

        public static <T> String[] expandArrayEntries​(List<T> list,
                                                      String fmtStr)
      • expandArrayEntries

        public static <T> String[] expandArrayEntries​(T[] array,
                                                      String[] fmtStrs)
      • getXpathValue

        public static String getXpathValue​(String xpath,
                                           com.ximpleware.AutoPilot ap)
                                    throws com.ximpleware.XPathParseException
        Throws:
        com.ximpleware.XPathParseException
      • getXpathValue

        public static String getXpathValue​(String xpath,
                                           com.ximpleware.VTDNav vn)
                                    throws com.ximpleware.XPathParseException
        Throws:
        com.ximpleware.XPathParseException
      • getXpathValue

        public static String getXpathValue​(String xpath,
                                           InputStream is)
                                    throws IOException,
                                           com.ximpleware.XPathParseException,
                                           com.ximpleware.ParseException
        Throws:
        IOException
        com.ximpleware.XPathParseException
        com.ximpleware.ParseException