Class JulieXMLTools
- java.lang.Object
-
- de.julielab.xml.JulieXMLTools
-
public class JulieXMLTools extends java.lang.ObjectUtility class offering convenience methods.- Author:
- faessler
-
-
Field Summary
Fields Modifier and Type Field Description static intCONTENT_FRAGMENTstatic intELEMENT_FRAGMENT
-
Constructor Summary
Constructors Constructor Description JulieXMLTools()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static java.util.Map<java.lang.String,java.lang.String>buildNamespaceMap(com.ximpleware.VTDNav vn)Reads the namespace axis of the XML document associated with vn and returns a map connecting the namespace prefixes with their URI.static java.util.Iterator<java.util.Map<java.lang.String,java.lang.Object>>constructRowIterator(byte[] data, int bufferSize, java.lang.String forEachXpath, java.util.List<java.util.Map<java.lang.String,java.lang.String>> fields, java.lang.String identifier)Convenience method for quick construction of a row iterator over an XML document.static java.util.Iterator<java.util.Map<java.lang.String,java.lang.Object>>constructRowIterator(com.ximpleware.VTDNav vn, java.lang.String forEachXpath, java.util.List<java.util.Map<java.lang.String,java.lang.String>> fields, java.lang.String identifier)TheVTDNavvn is a VTD navigator over the XML file to return data records from.static java.util.Iterator<java.util.Map<java.lang.String,java.lang.Object>>constructRowIterator(java.lang.String fileName, int bufferSize, java.lang.String forEachXpath, java.util.List<java.util.Map<java.lang.String,java.lang.String>> fields, boolean largeFileSize)Convenience method for quick construction of a row iterator over an XML document.static java.util.Map<java.lang.String,java.lang.String>createField(java.lang.String... configuration)static voiddeclareNamespaces(com.ximpleware.AutoPilot ap, java.util.Map<java.lang.String,java.lang.String> namespaceMap)Declares the given namespaces to the passed auto pilot.static <T> java.lang.String[]expandArrayEntries(java.util.List<T> list, java.lang.String fmtStr)static <T> java.lang.String[]expandArrayEntries(T[] array, java.lang.String fmtStr)static <T> java.lang.String[]expandArrayEntries(T[] array, java.lang.String[] fmtStrs)static java.lang.StringgetElementText(com.ximpleware.VTDNav vn)static java.lang.StringgetFragment(com.ximpleware.VTDNav vn, int fragmentType, boolean returnRawString)Returns the fragment of XML, where vn currently points to, as a string.static java.net.URLgetSolrServerURL(java.lang.String urlStr, boolean calledByCLI, org.slf4j.Logger LOG)static com.ximpleware.VTDNavgetVTDNav(java.io.InputStream is, int bufferSize)static java.lang.StringgetXpathValue(java.lang.String xpath, com.ximpleware.AutoPilot ap)static java.lang.StringgetXpathValue(java.lang.String xpath, com.ximpleware.VTDNav vn)static java.lang.StringgetXpathValue(java.lang.String xpath, java.io.InputStream is)static byte[]gzipData(byte[] data)static byte[]readStream(java.io.InputStream is, int bufferSize)Reads anInputStreambuffer wise, concatenates all buffers and returns onebyte[]of exact length of the read data.static intsetElementText(com.ximpleware.VTDNav vn, com.ximpleware.AutoPilot ap, com.ximpleware.XMLModifier xm, java.lang.String xpath, java.lang.String text)Sets the text content of an XML element pointed to byxpathtotext.static byte[]unGzipData(byte[] gzipData)
-
-
-
Field Detail
-
ELEMENT_FRAGMENT
public static final int ELEMENT_FRAGMENT
- See Also:
- Constant Field Values
-
CONTENT_FRAGMENT
public static final int CONTENT_FRAGMENT
- See Also:
- Constant Field Values
-
-
Method Detail
-
constructRowIterator
public static java.util.Iterator<java.util.Map<java.lang.String,java.lang.Object>> constructRowIterator(java.lang.String fileName, int bufferSize, java.lang.String forEachXpath, java.util.List<java.util.Map<java.lang.String,java.lang.String>> fields, boolean largeFileSize)Convenience method for quick construction of a row iterator over an XML document.The
fileNamedetermines the location of the XML file to return data records from. For more detailed information seeconstructRowIterator(VTDNav, String, List, String).- Parameters:
fileName- XML file to return data rows from.bufferSize- Size of buffers while reading the file atfileName.forEachXpath- An XPath expression determining the XML elements to retrieve data records from.fields- List of attribute-value pairs determining the record fields returned by the iterator.- Returns:
- An iterator over all rows extracted from the XMl document pointed
to by
fileName.
-
constructRowIterator
public static java.util.Iterator<java.util.Map<java.lang.String,java.lang.Object>> constructRowIterator(byte[] data, int bufferSize, java.lang.String forEachXpath, java.util.List<java.util.Map<java.lang.String,java.lang.String>> fields, java.lang.String identifier)Convenience method for quick construction of a row iterator over an XML document.datacontains the XML data to return data records from. For more detailed information seeconstructRowIterator(VTDNav, String, List, String).- Parameters:
data- Byte array containing an XML document.bufferSize- Size of buffers while reading the file atfileName.forEachXpath- An XPath expression determining the XML elements to retrieve data records from.fields- List of attribute-value pairs determining the record fields returned by the iterator.identifier- A string identifying the XML document indata, needed for error messages.- Returns:
- An iterator over all rows extracted from the XMl document pointed
to by
fileName.
-
constructRowIterator
public static java.util.Iterator<java.util.Map<java.lang.String,java.lang.Object>> constructRowIterator(com.ximpleware.VTDNav vn, java.lang.String forEachXpath, java.util.List<java.util.Map<java.lang.String,java.lang.String>> fields, java.lang.String identifier)The
VTDNavvn is a VTD navigator over the XML file to return data records from. For each evaluation of theforEachXPath expression, one data row is createdSuch a row consist of the fields given by the list
fieldsThe list containsMapsof attribute-value pairs. All fields are required to have aJulieXMLConstants.XPATHattribute which specifies the XPath pointing to information in the XML documents to retrieve. Likewise, aJulieXMLConstants.NAMEattribute is required. This attribute determines the name of the field in the resulting row containing the information retrieved by the field'sConstants.XPATHattribute.Example:
A field with the following attribute-value-pairs
<field name="pmid" xpath="/MedlineCitationSet/MedlineCitation/PMID" >
will create one field in each returned data row named "pmid" and its value will by the character data at the XPath "/MedlineCitationSet/MedlineCitation/PMID".- Parameters:
vn- TheVTDNavobject which navigates over the XML document to retrieve records from.forEachXpath- An XPath expression determining the XML elements for each of which one row should be created.fields- The fields to be returned with each data row.- Returns:
- An iterator over all rows extracted from the XMl document
navigated by
vn.
-
declareNamespaces
public static void declareNamespaces(com.ximpleware.AutoPilot ap, java.util.Map<java.lang.String,java.lang.String> namespaceMap)Declares the given namespaces to the passed auto pilot. The namespaceMap can automatically be derived from an XML document by callingbuildNamespaceMap(VTDNav).- Parameters:
ap-namespaceMap-
-
buildNamespaceMap
public static java.util.Map<java.lang.String,java.lang.String> buildNamespaceMap(com.ximpleware.VTDNav vn) throws com.ximpleware.VTDExceptionReads the namespace axis of the XML document associated with vn and returns a map connecting the namespace prefixes with their URI. This map can be passed todeclareNamespaces(AutoPilot, Map)to declare all the namespaces of the document to anAutoPilot.- Parameters:
vn-- Returns:
- Throws:
com.ximpleware.VTDException
-
getVTDNav
public static com.ximpleware.VTDNav getVTDNav(java.io.InputStream is, int bufferSize) throws com.ximpleware.ParseException, FileTooBigException- Throws:
com.ximpleware.ParseExceptionFileTooBigException
-
readStream
public static byte[] readStream(java.io.InputStream is, int bufferSize) throws java.io.IOExceptionReads anInputStreambuffer wise, concatenates all buffers and returns onebyte[]of exact length of the read data.- Parameters:
is-InputStreamto read.bufferSize- Size of maximum bytes to read by oneis.read()call.- Returns:
- A
byte[]containing all the data of theInputStream. - Throws:
java.io.IOException
-
gzipData
public static byte[] gzipData(byte[] data)
-
unGzipData
public static byte[] unGzipData(byte[] gzipData) throws java.io.IOException- Throws:
java.io.IOException
-
getSolrServerURL
public static java.net.URL getSolrServerURL(java.lang.String urlStr, boolean calledByCLI, org.slf4j.Logger LOG)
-
getElementText
public static java.lang.String getElementText(com.ximpleware.VTDNav vn) throws com.ximpleware.NavException- Throws:
com.ximpleware.NavException
-
getFragment
public static java.lang.String getFragment(com.ximpleware.VTDNav vn, int fragmentType, boolean returnRawString) throws com.ximpleware.NavExceptionReturns the fragment of XML, where vn currently points to, as a string.- Parameters:
vn- The XML navigator.fragmentType- EitherELEMENT_FRAGMENTorCONTENT_FRAGMENT. Determines which respective method on vn is called. The first returns the whole element, including starting and end tag, the latter omits the tags of the element and only returns its enclosed contents.returnRawString- Whether to return a raw string, i.e. the pure XML fragment without resolving XML entities, or a "readable" string which then possibly cannot be used for further XML parsing.- Returns:
- The XML fragment of the current element vn points to.
- Throws:
com.ximpleware.NavException
-
createField
public static java.util.Map<java.lang.String,java.lang.String> createField(java.lang.String... configuration)
-
setElementText
public static int setElementText(com.ximpleware.VTDNav vn, com.ximpleware.AutoPilot ap, com.ximpleware.XMLModifier xm, java.lang.String xpath, java.lang.String text) throws com.ximpleware.VTDException, java.io.UnsupportedEncodingExceptionSets the text content of an XML element pointed to byxpathtotext.The cursor of
vnis moved to the element determined byxpath.- Parameters:
vn-VTDNavobject navigating the XML document to modify.ap-AutoPilotobject bound tovn.xm-XMLModifierobject bound tovn.xpath- An XPath expression pointing to the XML element whose text should be set.text- The text which is to be set to the XML element pointed to byxpath.- Returns:
- The VTD index of the changed element, -1 otherwise.
- Throws:
com.ximpleware.VTDException- If something with navigation or modification of the XML document goes wrong.java.io.UnsupportedEncodingException
-
expandArrayEntries
public static <T> java.lang.String[] expandArrayEntries(T[] array, java.lang.String fmtStr)
-
expandArrayEntries
public static <T> java.lang.String[] expandArrayEntries(java.util.List<T> list, java.lang.String fmtStr)
-
expandArrayEntries
public static <T> java.lang.String[] expandArrayEntries(T[] array, java.lang.String[] fmtStrs)
-
getXpathValue
public static java.lang.String getXpathValue(java.lang.String xpath, com.ximpleware.AutoPilot ap) throws com.ximpleware.XPathParseException- Throws:
com.ximpleware.XPathParseException
-
getXpathValue
public static java.lang.String getXpathValue(java.lang.String xpath, com.ximpleware.VTDNav vn) throws com.ximpleware.XPathParseException- Throws:
com.ximpleware.XPathParseException
-
getXpathValue
public static java.lang.String getXpathValue(java.lang.String xpath, java.io.InputStream is) throws java.io.IOException, com.ximpleware.XPathParseException, com.ximpleware.ParseException- Throws:
java.io.IOExceptioncom.ximpleware.XPathParseExceptioncom.ximpleware.ParseException
-
-