Package de.unirostock.sems.xmlutils.ds
Class TreeDocument
- java.lang.Object
-
- de.unirostock.sems.xmlutils.ds.TreeDocument
-
public class TreeDocument extends Object
The Class TreeDocument representing hierarchically structured content.- Author:
- Martin Scharm
-
-
Constructor Summary
Constructors Constructor Description TreeDocument(TreeDocument td)Instantiates a new tree document as a copy of another tree document.TreeDocument(org.jdom2.Document d, Weighter w, URI baseUri)Instantiates a new tree document.TreeDocument(org.jdom2.Document d, Weighter w, URI baseUri, boolean ordered)Instantiates a new tree document.TreeDocument(org.jdom2.Document d, URI baseUri)Instantiates a new tree document.TreeDocument(org.jdom2.Document d, URI baseUri, boolean ordered)Instantiates a new tree document.
-
Method Summary
All Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description Stringdump()Dump mech for debugging purposes.booleanequals(Object anObject)URIgetBaseUri()Gets the base URI.DocumentNodegetNodeById(String id)Gets the node by id.TreeNodegetNodeByPath(String path)Gets the node by XPath expression.List<TreeNode>getNodesByHash(String hash)Gets the nodes by hash.List<DocumentNode>getNodesByTag(String tag)Gets the nodes sharing a certain tag name.HashMap<String,Integer>getNodeStats()Gets the node statistics as a map `tag name` => `nodes sharing this tag`.intgetNumNodes()Gets the number of nodes in this document.Set<String>getOccurringHashes()Get all known hashes.Set<String>getOccurringIds()Get all known identifiers.Set<String>getOccurringTags()Get all known tag names.Set<String>getOccurringXPaths()Get all known XPaths.DocumentNodegetRoot()Gets the root node.TreeNode[]getSubtreesBySize()Gets the subtrees ordered by size, biggest first.List<TextNode>getTextNodes()Gets all text nodes.doublegetTreeWeight()Gets the tree weight.voidintegrate(TreeNode node, boolean recursively)Integrate an node into this tree.voidresetAllModifications()Resets all modifications.voidresortSubtrees()Deprecated.We are now using a sorted set, no resorting necessary anymore.voidseparate(TreeNode node, boolean recursively)Extract a node from this tree.StringtoString()booleanuniqueIds()Are occurring IDs unique?.
-
-
-
Constructor Detail
-
TreeDocument
public TreeDocument(org.jdom2.Document d, URI baseUri) throws XmlDocumentParseExceptionInstantiates a new tree document.- Parameters:
d- the documentbaseUri- the base URI (needed to resolve relative imports)- Throws:
XmlDocumentParseException- the xml document parse exception
-
TreeDocument
public TreeDocument(org.jdom2.Document d, Weighter w, URI baseUri) throws XmlDocumentParseExceptionInstantiates a new tree document.- Parameters:
d- the documentw- the weighter to weight the nodes and subtreesbaseUri- the base URI (needed to resolve relative imports)- Throws:
XmlDocumentParseException- the xml document parse exception
-
TreeDocument
public TreeDocument(org.jdom2.Document d, URI baseUri, boolean ordered) throws XmlDocumentParseExceptionInstantiates a new tree document.- Parameters:
d- the documentbaseUri- the base URI (needed to resolve relative imports)ordered- the ordered flag, if true we consider this tree to be ordered- Throws:
XmlDocumentParseException- the xml document parse exception
-
TreeDocument
public TreeDocument(org.jdom2.Document d, Weighter w, URI baseUri, boolean ordered) throws XmlDocumentParseExceptionInstantiates a new tree document.- Parameters:
d- the documentw- the weighter to weight the nodes and subtreesbaseUri- the base URI (needed to resolve relative imports)ordered- the ordered- Throws:
XmlDocumentParseException- the xml document parse exception
-
TreeDocument
public TreeDocument(TreeDocument td)
Instantiates a new tree document as a copy of another tree document.- Parameters:
td- the tree document to copy
-
-
Method Detail
-
resortSubtrees
@Deprecated public void resortSubtrees()
Deprecated.We are now using a sorted set, no resorting necessary anymore. This method doesn't do anything anymore.Resort subtrees.
-
integrate
public void integrate(TreeNode node, boolean recursively)
Integrate an node into this tree. This will update hash-/id-/tag-mappers etc.- Parameters:
node- the node to integraterecursively- recursively integrate the node's children
-
separate
public void separate(TreeNode node, boolean recursively)
Extract a node from this tree. Will delete its hash/id/xpath etc from corresponding mappers.- Parameters:
node- the noderecursively- recursively separate the node's children
-
getBaseUri
public URI getBaseUri()
Gets the base URI.- Returns:
- the base URI
-
uniqueIds
public boolean uniqueIds()
Are occurring IDs unique?.- Returns:
- true, if all IDs are unique
-
resetAllModifications
public void resetAllModifications()
Resets all modifications.
-
getRoot
public DocumentNode getRoot()
Gets the root node.- Returns:
- the root node
-
getNumNodes
public int getNumNodes()
Gets the number of nodes in this document.- Returns:
- the number nodes
-
getTreeWeight
public double getTreeWeight()
Gets the tree weight. (equals the weight of the root node)- Returns:
- the tree weight
-
getNodesByTag
public List<DocumentNode> getNodesByTag(String tag)
Gets the nodes sharing a certain tag name. May return null if there is no such tag.- Parameters:
tag- the tag name to search for- Returns:
- the nodes sharing this tag name
-
getSubtreesBySize
public TreeNode[] getSubtreesBySize()
Gets the subtrees ordered by size, biggest first.- Returns:
- the subtrees by size
-
getNodesByHash
public List<TreeNode> getNodesByHash(String hash)
Gets the nodes by hash. May return null if there is no such hash.- Parameters:
hash- the hash- Returns:
- the nodes having this hash value
-
getNodeById
public DocumentNode getNodeById(String id)
Gets the node by id. May return null if there is no such id or if the id's in this document aren't unique.- Parameters:
id- the id- Returns:
- the node having this id value
-
getNodeByPath
public TreeNode getNodeByPath(String path)
Gets the node by XPath expression. Currently only XPath expressions computed by us are supported. A common use case is for exampledocB.getNodeByPath (nodeFromA.getXPath ());to search for a node at the same path in another document.- Parameters:
path- the path- Returns:
- the node by path
-
getOccurringXPaths
public Set<String> getOccurringXPaths()
Get all known XPaths.- Returns:
- the occurring XPaths
-
getOccurringIds
public Set<String> getOccurringIds()
Get all known identifiers.- Returns:
- the occurring identifiers
-
getOccurringTags
public Set<String> getOccurringTags()
Get all known tag names.- Returns:
- the occurring tags
-
getOccurringHashes
public Set<String> getOccurringHashes()
Get all known hashes.- Returns:
- the occurring hashes
-
getNodeStats
public HashMap<String,Integer> getNodeStats()
Gets the node statistics as a map `tag name` => `nodes sharing this tag`.- Returns:
- the node stats
-
dump
public String dump()
Dump mech for debugging purposes.- Returns:
- the string to debug this object
-
-