Class DocumentNode


  • public class DocumentNode
    extends TreeNode
    The class DocumentNode, representing a node in an XML tree.
    Author:
    Martin Scharm
    • Field Detail

      • ID_ATTR

        public static String ID_ATTR
        The id attr.
    • Constructor Detail

      • DocumentNode

        public DocumentNode​(org.jdom2.Element element,
                            DocumentNode parent,
                            TreeDocument doc,
                            Weighter w,
                            int numChild,
                            int level)
        Instantiates a new document node.
        Parameters:
        element - the corresponding element
        parent - the parent node
        doc - the corresponding document
        w - the weighter
        numChild - the number among its siblings
        level - the level in the tree
    • Method Detail

      • extract

        public DocumentNode extract()
        Extracts this subtree. Creates a copy of the subtree rooted in this node and returns a DocumentNode that has no parent, e.g. to transfer it to another document.
        Returns:
        the copy of this DocumentNode
      • getSubTreeHash

        public String getSubTreeHash()
        Gets the calculated hash of the subtree rooted in this node.
        Specified by:
        getSubTreeHash in class TreeNode
        Returns:
        the hash
      • getOwnHash

        public String getOwnHash()
        Gets the calculated hash of this single element (ignoring subtree).
        Specified by:
        getOwnHash in class TreeNode
        Returns:
        the hash
      • getSizeSubtree

        public int getSizeSubtree()
        Gets the size of this subtree (number of nodes under the current node, current node excluded).
        Returns:
        the size of the subtree
      • getNumLeaves

        public int getNumLeaves()
        Gets the number of leaves in the subtree rooted by this node. If this is a leave it will return 1.
        Returns:
        the num leaves
      • setIdAttr

        public static final void setIdAttr​(String id)
        Sets the id attribute. (you may want to use something like the metaid instead of the id as identifier)
        Parameters:
        id - the new id attribute
      • getTagName

        public String getTagName()
        Description copied from class: TreeNode
        Gets the tag name. For document nodes it's the actual tag name, in case of text nodes you'll receive TreeNode.TEXT_TAG.
        Specified by:
        getTagName in class TreeNode
        Returns:
        the tag name
      • getId

        public String getId()
        Gets the value of the id attribute.
        Returns:
        the id
      • addChild

        public void addChild​(DocumentNode toAdd)
        Adds a child to this node.
        Parameters:
        toAdd - the new child
      • getAttributeValue

        public String getAttributeValue​(String attr)
        Gets the value of an attribute. Don't use it to get the id, use getId () instead!
        Parameters:
        attr - the name of the attribute
        Returns:
        the value of the attribute
      • getAttribute

        public org.jdom2.Attribute getAttribute​(String attr)
        Gets the an attribute.
        Parameters:
        attr - the name of the attribute
        Returns:
        the the attribute
      • getAttributeValue

        public String getAttributeValue​(String attr,
                                        String nsContains)
        Gets the value of an attribute with matching name space. Don't use it to get the id, use getId () instead!
        Parameters:
        attr - the name of the attribute
        nsContains - the name space must contain nsContains
        Returns:
        the value of the attribute
      • setAttribute

        public void setAttribute​(org.jdom2.Attribute attr)
        Overrides an attribute.
        Parameters:
        attr - the attribute
      • setAttribute

        public void setAttribute​(String attr,
                                 String value)
        Overrides an attribute.
        Parameters:
        attr - the name of the attribute
        value - the new value
      • rmAttribute

        public void rmAttribute​(String attr)
        Removes an attribute.
        Parameters:
        attr - the attributes name
      • getAttributes

        public Set<String> getAttributes()
        Gets set attributes.
        Returns:
        the attribute names
      • isBelow

        public boolean isBelow​(DocumentNode parent)
        Checks if this node is a child of some other node (multilevel). Both nodes have to be from the same origin document and the XPath of the current node has to start with the parent's XPath.
        Parameters:
        parent - the parent in question
        Returns:
        true, if is this is a child of parent
      • getNumChildren

        public int getNumChildren()
        Gets the number of children in this node.
        Returns:
        the number of children
      • getChildren

        public List<TreeNode> getChildren()
        Gets the children.
        Returns:
        the children
      • getChildrenWithTag

        public List<TreeNode> getChildrenWithTag​(String tag)
        Gets the children sharing a certain tag.
        Parameters:
        tag - the tag
        Returns:
        the children having tag as tag name or an empty list if there are no such children
      • getChildrenTagMap

        public HashMap<String,​List<TreeNode>> getChildrenTagMap()
        Gets the children tag map.
        Returns:
        the children tag map
      • getNoOfChild

        public int getNoOfChild​(TreeNode kid)
        Gets the child number of a child. Will return 1 if it's the first child and getNumChildren () for the last child. If there is no such child it returns -1.
        Parameters:
        kid - the kid
        Returns:
        the no of child
      • getAttributeDistance

        public double getAttributeDistance​(DocumentNode cmp)
        Calculates the distance of attributes. Basically calls getAttributeDistance(DocumentNode, boolean, boolean, boolean) allowing different ids, caring about names, but not stricter names. Returns a double in [0,1]. If all attributes match the distance will be 0, if none of the attributes match the distance will be 1.
        Parameters:
        cmp - the node to compare
        Returns:
        the attribute distance in [0,1]
      • getAttributeDistance

        public double getAttributeDistance​(DocumentNode cmp,
                                           boolean allowDifferentIds,
                                           boolean careAboutNames,
                                           boolean stricterNames)
        Calculates the distance of attributes. Returns a double in [0,1].
        • If all attributes match the distance will be 0, if none of the attributes match the distance will be 1.
        • If allowDifferentIds is set to false, the distance will always be 1 if the two nodes do not share the same attribute.
        • If careAboutNames is set to true, we will treat the name attributes differently. That means the difference between Glucose5Phosphate and Glucose6Phosphate (might just be a typo) will be rated less and the difference between Glucose3Phosphate and MAPKK2 will be rated much higher. We are using the `Michaelis–Menten kinetics` (:-)) to calc the difference between names: vmax=6; km=min(length(name1),length(name2))/4; [S]=levenshteinDistance(name1,name2)
        • If careAboutNames is set to true AND stricterNames is set to true a difference in the name is treated very strictly. So if you're sure that your names are very similar, you should go for that option: vmax=12; km=min(length(name1),length(name2))/6; [S]=levenshteinDistance(name1,name2)
        Parameters:
        cmp - the node to compare
        allowDifferentIds - are different ids allowed?
        careAboutNames - should we care about names?
        stricterNames - should we handle names very strictly?
        Returns:
        the attribute distance in [0,1]
      • getWeighter

        public Weighter getWeighter()
        Gets the weighter used to compute the weight of this document.
        Returns:
        the weighter
      • getNameSpaceUri

        public String getNameSpaceUri()
        Gets the name space uri.
        Returns:
        the name space uri
      • getNameSpacePrefix

        public String getNameSpacePrefix()
        Gets the name space prefix.
        Returns:
        the name space prefix
      • getSubDoc

        public org.jdom2.Element getSubDoc​(org.jdom2.Element parent)
        Description copied from class: TreeNode
        Attaches the subtree rooted in this node to the node parent. Recursively attaches its children. Will fail for parent == null && this.getType () == TreeNode.TEXT_NODE That means a text node cannot become root.
        Specified by:
        getSubDoc in class TreeNode
        Parameters:
        parent - the parent element which will root this node. If null, this node will be root in the document
        Returns:
        the sub doc
      • getNodeStats

        public void getNodeStats​(HashMap<String,​Integer> map)
        Description copied from class: TreeNode
        Gets the node statistics of the subtree rooted in this node: tagname => number nodes having this tag name.
        Specified by:
        getNodeStats in class TreeNode
        Parameters:
        map - the map to write our statistics to
      • evaluate

        public boolean evaluate​(ConnectionManager conMgmr)
        Description copied from class: TreeNode
        Evaluate the modifications of this node. Just useful for tree comparisons.
        Specified by:
        evaluate in class TreeNode
        Parameters:
        conMgmr - the connection manager
        Returns:
        true, if node was changed
      • getWeight

        public double getWeight()
        Description copied from class: TreeNode
        Gets the weight of this node.
        Specified by:
        getWeight in class TreeNode
        Returns:
        the weight
      • reSetupStructureDown

        protected void reSetupStructureDown​(TreeDocument doc,
                                            int numChild)
        Description copied from class: TreeNode
        Re-setup the document structure downwards. (e.g. recompute XPaths etc.)
        Specified by:
        reSetupStructureDown in class TreeNode
        Parameters:
        doc - the document this node corresponds to
        numChild - the child number of this node
      • reSetupStructureUp

        protected void reSetupStructureUp()
        Description copied from class: TreeNode
        Re-setup the document structure upwards. (e.g. recompute hashes etc.)
        Specified by:
        reSetupStructureUp in class TreeNode
      • dump

        public String dump​(String prefix)
        Description copied from class: TreeNode
        Dump this node. Just for debugging purposes..
        Specified by:
        dump in class TreeNode
        Parameters:
        prefix - the prefix for a line (indention)
        Returns:
        the produced dump
      • getNameSpace

        public org.jdom2.Namespace getNameSpace()
        Gets the name space associated with this node.
        Returns:
        the name space