Class MaxEntScorerPairExtractor


  • public class MaxEntScorerPairExtractor
    extends java.lang.Object
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      boolean addPair​(java.lang.String first, java.lang.String second)
      simple pair add rule: terms must not be the same and must have at least one token in common.
      boolean addPair​(java.lang.String first, java.lang.String second, double overlapRatio, int maxSynLength)
      overlap must be at least overlapRatio in both terms and both terms must not be longer than a maximal synonym length
      boolean addPairSpecialRules​(java.lang.String first, java.lang.String second, double overlapRatio, int maxSynLength)
      as addPair but pair is only allowed if - difference is not only a number or a single character - overlap is not only a number or a single character
      java.lang.String[][] compareStrings​(java.lang.String S1, java.lang.String S2)
      TODO: comment!
      void showPairs​(java.util.ArrayList<java.lang.String[]> pairs)  
      void storePairs​(java.util.ArrayList<java.lang.String[]> pairs, java.io.File filename)  
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • MaxEntScorerPairExtractor

        public MaxEntScorerPairExtractor()
    • Method Detail

      • showPairs

        public void showPairs​(java.util.ArrayList<java.lang.String[]> pairs)
      • storePairs

        public void storePairs​(java.util.ArrayList<java.lang.String[]> pairs,
                               java.io.File filename)
                        throws java.io.IOException
        Throws:
        java.io.IOException
      • addPair

        public boolean addPair​(java.lang.String first,
                               java.lang.String second)
        simple pair add rule: terms must not be the same and must have at least one token in common. Works on normalized terms.
        Parameters:
        first - normalized term
        second - normalized term
        Returns:
      • addPair

        public boolean addPair​(java.lang.String first,
                               java.lang.String second,
                               double overlapRatio,
                               int maxSynLength)
        overlap must be at least overlapRatio in both terms and both terms must not be longer than a maximal synonym length
        Parameters:
        first - normalized term
        second - normalized term
        overlapRatio - intersection-size / term-length
        maxSynLenghth - length in tokens
        Returns:
      • addPairSpecialRules

        public boolean addPairSpecialRules​(java.lang.String first,
                                           java.lang.String second,
                                           double overlapRatio,
                                           int maxSynLength)
        as addPair but pair is only allowed if - difference is not only a number or a single character - overlap is not only a number or a single character
        Parameters:
        first -
        second -
        overlapRatio -
        maxSynLength -
        Returns:
      • compareStrings

        public java.lang.String[][] compareStrings​(java.lang.String S1,
                                                   java.lang.String S2)
        TODO: comment!
        Parameters:
        S1 -
        S2 -
        Returns: