Class CandidateFilter


  • public class CandidateFilter
    extends java.lang.Object
    • Constructor Summary

      Constructors 
      Constructor Description
      CandidateFilter()  
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      static java.lang.String expendGreek​(java.lang.String s)
      Looks for single letters that could be interpreted as the short form of a greek character such as a -> alpha, b -> beta und returns a string with expanded greek letters.
      boolean filterOut​(java.lang.String searchTerm, java.lang.String foundTerm)
      method to filtered out some hits by some rules rule 1: if overlap is only constituted by numbers
      static com.google.common.collect.Multiset<java.lang.String> getContentTokens​(java.lang.String[] tokens)  
      static com.google.common.collect.Multiset<java.lang.String> getNumberOfCommonTokens​(java.lang.String normalizedMention, java.lang.String synonym)  
      static com.google.common.collect.Multiset<java.lang.String> getNumbers​(java.lang.String[] tokens)  
      static com.google.common.collect.Multiset<java.lang.String> getSingleSymbols​(java.lang.String[] tokens)
      Single characters, numbers, greek characters.
      boolean hasContradictingGreek​(java.lang.String s1, java.lang.String s2)  
      void initPreModifiers()  
      void initUnspecifieds()  
      boolean isNonDescriptive​(java.lang.String word)  
      static boolean isNumberCompatible​(java.lang.String normalizedMention, java.lang.String synonym)  
      boolean isUnspecified​(java.lang.String word)  
      static void main​(java.lang.String[] args)  
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • GREEK

        public static final java.lang.String[] GREEK
      • LAT_NUM

        public static final java.lang.String[] LAT_NUM
      • GREEK_REGEX

        public static java.lang.String GREEK_REGEX
      • LAT_NUM_REGEX

        public static java.lang.String LAT_NUM_REGEX
      • greekAbbrMap

        public static final java.util.Map<java.lang.String,​java.lang.String> greekAbbrMap
      • MODIFIER

        public static java.lang.String MODIFIER
      • NON_DESCRIPTIVE

        public static java.lang.String NON_DESCRIPTIVE
      • AMINO_ACIDS

        public static java.lang.String AMINO_ACIDS
      • NON_DESC

        public java.lang.String NON_DESC
      • patternNonDesc

        public java.util.regex.Pattern patternNonDesc
      • matcherNonDesc

        public java.util.regex.Matcher matcherNonDesc
      • DOMAIN_FAMILIES

        public java.lang.String DOMAIN_FAMILIES
      • patternDomainFamilies

        public java.util.regex.Pattern patternDomainFamilies
      • UNSPECIFIEDS

        public java.lang.String UNSPECIFIEDS
      • patternUnspecifieds

        public java.util.regex.Pattern patternUnspecifieds
      • matcherUnspecifieds

        public java.util.regex.Matcher matcherUnspecifieds
      • PREMODS

        public java.lang.String PREMODS
      • patternPreMods

        public java.util.regex.Pattern patternPreMods
    • Constructor Detail

      • CandidateFilter

        public CandidateFilter()
                        throws java.io.IOException
        Throws:
        java.io.IOException
    • Method Detail

      • main

        public static void main​(java.lang.String[] args)
                         throws java.io.IOException
        Throws:
        java.io.IOException
      • filterOut

        public boolean filterOut​(java.lang.String searchTerm,
                                 java.lang.String foundTerm)
        method to filtered out some hits by some rules rule 1: if overlap is only constituted by numbers
        Returns:
      • initUnspecifieds

        public void initUnspecifieds()
                              throws java.io.IOException
        Throws:
        java.io.IOException
      • initPreModifiers

        public void initPreModifiers()
                              throws java.io.IOException
        Throws:
        java.io.IOException
      • hasContradictingGreek

        public boolean hasContradictingGreek​(java.lang.String s1,
                                             java.lang.String s2)
      • expendGreek

        public static java.lang.String expendGreek​(java.lang.String s)
        Looks for single letters that could be interpreted as the short form of a greek character such as a -> alpha, b -> beta und returns a string with expanded greek letters. For letter collision, always the first greek letter is used, i.e. e -> epsilon. Thus, eta won't every be returned.
        Parameters:
        s - The string to expand greek abbreviation characters.
        Returns:
      • isNumberCompatible

        public static boolean isNumberCompatible​(java.lang.String normalizedMention,
                                                 java.lang.String synonym)
      • getNumbers

        public static com.google.common.collect.Multiset<java.lang.String> getNumbers​(java.lang.String[] tokens)
      • getSingleSymbols

        public static com.google.common.collect.Multiset<java.lang.String> getSingleSymbols​(java.lang.String[] tokens)
        Single characters, numbers, greek characters.
        Parameters:
        tokens -
        Returns:
      • getContentTokens

        public static com.google.common.collect.Multiset<java.lang.String> getContentTokens​(java.lang.String[] tokens)
      • getNumberOfCommonTokens

        public static com.google.common.collect.Multiset<java.lang.String> getNumberOfCommonTokens​(java.lang.String normalizedMention,
                                                                                                   java.lang.String synonym)
      • isUnspecified

        public boolean isUnspecified​(java.lang.String word)
      • isNonDescriptive

        public boolean isNonDescriptive​(java.lang.String word)