Class GeneMapping


  • public class GeneMapping
    extends java.lang.Object
    • Method Detail

      • mapTopN

        public java.util.ArrayList<SynHit> mapTopN​(java.lang.String searchTerm,
                                                   int topN)
                                            throws java.io.IOException,
                                                   GeneCandidateRetrievalException
        This mapping returns a list of SynHits. No semantic disambiguation is done here. TopN hits with the highest (lucene) scores are returned. Not needed for actual mapping but used for generating training material for MaxEntScorer.
        Parameters:
        searchTerm - the term to be mapped
        topN - number of hits to be returned
        Throws:
        GeneCandidateRetrievalException
        java.io.IOException
      • map

        public java.util.List<SynHit> map​(java.lang.String searchTerm,
                                          org.apache.lucene.search.BooleanQuery contextQuery)
                                   throws GeneMappingException
        A wrapper to the main mapping function. This one does not require an organism to be specified and does thus completely organism-agnostic search (currently used basically for backward compatibility to BC evaluation).
        Parameters:
        searchTerm -
        contextQuery -
        Returns:
        the SynHits that apply to the given searchTerm
        Throws:
        java.lang.Exception
        GeneMappingException
      • map

        public MentionMappingResult map​(GeneMention searchTerm,
                                        org.apache.lucene.search.BooleanQuery contextQuery,
                                        java.lang.String documentContext)
                                 throws GeneMappingException
        Actual mapping method. This mapping functions has semantic disambiguation as well. First it checks for general, organism-specific hits (getCandidates). If organisms is given (i.e. is not null or not empty) semantic disambiguation is performed with this organism list.
        Parameters:
        searchTerm - the term to do the mapping for
        contextQuery - the term's context (i.e. the document/abstract where it was found in)
        documentContext -
        Returns:
        ArrayList with SynHits
        Throws:
        java.lang.Exception
        GeneMappingException
      • removeModifiers

        public static java.lang.String removeModifiers​(java.lang.String normalizedSearchTerm)
        Parameters:
        normalizedSearchTerm -
        Returns:
        the normalizedSearchTerm with all modifiers removed
      • removeUnspecifieds

        public static java.lang.String removeUnspecifieds​(java.lang.String normalizedSearchTerm)
      • removeNondescriptives

        public static java.lang.String removeNondescriptives​(java.lang.String normalizedSearchTerm)
      • removeDomainFamilies

        public static java.lang.String removeDomainFamilies​(java.lang.String normalizedSearchTerm)
      • removePremodifiers

        public static java.lang.String removePremodifiers​(java.lang.String normalizedSearchTerm)
      • setMappingCore

        public void setMappingCore​(MappingCore mappingCore)
      • map

        public MentionMappingResult map​(java.lang.String term,
                                        org.apache.lucene.search.BooleanQuery contextQuery,
                                        java.lang.String documentContext)
                                 throws GeneMappingException
        Convenience method mostly used for tests. The term will be wrapped into a GeneMention. However, no offset information or other data about the original gene mention will be known, of course.
        Parameters:
        term -
        contextQuery -
        documentContext -
        Returns:
        Throws:
        java.lang.Exception
        GeneMappingException