Class JCoReMapAnnotationIndex<K extends java.lang.Comparable<K>,​T extends org.apache.uima.jcas.tcas.Annotation>

  • Type Parameters:
    T - The annotation type the index is over.
    K - The key type used to index the annotations.
    C - The collection type (e.g. ArrayList) used to store annotations in the index.
    U - The collection type (e.g. TreeSet) used to return search results.
    All Implemented Interfaces:
    JCoReAnnotationIndex<T>
    Direct Known Subclasses:
    JCoReHashMapAnnotationIndex, JCoReTreeMapAnnotationIndex

    public class JCoReMapAnnotationIndex<K extends java.lang.Comparable<K>,​T extends org.apache.uima.jcas.tcas.Annotation>
    extends java.lang.Object
    implements JCoReAnnotationIndex<T>

    Use when: You want to access annotations by an some arbitrary key that should be computed once and then only used for access. Use also when single annotations might or should be associated with multiple keys.

    This class builds a map from arbitrary keys to collections of annotations. For convenience access, class takes suppliers for the generation of index or search terms as well as suppliers for the actual collection implementations that should be used within the index and for search results. Thus, it's just a kind of convenience framework around a map.

    Author:
    faessler
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      void add​(T a)  
      T get​(K searchTerm)  
      T get​(org.apache.uima.jcas.tcas.Annotation a)  
      T getFirst​(K searchTerm)  
      T getFirst​(org.apache.uima.jcas.tcas.Annotation a)  
      java.util.Map<K,​java.util.Collection<T>> getIndex()  
      void index​(org.apache.uima.jcas.JCas jCas, int type)
      Indexes the whole contents of the CAS annotation index of type type.
      void index​(org.apache.uima.jcas.JCas jCas, org.apache.uima.cas.Type type)
      Indexes the whole contents of the CAS annotation index of type type.
      void index​(T a)
      Indexes the annotation a.
      java.util.stream.Stream<T> search​(java.util.stream.Stream<K> searchTerms)
      Searches for the provided search terms in the index.
      java.util.stream.Stream<T> search​(K searchTerm)
      Searches for annotations in the index by the provided search term.
      java.util.stream.Stream<T> search​(org.apache.uima.jcas.tcas.Annotation a)
      Generates search terms from a via the searchTermGenerator.
      void setIndexAnnotationStorageSupplier​(java.util.function.Supplier<java.util.Collection<T>> supplier)
      Allows to change the supplier for the internal storage of annotations which are the values of the index map.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • index

        protected final java.util.Map<K extends java.lang.Comparable<K>,​java.util.Collection<T extends org.apache.uima.jcas.tcas.Annotation>> index
      • indexTermGenerator

        protected final IndexTermGenerator<K extends java.lang.Comparable<K>> indexTermGenerator
      • searchTermGenerator

        protected final IndexTermGenerator<K extends java.lang.Comparable<K>> searchTermGenerator
      • indexAnnotationStorageSupplier

        protected java.util.function.Supplier<java.util.Collection<T extends org.apache.uima.jcas.tcas.Annotation>> indexAnnotationStorageSupplier
    • Constructor Detail

      • JCoReMapAnnotationIndex

        public JCoReMapAnnotationIndex​(java.util.function.Supplier<java.util.Map<K,​java.util.Collection<T>>> indexMapSupplier,
                                       IndexTermGenerator<K> indexTermGenerator,
                                       IndexTermGenerator<K> searchTermGenerator,
                                       org.apache.uima.jcas.JCas jCas,
                                       org.apache.uima.cas.Type type)
        This is the full constructor of the map index. It takes parameters for virtually every aspect of the index. For a quicker start, you might want to refer to one of its subclasses, e.g. JCoReHashMapAnnotationIndex. Using the constructor immediately build the index from the given jCas for the annotation type given by type.
        Parameters:
        indexMapSupplier - A supplier for the map that should be used as the index.
        indexTermGenerator - Generates index terms of generic parameter type K. Those index terms will be extracted from indexed annotations.
        searchTermGenerator - Generates search terms of generic parameter K. The index will extract all IndexEntry items in the index matching one of the generated terms. This may be the very same term generator passed for indexTermGenerator.
        indexAnnotationCollectionSupplier - A supplier for the collection data structure used to store annotations in the index. In case of single index hits during a search, this data structure is returned directly to save time, if it is compatible with the return type specified by the generic type parameter U (search result return type). This way the desired output structure can be specified (e.g. a TreeSet with a specific comparator).
        resultCollectionSupplier - In case a multiple search terms for a search as generated by searchTermGenerator, not the index annotation collection is returned but this supplier is used to create a new collection to return search results for all search terms. Thus, when it is expected that searchTermGenerator will often generate multiple search terms, the indexAnnotationCollectionSupplier should create a collection efficient for adding and iterating and the resultCollectionSupplier should create a collection reflects the desired output format.
        jCas - A JCas containing annotations that should be indexed.
        type - The UIMA type system type, belonging to jCas, that should be indexed.
      • JCoReMapAnnotationIndex

        public JCoReMapAnnotationIndex​(java.util.function.Supplier<java.util.Map<K,​java.util.Collection<T>>> indexMapSupplier,
                                       IndexTermGenerator<K> indexTermGenerator,
                                       IndexTermGenerator<K> searchTermGenerator,
                                       org.apache.uima.jcas.JCas jCas,
                                       int type)
        This is the full constructor of the map index. It takes parameters for virtually every aspect of the index. For a quicker start, you might want to refer to one of its subclasses, e.g. JCoReHashMapAnnotationIndex. Using the constructor immediately build the index from the given jCas for the annotation type given by type.
        Parameters:
        indexMapSupplier - A supplier for the map that should be used as the index.
        indexTermGenerator - Generates index terms of generic parameter type K. Those index terms will be extracted from indexed annotations.
        searchTermGenerator - Generates search terms of generic parameter K. The index will extract all IndexEntry items in the index matching one of the generated terms. This may be the very same term generator passed for indexTermGenerator.
        indexAnnotationCollectionSupplier - A supplier for the collection data structure used to store annotations in the index. In case of single index hits during a search, this data structure is returned directly to save time, if it is compatible with the return type specified by the generic type parameter U (search result return type). This way the desired output structure can be specified (e.g. a TreeSet with a specific comparator).
        resultCollectionSupplier - In case a multiple search terms for a search as generated by searchTermGenerator, not the index annotation collection is returned but this supplier is used to create a new collection to return search results for all search terms. Thus, when it is expected that searchTermGenerator will often generate multiple search terms, the indexAnnotationCollectionSupplier should create a collection efficient for adding and iterating and the resultCollectionSupplier should create a collection reflects the desired output format.
        jCas - A JCas containing annotations that should be indexed.
        type - The UIMA type system type, belonging to jCas, that should be indexed.
      • JCoReMapAnnotationIndex

        public JCoReMapAnnotationIndex​(java.util.function.Supplier<java.util.Map<K,​java.util.Collection<T>>> indexMapSupplier,
                                       IndexTermGenerator<K> indexTermGenerator,
                                       IndexTermGenerator<K> searchTermGenerator)
        Parameters:
        indexMapSupplier - A supplier for the map that should be used as the index.
        indexTermGenerator - Generates index terms of generic parameter type K. Those index terms will be extracted from indexed annotations.
        searchTermGenerator - Generates search terms of generic parameter K. The index will extract all IndexEntry items in the index matching one of the generated terms. This may be the very same term generator passed for indexTermGenerator.
        indexAnnotationCollectionSupplier - A supplier for the collection data structure used to store annotations in the index. In case of single index hits during a search, this data structure is returned directly to save time, if it is compatible with the return type specified by the generic type parameter U (search result return type). This way the desired output structure can be specified (e.g. a TreeSet with a specific comparator).
        resultCollectionSupplier - In case a multiple search terms for a search as generated by searchTermGenerator, not the index annotation collection is returned but this supplier is used to create a new collection to return search results for all search terms. Thus, when it is expected that searchTermGenerator will often generate multiple search terms, the indexAnnotationCollectionSupplier should create a collection efficient for adding and iterating and the resultCollectionSupplier should create a collection reflects the desired output format.
    • Method Detail

      • index

        public void index​(org.apache.uima.jcas.JCas jCas,
                          int type)
        Indexes the whole contents of the CAS annotation index of type type. For each annotation, the indexTermGenerator is used to create terms with which the annotation will be associated in the index and can be retrieved by a search method.
        Specified by:
        index in interface JCoReAnnotationIndex<K extends java.lang.Comparable<K>>
        Parameters:
        jCas - A CAS instance.
        type - The annotation type to index.
      • index

        public void index​(org.apache.uima.jcas.JCas jCas,
                          org.apache.uima.cas.Type type)
        Indexes the whole contents of the CAS annotation index of type type. For each annotation, the indexTermGenerator is used to create terms with which the annotation will be associated in the index and can be retrieved by a search method.
        Specified by:
        index in interface JCoReAnnotationIndex<K extends java.lang.Comparable<K>>
        Parameters:
        jCas - A CAS instance.
        type - The annotation type to index.
      • index

        public void index​(T a)
        Indexes the annotation a. The indexTermGenerator is used to create terms with which the annotation will be associated in the index and can be retrieved by a search method.
        Specified by:
        index in interface JCoReAnnotationIndex<K extends java.lang.Comparable<K>>
        Parameters:
        a - The annotation to index.
      • search

        public java.util.stream.Stream<T> search​(org.apache.uima.jcas.tcas.Annotation a)

        Generates search terms from a via the searchTermGenerator. These terms are then used to lookup annotations in the index and returned.

        It is perfectly valid and actually a frequent usecase to search for annotations which are not themselves part of the index. When searching, for example, for (parts of) the covered text, one can search for an entity and retrieve tokens matching the entity's name.

        Parameters:
        a - The annotation that provides search terms to search for.
        Returns:
        The found annotations.
      • search

        public java.util.stream.Stream<T> search​(java.util.stream.Stream<K> searchTerms)
        Searches for the provided search terms in the index.
        Parameters:
        searchTerms - The terms used to look up annotations.
        Returns:
        The found annotations.
      • search

        public java.util.stream.Stream<T> search​(K searchTerm)
        Searches for annotations in the index by the provided search term.
        Parameters:
        searchTerm - The term to search for.
        Returns:
        The found annotations.
      • getFirst

        public T getFirst​(K searchTerm)
      • getFirst

        public T getFirst​(org.apache.uima.jcas.tcas.Annotation a)
      • get

        public T get​(K searchTerm)
      • get

        public T get​(org.apache.uima.jcas.tcas.Annotation a)
      • getIndex

        public java.util.Map<K,​java.util.Collection<T>> getIndex()
      • setIndexAnnotationStorageSupplier

        public void setIndexAnnotationStorageSupplier​(java.util.function.Supplier<java.util.Collection<T>> supplier)
        Allows to change the supplier for the internal storage of annotations which are the values of the index map. That might be helpful when one wants to control the storage strategy in order to be able to predict the shape of search results.
        Parameters:
        supplier - A supplier that will be used to create the internal storage for indexed annotations (i.e. the values of the index map).