Class JCoReMapAnnotationIndex<K extends Comparable<K>,​T extends org.apache.uima.jcas.tcas.Annotation>

  • Type Parameters:
    T - The annotation type the index is over.
    K - The key type used to index the annotations.
    C - The collection type (e.g. ArrayList) used to store annotations in the index.
    U - The collection type (e.g. TreeSet) used to return search results.
    All Implemented Interfaces:
    JCoReAnnotationIndex<T>
    Direct Known Subclasses:
    JCoReHashMapAnnotationIndex, JCoReTreeMapAnnotationIndex

    public class JCoReMapAnnotationIndex<K extends Comparable<K>,​T extends org.apache.uima.jcas.tcas.Annotation>
    extends Object
    implements JCoReAnnotationIndex<T>

    Use when: You want to access annotations by an some arbitrary key that should be computed once and then only used for access. Use also when single annotations might or should be associated with multiple keys.

    This class builds a map from arbitrary keys to collections of annotations. For convenience access, class takes suppliers for the generation of index or search terms as well as suppliers for the actual collection implementations that should be used within the index and for search results. Thus, it's just a kind of convenience framework around a map.

    Author:
    faessler
    • Constructor Detail

      • JCoReMapAnnotationIndex

        public JCoReMapAnnotationIndex​(Supplier<Map<K,​Collection<T>>> indexMapSupplier,
                                       IndexTermGenerator<K> indexTermGenerator,
                                       IndexTermGenerator<K> searchTermGenerator,
                                       org.apache.uima.jcas.JCas jCas,
                                       org.apache.uima.cas.Type type)
        This is the full constructor of the map index. It takes parameters for virtually every aspect of the index. For a quicker start, you might want to refer to one of its subclasses, e.g. JCoReHashMapAnnotationIndex. Using the constructor immediately build the index from the given jCas for the annotation type given by type.
        Parameters:
        indexMapSupplier - A supplier for the map that should be used as the index.
        indexTermGenerator - Generates index terms of generic parameter type K. Those index terms will be extracted from indexed annotations.
        searchTermGenerator - Generates search terms of generic parameter K. The index will extract all IndexEntry items in the index matching one of the generated terms. This may be the very same term generator passed for indexTermGenerator.
        indexAnnotationCollectionSupplier - A supplier for the collection data structure used to store annotations in the index. In case of single index hits during a search, this data structure is returned directly to save time, if it is compatible with the return type specified by the generic type parameter U (search result return type). This way the desired output structure can be specified (e.g. a TreeSet with a specific comparator).
        resultCollectionSupplier - In case a multiple search terms for a search as generated by searchTermGenerator, not the index annotation collection is returned but this supplier is used to create a new collection to return search results for all search terms. Thus, when it is expected that searchTermGenerator will often generate multiple search terms, the indexAnnotationCollectionSupplier should create a collection efficient for adding and iterating and the resultCollectionSupplier should create a collection reflects the desired output format.
        jCas - A JCas containing annotations that should be indexed.
        type - The UIMA type system type, belonging to jCas, that should be indexed.
      • JCoReMapAnnotationIndex

        public JCoReMapAnnotationIndex​(Supplier<Map<K,​Collection<T>>> indexMapSupplier,
                                       IndexTermGenerator<K> indexTermGenerator,
                                       IndexTermGenerator<K> searchTermGenerator,
                                       org.apache.uima.jcas.JCas jCas,
                                       int type)
        This is the full constructor of the map index. It takes parameters for virtually every aspect of the index. For a quicker start, you might want to refer to one of its subclasses, e.g. JCoReHashMapAnnotationIndex. Using the constructor immediately build the index from the given jCas for the annotation type given by type.
        Parameters:
        indexMapSupplier - A supplier for the map that should be used as the index.
        indexTermGenerator - Generates index terms of generic parameter type K. Those index terms will be extracted from indexed annotations.
        searchTermGenerator - Generates search terms of generic parameter K. The index will extract all IndexEntry items in the index matching one of the generated terms. This may be the very same term generator passed for indexTermGenerator.
        indexAnnotationCollectionSupplier - A supplier for the collection data structure used to store annotations in the index. In case of single index hits during a search, this data structure is returned directly to save time, if it is compatible with the return type specified by the generic type parameter U (search result return type). This way the desired output structure can be specified (e.g. a TreeSet with a specific comparator).
        resultCollectionSupplier - In case a multiple search terms for a search as generated by searchTermGenerator, not the index annotation collection is returned but this supplier is used to create a new collection to return search results for all search terms. Thus, when it is expected that searchTermGenerator will often generate multiple search terms, the indexAnnotationCollectionSupplier should create a collection efficient for adding and iterating and the resultCollectionSupplier should create a collection reflects the desired output format.
        jCas - A JCas containing annotations that should be indexed.
        type - The UIMA type system type, belonging to jCas, that should be indexed.
      • JCoReMapAnnotationIndex

        public JCoReMapAnnotationIndex​(Supplier<Map<K,​Collection<T>>> indexMapSupplier,
                                       IndexTermGenerator<K> indexTermGenerator,
                                       IndexTermGenerator<K> searchTermGenerator)
        Parameters:
        indexMapSupplier - A supplier for the map that should be used as the index.
        indexTermGenerator - Generates index terms of generic parameter type K. Those index terms will be extracted from indexed annotations.
        searchTermGenerator - Generates search terms of generic parameter K. The index will extract all IndexEntry items in the index matching one of the generated terms. This may be the very same term generator passed for indexTermGenerator.
        indexAnnotationCollectionSupplier - A supplier for the collection data structure used to store annotations in the index. In case of single index hits during a search, this data structure is returned directly to save time, if it is compatible with the return type specified by the generic type parameter U (search result return type). This way the desired output structure can be specified (e.g. a TreeSet with a specific comparator).
        resultCollectionSupplier - In case a multiple search terms for a search as generated by searchTermGenerator, not the index annotation collection is returned but this supplier is used to create a new collection to return search results for all search terms. Thus, when it is expected that searchTermGenerator will often generate multiple search terms, the indexAnnotationCollectionSupplier should create a collection efficient for adding and iterating and the resultCollectionSupplier should create a collection reflects the desired output format.
    • Method Detail

      • index

        public void index​(org.apache.uima.jcas.JCas jCas,
                          int type)
        Indexes the whole contents of the CAS annotation index of type type. For each annotation, the indexTermGenerator is used to create terms with which the annotation will be associated in the index and can be retrieved by a search method.
        Specified by:
        index in interface JCoReAnnotationIndex<K extends Comparable<K>>
        Parameters:
        jCas - A CAS instance.
        type - The annotation type to index.
      • index

        public void index​(org.apache.uima.jcas.JCas jCas,
                          org.apache.uima.cas.Type type)
        Indexes the whole contents of the CAS annotation index of type type. For each annotation, the indexTermGenerator is used to create terms with which the annotation will be associated in the index and can be retrieved by a search method.
        Specified by:
        index in interface JCoReAnnotationIndex<K extends Comparable<K>>
        Parameters:
        jCas - A CAS instance.
        type - The annotation type to index.
      • index

        public void index​(T a)
        Indexes the annotation a. The indexTermGenerator is used to create terms with which the annotation will be associated in the index and can be retrieved by a search method.
        Specified by:
        index in interface JCoReAnnotationIndex<K extends Comparable<K>>
        Parameters:
        a - The annotation to index.
      • search

        public Stream<T> search​(org.apache.uima.jcas.tcas.Annotation a)

        Generates search terms from a via the searchTermGenerator. These terms are then used to lookup annotations in the index and returned.

        It is perfectly valid and actually a frequent usecase to search for annotations which are not themselves part of the index. When searching, for example, for (parts of) the covered text, one can search for an entity and retrieve tokens matching the entity's name.

        Parameters:
        a - The annotation that provides search terms to search for.
        Returns:
        The found annotations.
      • search

        public Stream<T> search​(Stream<K> searchTerms)
        Searches for the provided search terms in the index.
        Parameters:
        searchTerms - The terms used to look up annotations.
        Returns:
        The found annotations.
      • search

        public Stream<T> search​(K searchTerm)
        Searches for annotations in the index by the provided search term.
        Parameters:
        searchTerm - The term to search for.
        Returns:
        The found annotations.
      • getFirst

        public T getFirst​(K searchTerm)
      • getFirst

        public T getFirst​(org.apache.uima.jcas.tcas.Annotation a)
      • get

        public T get​(K searchTerm)
      • get

        public T get​(org.apache.uima.jcas.tcas.Annotation a)
      • setIndexAnnotationStorageSupplier

        public void setIndexAnnotationStorageSupplier​(Supplier<Collection<T>> supplier)
        Allows to change the supplier for the internal storage of annotations which are the values of the index map. That might be helpful when one wants to control the storage strategy in order to be able to predict the shape of search results.
        Parameters:
        supplier - A supplier that will be used to create the internal storage for indexed annotations (i.e. the values of the index map).