Package de.jungblut.datastructure
Class InvertedIndex<DOCUMENT_TYPE,KEY_TYPE>
- java.lang.Object
-
- de.jungblut.datastructure.InvertedIndex<DOCUMENT_TYPE,KEY_TYPE>
-
- Type Parameters:
DOCUMENT_TYPE- the type of document one wants to retrieve.KEY_TYPE- the type of key that is going to be extracted out of documents and is searchable (needs hashCode&equals implementations).
public final class InvertedIndex<DOCUMENT_TYPE,KEY_TYPE> extends java.lang.ObjectInverted Index, mainly developed for sparse vectors to speedup dimension lookups for fast distance measurement and search space reduction. But of course it can also be used to behave like a fulltext index to find relevant documents by their textual representation.- Author:
- thomas.jungblut
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static interfaceInvertedIndex.DocumentDistanceMeasurer<DOCUMENT_TYPE,KEY_TYPE>Measurer that measures distance of two documents.static interfaceInvertedIndex.DocumentMapper<DOCUMENT_TYPE,KEY_TYPE>Mapper that maps a document to its keys.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description voidbuild(java.util.List<DOCUMENT_TYPE> items)Builds this inverted index.static <KEY_TYPE,DOCUMENT_TYPE>
InvertedIndex<DOCUMENT_TYPE,KEY_TYPE>create(InvertedIndex.DocumentMapper<DOCUMENT_TYPE,KEY_TYPE> mapper, InvertedIndex.DocumentDistanceMeasurer<DOCUMENT_TYPE,KEY_TYPE> measurer)Create an inverted index out of two mapping interfaces: a mapper that maps documents to its key parts and a distance measurer that measures distance between two documents.static InvertedIndex<de.jungblut.math.DoubleVector,java.lang.Integer>createVectorIndex(DistanceMeasurer measurer)Creates an inverted index for vectors (usually sparse vectors are used) that maps dimensions to the corresponding vectors if they are non-zero.java.util.List<DistanceResult<DOCUMENT_TYPE>>query(DOCUMENT_TYPE document)Queries this invertex index.java.util.List<DistanceResult<DOCUMENT_TYPE>>query(DOCUMENT_TYPE document, double minDistance)Queries this invertex index.java.util.List<DistanceResult<DOCUMENT_TYPE>>query(DOCUMENT_TYPE document, int maxResults, double minDistance)Queries this inverted index.
-
-
-
Method Detail
-
build
public void build(java.util.List<DOCUMENT_TYPE> items)
Builds this inverted index.- Parameters:
items- the items that needs to be indexed.
-
query
public java.util.List<DistanceResult<DOCUMENT_TYPE>> query(DOCUMENT_TYPE document)
Queries this invertex index. This is not bounding the result, so you'll get all items.- Parameters:
document- the document to query with- Returns:
- an array of results descending sorted, so the best matching item resides on the first index.
-
query
public java.util.List<DistanceResult<DOCUMENT_TYPE>> query(DOCUMENT_TYPE document, double minDistance)
Queries this invertex index. This is not bounding the result, so you'll get all items that have at least minDistance.- Parameters:
document- the document to query withminDistance- the minimum (lower than: <=) distance the items should have.- Returns:
- an array of results descending sorted, so the best matching item resides on the first index.
-
query
public java.util.List<DistanceResult<DOCUMENT_TYPE>> query(DOCUMENT_TYPE document, int maxResults, double minDistance)
Queries this inverted index.- Parameters:
document- the document to query with-maxResults- the maximum number of results to obtain.minDistance- the minimum (lower than: <=) distance the items should have.- Returns:
- an array list of results descending sorted, so the best matching item resides on the first index.
-
create
public static <KEY_TYPE,DOCUMENT_TYPE> InvertedIndex<DOCUMENT_TYPE,KEY_TYPE> create(InvertedIndex.DocumentMapper<DOCUMENT_TYPE,KEY_TYPE> mapper, InvertedIndex.DocumentDistanceMeasurer<DOCUMENT_TYPE,KEY_TYPE> measurer)
Create an inverted index out of two mapping interfaces: a mapper that maps documents to its key parts and a distance measurer that measures distance between two documents.- Parameters:
mapper- theInvertedIndex.DocumentMapper.measurer- theInvertedIndex.DocumentDistanceMeasurer.- Returns:
- a brand new inverted index.
-
createVectorIndex
public static InvertedIndex<de.jungblut.math.DoubleVector,java.lang.Integer> createVectorIndex(DistanceMeasurer measurer)
Creates an inverted index for vectors (usually sparse vectors are used) that maps dimensions to the corresponding vectors if they are non-zero.- Parameters:
measurer- the distance measurer on two vectors.- Returns:
- a brand new inverted index.
-
-