public interface Extractor extends Serializable
| Modifier and Type | Method and Description |
|---|---|
Extraction |
extract(Object o)
Performs extraction given a raw object.
|
Extraction |
extract(PipeInputIterator source)
Performs extraction on a a set of raw documents.
|
Extraction |
extract(Tokenization toks)
Performs extraction from an object that has been
already been tokenized.
|
Pipe |
getFeaturePipe()
Returns the pipe used by this extractor for.
|
Alphabet |
getInputAlphabet()
Returns an alphabet of the features used by the extractor.
|
LabelAlphabet |
getTargetAlphabet()
Returns an alphabet of the labels used by the extractor.
|
Pipe |
getTokenizationPipe()
Returns the pipe used by this extractor to tokenize the input.
|
void |
setTokenizationPipe(Pipe pipe)
Sets the pipe used by this extractor for tokenization.
|
Extraction extract(Object o)
o - The document to extract from (often a String).Extraction extract(Tokenization toks)
toks - A tokenized documentExtraction extract(PipeInputIterator source)
source - A source of raw documentsPipe getFeaturePipe()
Pipe getTokenizationPipe()
void setTokenizationPipe(Pipe pipe)
The pipe @link{edu.umass.cs.mallet.base.pipe.CharSequence2TokenSequence} is an example of a pipe that could be used here.
Alphabet getInputAlphabet()
LabelAlphabet getTargetAlphabet()
Copyright © 2019 JULIE Lab, Germany. All rights reserved.