public class ChunkerFeatureExtractor extends Object implements FeatureExtractor<CharSequence>, Serializable
ChunkerFeatureExtractor implements a feature extractor
for character sequences based on a specified chunker. Feature
names are derived from the chunk types optionally concatenated to
the phrase making up the chunk. Feature values are the count of
their occurrences.
For instance, if a chunker were to return a chunk of type PER spanning the phrase John and a chunk of type LOC spanning the phrase New York, then the features will
be PER:1, LOC:1 if the phrases are not included and
PER_John:1, LOC_New York:1. If the phrase John
had shown up three times, the value for PER_John would
be 3 (assuming types are included).
| Constructor and Description |
|---|
ChunkerFeatureExtractor(Chunker chunker,
boolean includePhrase)
Construct a new chunker feature extractor based on the
specified chunker, including the phrases extracted if the
specified flag is true.
|
| Modifier and Type | Method and Description |
|---|---|
Map<String,? extends Number> |
features(CharSequence in)
Return the feature vector for the specified input.
|
public ChunkerFeatureExtractor(Chunker chunker, boolean includePhrase)
chunker - Base chunker for the extractor.includePhrase - Set to true to append the
phrase derived from the chunk to the feature name.public Map<String,? extends Number> features(CharSequence in)
FeatureExtractorfeatures in interface FeatureExtractor<CharSequence>in - Input object.Copyright © 2016 Alias-i, Inc.. All rights reserved.