public class HierarchicalTokenizationFilter extends Object implements TokenizationFilter
A A|B A|B|C A|B|C A|B A A w1 w2 w3 w4 w5 w6 w7will result in LabeledSpans like <A>w1 <B>w2 <C>w3 w4</C> w5</B> w6 w7</A> Also, labels of the form <B-field> will force a new instance of the field to begin, even if it is already active. And prefixes of I- are ignored so you can use BIO labeling. Created: Nov 12, 2004
| Constructor and Description |
|---|
HierarchicalTokenizationFilter() |
HierarchicalTokenizationFilter(Pattern ignorePattern) |
| Modifier and Type | Method and Description |
|---|---|
LabeledSpans |
constructLabeledSpans(LabelAlphabet dict,
Object document,
Label backgroundTag,
Tokenization input,
Sequence seq)
Converts a the sequence of labels into a set of labeled spans.
|
public HierarchicalTokenizationFilter()
public HierarchicalTokenizationFilter(Pattern ignorePattern)
public LabeledSpans constructLabeledSpans(LabelAlphabet dict, Object document, Label backgroundTag, Tokenization input, Sequence seq)
TokenizationFilterconstructLabeledSpans in interface TokenizationFilterCopyright © 2019 JULIE Lab, Germany. All rights reserved.