Package opennlp.tools.namefind
Class NameFinderEventStream
- All Implemented Interfaces:
AutoCloseable,ObjectStream<Event>
Class for creating an event stream out of data files for training an
TokenNameFinder.-
Constructor Summary
ConstructorsConstructorDescriptionNameFinderEventStream(ObjectStream<NameSample> dataStream, String type, NameContextGenerator contextGenerator, SequenceCodec<String> codec) -
Method Summary
Modifier and TypeMethodDescriptionstatic String[][]additionalContext(String[] tokens, Map<String, String> prevMap) Generated previous decision features for each token based on contents of the specifiedprevMap.generateEvents(String[] sentence, String[] outcomes, NameContextGenerator cg) Generateseventsfor each token in asentencewith the specifiedoutcomesusing the specifiedNameContextGenerator.static String[]generateOutcomes(Span[] names, String type, int length) Deprecated.Methods inherited from class opennlp.tools.util.AbstractEventStream
close, read, reset
-
Constructor Details
-
NameFinderEventStream
public NameFinderEventStream(ObjectStream<NameSample> dataStream, String type, NameContextGenerator contextGenerator, SequenceCodec<String> codec) - Parameters:
dataStream- Thedata streamof events.type-nullor overrides the type parameter in the provided samples.contextGenerator- TheNameContextGeneratorused to generate features for the event stream.codec- TheSequenceCodecto use.
-
-
Method Details
-
generateOutcomes
Deprecated.use theBioCodecimplementation of the SequenceValidator instead!Generates the name tag outcomes (start,continue,other) for each token in a sentence with the specifiedlengthusing the specifiednames.- Parameters:
names- Tokenspansfor each of the names.type-nullor overrides the type parameter in the provided sampleslength- The length of the sentence.- Returns:
- An array of
start,continue,otheroutcomes based on the specified names and sentencelength.
-
generateEvents
public static List<Event> generateEvents(String[] sentence, String[] outcomes, NameContextGenerator cg) Generateseventsfor each token in asentencewith the specifiedoutcomesusing the specifiedNameContextGenerator.- Parameters:
sentence- Token representing a sentence.outcomes- An array of outcomes.cg- TheNameContextGeneratorto use.- Returns:
- A list of
eventsgenerated.
-
additionalContext
Generated previous decision features for each token based on contents of the specifiedprevMap.- Parameters:
tokens- The token for which the context is generated.prevMap- A mapping of tokens to their previous decisions.- Returns:
- A 2-dimensional array with additional context with features for each token.
-
BioCodecimplementation of the SequenceValidator instead!