Package opennlp.tools.formats.ad
Class ADNameSampleStream
java.lang.Object
opennlp.tools.formats.ad.ADNameSampleStream
- All Implemented Interfaces:
AutoCloseable,ObjectStream<NameSample>
Parser for Floresta Sita(c)tica Arvores Deitadas corpus, output to for the
Portuguese NER training.
The data contains four named entity types: Person, Organization, Group,
Place, Event, ArtProd, Abstract, Thing, Time and Numeric.
Data can be found on this web site.
Information about the format:
Susana Afonso.
"Árvores deitadas: Descrição do formato e das opções de análise na Floresta Sintáctica".
12 de Fevereiro de 2006.
Detailed info about the NER tagset.
Note: Do not use this class, internal use only!
-
Constructor Summary
ConstructorsConstructorDescriptionADNameSampleStream(InputStreamFactory in, String charsetName, boolean splitHyphenatedTokens) Deprecated.ADNameSampleStream(ObjectStream<String> lineStream, boolean splitHyphenatedTokens) Initializes a newADNameSampleStreamstream from aObjectStream<String>, that could be aPlainTextByLineStreamobject. -
Method Summary
Modifier and TypeMethodDescriptionvoidclose()Closes theObjectStreamand releases all allocated resources.read()Returns the nextObjectStreamobject.voidreset()Repositions the stream at the beginning and the previously seen object sequence will be repeated exactly.
-
Constructor Details
-
ADNameSampleStream
Initializes a newADNameSampleStreamstream from aObjectStream<String>, that could be aPlainTextByLineStreamobject.- Parameters:
lineStream- AnObjectStream<String>as input.splitHyphenatedTokens- Iftruehyphenated tokens will be separated: "carros-monstro" > "carros" "-" "monstro".
-
ADNameSampleStream
@Deprecated public ADNameSampleStream(InputStreamFactory in, String charsetName, boolean splitHyphenatedTokens) throws IOException Deprecated.Initializes a newADNameSampleStreamfrom anInputStreamFactory- Parameters:
in- The CorpusInputStreamFactory.charsetName- Thecharsetto use for reading of the corpus.splitHyphenatedTokens- Iftruehyphenated tokens will be separated: "carros-monstro" > "carros" "-" "monstro".- Throws:
IOException
-
-
Method Details
-
read
Description copied from interface:ObjectStreamReturns the nextObjectStreamobject. Calling this method repeatedly until it returnsnullwill return each object from the underlying source exactly once.- Specified by:
readin interfaceObjectStream<NameSample>- Returns:
- The next object or
nullto signal that the stream is exhausted. - Throws:
IOException- Thrown if there is an error during reading.
-
reset
Description copied from interface:ObjectStreamRepositions the stream at the beginning and the previously seen object sequence will be repeated exactly. This method can be used to re-read the stream if multiple passes over the objects are required.The implementation of this method is optional.
- Specified by:
resetin interfaceObjectStream<NameSample>- Throws:
IOException- Thrown if there is an error during resetting the stream.UnsupportedOperationException- Thrown if thereset()is not supported. By default, this is the case.
-
close
Description copied from interface:ObjectStreamCloses theObjectStreamand releases all allocated resources. After close was called, it's not allowed to callObjectStream.read()orObjectStream.reset().- Specified by:
closein interfaceAutoCloseable- Specified by:
closein interfaceObjectStream<NameSample>- Throws:
IOException- Thrown if there is an error during closing the stream.
-