public class EvalitaNameSampleStream extends Object implements ObjectStream<NameSample>
The data does not contain article boundaries, adaptive data will be cleared for every sentence.
Named Entities are annotated in the IOB2 format (as used in CoNLL 2002 shared task)
The Named Entity tag consists of two parts: 1. The IOB2 tag: 'B' (for 'begin') denotes the first token of a Named Entity, I (for 'inside') is used for all other tokens in a Named Entity, and 'O' (for 'outside') is used for all other words; 2. The Entity type tag: PER (for Person), ORG (for Organization), GPE (for Geo-Political Entity), or LOC (for Location).
Each file consists of four columns separated by a blank, containing respectively the token, the Elsnet PoS-tag, the Adige news story to which the token belongs, and the Named Entity tag.
Data can be found on this web site:
http://www.evalita.it
Note: Do not use this class, internal use only!
| Modifier and Type | Class and Description |
|---|---|
static class |
EvalitaNameSampleStream.LANGUAGE |
| Modifier and Type | Field and Description |
|---|---|
static String |
DOCSTART |
static int |
GENERATE_GPE_ENTITIES |
static int |
GENERATE_LOCATION_ENTITIES |
static int |
GENERATE_ORGANIZATION_ENTITIES |
static int |
GENERATE_PERSON_ENTITIES |
| Constructor and Description |
|---|
EvalitaNameSampleStream(EvalitaNameSampleStream.LANGUAGE lang,
InputStreamFactory in,
int types) |
EvalitaNameSampleStream(EvalitaNameSampleStream.LANGUAGE lang,
InputStream in,
int types)
Deprecated.
|
EvalitaNameSampleStream(EvalitaNameSampleStream.LANGUAGE lang,
ObjectStream<String> lineStream,
int types) |
| Modifier and Type | Method and Description |
|---|---|
void |
close()
Closes the
ObjectStream and releases all allocated
resources. |
NameSample |
read()
Returns the next object.
|
void |
reset()
Repositions the stream at the beginning and the previously seen object sequence
will be repeated exactly.
|
public static final int GENERATE_PERSON_ENTITIES
public static final int GENERATE_ORGANIZATION_ENTITIES
public static final int GENERATE_LOCATION_ENTITIES
public static final int GENERATE_GPE_ENTITIES
public static final String DOCSTART
public EvalitaNameSampleStream(EvalitaNameSampleStream.LANGUAGE lang, ObjectStream<String> lineStream, int types)
public EvalitaNameSampleStream(EvalitaNameSampleStream.LANGUAGE lang, InputStreamFactory in, int types) throws IOException
IOException@Deprecated public EvalitaNameSampleStream(EvalitaNameSampleStream.LANGUAGE lang, InputStream in, int types)
lang - the language of the Evalita data filein - an Input Stream to read data.types - the types of the entities which are included in the Name Sample streampublic NameSample read() throws IOException
ObjectStreamread in interface ObjectStream<NameSample>IOException - if there is an error during readingpublic void reset()
throws IOException,
UnsupportedOperationException
ObjectStreamreset in interface ObjectStream<NameSample>IOException - if there is an error during reseting the streamUnsupportedOperationExceptionpublic void close()
throws IOException
ObjectStreamObjectStream and releases all allocated
resources. After close was called its not allowed to call
read or reset.close in interface AutoCloseableclose in interface ObjectStream<NameSample>IOException - if there is an error during closing the streamCopyright © 2015 The Apache Software Foundation. All rights reserved.