public class MultiSegReader extends RawTextDatasetReader
| Modifier and Type | Field and Description |
|---|---|
protected static org.slf4j.Logger |
log |
generateUIDs, isTokenized, useFirstSentenceAsTitlelimit, randomizeDocuments| Constructor and Description |
|---|
MultiSegReader() |
| Modifier and Type | Method and Description |
|---|---|
Dataset |
read(Resource path) |
Document |
readDocumentFromFile(Resource file)
Read a single Document from file.
|
protected TreeSet[] |
readSectionsFromLabel(Resource file,
int docNum)
Return a set of lines where a new sections starts.
|
withFirstSentenceAsTitle, withGeneratedUIDs, withTokenizedInputreadDatasetFromDirectory, readDatasetFromDirectory, stream, streamDocumentsFromDirectory, tryReadDocumentsFromFile, withLimitNumberOfDocuments, withRandomizedDocumentspublic Dataset read(Resource path) throws IOException
read in interface DatasetReaderread in class DirectoryDatasetReader<RawTextDatasetReader>IOExceptionpublic Document readDocumentFromFile(Resource file)
readDocumentFromFile in class RawTextDatasetReaderprotected TreeSet[] readSectionsFromLabel(Resource file, int docNum) throws IOException
IOExceptionCopyright © 2019. All rights reserved.