public class SentenceSplitter extends Object
| Constructor and Description |
|---|
SentenceSplitter() |
| Modifier and Type | Method and Description |
|---|---|
ArrayList<String> |
getLabelsFromLabelSequence(cc.mallet.types.LabelSequence ls) |
cc.mallet.fst.CRF |
getModel() |
cc.mallet.types.Instance |
makePredictionData(ArrayList<String> lines,
cc.mallet.pipe.Pipe myPipe)
creates a single instance from the arraylist with lines provided and the given pipe
|
cc.mallet.types.InstanceList |
makePredictionData(File[] predictFiles,
cc.mallet.pipe.Pipe myPipe)
creates a list of instances with the pipe provided from the given array of files
|
cc.mallet.types.Instance |
makePredictionData(File predictFile,
cc.mallet.pipe.Pipe myPipe)
creates a single instance from the file provided and the given pipe
|
cc.mallet.types.InstanceList |
makeTrainingData(File[] trainFiles,
boolean useTokenOffset,
boolean splitUnitsAfterPunctuation) |
List<Unit> |
predict(cc.mallet.types.Instance inst,
String filterName)
predict a single Instance
|
List<Unit> |
predict(List<String> lines,
String postprocessingFilter)
predict a couple of lines
|
ArrayList<String> |
readFile(File myFile) |
void |
readModel(File file)
load a previously trained FeatureSubsetModel (CRF4+Properties) which was stored as serialized object to disk.
|
void |
readModel(InputStream is) |
void |
train(cc.mallet.types.InstanceList instList,
cc.mallet.pipe.Pipe dataPipe) |
void |
writeModel(String filename)
Save the model learned to disk.
|
public cc.mallet.types.Instance makePredictionData(ArrayList<String> lines, cc.mallet.pipe.Pipe myPipe)
public cc.mallet.types.Instance makePredictionData(File predictFile, cc.mallet.pipe.Pipe myPipe)
public cc.mallet.types.InstanceList makePredictionData(File[] predictFiles, cc.mallet.pipe.Pipe myPipe)
public cc.mallet.types.InstanceList makeTrainingData(File[] trainFiles, boolean useTokenOffset, boolean splitUnitsAfterPunctuation)
trainFiles - useTokenOffset - if true the tokens offset and not is string representation is stored in the instance sourcepublic void train(cc.mallet.types.InstanceList instList,
cc.mallet.pipe.Pipe dataPipe)
public List<Unit> predict(List<String> lines, String postprocessingFilter)
lines - postprocessingFilter - public List<Unit> predict(cc.mallet.types.Instance inst, String filterName)
inst - filterName - public ArrayList<String> getLabelsFromLabelSequence(cc.mallet.types.LabelSequence ls)
public void writeModel(String filename)
filename - where to write it (full path!)public void readModel(File file) throws IOException, FileNotFoundException, ClassNotFoundException
file - where to find the serialized featureSubsetModel (full path!)IOExceptionFileNotFoundExceptionClassNotFoundExceptionpublic void readModel(InputStream is) throws IOException, ClassNotFoundException
IOExceptionClassNotFoundExceptionpublic cc.mallet.fst.CRF getModel()
Copyright © 2018 JULIE Lab Jena, Germany. All rights reserved.