Package banner.tagging
Class CRFTagger
- java.lang.Object
-
- banner.tagging.CRFTagger
-
- All Implemented Interfaces:
Tagger
- Direct Known Subclasses:
NBestCRFTagger
public class CRFTagger extends Object implements Tagger
-
-
Field Summary
Fields Modifier and Type Field Description protected cc.mallet.fst.CRFmodel
-
Constructor Summary
Constructors Modifier Constructor Description protectedCRFTagger(cc.mallet.fst.CRF model, FeatureSet featureSet, int order)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description voiddescribe(String fileName)Set<String>getFeatureNames()List<List<String>>getFeatureRepresentation(Sentence sentence)protected cc.mallet.types.InstancegetInstance(Sentence sentence)Map<String,Double>getMaxWeights()Map<String,Double>getMinWeights()intgetOrder()protected static List<String>getTagList(cc.mallet.types.Sequence<Object> tags)static CRFTaggerload(InputStream f, dragon.nlp.tool.Lemmatiser lemmatiser, dragon.nlp.tool.Tagger posTagger, Tagger preTagger)Loads aCRFTaggerfrom the specified file.voidtag(Sentence sentence)static CRFTaggertrain(Set<Sentence> sentences, int order, TagFormat format, FeatureSet featureSet)voidwrite(File f)Serializes and writes this CRFTagger to the specified file
-
-
-
Constructor Detail
-
CRFTagger
protected CRFTagger(cc.mallet.fst.CRF model, FeatureSet featureSet, int order)
-
-
Method Detail
-
load
public static CRFTagger load(InputStream f, dragon.nlp.tool.Lemmatiser lemmatiser, dragon.nlp.tool.Tagger posTagger, Tagger preTagger) throws IOException
Loads aCRFTaggerfrom the specified file. As the lemmatiser and part-of-speech tagger both require data, these cannot be written to disk and must be passed in new.- Parameters:
f- The file to load the CRFTagger from, as written by the write() method.lemmatiser- TheLemmatiserto useposTagger- The part-of-speechTaggerto use- Returns:
- A new instance of the CRFTagger contained in the specified file
- Throws:
IOException
-
train
public static CRFTagger train(Set<Sentence> sentences, int order, TagFormat format, FeatureSet featureSet)
-
write
public void write(File f)
Serializes and writes this CRFTagger to the specified file- Parameters:
f- The file to write this CRFTagger to
-
getInstance
protected cc.mallet.types.Instance getInstance(Sentence sentence)
-
getOrder
public int getOrder()
- Returns:
- The CRF order used by this tagger. Order 1 means that the last state is used and order 2 means that the last 2 states are used.
-
describe
public void describe(String fileName) throws IOException
- Throws:
IOException
-
-