Class DensityRulesClassifier
- java.lang.Object
-
- de.l3s.boilerpipe.filters.english.DensityRulesClassifier
-
- All Implemented Interfaces:
BoilerpipeFilter
public class DensityRulesClassifier extends java.lang.Object implements BoilerpipeFilter
ClassifiesTextBlocks as content/not-content through rules that have been determined using the C4.8 machine learning algorithm, as described in the paper "Boilerplate Detection using Shallow Text Features", particularly using text densities and link densities.
-
-
Field Summary
Fields Modifier and Type Field Description static DensityRulesClassifierINSTANCE
-
Constructor Summary
Constructors Constructor Description DensityRulesClassifier()
-
Method Summary
Modifier and Type Method Description protected booleanclassify(TextBlock prev, TextBlock curr, TextBlock next)static DensityRulesClassifiergetInstance()Returns the singleton instance for RulebasedBoilerpipeClassifier.booleanprocess(TextDocument doc)Processes the given documentdoc.
-
-
-
Field Detail
-
INSTANCE
public static final DensityRulesClassifier INSTANCE
-
-
Method Detail
-
getInstance
public static DensityRulesClassifier getInstance()
Returns the singleton instance for RulebasedBoilerpipeClassifier.
-
process
public boolean process(TextDocument doc) throws BoilerpipeProcessingException
Description copied from interface:BoilerpipeFilterProcesses the given documentdoc.- Specified by:
processin interfaceBoilerpipeFilter- Parameters:
doc- TheTextDocumentthat is to be processed.- Returns:
trueif changes have been made to theTextDocument.- Throws:
BoilerpipeProcessingException
-
-