Class KeepLargestBlockFilter

  • All Implemented Interfaces:
    BoilerpipeFilter

    public final class KeepLargestBlockFilter
    extends java.lang.Object
    implements BoilerpipeFilter
    Keeps the largest TextBlock only (by the number of words). In case of more than one block with the same number of words, the first block is chosen. All discarded blocks are marked "not content" and flagged as DefaultLabels.MIGHT_BE_CONTENT. Note that, by default, only TextBlocks marked as "content" are taken into consideration.