Package de.l3s.boilerpipe.document
Class TextDocument
- java.lang.Object
-
- de.l3s.boilerpipe.document.TextDocument
-
public class TextDocument extends java.lang.ObjectA text document, consisting of one or moreTextBlocks.
-
-
Constructor Summary
Constructors Constructor Description TextDocument(java.lang.String title, java.util.List<TextBlock> textBlocks)Creates a newTextDocumentwith givenTextBlocks and given title.TextDocument(java.util.List<TextBlock> textBlocks)Creates a newTextDocumentwith givenTextBlocks, and no title.
-
Method Summary
Modifier and Type Method Description java.lang.StringdebugString()Returns detailed debugging information about the containedTextBlocks.java.lang.StringgetContent()Returns theTextDocument's content.java.lang.StringgetText(boolean includeContent, boolean includeNonContent)Returns theTextDocument's content, non-content or bothjava.util.List<TextBlock>getTextBlocks()Returns theTextBlocks of this document.java.lang.StringgetTitle()Returns the "main" title for this document, ornullif no such title has ben set.voidsetTitle(java.lang.String title)Updates the "main" title for this document.
-
-
-
Constructor Detail
-
TextDocument
public TextDocument(java.util.List<TextBlock> textBlocks)
Creates a newTextDocumentwith givenTextBlocks, and no title.- Parameters:
textBlocks- The text blocks of this document.
-
TextDocument
public TextDocument(java.lang.String title, java.util.List<TextBlock> textBlocks)Creates a newTextDocumentwith givenTextBlocks and given title.- Parameters:
title- The "main" title for this text document.textBlocks- The text blocks of this document.
-
-
Method Detail
-
getTextBlocks
public java.util.List<TextBlock> getTextBlocks()
Returns theTextBlocks of this document.- Returns:
- A list of
TextBlocks, in sequential order of appearance.
-
getTitle
public java.lang.String getTitle()
Returns the "main" title for this document, ornullif no such title has ben set.- Returns:
- The "main" title.
-
setTitle
public void setTitle(java.lang.String title)
Updates the "main" title for this document.- Parameters:
title-
-
getContent
public java.lang.String getContent()
Returns theTextDocument's content.- Returns:
- The content text.
-
getText
public java.lang.String getText(boolean includeContent, boolean includeNonContent)Returns theTextDocument's content, non-content or both- Parameters:
includeContent- Whether to include TextBlocks marked as "content".includeNonContent- Whether to include TextBlocks marked as "non-content".- Returns:
- The text.
-
debugString
public java.lang.String debugString()
Returns detailed debugging information about the containedTextBlocks.- Returns:
- Debug information.
-
-