public class AltoPassageFormatter extends OcrPassageFormatter
absoluteHighlights, endHlTag, startHlTag| Modifier | Constructor and Description |
|---|---|
protected |
AltoPassageFormatter(String startHlTag,
String endHlTag,
boolean absoluteHighlights) |
| Modifier and Type | Method and Description |
|---|---|
String |
determineStartPage(String ocrFragment,
int startOffset,
IterableCharSequence content)
Determine the id of the page an OCR fragment resides on.
|
protected String |
getTextFromXml(String altoFragment)
Helper method to get plaintext from XML/HTML-like fragments
|
protected List<OcrBox> |
parseWords(String ocrFragment,
String startPage)
Parse word boxes from an OCR fragment.
|
addHighlightsToSnippet, format, format, mergeBoxes, parseFragmentpublic String determineStartPage(String ocrFragment, int startOffset, IterableCharSequence content)
OcrPassageFormatterdetermineStartPage in class OcrPassageFormatterprotected String getTextFromXml(String altoFragment)
OcrPassageFormattergetTextFromXml in class OcrPassageFormatterprotected List<OcrBox> parseWords(String ocrFragment, String startPage)
OcrPassageFormatterparseWords in class OcrPassageFormatterCopyright © 2019. All rights reserved.