public class MiniOcrPassageFormatter extends OcrPassageFormatter
absoluteHighlights, endHlTag, startHlTag| Constructor and Description |
|---|
MiniOcrPassageFormatter(String startHlTag,
String endHlTag,
boolean absoluteHighlights) |
| Modifier and Type | Method and Description |
|---|---|
protected void |
addHighlightsToSnippet(List<List<OcrBox>> hlBoxes,
OcrSnippet snippet) |
String |
determineStartPage(String xmlFragment,
int startOffset,
IterableCharSequence content)
Determine the id of the page an OCR fragment resides on.
|
Object |
format(org.apache.lucene.search.uhighlight.Passage[] passages,
String content)
Convenience implementation to format document text that is available as a
String. |
protected List<OcrBox> |
parseWords(String ocrFragment,
String startPage)
Parse word boxes from an OCR fragment.
|
format, getTextFromXml, mergeBoxes, parseFragmentpublic String determineStartPage(String xmlFragment, int startOffset, IterableCharSequence content)
OcrPassageFormatterdetermineStartPage in class OcrPassageFormatterprotected void addHighlightsToSnippet(List<List<OcrBox>> hlBoxes, OcrSnippet snippet)
addHighlightsToSnippet in class OcrPassageFormatterprotected List<OcrBox> parseWords(String ocrFragment, String startPage)
OcrPassageFormatterparseWords in class OcrPassageFormatterpublic Object format(org.apache.lucene.search.uhighlight.Passage[] passages, String content)
OcrPassageFormatterString.
Wraps the String in a IterableCharSequence implementation and calls
OcrPassageFormatter.format(Passage[], IterableCharSequence)format in class OcrPassageFormatterpassages - in the the document text that contain highlighted textcontent - of the OCR field, implemented as an IterableCharSequenceCopyright © 2019. All rights reserved.