| Constructor and Description |
|---|
HocrFormat() |
| Modifier and Type | Method and Description |
|---|---|
BreakIterator |
getBreakIterator(OcrBlock breakBlock,
OcrBlock limitBlock,
int contextSize)
Get a BreakIterator that splits the content according to the break parameters
|
OcrPassageFormatter |
getPassageFormatter(String prehHighlightTag,
String postHighlightTag,
boolean absoluteHighlights)
Get a PassageFormatter that builds OCR snippets from passages
|
public BreakIterator getBreakIterator(OcrBlock breakBlock, OcrBlock limitBlock, int contextSize)
OcrFormatgetBreakIterator in interface OcrFormatbreakBlock - the type of OcrBlock that the input document is split on to build passageslimitBlock - the type of OcrBlock that a passage may not crosscontextSize - the number of break blocks in a context that forms a highlighting passagepublic OcrPassageFormatter getPassageFormatter(String prehHighlightTag, String postHighlightTag, boolean absoluteHighlights)
OcrFormatgetPassageFormatter in interface OcrFormatprehHighlightTag - the tag to put in the snippet text before a highlighted region, e.g. <em>postHighlightTag - the tag to put in the snippet text after a highlighted region, e.g. </em>absoluteHighlights - whether the coordinates for highlights should be absolute, i.e. relative to the page
and not the containing snippetCopyright © 2019. All rights reserved.