PhraseHelper to add support for byte offsets from payloads
About 80% of this code is copied straight from the original class.OffsetsEnum to load the byte offset from payloads.BreakIterators and aggregates their breaks to form larger contexts.sequence.FieldOffsetStrategy to load byte offsets from payloads.CharacterIterator and CharSequence
since all indices are byte offsets into the underlying file, not character indices.OcrSnippet instancesString.ExternalFieldLoader.loadField(Map, String)true if bytes is a well-formed UTF-8 byte sequence according to
Unicode 6.0.isWellFormed(byte[]).CharSequence and CharacterIterator.NoOpOffsetStrategy for byte offsets from payloadsFieldHighlighter to support lazy-loaded field values and byte offsets from payloads.BreakIterator and OcrPassageFormatter instances.OcrHighlighter, with support for loading byte offsets from payloads.UnifiedHighlighter variant to support lazy-loading field values from arbitrary storage and using byte
offsets from term payloads for highlighting instead of character offsets.OcrSnippet instances.OcrSnippet from an OCR fragment.BreakIterator that splits an XML-like document on a specific opening or closing tag.NamedList that is used by Solr to populate the response.Copyright © 2019. All rights reserved.