Skip navigation links
A B C D E F G H I J L M N O P Q R S T U V W X Z 

A

AbstractChunking - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking
This class specifies the base class for file chunking
AbstractChunking(byte[]) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.AbstractChunking
Initializes a new instance of the AbstractChunking class.
AbstractListManager - Class in org.apache.tika.parser.microsoft
 
AbstractListManager() - Constructor for class org.apache.tika.parser.microsoft.AbstractListManager
 
AbstractListManager.LevelTuple - Class in org.apache.tika.parser.microsoft
 
AbstractListManager.ParagraphLevelCounter - Class in org.apache.tika.parser.microsoft
 
AbstractOfficeParser - Class in org.apache.tika.parser.microsoft
Intermediate layer to set OfficeParserConfig uniformly.
AbstractOfficeParser() - Constructor for class org.apache.tika.parser.microsoft.AbstractOfficeParser
 
AbstractOOXMLExtractor - Class in org.apache.tika.parser.microsoft.ooxml
Base class for all Tika OOXML extractors.
AbstractOOXMLExtractor(ParseContext, POIXMLTextExtractor) - Constructor for class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
 
AbstractXML2003Parser - Class in org.apache.tika.parser.microsoft.xml
 
AbstractXML2003Parser() - Constructor for class org.apache.tika.parser.microsoft.xml.AbstractXML2003Parser
 
AccessChecker - Class in org.apache.tika.parser.pdf
Checks whether or not a document allows extraction generally or extraction for accessibility only.
AccessChecker() - Constructor for class org.apache.tika.parser.pdf.AccessChecker
This constructs an AccessChecker that will not perform any checking and will always return without throwing an exception.
AccessChecker(boolean) - Constructor for class org.apache.tika.parser.pdf.AccessChecker
This constructs an AccessChecker that will check for whether or not content should be extracted from a document.
Activator - Class in org.apache.tika.parser.internal
 
Activator() - Constructor for class org.apache.tika.parser.internal.Activator
 
AdapterHelper - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
 
AdapterHelper() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AdapterHelper
 
add(UByte) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
 
add(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
 
add(UInteger) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
 
add(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
 
add(ULong) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
 
add(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
 
add(long) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
 
add(UShort) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
 
add(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
 
addAlternative(GeoTag) - Method in class org.apache.tika.parser.geo.topic.GeoTag
 
addDrawingHyperLinks(PackagePart) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
 
addEvenIfNull(Property, String, Metadata) - Static method in class org.apache.tika.parser.microsoft.OutlookExtractor
 
addMetadata(Mp4Directory) - Method in class org.apache.tika.parser.mp4.boxes.TikaUserDataBox
 
addMetadata(String) - Method in class org.apache.tika.parser.xml.AttributeDependantMetadataHandler
 
addMetadata(String) - Method in class org.apache.tika.parser.xml.ElementMetadataHandler
 
addMetadata(String) - Method in class org.apache.tika.parser.xml.MetadataHandler
Deprecated.
 
addMulti(Metadata, Property, String) - Static method in class org.apache.tika.parser.microsoft.SummaryExtractor
 
addOtherTesseractConfig(String, String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
Add a key-value pair to pass to Tesseract using its -c command line option.
addPersonAndEmail(String, Property, Property, Metadata) - Static method in class org.apache.tika.parser.mail.MailUtil
This tries to split a "from" or "to" value into a person field and an email field.
AdobeFontMetricParser - Class in org.apache.tika.parser.font
Parser for AFM Font Files
AdobeFontMetricParser() - Constructor for class org.apache.tika.parser.font.AdobeFontMetricParser
 
ALIGNED_OFFSET - Static variable in class org.apache.tika.parser.chm.core.ChmCommons
 
alignedLenTable - Variable in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
alignedTreeTable - Variable in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
AlternativePackaging - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
 
AlternativePackaging() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AlternativePackaging
 
analyzeStorageIndexDataElement(List<DataElement>, ExGuid, AtomicReference<ExGuid>, AtomicReference<HashMap<CellID, ExGuid>>, AtomicReference<HashMap<ExGuid, ExGuid>>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
This method is used to analyze the storage index data element to get all the mappings.
apiBaseUri - Variable in class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
 
apiUri - Variable in class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
 
appendByteArrayToListOfByte(List<Byte>, byte[]) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.ByteUtil
 
appendGUID(UUID) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitWriter
Append a specified GUID value into the buffer.
appendInit32(int, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitWriter
Append a specified Init32 type value into the buffer with the specified bit length.
appendUInit32(int, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitWriter
Append a specified Unit32 type value into the buffer with the specified bit length.
appendUInt64(long, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitWriter
Append a specified Unit64 type value into the buffer with the specified bit length.
AppleSingleFileParser - Class in org.apache.tika.parser.apple
Parser that strips the header off of AppleSingle and AppleDouble files.
AppleSingleFileParser() - Constructor for class org.apache.tika.parser.apple.AppleSingleFileParser
 
ARCHITECTURE_BITS - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
 
ArrayNumber - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.property
The class is used to represent the number of the array.
ArrayNumber() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.ArrayNumber
 
asBytes(UUID) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.UuidUtils
 
assertByteArrayNotNull(byte[]) - Static method in class org.apache.tika.parser.chm.assertion.ChmAssert
Checks if byte[] is not null
assertByteArrayNotNull(byte[]) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
 
assertChmAccessorNotNull(ChmAccessor<?>) - Static method in class org.apache.tika.parser.chm.assertion.ChmAssert
Checks if ChmAccessor is not null In case of null throws exception
assertChmAccessorParameters(byte[], ChmAccessor<?>, int) - Static method in class org.apache.tika.parser.chm.assertion.ChmAssert
Checks validity of ChmAccessor parameters
assertChmBlockSegment(byte[], ChmLzxcResetTable, int, int, int) - Static method in class org.apache.tika.parser.chm.assertion.ChmAssert
Checks a validity of the chmBlockSegment parameters
assertCopyingDataIndex(int, int) - Static method in class org.apache.tika.parser.chm.assertion.ChmAssert
 
assertDirectoryListingEntry(int, String, ChmCommons.EntryType, int, int) - Static method in class org.apache.tika.parser.chm.assertion.ChmAssert
Checks validity of the DirectoryListingEntry's parameters In case of invalid parameter(s) throws an exception
assertInputStreamNotNull(InputStream) - Static method in class org.apache.tika.parser.chm.assertion.ChmAssert
Checks if InputStream is not null
assertPositiveInt(int) - Static method in class org.apache.tika.parser.chm.assertion.ChmAssert
Checks if int param is greater than zero In case param <= 0 throws an exception
asUuid(byte[]) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.UuidUtils
 
AttributeDependantMetadataHandler - Class in org.apache.tika.parser.xml
This adds a Metadata entry for a given node.
AttributeDependantMetadataHandler(Metadata, String, String) - Constructor for class org.apache.tika.parser.xml.AttributeDependantMetadataHandler
 
AttributeMetadataHandler - Class in org.apache.tika.parser.xml
SAX event handler that maps the contents of an XML attribute into a metadata field.
AttributeMetadataHandler(String, String, Metadata, String) - Constructor for class org.apache.tika.parser.xml.AttributeMetadataHandler
 
AttributeMetadataHandler(String, String, Metadata, Property) - Constructor for class org.apache.tika.parser.xml.AttributeMetadataHandler
 
AudioFrame - Class in org.apache.tika.parser.mp3
An Audio Frame in an MP3 file.
AudioFrame(InputStream, ContentHandler) - Constructor for class org.apache.tika.parser.mp3.AudioFrame
Deprecated.
Use the constructor which is passed all values directly.
AudioFrame(int, int, int, int, InputStream) - Constructor for class org.apache.tika.parser.mp3.AudioFrame
Deprecated.
Use the constructor which is passed all values directly.
AudioFrame(int, int, int, int, int, int, float) - Constructor for class org.apache.tika.parser.mp3.AudioFrame
Creates a new instance of AudioFrame and initializes all properties.
AudioParser - Class in org.apache.tika.parser.audio
 
AudioParser() - Constructor for class org.apache.tika.parser.audio.AudioParser
 
available - Variable in class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
 

B

baseRevisionID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifest
 
BasicObject - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
Base object for FSSHTTPB.
BasicObject() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.BasicObject
 
BIG - Static variable in class org.apache.tika.parser.executable.MachineMetadata.Endian
 
BinaryItem - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
 
BinaryItem() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.BinaryItem
Initializes a new instance of the BinaryItem class.
BinaryItem(Collection<Byte>) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.BinaryItem
Initializes a new instance of the BinaryItem class with the specified content.
Bit - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.util
The class is used to read/set bit value for a byte array
Bit() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.Bit
 
BitConverter - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.util
 
BitConverter() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
 
BitReader - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.util
A class is used to extract values across byte boundaries with arbitrary bit positions.
BitReader(byte[], int) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitReader
Initializes a new instance of the BitReader class with specified bytes buffer and start position in byte.
BitWriter - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.util
 
BitWriter(int) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitWriter
Initializes a new instance of the BitWriter class with specified buffer size in byte.
blobExtendedGUID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectDataBLOBReference
 
body - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectPropSet
 
body - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfContextIDs
 
body - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfOIDs
 
body - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfOSIDs
 
BoilerpipeContentHandler - Class in org.apache.tika.parser.html
Uses the boilerpipe library to automatically extract the main content from a web page.
BoilerpipeContentHandler(ContentHandler) - Constructor for class org.apache.tika.parser.html.BoilerpipeContentHandler
Creates a new boilerpipe-based content extractor, using the DefaultExtractor extraction rules and "delegate" as the content handler.
BoilerpipeContentHandler(Writer) - Constructor for class org.apache.tika.parser.html.BoilerpipeContentHandler
Creates a content handler that writes XHTML body character events to the given writer.
BoilerpipeContentHandler(ContentHandler, BoilerpipeExtractor) - Constructor for class org.apache.tika.parser.html.BoilerpipeContentHandler
Creates a new boilerpipe-based content extractor, using the given extraction rules.
boolValue - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyID
 
BouncyCastleDigester - Class in org.apache.tika.parser.utils
Digester that relies on BouncyCastle for MessageDigest implementations.
BouncyCastleDigester(int, String) - Constructor for class org.apache.tika.parser.utils.BouncyCastleDigester
Include a string representing the comma-separated algorithms to run: e.g.
BPGParser - Class in org.apache.tika.parser.image
Parser for the Better Portable Graphics )BPG) File Format.
BPGParser() - Constructor for class org.apache.tika.parser.image.BPGParser
 
BPListDetector - Class in org.apache.tika.parser.apple
Detector for BPList with utility functions for PList.
BPListDetector() - Constructor for class org.apache.tika.parser.apple.BPListDetector
 
Build(byte[]) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.IntermediateNodeObject.RootNodeObjectBuilder
This method is used to build a root node object from a byte array
Build(List<ObjectGroupDataElementData>, ObjectGroupObjectData, ExGuid) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.LeafNodeObject.IntermediateNodeObjectBuilder
This method is used to build intermediate node object from an list of object group data element
Build(byte[], SignatureObject) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.LeafNodeObject.IntermediateNodeObjectBuilder
This method is used to build intermediate node object from a byte array with a signature
build(NodeObject) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupDataElementData.Builder
This method is used to build a list of DataElement from a node object
buildDataElements(byte[], AtomicReference<ExGuid>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
This method is used to build a list of data elements to represent a file.
Builder() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupDataElementData.Builder
 
buildParagraphTagAndStyle(String, boolean) - Static method in class org.apache.tika.parser.microsoft.WordExtractor
Given a style name, return what tag should be used, and what style should be applied to it.
buildXHTML(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
Populates the XHTMLContentHandler object received as parameter.
buildXHTML(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.POIXMLTextExtractorDecorator
 
buildXHTML(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.SXSLFPowerPointExtractorDecorator
 
buildXHTML(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.SXWPFWordExtractorDecorator
 
buildXHTML(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.xps.XPSExtractorDecorator
 
buildXHTML(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.XSLFPowerPointExtractorDecorator
 
buildXHTML(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFBExcelExtractorDecorator
 
buildXHTML(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
 
buildXHTML(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.XWPFWordExtractorDecorator
 
BYTE_ARRAY_LENGHT - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
ByteUtil - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.util
 
ByteUtil() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.ByteUtil
 

C

canRun() - Static method in class org.apache.tika.parser.journal.GrobidRESTParser
 
CaptionObject - Class in org.apache.tika.parser.captioning
A model for caption objects from graphics and texts typically includes human readable sentence, language of the sentence and confidence score.
CaptionObject(String, String, double) - Constructor for class org.apache.tika.parser.captioning.CaptionObject
 
cb - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.PrtFourBytesOfLengthFollowedByData
 
Cell - Interface in org.apache.tika.parser.microsoft
Cell of content.
cell(String, String, XSSFComment) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.SheetTextAsHTML
 
CellDecorator - Class in org.apache.tika.parser.microsoft
Cell decorator.
CellDecorator(Cell) - Constructor for class org.apache.tika.parser.microsoft.CellDecorator
 
CellID - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
 
CellID(ExGuid, ExGuid) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellID
Initializes a new instance of the CellID class with specified ExGuids.
CellID(CellID) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellID
Initializes a new instance of the CellID class, this is the copy constructor.
CellID() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellID
Initializes a new instance of the CellID class, this is default constructor.
cellID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexCellMapping
 
cellID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestRootDeclare
 
CellIDArray - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
 
CellIDArray(long, List<CellID>) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellIDArray
Initializes a new instance of the CellIDArray class.
CellIDArray(CellIDArray) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellIDArray
Initializes a new instance of the CellIDArray class, this is copy constructor.
CellIDArray() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellIDArray
Initializes a new instance of the CellIDArray class, this is default constructor.
cellIDArray - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectData
 
cellIDArray - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectDataBLOBReference
 
CellManifestCurrentRevision - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
 
CellManifestCurrentRevision() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.CellManifestCurrentRevision
Initializes a new instance of the CellManifestCurrentRevision class.
cellManifestCurrentRevision - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.CellManifestDataElementData
 
cellManifestCurrentRevisionExGuid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.CellManifestCurrentRevision
 
CellManifestDataElementData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
Cell manifest data element
CellManifestDataElementData() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.CellManifestDataElementData
Initializes a new instance of the CellManifestDataElementData class.
cellManifests - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStorePackage
 
cellMappingExGuid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexCellMapping
 
cellMappingSerialNumber - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexCellMapping
 
cellReferencesCount - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectBLOBDataDeclaration
 
cellReferencesCount - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectDeclare
 
CellSecondExGuid - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
 
characters(char[], int, int) - Method in class org.apache.tika.parser.ctakes.CTAKESContentHandler
 
characters(char[], int, int) - Method in class org.apache.tika.parser.dif.DIFContentHandler
 
characters(char[], int, int) - Method in class org.apache.tika.parser.html.BoilerpipeContentHandler
 
characters(char[], int, int) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
 
characters(char[], int, int) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
 
characters(char[], int, int) - Method in class org.apache.tika.parser.mif.MIFContentHandler
 
characters(char[], int, int) - Method in class org.apache.tika.parser.xliff.XLIFF12ContentHandler
 
characters(char[], int, int) - Method in class org.apache.tika.parser.xml.AttributeDependantMetadataHandler
 
characters(char[], int, int) - Method in class org.apache.tika.parser.xml.ElementMetadataHandler
 
characters(char[], int, int) - Method in class org.apache.tika.parser.xml.MetadataHandler
Deprecated.
 
CharsetDetector - Class in org.apache.tika.parser.txt
CharsetDetector provides a facility for detecting the charset or encoding of character data in an unknown format.
CharsetDetector() - Constructor for class org.apache.tika.parser.txt.CharsetDetector
Constructor
CharsetDetector(int) - Constructor for class org.apache.tika.parser.txt.CharsetDetector
 
CharsetMatch - Class in org.apache.tika.parser.txt
This class represents a charset that has been identified by a CharsetDetector as a possible encoding for a set of input data.
check(Metadata) - Method in class org.apache.tika.parser.pdf.AccessChecker
Checks to see if a document's content should be extracted based on metadata values and the value of AccessChecker.allowAccessibility in the constructor.
checkAvail() - Method in class org.apache.tika.parser.geo.topic.gazetteer.GeoGazetteerClient
Ping lucene-geo-gazetteer API
checkBit(int) - Method in class org.apache.tika.parser.chm.lzx.ChmSection
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.captioning.tf.TensorflowRESTCaptioner
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.jdbc.SQLite3Parser
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.pdf.PDFParser
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.recognition.ObjectRecognitionParser
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.recognition.tf.TensorflowImageRecParser
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.sentiment.SentimentAnalysisParser
 
CHM_ITSF_V2_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
CHM_ITSF_V3_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
CHM_ITSP_V1_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
CHM_LZXC_MIN_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
CHM_LZXC_RESETTABLE_V1_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
CHM_LZXC_V2_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
CHM_PMGI_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
CHM_PMGI_MARKER - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
CHM_PMGL_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
CHM_SIGNATURE_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
CHM_VER_1 - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
CHM_VER_2 - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
CHM_VER_3 - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
CHM_WINDOW_SIZE_BLOCK - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
ChmAccessor<T> - Interface in org.apache.tika.parser.chm.accessor
Defines an accessor interface
ChmAssert - Class in org.apache.tika.parser.chm.assertion
Contains chm extractor assertions
ChmAssert() - Constructor for class org.apache.tika.parser.chm.assertion.ChmAssert
 
ChmBlockInfo - Class in org.apache.tika.parser.chm.lzx
A container that contains chm block information such as: i.
ChmCommons - Class in org.apache.tika.parser.chm.core
 
ChmCommons.EntryType - Enum in org.apache.tika.parser.chm.core
Represents entry types: uncompressed, compressed
ChmCommons.IntelState - Enum in org.apache.tika.parser.chm.core
Represents intel file states during decompression
ChmCommons.LzxState - Enum in org.apache.tika.parser.chm.core
Represents lzx states: started decoding, not started decoding
ChmConstants - Class in org.apache.tika.parser.chm.core
 
ChmDirectoryListingSet - Class in org.apache.tika.parser.chm.accessor
Holds chm listing entries
ChmDirectoryListingSet(byte[], ChmItsfHeader, ChmItspHeader) - Constructor for class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
Constructs chm directory listing set
ChmExtractor - Class in org.apache.tika.parser.chm.core
Extracts text from chm file.
ChmExtractor(InputStream) - Constructor for class org.apache.tika.parser.chm.core.ChmExtractor
 
ChmItsfHeader - Class in org.apache.tika.parser.chm.accessor
The Header 0000: char[4] 'ITSF' 0004: DWORD 3 (Version number) 0008: DWORD Total header length, including header section table and following data.
ChmItsfHeader() - Constructor for class org.apache.tika.parser.chm.accessor.ChmItsfHeader
 
ChmItspHeader - Class in org.apache.tika.parser.chm.accessor
Directory header The directory starts with a header; its format is as follows: 0000: char[4] 'ITSP' 0004: DWORD Version number 1 0008: DWORD Length of the directory header 000C: DWORD $0a (unknown) 0010: DWORD $1000 Directory chunk size 0014: DWORD "Density" of quickref section, usually 2 0018: DWORD Depth of the index tree - 1 there is no index, 2 if there is one level of PMGI chunks 001C: DWORD Chunk number of root index chunk, -1 if there is none (though at least one file has 0 despite there being no index chunk, probably a bug) 0020: DWORD Chunk number of first PMGL (listing) chunk 0024: DWORD Chunk number of last PMGL (listing) chunk 0028: DWORD -1 (unknown) 002C: DWORD Number of directory chunks (total) 0030: DWORD Windows language ID 0034: GUID {5D02926A-212E-11D0-9DF9-00A0C922E6EC} 0044: DWORD $54 (This is the length again) 0048: DWORD -1 (unknown) 004C: DWORD -1 (unknown) 0050: DWORD -1 (unknown)
ChmItspHeader() - Constructor for class org.apache.tika.parser.chm.accessor.ChmItspHeader
 
ChmLzxBlock - Class in org.apache.tika.parser.chm.lzx
Decompresses a chm block.
ChmLzxBlock(int, byte[], long, ChmLzxBlock) - Constructor for class org.apache.tika.parser.chm.lzx.ChmLzxBlock
 
ChmLzxcControlData - Class in org.apache.tika.parser.chm.accessor
::DataSpace/Storage//ControlData This file contains $20 bytes of information on the compression.
ChmLzxcControlData() - Constructor for class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
 
ChmLzxcResetTable - Class in org.apache.tika.parser.chm.accessor
LZXC reset table For ensuring a decompression.
ChmLzxcResetTable() - Constructor for class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
 
ChmLzxState - Class in org.apache.tika.parser.chm.lzx
 
ChmLzxState(int) - Constructor for class org.apache.tika.parser.chm.lzx.ChmLzxState
 
ChmParser - Class in org.apache.tika.parser.chm
 
ChmParser() - Constructor for class org.apache.tika.parser.chm.ChmParser
 
ChmParsingException - Exception in org.apache.tika.parser.chm.exception
 
ChmParsingException(String) - Constructor for exception org.apache.tika.parser.chm.exception.ChmParsingException
 
ChmPmgiHeader - Class in org.apache.tika.parser.chm.accessor
Description Note: not always exists An index chunk has the following format: 0000: char[4] 'PMGI' 0004: DWORD Length of quickref/free area at end of directory chunk 0008: Directory index entries (to quickref/free area) The quickref area in an PMGI is the same as in an PMGL The format of a directory index entry is as follows: BYTE: length of name BYTEs: name (UTF-8 encoded) ENCINT: directory listing chunk which starts with name Encoded Integers aka ENCINT An ENCINT is a variable-length integer.
ChmPmgiHeader() - Constructor for class org.apache.tika.parser.chm.accessor.ChmPmgiHeader
 
ChmPmglHeader - Class in org.apache.tika.parser.chm.accessor
Description There are two types of directory chunks -- index chunks, and listing chunks.
ChmPmglHeader() - Constructor for class org.apache.tika.parser.chm.accessor.ChmPmglHeader
 
ChmSection - Class in org.apache.tika.parser.chm.lzx
 
ChmSection(byte[]) - Constructor for class org.apache.tika.parser.chm.lzx.ChmSection
 
ChmSection(byte[], byte[]) - Constructor for class org.apache.tika.parser.chm.lzx.ChmSection
 
ChmWrapper - Class in org.apache.tika.parser.chm.core
 
ChmWrapper() - Constructor for class org.apache.tika.parser.chm.core.ChmWrapper
 
chunking() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.AbstractChunking
This method is used to chunk the file data.
chunking() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.RDCAnalysisChunking
This method is used to chunk the file data.
chunking() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.SimpleChunking
This method is used to chunk the file data.
chunking() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.ZipFilesChunking
This method is used to chunk the file data.
ChunkingFactory - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking
This class is used to create instance of AbstractChunking.
ChunkingMethod - Enum in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking
 
ClassParser - Class in org.apache.tika.parser.asm
Parser for Java .class files.
ClassParser() - Constructor for class org.apache.tika.parser.asm.ClassParser
 
clearBit(byte[], long) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.Bit
Set a bit value to "Off" in the specified byte array with the specified bit position.
clone() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
closeStyleTags(XHTMLContentHandler, Deque<FormattingUtils.Tag>) - Static method in class org.apache.tika.parser.microsoft.FormattingUtils
Closes all formatting tags.
CommonsDigester - Class in org.apache.tika.parser.utils
Implementation of DigestingParser.Digester that relies on commons.codec.digest.DigestUtils to calculate digest hashes.
CommonsDigester(int, String) - Constructor for class org.apache.tika.parser.utils.CommonsDigester
Include a string representing the comma-separated algorithms to run: e.g.
CommonsDigester(int, CommonsDigester.DigestAlgorithm...) - Constructor for class org.apache.tika.parser.utils.CommonsDigester
CommonsDigester.DigestAlgorithm - Enum in org.apache.tika.parser.utils
 
COMP_OBJ - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
Some other kind of embedded document, in a CompObj container within another OLE2 document
Compact64bitInt - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
A 9-byte encoding of values in the range 0x0002000000000000 through 0xFFFFFFFFFFFFFFFF
Compact64bitInt(long) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
Initializes a new instance of the Compact64bitInt class with specified value.
Compact64bitInt() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
Initializes a new instance of the Compact64bitInt class, this is the default constructor.
CompactID - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
This class is used to represent the CompactID structrue.
CompactID() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CompactID
 
CompactUint14bitType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
Specify the type value for compact uint 14 bits type value.
CompactUint21bitType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
Specify the type value for compact uint 21 bits type value.
CompactUint28bitType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
Specify the type value for compact uint 28 bits type value.
CompactUint35bitType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
Specify the type value for compact uint 35 bits type value.
CompactUint42bitType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
Specify the type value for compact uint 42 bits type value.
CompactUint49bitType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
Specify the type value for compact uint 49 bits type value.
CompactUint64bitType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
Specify the type value for compact uint 64 bits type value.
CompactUint7bitType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
Specify the type value for compact uint 7 bits type value.
CompactUintNullType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
Specify the type value for compact uint zero type value.
compare(long, long) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
 
compareTo(CSVResult) - Method in class org.apache.tika.parser.csv.CSVResult
Sorts in descending order of confidence
compareTo(ExtendedGUID) - Method in class org.apache.tika.parser.microsoft.onenote.ExtendedGUID
 
compareTo(UByte) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
 
compareTo(UInteger) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
 
compareTo(ULong) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
 
compareTo(UShort) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
 
compareTo(GUID) - Method in class org.apache.tika.parser.microsoft.onenote.GUID
 
compareTo(CharsetMatch) - Method in class org.apache.tika.parser.txt.CharsetMatch
Compare to other CharsetMatch objects.
CompositeTagHandler - Class in org.apache.tika.parser.mp3
Takes an array of ID3Tags in preference order, and when asked for a given tag, will return it from the first ID3Tags that has it.
CompositeTagHandler(ID3Tags[]) - Constructor for class org.apache.tika.parser.mp3.CompositeTagHandler
 
compound - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart
Gets or sets a value that specifies if set a compound parse type is needed and MUST be ended with either an 8-bit stream object header end or a 16-bit stream object header end.
CompressorParser - Class in org.apache.tika.parser.pkg
Parser for various compression formats.
CompressorParser() - Constructor for class org.apache.tika.parser.pkg.CompressorParser
 
CompressorParserOptions - Interface in org.apache.tika.parser.pkg
Interface for setting options for the CompressorParser by passing via the ParseContext.
confidence - Variable in class org.apache.tika.parser.recognition.RecognisedObject
Confidence score
config - Variable in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
 
configure(ParseContext) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
Checks to see if the user has specified an OfficeParserConfig.
configure(PDF2XHTML) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
Configures the given pdf2XHTML.
configureExtractor(POIXMLTextExtractor, Locale) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFBExcelExtractorDecorator
 
configureExtractor(POIXMLTextExtractor, Locale) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
 
contains(Charset) - Method in class org.apache.tika.parser.html.charsetdetector.charsets.ReplacementCharset
 
contains(Charset) - Method in class org.apache.tika.parser.html.charsetdetector.charsets.XUserDefinedCharset
 
containsEmail(String) - Static method in class org.apache.tika.parser.mail.MailUtil
If the chunk looks like it contains an email
CONTENT - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
content - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.BinaryItem
 
content - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellIDArray
 
content - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGUIDArray
Gets or sets an extended GUID array
contextIDs - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectPropSet
 
CONTROL_DATA - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
converttoInt(byte[]) - Static method in class org.apache.tika.parser.image.ICNSType
 
convertToJSONArray(JSONObject, String) - Method in class org.apache.tika.parser.ner.grobid.GrobidNERecogniser
Converts JSON Object to JSON Array
convertToJSONObject(String) - Method in class org.apache.tika.parser.ner.grobid.GrobidNERecogniser
Parses a JSON String and converts it to a JSON Object
copyOfRange(byte[], int, int) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
 
CoreNLPNERecogniser - Class in org.apache.tika.parser.ner.corenlp
This class offers an implementation of NERecogniser based on CRF classifiers from Stanford CoreNLP.
CoreNLPNERecogniser() - Constructor for class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
 
CoreNLPNERecogniser(String) - Constructor for class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
Creates a NERecogniser by loading model from given path
count - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellIDArray
 
count - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGUIDArray
 
count - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamHeader
 
cProperties - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.PrtArrayOfPropertyValues
 
cProperties - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.PropertySet
 
createCellMainifestDataElement(ExGuid, Map<CellID, ExGuid>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
This method is used to create the cell manifest data element.
createChunkingInstance(byte[]) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.ChunkingFactory
This method is used to create the instance of AbstractChunking.
createChunkingInstance(IntermediateNodeObject) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.ChunkingFactory
This method is used to create the instance of AbstractChunking.
createChunkingInstance(byte[], ChunkingMethod) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.ChunkingFactory
This method is used to create the instance of AbstractChunking.
createDecryptStream(InputStream, Key) - Method in class org.apache.tika.parser.hwp.HwpTextExtractorV5
 
createFrameIfPresent(InputStream) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
Returns the next ID3v2 Frame in the file, or null if the next batch of data doesn't correspond to either an ID3v2 header.
createInstance(ObjectGroupDataElementData) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.HeaderCell
Create the instance of Header Cell.
createInstance(ExGuid, ObjectGroupDataElementData, boolean) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionStoreObjectGroup
 
createObjectGroupDataElement(byte[], AtomicReference<ExGuid>, List<ExGuid>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
This method is used to create object group data/blob element list.
createOneNoteDocumentFromDirectFileResource(OneNoteDirectFileResource) - Method in class org.apache.tika.parser.microsoft.onenote.OneNoteParser
Create a OneNoteDocument object.
createRevisionManifestDataElement(ExGuid, ExGuid, List<ExGuid>, Map<ExGuid, ExGuid>, AtomicReference<ExGuid>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
This method is used to create the revision manifest data element.
createStorageIndexDataElement(ExGuid, Map<CellID, ExGuid>, Map<ExGuid, ExGuid>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
This method is used to create the storage index data element.
createStorageManifestDataElement(Map<CellID, ExGuid>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
This method is used to create the storage manifest data element.
CSVParams - Class in org.apache.tika.parser.csv
 
CSVResult - Class in org.apache.tika.parser.csv
 
CSVResult(double, MediaType, Character) - Constructor for class org.apache.tika.parser.csv.CSVResult
 
CTAKES_META_PREFIX - Static variable in class org.apache.tika.parser.ctakes.CTAKESContentHandler
 
CTAKESAnnotationProperty - Enum in org.apache.tika.parser.ctakes
This enumeration includes the properties that an IdentifiedAnnotation object can provide.
CTAKESConfig - Class in org.apache.tika.parser.ctakes
Configuration for CTAKESContentHandler.
CTAKESConfig() - Constructor for class org.apache.tika.parser.ctakes.CTAKESConfig
Default constructor.
CTAKESConfig(InputStream) - Constructor for class org.apache.tika.parser.ctakes.CTAKESConfig
Loads properties from InputStream and then tries to close InputStream.
CTAKESContentHandler - Class in org.apache.tika.parser.ctakes
Class used to extract biomedical information while parsing.
CTAKESContentHandler(ContentHandler, Metadata, CTAKESConfig) - Constructor for class org.apache.tika.parser.ctakes.CTAKESContentHandler
Creates a new CTAKESContentHandler for the given ContentHandler and Metadata objects.
CTAKESContentHandler(ContentHandler, Metadata) - Constructor for class org.apache.tika.parser.ctakes.CTAKESContentHandler
Creates a new CTAKESContentHandler for the given ContentHandler and Metadata objects.
CTAKESContentHandler() - Constructor for class org.apache.tika.parser.ctakes.CTAKESContentHandler
Default constructor.
CTAKESParser - Class in org.apache.tika.parser.ctakes
CTAKESParser decorates a Parser and leverages on CTAKESContentHandler to extract biomedical information from clinical text using Apache cTAKES.
CTAKESParser() - Constructor for class org.apache.tika.parser.ctakes.CTAKESParser
Wraps the default Parser
CTAKESParser(TikaConfig) - Constructor for class org.apache.tika.parser.ctakes.CTAKESParser
Wraps the default Parser for this Config
CTAKESParser(Parser) - Constructor for class org.apache.tika.parser.ctakes.CTAKESParser
Wraps the specified Parser
CTAKESSerializer - Enum in org.apache.tika.parser.ctakes
Enumeration for types of cTAKES (UIMA) CAS serializer supported by cTAKES.
CTAKESUtils - Class in org.apache.tika.parser.ctakes
This class provides methods to extract biomedical information from plain text using CTAKESContentHandler that relies on Apache cTAKES.
CTAKESUtils() - Constructor for class org.apache.tika.parser.ctakes.CTAKESUtils
 

D

data - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.EightBytesOfData
 
data - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.FourBytesOfData
 
data - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.OneByteOfData
 
data - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.PrtArrayOfPropertyValues
 
data - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.PrtFourBytesOfLengthFollowedByData
 
data - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.TwoBytesOfData
 
data - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElement
 
data - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataHashObject
Gets or sets a binary item as specified in [MS-FSSHTTPB] section 2.2.1.3 that specifies a value that is unique to the file data represented by this root node object.
data - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectData
 
data - Variable in class org.apache.tika.parser.mp3.ID3v2Frame.RawTag
 
DataElement - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
 
DataElement(DataElementType, DataElementData) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElement
Initializes a new instance of the DataElement class.
DataElement() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElement
Initializes a new instance of the DataElement class.
DataElementData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
Base class of data element
DataElementData() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementData
 
dataElementExGuid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElement
 
DataElementHash - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
Specifies an data element hash stream object
DataElementHash() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementHash
Initializes a new instance of the DataElementHash class.
dataElementHash - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupDataElementData
 
dataElementHashData - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementHash
 
dataElementHashScheme - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementHash
 
dataElementPackage - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AlternativePackaging
 
DataElementPackage - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
 
DataElementPackage() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementPackage
Initializes a new instance of the DataElementHash class.
DataElementParseErrorException - Exception in org.apache.tika.parser.microsoft.onenote.fsshttpb.exception
 
DataElementParseErrorException(int, Exception) - Constructor for exception org.apache.tika.parser.microsoft.onenote.fsshttpb.exception.DataElementParseErrorException
 
DataElementParseErrorException(int, String, Exception) - Constructor for exception org.apache.tika.parser.microsoft.onenote.fsshttpb.exception.DataElementParseErrorException
 
dataElements - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementPackage
 
DataElementType - Enum in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
The enumeration of the data element type
dataElementType - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElement
 
DataElementUtils - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.util
 
DataElementUtils() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
 
dataHash - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.LeafNodeObject
 
DataHashObject - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
 
DataHashObject() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataHashObject
Initializes a new instance of the DataHashObject class.
DataNodeObjectData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
Data Node Object data
DataNodeObjectData(byte[], int, int) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.DataNodeObjectData
Initializes a new instance of the DataNodeObjectData class.
dataNodeObjectData - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.LeafNodeObject
 
dataRoot - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStorePackage
 
dataSize - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataSizeObject
 
dataSize - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.NodeObject
 
DataSizeObject - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
Data Size Object
DataSizeObject() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataSizeObject
Initializes a new instance of the DataSizeObject class.
DataURIScheme - Class in org.apache.tika.parser.utils
 
DataURISchemeParseException - Exception in org.apache.tika.parser.utils
 
DataURISchemeParseException(String) - Constructor for exception org.apache.tika.parser.utils.DataURISchemeParseException
 
DataURISchemeUtil - Class in org.apache.tika.parser.utils
Not thread safe.
DataURISchemeUtil() - Constructor for class org.apache.tika.parser.utils.DataURISchemeUtil
 
DATE - Static variable in interface org.apache.tika.parser.ner.NERecogniser
 
DATE_FILE - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
 
DBFParser - Class in org.apache.tika.parser.dbf
This is a Tika wrapper around the DBFReader.
DBFParser() - Constructor for class org.apache.tika.parser.dbf.DBFParser
 
DcXMLParser - Class in org.apache.tika.parser.xml
Dublin Core metadata parser
DcXMLParser() - Constructor for class org.apache.tika.parser.xml.DcXMLParser
 
decompressConcatenated(Metadata) - Method in interface org.apache.tika.parser.pkg.CompressorParserOptions
 
DEF_MODEL - Static variable in class org.apache.tika.parser.sentiment.SentimentAnalysisParser
 
DEFAULT_CHARSET - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
DEFAULT_MODEL_PATH - Static variable in class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
default Model path
DEFAULT_MODELS - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
 
DEFAULT_NER_IMPL - Static variable in class org.apache.tika.parser.ner.NamedEntityParser
 
DefaultHtmlMapper - Class in org.apache.tika.parser.html
The default HTML mapping rules in Tika.
DefaultHtmlMapper() - Constructor for class org.apache.tika.parser.html.DefaultHtmlMapper
 
DELIMITER_PROPERTY - Static variable in class org.apache.tika.parser.csv.TextAndCSVParser
 
deserializeDataElementDataFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.CellManifestDataElementData
Used to return the length of this element.
deserializeDataElementDataFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementData
De-serialize data element data from byte array.
deserializeDataElementDataFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupDataElementData
Used to return the length of this element.
deserializeDataElementDataFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestDataElementData
Used to return the length of this element.
deserializeDataElementDataFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexDataElementData
Used to de-serialize the data element.
deserializeDataElementDataFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestDataElementData
Used to de-serialize data element.
deserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.BasicObject
Used to return the length of this element.
deserializeFromByteArray(StreamObjectHeaderStart, byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObject
Used to return the length of this element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.CellManifestCurrentRevision
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElement
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementHash
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementPackage
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataHashObject
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataSizeObject
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.IntermediateNodeObject
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.LeafNodeObject
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupData
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupDeclarations
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupMetadata
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupMetadataDeclarations
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectBLOBDataDeclaration
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectData
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectDataBLOBReference
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectDeclare
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifest
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestObjectGroupReferences
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestRootDeclare
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.SignatureObject
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexCellMapping
Used to de-serialize the items.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexManifestMapping
Used to Deserialize the items.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexRevisionMapping
Used to de-serialize the items
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestRootDeclare
Used to de-serialize the items.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestSchemaGUID
Used to de-serialize the items.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObject
De-serialize items from byte array.
detect(InputStream, Metadata) - Method in class org.apache.tika.parser.apple.BPListDetector
 
detect(InputStream, Metadata) - Method in class org.apache.tika.parser.html.charsetdetector.StandardHtmlEncodingDetector
 
detect(InputStream, Metadata) - Method in class org.apache.tika.parser.html.HtmlEncodingDetector
 
detect(ZipFile) - Static method in enum org.apache.tika.parser.iwork.iwana.IWork13PackageParser.IWork13DocumentType
 
detect(ZipFile) - Static method in enum org.apache.tika.parser.iwork.iwana.IWork18PackageParser.IWork18DocumentType
 
detect(Set<String>) - Static method in class org.apache.tika.parser.microsoft.POIFSContainerDetector
Deprecated.
Use POIFSContainerDetector.detect(Set, DirectoryEntry) and pass the root entry of the filesystem whose type is to be detected, as a second argument.
detect(Set<String>, DirectoryEntry) - Static method in class org.apache.tika.parser.microsoft.POIFSContainerDetector
Internal detection of the specific kind of OLE2 document, based on the names of the top-level streams within the file.
detect(InputStream, Metadata) - Method in class org.apache.tika.parser.microsoft.POIFSContainerDetector
 
detect(InputStream, Metadata) - Method in class org.apache.tika.parser.pkg.StreamingZipContainerDetector
 
detect(InputStream, Metadata) - Method in class org.apache.tika.parser.pkg.ZipContainerDetector
 
detect() - Method in class org.apache.tika.parser.txt.CharsetDetector
Return the charset that best matches the supplied input data.
detect(InputStream, Metadata) - Method in class org.apache.tika.parser.txt.Icu4jEncodingDetector
 
detect(InputStream, Metadata) - Method in class org.apache.tika.parser.txt.UniversalEncodingDetector
 
detectAll() - Method in class org.apache.tika.parser.txt.CharsetDetector
Return an array of all charsets that appear to be plausible matches with the input data.
detectIfPossible(ZipEntry) - Static method in enum org.apache.tika.parser.iwork.iwana.IWork13PackageParser.IWork13DocumentType
 
detectIfPossible(ZipEntry) - Static method in enum org.apache.tika.parser.iwork.iwana.IWork18PackageParser.IWork18DocumentType
 
detectOfficeOpenXML(OPCPackage) - Static method in class org.apache.tika.parser.pkg.ZipContainerDetector
Detects the type of an OfficeOpenXML (OOXML) file from opened Package
detectType(ZipArchiveEntry, ZipFile) - Static method in enum org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
 
detectType(ZipArchiveEntry, ZipArchiveInputStream) - Static method in enum org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
 
detectType(InputStream) - Static method in enum org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
 
detectType(POIFSFileSystem) - Static method in enum org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
 
detectType(DirectoryEntry) - Static method in enum org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
 
DIFContentHandler - Class in org.apache.tika.parser.dif
 
DIFContentHandler(ContentHandler, Metadata) - Constructor for class org.apache.tika.parser.dif.DIFContentHandler
 
DIFParser - Class in org.apache.tika.parser.dif
 
DIFParser() - Constructor for class org.apache.tika.parser.dif.DIFParser
 
DirectoryListingEntry - Class in org.apache.tika.parser.chm.accessor
The format of a directory listing entry is as follows: BYTE: length of name BYTEs: name (UTF-8 encoded) ENCINT: content section ENCINT: offset ENCINT: length The offset is from the beginning of the content section the file is in, after the section has been decompressed (if appropriate).
DirectoryListingEntry() - Constructor for class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
 
DirectoryListingEntry(int, String, ChmCommons.EntryType, int, int) - Constructor for class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
Constructs directoryListingEntry
dispose() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitReader
Assign the internal read buffer to null.
DOC - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
Microsoft Word
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.ArrayNumber
This method is used to deserialize the number of array from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.EightBytesOfData
This method is used to deserialize the EightBytesOfData from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.FourBytesOfData
This method is used to deserialize the FourBytesOfData from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in interface org.apache.tika.parser.microsoft.onenote.fsshttpb.property.IProperty
This method is used to deserialize the property from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.NoData
This method is used to deserialize the NoData from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.OneByteOfData
This method is used to deserialize the OneByteOfData from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.PrtArrayOfPropertyValues
This method is used to deserialize the prtArrayOfPropertyValues from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.PrtFourBytesOfLengthFollowedByData
This method is used to deserialize the prtFourBytesOfLengthFollowedByData from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.TwoBytesOfData
This method is used to deserialize the TwoBytesOfData from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AlternativePackaging
This method is used to deserialize the Alternative Packaging object from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.BasicObject
Used to return the length of this element.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.BinaryItem
This method is used to de-serialize the BinaryItem basic object from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellID
This method is used to deserialize the CellID basic object from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellIDArray
This method is used to deserialize the CellIDArray basic object from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
This method is used to deserialize the Compact64bitInt basic object from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CompactID
This method is used to deserialize the CompactID object from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
This method is used to deserialize the ExGuid basic object from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGUIDArray
This method is used to deserialize the ExGUIDArray basic object from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.JCID
This method is used to deserialize the JCID object from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyID
This method is used to deserialize the PropertyID object from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.SerialNumber
This method is used to deserialize the SerialNumber basic object from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.PropertySet
This method is used to deserialize the PropertySet from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectPropSet
This method is used to deserialize the ObjectSpaceObjectPropSet from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamHeader
This method is used to deserialize the ObjectSpaceObjectStreamHeader object from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfContextIDs
This method is used to deserialize the ObjectSpaceObjectStreamOfContextIDs object from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfOIDs
This method is used to deserialize the ObjectSpaceObjectStreamOfOIDs object from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfOSIDs
This method is used to deserialize the ObjectSpaceObjectStreamOfOSIDs object from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderEnd16bit
This method is used to deserialize the StreamObjectHeaderEnd16bit basic object from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderEnd8bit
This method is used to deserialize the StreamObjectHeaderEnd8bit basic object from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart16bit
This method is used to deserialize the StreamObjectHeaderStart16bit basic object from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart32bit
This method is used to deserialize the StreamObjectHeaderStart32bit basic object from the specified byte array and start index.
doubleByte - Variable in class org.apache.tika.parser.mp3.ID3v2Frame.TextEncoding
 
doubleToInt64Bits(double) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
 
doubleValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
 
doubleValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
 
doubleValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
 
doubleValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
 
drawingHyperlinks - Variable in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
 
DWGParser - Class in org.apache.tika.parser.dwg
DWG (CAD Drawing) parser.
DWGParser() - Constructor for class org.apache.tika.parser.dwg.DWGParser
 

E

EightBytesOfData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.property
This class is used to represent the property contains 8 bytes of data in the PropertySet.rgData stream field.
EightBytesOfData() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.EightBytesOfData
 
ElementMetadataHandler - Class in org.apache.tika.parser.xml
SAX event handler that maps the contents of an XML element into a metadata field.
ElementMetadataHandler(String, String, Metadata, String) - Constructor for class org.apache.tika.parser.xml.ElementMetadataHandler
Constructor for string metadata keys.
ElementMetadataHandler(String, String, Metadata, String, boolean, boolean) - Constructor for class org.apache.tika.parser.xml.ElementMetadataHandler
Constructor for string metadata keys which allows change of behavior for duplicate and empty entry values.
ElementMetadataHandler(String, String, Metadata, Property) - Constructor for class org.apache.tika.parser.xml.ElementMetadataHandler
Constructor for Property metadata keys.
ElementMetadataHandler(String, String, Metadata, Property, boolean, boolean) - Constructor for class org.apache.tika.parser.xml.ElementMetadataHandler
Constructor for Property metadata keys which allows change of behavior for duplicate and empty entry values.
EMBEDDED_RELATIONSHIPS - Static variable in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
 
embeddedOLERef(String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
 
embeddedOLERef(String) - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
 
embeddedPicRef(String, String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
 
embeddedPicRef(String, String) - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
 
EMFParser - Class in org.apache.tika.parser.microsoft
Extracts files embedded in EMF and offers a very rough capability to extract text if there is text stored in the EMF.
EMFParser() - Constructor for class org.apache.tika.parser.microsoft.EMFParser
 
EMPTY_LIST - Static variable in class org.apache.tika.parser.microsoft.ooxml.XWPFListManager
Empty singleton to be used when there is no list manager.
EMPTY_STYLES - Static variable in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFStylesShim
Empty singleton to be used when there is no style info
emptyGuid() - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.GuidUtil
 
enableInputFilter(boolean) - Method in class org.apache.tika.parser.txt.CharsetDetector
Enable filtering of input text.
encoding - Variable in class org.apache.tika.parser.mp3.ID3v2Frame.TextEncoding
 
encodings - Static variable in class org.apache.tika.parser.mp3.ID3v2Frame
 
encryptionObjects - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionStoreObjectGroup
 
endBookmark(String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
 
endBookmark(String) - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
 
endDocument() - Method in class org.apache.tika.parser.ctakes.CTAKESContentHandler
 
endDocument() - Method in class org.apache.tika.parser.dif.DIFContentHandler
 
endDocument() - Method in class org.apache.tika.parser.html.BoilerpipeContentHandler
 
endDocument() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
 
endDocument() - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
 
endDocument() - Method in class org.apache.tika.parser.mif.MIFContentHandler
 
endDocument() - Method in class org.apache.tika.parser.xliff.XLIFF12ContentHandler
 
endEditedSection() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
 
endEditedSection() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
 
endElement(String, String, String) - Method in class org.apache.tika.parser.dif.DIFContentHandler
 
endElement(String, String, String) - Method in class org.apache.tika.parser.html.BoilerpipeContentHandler
 
endElement(String, String, String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
 
endElement(String, String, String) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
 
endElement(String, String, String) - Method in class org.apache.tika.parser.mif.MIFContentHandler
 
endElement(String, String, String) - Method in class org.apache.tika.parser.odf.NSNormalizerContentHandler
 
endElement(String, String, String) - Method in class org.apache.tika.parser.xliff.XLIFF12ContentHandler
 
endElement(String, String, String) - Method in class org.apache.tika.parser.xml.AttributeDependantMetadataHandler
 
endElement(String, String, String) - Method in class org.apache.tika.parser.xml.ElementMetadataHandler
 
endElement(String, String, String) - Method in class org.apache.tika.parser.xml.MetadataHandler
Deprecated.
 
ENDIAN - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
 
endnoteReference(String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
 
endnoteReference(String) - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
 
endParagraph() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
 
endParagraph() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
 
endPrefixMapping(String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
 
endPrefixMapping(String) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
 
endRow(int) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.SheetTextAsHTML
 
endSDT() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
 
endSDT() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
 
endTable() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
 
endTable() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
 
endTableCell() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
 
endTableCell() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
 
endTableRow() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
 
endTableRow() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
 
ensureFormattingState(XHTMLContentHandler, EnumSet<FormattingUtils.Tag>, Deque<FormattingUtils.Tag>) - Static method in class org.apache.tika.parser.microsoft.FormattingUtils
Closes all tags until currentState contains only tags from desired set, then open all required tags to reach desired state.
ensureSkip(long) - Method in class org.apache.tika.parser.hwp.HwpStreamReader
ensure skip of n byte
ENTITY_LOCAL_NAMES - Static variable in class org.apache.tika.parser.xml.XMLProfiler
 
ENTITY_TYPES - Static variable in class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
 
ENTITY_TYPES - Static variable in class org.apache.tika.parser.ner.grobid.GrobidNERecogniser
 
ENTITY_TYPES - Static variable in class org.apache.tika.parser.ner.mitie.MITIENERecogniser
 
ENTITY_TYPES - Static variable in class org.apache.tika.parser.ner.nltk.NLTKNERecogniser
some common entities identified by NLTK
ENTITY_URIS - Static variable in class org.apache.tika.parser.xml.XMLProfiler
 
entityTypes - Variable in class org.apache.tika.parser.ner.regex.RegexNERecogniser
 
enumerateChm() - Method in class org.apache.tika.parser.chm.core.ChmExtractor
Enumerates chm entities
ENVI_MIME_TYPE - Static variable in class org.apache.tika.parser.envi.EnviHeaderParser
 
EnviHeaderParser - Class in org.apache.tika.parser.envi
 
EnviHeaderParser() - Constructor for class org.apache.tika.parser.envi.EnviHeaderParser
 
EnviHeaderParser(EncodingDetector) - Constructor for class org.apache.tika.parser.envi.EnviHeaderParser
 
EpubContentParser - Class in org.apache.tika.parser.epub
Parser for EPUB OPS *.html files.
EpubContentParser() - Constructor for class org.apache.tika.parser.epub.EpubContentParser
 
EpubParser - Class in org.apache.tika.parser.epub
Epub parser
EpubParser() - Constructor for class org.apache.tika.parser.epub.EpubParser
 
equals(Object) - Method in class org.apache.tika.parser.csv.CSVResult
 
equals(Object) - Method in class org.apache.tika.parser.microsoft.onenote.ExtendedGUID
 
equals(Object) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellID
Override the Equals method.
equals(Object) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
Override the Equals method.
equals(Object) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataHashObject
 
equals(Object) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
 
equals(Object) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
 
equals(Object) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
 
equals(Object) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
 
equals(Object) - Method in class org.apache.tika.parser.microsoft.onenote.GUID
 
equals(Object) - Method in class org.apache.tika.parser.pdf.AccessChecker
 
equals(Object) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
 
equals(Object) - Method in class org.apache.tika.parser.txt.CharsetMatch
compare this CharsetMatch to another based on confidence value
equals(Object) - Method in class org.apache.tika.parser.utils.DataURIScheme
 
Error - Enum in org.apache.tika.parser.microsoft.onenote
 
ExcelExtractor - Class in org.apache.tika.parser.microsoft
Excel parser implementation which uses POI's Event API to handle the contents of a Workbook.
ExcelExtractor(ParseContext, Metadata) - Constructor for class org.apache.tika.parser.microsoft.ExcelExtractor
 
ExecutableParser - Class in org.apache.tika.parser.executable
Parser for executable files.
ExecutableParser() - Constructor for class org.apache.tika.parser.executable.ExecutableParser
 
exGuid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.DataNodeObjectData
 
ExGuid - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
 
ExGuid(int, UUID) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
Initializes a new instance of the ExGuid class with specified value.
ExGuid(ExGuid) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
Initializes a new instance of the ExGuid class, this is the copy constructor.
ExGuid() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
Initializes a new instance of the ExGuid class, this is a default constructor.
exGuid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.NodeObject
 
ExGUIDArray - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
 
ExGUIDArray(List<ExGuid>) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGUIDArray
Initializes a new instance of the ExGUIDArray class with specified value.
ExGUIDArray(ExGUIDArray) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGUIDArray
Initializes a new instance of the ExGUIDArray class, this is copy constructor.
ExGUIDArray() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGUIDArray
Initializes a new instance of the ExGUIDArray class, this is the default constructor.
ExtendedGUID - Class in org.apache.tika.parser.microsoft.onenote
 
ExtendedGUID() - Constructor for class org.apache.tika.parser.microsoft.onenote.ExtendedGUID
 
ExtendedGUID(GUID, long) - Constructor for class org.apache.tika.parser.microsoft.onenote.ExtendedGUID
 
ExtendedGUID10BitUintType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
Specify the extended GUID 10 Bit int type value.
ExtendedGUID17BitUintType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
Specify the extended GUID 17 Bit int type value.
ExtendedGUID32BitUintType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
Specify the extended GUID 32 Bit int type value.
ExtendedGUID5BitUintType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
Specify the extended GUID 5 Bit int type value.
ExtendedGUIDNullType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
Specify the extended GUID null type value.
extendedStreamsPresent - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamHeader
 
extendGUID1 - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellID
 
extendGUID2 - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellID
 
EXTENSION_TAG_EXIF - Static variable in class org.apache.tika.parser.image.BPGParser
 
EXTENSION_TAG_ICC_PROFILE - Static variable in class org.apache.tika.parser.image.BPGParser
 
EXTENSION_TAG_THUMBNAIL - Static variable in class org.apache.tika.parser.image.BPGParser
 
EXTENSION_TAG_XMP - Static variable in class org.apache.tika.parser.image.BPGParser
 
EXTRA_BITS - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
extract(InputStream, Metadata, XHTMLContentHandler) - Method in class org.apache.tika.parser.hwp.HwpTextExtractorV5
extract Text from HWP Stream.
extract(Metadata) - Method in class org.apache.tika.parser.microsoft.ooxml.MetadataExtractor
 
extract(String) - Method in class org.apache.tika.parser.utils.DataURISchemeUtil
Extracts DataURISchemes from free text, as in javascript.
extractChmEntry(DirectoryListingEntry) - Method in class org.apache.tika.parser.chm.core.ChmExtractor
Decompresses a chm entry
extractDublinCore(XMPMetadata, Metadata) - Static method in class org.apache.tika.parser.image.xmp.JempboxExtractor
Tries to extract Dublin Core schema from XMP.
extractDublinCoreSchema(XMPMetadata, Metadata) - Static method in class org.apache.tika.parser.indesign.xmp.XMPMetadataExtractor
Extracts Dublin Core.
extractGenre(String) - Static method in class org.apache.tika.parser.mp3.ID3v22Handler
 
extractHeaderFooter(String, XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFBExcelExtractorDecorator
 
extractHeaderFooter(String, XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
 
extractHyperLinks(PackagePart, XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
 
extractMacros(POIFSFileSystem, ContentHandler, EmbeddedDocumentExtractor) - Static method in class org.apache.tika.parser.microsoft.OfficeParser
Helper to extract macros from an NPOIFS/vbaProject.bin As of POI-3.15-final, there are still some bugs in VBAMacroReader.
extractor - Variable in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
 
extractXMPBasicSchema(XMPMetadata, Metadata) - Static method in class org.apache.tika.parser.indesign.xmp.XMPMetadataExtractor
Extracts basic schema metadata from XMP.
extractXMPMM(XMPMetadata, Metadata) - Static method in class org.apache.tika.parser.image.xmp.JempboxExtractor
Extracts Media Management metadata from XMP.

F

FeedParser - Class in org.apache.tika.parser.feed
Feed parser.
FeedParser() - Constructor for class org.apache.tika.parser.feed.FeedParser
 
FictionBookParser - Class in org.apache.tika.parser.xml
 
FictionBookParser() - Constructor for class org.apache.tika.parser.xml.FictionBookParser
 
FileConfig - Class in org.apache.tika.parser.strings
Configuration for the "file" (or file-alternative) command.
FileConfig() - Constructor for class org.apache.tika.parser.strings.FileConfig
Default constructor.
fileContent - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.AbstractChunking
 
fileDataObject - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionStoreObject
 
findIconType(byte[]) - Static method in class org.apache.tika.parser.image.ICNSType
 
findMatches(String, Pattern) - Method in class org.apache.tika.parser.ner.regex.RegexNERecogniser
finds matching sub groups in text
findNames(String[]) - Method in class org.apache.tika.parser.ner.opennlp.OpenNLPNameFinder
finds names from given array of tokens
findStorageIndexCellMapping(CellID) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStorePackage
This method is used to find the Storage Index Cell Mapping matches the Cell ID.
findStorageIndexRevisionMapping(ExGuid) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStorePackage
This method is used to find the Storage Index Revision Mapping that matches the Revision Mapping Extended GUID.
flag - Variable in class org.apache.tika.parser.mp3.ID3v2Frame.RawTag
 
FlatOpenDocumentParser - Class in org.apache.tika.parser.odf
 
FlatOpenDocumentParser() - Constructor for class org.apache.tika.parser.odf.FlatOpenDocumentParser
 
floatValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
 
floatValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
 
floatValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
 
floatValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
 
FLVParser - Class in org.apache.tika.parser.video
Parser for metadata contained in Flash Videos (.flv).
FLVParser() - Constructor for class org.apache.tika.parser.video.FLVParser
 
footers - Variable in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.SheetTextAsHTML
 
footnoteReference(String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
 
footnoteReference(String) - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
 
format(Object, StringBuffer, FieldPosition) - Method in class org.apache.tika.parser.microsoft.TikaExcelGeneralFormat
 
formatRawCellContents(double, int, String, boolean) - Method in class org.apache.tika.parser.microsoft.TikaExcelDataFormatter
 
formatter - Variable in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
 
FormattingUtils - Class in org.apache.tika.parser.microsoft
 
FormattingUtils.Tag - Enum in org.apache.tika.parser.microsoft
 
FourBytesOfData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.property
This class is used to represent the property contains 4 bytes of data in the PropertySet.rgData stream field.
FourBytesOfData() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.FourBytesOfData
 
fromCurlyBraceUTF16Bytes(byte[]) - Static method in class org.apache.tika.parser.microsoft.onenote.GUID
Converts a GUID of format: {AAAAAAAA-BBBB-CCCC-DDDD-EEEEEEEEEEEE} (in bytes) to a GUID object.
fromIntVal(int) - Static method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.DataElementType
 
fromIntVal(int) - Static method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyType
 
fromIntVal(int) - Static method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderEnd
 
fromIntVal(int) - Static method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
 

G

GDALParser - Class in org.apache.tika.parser.gdal
Wraps execution of the Geospatial Data Abstraction Library (GDAL) gdalinfo tool used to extract geospatial information out of hundreds of geo file formats.
GDALParser() - Constructor for class org.apache.tika.parser.gdal.GDALParser
 
GENERAL_EMBEDDED - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
General embedded document type within an OLE2 container
GENRES - Static variable in interface org.apache.tika.parser.mp3.ID3Tags
List of predefined genres.
GeoGazetteerClient - Class in org.apache.tika.parser.geo.topic.gazetteer
 
GeoGazetteerClient(String) - Constructor for class org.apache.tika.parser.geo.topic.gazetteer.GeoGazetteerClient
Pass URL on which lucene-geo-gazetteer is available - eg.
GeoGazetteerClient(GeoParserConfig) - Constructor for class org.apache.tika.parser.geo.topic.gazetteer.GeoGazetteerClient
 
GeographicInformationParser - Class in org.apache.tika.parser.geoinfo
 
GeographicInformationParser() - Constructor for class org.apache.tika.parser.geoinfo.GeographicInformationParser
 
geoInfoType - Static variable in class org.apache.tika.parser.geoinfo.GeographicInformationParser
 
GeoParser - Class in org.apache.tika.parser.geo.topic
 
GeoParser() - Constructor for class org.apache.tika.parser.geo.topic.GeoParser
 
GeoParserConfig - Class in org.apache.tika.parser.geo.topic
 
GeoParserConfig() - Constructor for class org.apache.tika.parser.geo.topic.GeoParserConfig
 
GeoTag - Class in org.apache.tika.parser.geo.topic
 
GeoTag() - Constructor for class org.apache.tika.parser.geo.topic.GeoTag
 
get() - Method in enum org.apache.tika.parser.strings.StringsEncoding
 
get7BitsInt(byte[], int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
AKA a Synchsafe integer.
getAccessChecker() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
 
getAdmin1Code() - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
 
getAdmin2Code() - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
 
getAeDescriptorPath() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
Returns the path to XML descriptor for AnalysisEngine.
getAlbum() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
 
getAlbum() - Method in interface org.apache.tika.parser.mp3.ID3Tags
 
getAlbum() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
 
getAlbum() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
 
getAlbum() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
 
getAlbum() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
 
getAlbumArtist() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
 
getAlbumArtist() - Method in interface org.apache.tika.parser.mp3.ID3Tags
The Artist for the overall album / compilation of albums
getAlbumArtist() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
ID3v1 doesn't have album-wide artists, so returns null;
getAlbumArtist() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
 
getAlbumArtist() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
 
getAlbumArtist() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
 
getAlignedLenTable() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
getAlignedTreeTable() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
getAllDetectableCharsets() - Static method in class org.apache.tika.parser.txt.CharsetDetector
Get the names of all charsets supported by CharsetDetector class.
getAllNameEntitiesfromInput(InputStream) - Method in class org.apache.tika.parser.geo.topic.NameEntityExtractor
 
getAllTagHandlers(InputStream, ContentHandler) - Static method in class org.apache.tika.parser.mp3.Mp3Parser
Scans the MP3 frames for ID3 tags, and creates ID3Tag Handlers for each supported set of tags.
getAnalysisEngine(String, String, String) - Static method in class org.apache.tika.parser.ctakes.CTAKESUtils
Returns a new UIMA Analysis Engine (AE).
getAnnotationProperty(IdentifiedAnnotation, CTAKESAnnotationProperty) - Static method in class org.apache.tika.parser.ctakes.CTAKESUtils
Returns the annotation value based on the given annotation type.
getAnnotationProps() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
Returns an array of CTAKESAnnotationProperty's that will be included into cTAKES metadata.
getAnnotationPropsAsString() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
Returns a string containing a comma-separated list of CTAKESAnnotationProperty names that will be included into cTAKES metadata.
getApiUri(Metadata) - Method in class org.apache.tika.parser.captioning.tf.TensorflowRESTCaptioner
 
getApiUri(Metadata) - Method in class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
 
getApiUri(Metadata) - Method in class org.apache.tika.parser.recognition.tf.TensorflowRESTVideoRecogniser
 
getApplyRotation() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
 
getArtist() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
 
getArtist() - Method in interface org.apache.tika.parser.mp3.ID3Tags
The Artist for the track
getArtist() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
 
getArtist() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
 
getArtist() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
 
getArtist() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
 
getAverageCharTolerance() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
 
getBestNameEntity() - Method in class org.apache.tika.parser.geo.topic.NameEntityExtractor
 
getBigInteger(int) - Method in class org.apache.tika.parser.chm.lzx.ChmSection
 
getBitRate() - Method in class org.apache.tika.parser.mp3.AudioFrame
Get the bit rate in bit per second.
getBitsPerPixel() - Method in class org.apache.tika.parser.image.ICNSType
 
getBlock_len() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Returns block's length
getBlockAddress() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
Returns block addresses
getBlockCount() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
Gets a block count
getBlockidx_intvl() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Returns block index interval
getBlockLen() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
Gets a block length
getBlockLength() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
getBlockNext() - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
 
getBlockNumber() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxBlock
 
getBlockPrev() - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
 
getBlockRemaining() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
getBlockType() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
getBody() - Method in class org.apache.tika.parser.microsoft.onenote.ObjectSpaceObjectPropSet
 
getByte() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
 
getByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitWriter
 
getBytes(boolean) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
 
getBytes(char) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
 
getBytes(double) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
 
getBytes(short) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
 
getBytes(int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
 
getBytes(long) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
 
getBytes(float) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
 
getBytes(String) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
 
getBytes() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitWriter
Gets a copy byte array which contains the current written byte.
getBytes(long) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.LittleEndianBitConverter
Returns the specified 64-bit unsigned integer value as an array of bytes.
getBytes(int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.LittleEndianBitConverter
Returns the specified 32-bit unsigned integer value as an array of bytes.
getCatchIntermediateIOExceptions() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
getCellManifestDataElementData(List<DataElement>, StorageManifestDataElementData, HashMap<CellID, ExGuid>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
This method is used to get cell manifest data element from a list of data element.
getCenter() - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.HeaderFooterFromString
 
getChannels() - Method in class org.apache.tika.parser.mp3.AudioFrame
Get the number of channels (1=mono, 2=stereo)
getCharset() - Method in class org.apache.tika.parser.csv.CSVParams
 
getChmBlockInfoInstance(DirectoryListingEntry, int, ChmLzxcControlData) - Static method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
Deprecated.
getChmBlockInfoInstance(DirectoryListingEntry, int, ChmLzxcControlData, ChmBlockInfo) - Static method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
 
getChmBlockSegment(byte[], ChmLzxcResetTable, int, int, int) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
 
getChmDirList() - Method in class org.apache.tika.parser.chm.core.ChmExtractor
 
getChmDirList() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
 
getChmItsfHeader() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
 
getChmItspHeader() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
 
getChmLzxcControlData() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
 
getChmLzxcResetTable() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
 
getClassName() - Method in enum org.apache.tika.parser.ctakes.CTAKESSerializer
 
getColorspace() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
 
getCommand() - Method in class org.apache.tika.parser.gdal.GDALParser
 
getComment(byte[], int, int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
Builds up the ID3 comment, by parsing and extracting the comment string parts from the given data.
getComments() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
 
getComments() - Method in interface org.apache.tika.parser.mp3.ID3Tags
Retrieves the comments, if any.
getComments() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
 
getComments() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
 
getComments() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
 
getComments() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
 
getCompilation() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
 
getCompilation() - Method in interface org.apache.tika.parser.mp3.ID3Tags
 
getCompilation() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
ID3v1 doesn't have compilations, so returns null;
getCompilation() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
ID3v22 doesn't have compilations, so returns null;
getCompilation() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
 
getCompilation() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
 
getComposer() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
 
getComposer() - Method in interface org.apache.tika.parser.mp3.ID3Tags
 
getComposer() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
ID3v1 doesn't have composers, so returns null;
getComposer() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
 
getComposer() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
 
getComposer() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
 
getCompoundTypes() - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObject
Gets the StreamObjectTypeHeaderStart
getCompressedLen() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
Gets compressed length
getConcatenatePhoneticRuns() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
 
getConfidence() - Method in class org.apache.tika.parser.csv.CSVResult
 
getConfidence() - Method in class org.apache.tika.parser.recognition.RecognisedObject
 
getConfidence() - Method in class org.apache.tika.parser.txt.CharsetMatch
Get an indication of the confidence in the charset detected.
getContent() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxBlock
 
getContent(int, int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxBlock
 
getContent(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxBlock
 
getContent() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGUIDArray
 
getContent() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.IntermediateNodeObject
Get all the content which is represented by the root node object.
getContent() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.LeafNodeObject
Get all the content which is represented by the intermediate node object.
getContent() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.NodeObject
Get all the content which is represented by the node object.
getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.dif.DIFParser
 
getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.xml.AbstractXML2003Parser
 
getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.xml.SpreadsheetMLParser
 
getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.xml.WordMLParser
 
getContentHandler(ContentHandler, Metadata) - Method in class org.apache.tika.parser.mif.MIFParser
Get the content handler to use.
getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.odf.OpenDocumentMetaParser
 
getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.xml.DcXMLParser
 
getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.xml.FictionBookParser
 
getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.xml.XMLParser
 
getContentLength() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxBlock
 
getContentParser() - Method in class org.apache.tika.parser.epub.EpubParser
 
getContentParser() - Method in class org.apache.tika.parser.odf.OpenDocumentParser
 
getContextIDs() - Method in class org.apache.tika.parser.microsoft.onenote.ObjectSpaceObjectPropSet
 
getControlDataIndex() - Method in class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
Returns control data index that located in List
getCoreProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xps.XPSTextExtractor
 
getCoreProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xslf.XSLFEventBasedPowerPointExtractor
 
getCoreProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFEventBasedWordExtractor
 
getCountryCode() - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
 
getCurrent(byte[], AtomicInteger, Class<T>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObject
Get current stream object.
getCurrent() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitReader
 
getCurrentFSSHTTPBSubRequestID() - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.SequenceNumberGenerator
This method is used to get the current sub request ID and atomic adding the token by 1.
GetCurrentSerialNumber() - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.SequenceNumberGenerator
This method is used to get the current serial number and atomic adding the token by 1.
getCurrentToken() - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.SequenceNumberGenerator
This method is used to get the current token value and atomic adding the token by 1.
getCustomProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xps.XPSTextExtractor
 
getCustomProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xslf.XSLFEventBasedPowerPointExtractor
 
getCustomProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFEventBasedWordExtractor
 
getData() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
 
getData() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
 
getData(Class<T>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElement
Used to get data.
getData() - Method in class org.apache.tika.parser.mp3.ID3v2Frame
 
getDataObjectDataElementData(List<DataElement>, ExGuid, AtomicReference<ExGuid>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
This method is used to get the list of object group data element from a list of data element.
getDataObjectDataElementData(List<DataElement>, RevisionManifestDataElementData, AtomicReference<ExGuid>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
This method is used to get a list of object group data element from a list of data element.
getDataOffset() - Method in class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
Returns data offset
getDataOffset() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Returns data offset
getDateFormatOverride() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
 
getDecodedValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
 
getDecorationName() - Method in class org.apache.tika.parser.ctakes.CTAKESParser
 
getDefaultConfig() - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
 
getDelimiter() - Method in class org.apache.tika.parser.csv.CSVParams
 
getDelimiter() - Method in class org.apache.tika.parser.csv.CSVResult
 
getDensity() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
 
getDepth() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
 
getDescription() - Method in class org.apache.tika.parser.mp3.ID3Tags.ID3Comment
Gets the description, if present
getDetectableCharsets() - Method in class org.apache.tika.parser.txt.CharsetDetector
Deprecated.
This API is ICU internal only.
getDetectAngles() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
 
getDir_uuid() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Returns directory uuid
getDirectoryListingEntryList() - Method in class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
Returns chm directory listing entry list
getDirLen() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Returns directory length
getDirOffset() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Returns directory offset
getDisc() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
 
getDisc() - Method in interface org.apache.tika.parser.mp3.ID3Tags
The number of the disc this belongs to, within the set
getDisc() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
ID3v1 doesn't have disc numbers, so returns null;
getDisc() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
 
getDisc() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
 
getDisc() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
 
getDocument() - Method in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
 
getDocument() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLExtractor
Returns the opened document.
getDocument() - Method in class org.apache.tika.parser.microsoft.ooxml.xps.XPSExtractorDecorator
 
getDropThreshold() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
 
getDuration() - Method in class org.apache.tika.parser.mp3.AudioFrame
Returns the duration in milliseconds.
getEnableAutoSpace() - Method in class org.apache.tika.parser.pdf.PDFParser
 
getEnableAutoSpace() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
 
getEncint() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
 
getEncoding() - Method in class org.apache.tika.parser.strings.StringsConfig
Returns the character encoding of the strings that are to be found.
getEndBlock() - Method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
Returns the end block index
getEndOffset() - Method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
Returns the end offset index
getEntityTypes() - Method in class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
Gets set of entity types recognised by this recogniser
getEntityTypes() - Method in class org.apache.tika.parser.ner.grobid.GrobidNERecogniser
Gets set of entity types recognised by this recogniser
getEntityTypes() - Method in class org.apache.tika.parser.ner.mitie.MITIENERecogniser
Gets set of entity types recognised by this recogniser
getEntityTypes() - Method in interface org.apache.tika.parser.ner.NERecogniser
gets a set of entity types whose names are recognisable by this
getEntityTypes() - Method in class org.apache.tika.parser.ner.nltk.NLTKNERecogniser
Gets set of entity types recognised by this recogniser
getEntityTypes() - Method in class org.apache.tika.parser.ner.opennlp.OpenNLPNameFinder
 
getEntityTypes() - Method in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
 
getEntityTypes() - Method in class org.apache.tika.parser.ner.regex.RegexNERecogniser
 
getEntriesToCopy() - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntry3FNDX
 
getEntryType() - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
Returns ChmCommons.EntryType (COMPRESSED or UNCOMPRESSED)
getExtendedGuidString() - Method in class org.apache.tika.parser.microsoft.onenote.ExtendedGUID
 
getExtendedHeader() - Method in class org.apache.tika.parser.mp3.ID3v2Frame
 
getExtendedProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xps.XPSTextExtractor
 
getExtendedProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xslf.XSLFEventBasedPowerPointExtractor
 
getExtendedProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFEventBasedWordExtractor
 
getExtension() - Method in enum org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
 
getExtractAcroFormContent() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
 
getExtractActions() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
 
getExtractAllAlternativesFromMSG() - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
 
getExtractAllAlternativesFromMSG() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
 
getExtractAnnotationText() - Method in class org.apache.tika.parser.pdf.PDFParser
getExtractAnnotationText() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
 
getExtractBookmarksText() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
 
getExtractFontNames() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
 
getExtractInlineImages() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
 
getExtractMacros() - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
 
getExtractMacros() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
 
getExtractMarkedContent() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
 
getExtractScripts() - Method in class org.apache.tika.parser.html.HtmlParser
 
getExtractUniqueInlineImagesOnly() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
 
getFilePath() - Method in class org.apache.tika.parser.strings.FileConfig
Returns the "file" installation folder.
getFileProg() - Static method in class org.apache.tika.parser.strings.StringsParser
 
getFilter() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
 
getFlags() - Method in class org.apache.tika.parser.mp3.ID3v2Frame
 
getFormattedNumber(Paragraph) - Method in class org.apache.tika.parser.microsoft.ListManager
Get the formatted number for a given paragraph

getFormattedNumber(XWPFParagraph) - Method in class org.apache.tika.parser.microsoft.ooxml.XWPFListManager
 
getFormattedNumber(BigInteger, int) - Method in class org.apache.tika.parser.microsoft.ooxml.XWPFListManager
 
getFramesRead() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
getFreeSpace() - Method in class org.apache.tika.parser.chm.accessor.ChmPmgiHeader
Returns pmgi free space
getFreeSpace() - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
 
getGazetteerRestEndpoint() - Method in class org.apache.tika.parser.geo.topic.GeoParser
 
getGazetteerRestEndpoint() - Method in class org.apache.tika.parser.geo.topic.GeoParserConfig
 
getGenre() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
 
getGenre() - Method in interface org.apache.tika.parser.mp3.ID3Tags
 
getGenre() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
 
getGenre() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
 
getGenre() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
 
getGenre() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
 
getGuid() - Method in class org.apache.tika.parser.microsoft.onenote.ExtendedGUID
 
getGuid() - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntryFNDX
 
getGuid() - Method in class org.apache.tika.parser.microsoft.onenote.GUID
 
getGuidString() - Method in class org.apache.tika.parser.microsoft.onenote.GUID
 
getHadStarted() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
getHeader_len() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Returns header length
getHeaderLen() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Returns itsf header length
getHeight() - Method in class org.apache.tika.parser.image.ICNSType
 
getId() - Method in class org.apache.tika.parser.recognition.RecognisedObject
 
getIfXFAExtractOnlyXFA() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
 
getIlvl() - Method in class org.apache.tika.parser.microsoft.ooxml.ParagraphProperties
 
getImageMagickPath() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
 
getIncludeDeletedContent() - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
 
getIncludeDeletedContent() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
 
getIncludeDeletedText() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
 
getIncludeDeletedText() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
 
getIncludeHeadersAndFooters() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
 
getIncludeMissingRows() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
 
getIncludeMoveFromContent() - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
 
getIncludeMoveFromContent() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
 
getIncludeMoveFromText() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
 
getIncludeMoveFromText() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
 
getIncludeShapeBasedContent() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
 
getIncludeSlideMasterContent() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
 
getIncludeSlideNotes() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
 
getIndex() - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntryFNDX
 
getIndex_depth() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Returns an index depth
getIndex_head() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Returns an index head
getIndex_root() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Returns index root
getIndexCopyFromStart() - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntry3FNDX
 
getIndexCopyToStart() - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntry3FNDX
 
getIndexOfContent() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
 
getIndexOfResetData() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
 
getIndexOfResetTable() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
 
getIniBlock() - Method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
Returns an initial block index
getInlineBool(OneNotePropertyEnum) - Static method in enum org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
 
getInputStream() - Method in class org.apache.tika.parser.utils.DataURIScheme
 
getInstance() - Static method in class org.apache.tika.parser.ner.regex.RegexNERecogniser
 
getInt(byte[]) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
 
getInt(byte[], int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
 
getInt2(byte[], int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
 
getInt3(byte[], int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
 
getIntelCurrentPossition() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
getIntelFileSize() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
getIntelState() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
getIntVal() - Method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.DataElementType
 
getIntVal() - Method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyType
 
getIntVal() - Method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.RequestTypes
 
getIntVal() - Method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderEnd
 
getIntVal() - Method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
 
getJCas(AnalysisEngine) - Static method in class org.apache.tika.parser.ctakes.CTAKESUtils
Returns a new JCas () appropriate for the given Analysis Engine.
getJustFileName(String) - Method in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
 
getLabel() - Method in class org.apache.tika.parser.recognition.RecognisedObject
 
getLabelLang() - Method in class org.apache.tika.parser.recognition.RecognisedObject
 
getLang_id() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Returns language id
getLangId() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Returns language ID
getLanguage(long) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
Returns textual representation of LangID
getLanguage() - Method in class org.apache.tika.parser.mp3.ID3Tags.ID3Comment
Gets the language, if present
getLanguage() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
 
getLanguage() - Method in class org.apache.tika.parser.txt.CharsetMatch
Get the ISO code for the language of the detected charset.
getLastModified() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Returns last modified date of the chm file
getLatitude() - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
 
getLayer() - Method in class org.apache.tika.parser.mp3.AudioFrame
Get the audio layer code.
getLeft() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
 
getLeft() - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.HeaderFooterFromString
 
getLength() - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
 
getLength() - Method in class org.apache.tika.parser.mp3.AudioFrame
Returns the frame length in bytes.
getLength() - Method in class org.apache.tika.parser.mp3.ID3v2Frame
 
getLengthTreeLengtsTable() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
getLengthTreeTable() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
getLinearizedDictionary(PDDocument) - Static method in class org.apache.tika.parser.pdf.PDFPreflightParser
Deprecated.
Copied verbatim from PDFBox According to the PDF Reference, A linearized PDF contain a dictionary as first object (linearized dictionary) and only this one in the first section.
getLocations(List<String>) - Method in class org.apache.tika.parser.geo.topic.gazetteer.GeoGazetteerClient
Calls API of lucene-geo-gazetteer to search location name in gazetteer.
getLongitude() - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
 
getLzxBlockLength() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
 
getLzxBlockOffset() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
 
getLzxBlocksCache() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
 
getMainDocumentParts() - Method in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
Return a list of the main parts of the document, used when searching for embedded resources.
getMainDocumentParts() - Method in class org.apache.tika.parser.microsoft.ooxml.POIXMLTextExtractorDecorator
 
getMainDocumentParts() - Method in class org.apache.tika.parser.microsoft.ooxml.SXSLFPowerPointExtractorDecorator
In PowerPoint files, slides have things embedded in them, and slide drawings which have the images
getMainDocumentParts() - Method in class org.apache.tika.parser.microsoft.ooxml.SXWPFWordExtractorDecorator
This returns all items that might contain embedded objects: main document, headers, footers, comments, etc.
getMainDocumentParts() - Method in class org.apache.tika.parser.microsoft.ooxml.xps.XPSExtractorDecorator
 
getMainDocumentParts() - Method in class org.apache.tika.parser.microsoft.ooxml.XSLFPowerPointExtractorDecorator
In PowerPoint files, slides have things embedded in them, and slide drawings which have the images
getMainDocumentParts() - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
In Excel files, sheets have things embedded in them, and sheet drawings which have the images
getMainDocumentParts() - Method in class org.apache.tika.parser.microsoft.ooxml.XWPFWordExtractorDecorator
Include main body and anything else that can have an attachment/embedded object
getMainTreeElements() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
getMainTreeLengtsTable() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
getMainTreeTable() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
getMajorVersion() - Method in class org.apache.tika.parser.mp3.ID3v2Frame
 
getMarkLimit() - Method in class org.apache.tika.parser.html.charsetdetector.StandardHtmlEncodingDetector
 
getMarkLimit() - Method in class org.apache.tika.parser.html.HtmlEncodingDetector
 
getMarkLimit() - Method in class org.apache.tika.parser.txt.Icu4jEncodingDetector
 
getMarkLimit() - Method in class org.apache.tika.parser.txt.UniversalEncodingDetector
 
getMaxBytesForEmbeddedObject() - Static method in class org.apache.tika.parser.rtf.RTFParser
Deprecated.
getMaxFileSizeToOcr() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
 
getMaxMainMemoryBytes() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
The maximum amount of memory to use when loading a pdf into a PDDocument.
getMaxXMPMMHistory() - Static method in class org.apache.tika.parser.image.xmp.JempboxExtractor
 
getMediaType() - Method in class org.apache.tika.parser.csv.CSVParams
 
getMediaType() - Method in class org.apache.tika.parser.csv.CSVResult
 
getMediaType() - Method in class org.apache.tika.parser.utils.DataURIScheme
 
getMessageClass(String) - Static method in class org.apache.tika.parser.microsoft.OutlookExtractor
 
getMetadata() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
Returns an array of metadata whose values will be analyzed using cTAKES.
getMetadata() - Method in class org.apache.tika.parser.ctakes.CTAKESContentHandler
Returns metadata that includes cTAKES annotations.
getMetadataAsString() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
Returns a string containing a comma-separated list of metadata whose values will be analyzed using cTAKES.
getMetadataExtractor() - Method in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
 
getMetadataExtractor() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLExtractor
POIXMLTextExtractor.getMetadataTextExtractor() not yet supported for OOXML by POI.
getMetaParser() - Method in class org.apache.tika.parser.epub.EpubParser
 
getMetaParser() - Method in class org.apache.tika.parser.odf.OpenDocumentParser
 
getMinFileSizeToOcr() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
 
getMinLength() - Method in class org.apache.tika.parser.strings.StringsConfig
Returns the minimum sequence length (characters) to print.
getMinorVersion() - Method in class org.apache.tika.parser.mp3.ID3v2Frame
 
getMinSize() - Method in class org.apache.tika.parser.strings.Latin1StringsParser
Returns the minimum size of a character sequence to be extracted.
getMSB() - Method in class org.apache.tika.parser.executable.MachineMetadata.Endian
 
getN() - Method in class org.apache.tika.parser.microsoft.onenote.ExtendedGUID
 
getName() - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
Returns an entry name
getName() - Method in enum org.apache.tika.parser.ctakes.CTAKESAnnotationProperty
 
getName() - Method in class org.apache.tika.parser.executable.MachineMetadata.Endian
 
getName() - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
 
getName() - Method in class org.apache.tika.parser.txt.CharsetMatch
Get the name of the detected charset.
getNameLength() - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
Returns an entry name length
getNamespace() - Method in enum org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
 
getNerModelUrl() - Method in class org.apache.tika.parser.geo.topic.GeoParser
 
getNerModelUrl() - Method in class org.apache.tika.parser.geo.topic.GeoParserConfig
 
getNum_blocks() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Returns number of blocks
getNumberOfLevels() - Method in class org.apache.tika.parser.microsoft.AbstractListManager.ParagraphLevelCounter
 
getNumId() - Method in class org.apache.tika.parser.microsoft.ooxml.ParagraphProperties
 
getOcrDPI() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
Dots per inch used to render the page image for OCR
getOcrImageFormatName() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
String representation of the image format used to render the page image for OCR (examples: png, tiff, jpeg)
getOcrImageQuality() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
Image quality used to render the page image for OCR.
getOcrImageScale() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
Deprecated.
as of Tika 1.23, this is no longer used in rendering page images; use PDFParserConfig.setOcrDPI(int)
getOcrImageType() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
Image type used to render the page image for OCR.
getOcrStrategy() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
 
getOffset() - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
 
getOids() - Method in class org.apache.tika.parser.microsoft.onenote.ObjectSpaceObjectPropSet
 
getOsids() - Method in class org.apache.tika.parser.microsoft.onenote.ObjectSpaceObjectPropSet
 
getOtherTesseractConfig() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
 
getOutputStream() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
Returns an OutputStream object used write the CAS.
getOutputType() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
 
getPackage() - Method in class org.apache.tika.parser.microsoft.ooxml.xps.XPSTextExtractor
 
getPackage() - Method in class org.apache.tika.parser.microsoft.ooxml.xslf.XSLFEventBasedPowerPointExtractor
 
getPackage() - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFEventBasedWordExtractor
 
getPageSegMode() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
 
getPageSeparator() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
 
getPart() - Method in enum org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
 
getPDDocument(InputStream, String, MemoryUsageSetting, Metadata, ParseContext) - Method in class org.apache.tika.parser.pdf.PDFParser
 
getPDDocument(Path, String, MemoryUsageSetting, Metadata, ParseContext) - Method in class org.apache.tika.parser.pdf.PDFParser
 
getPDDocument(InputStream, String, MemoryUsageSetting, Metadata, ParseContext) - Method in class org.apache.tika.parser.pdf.PDFPreflightParser
Deprecated.
 
getPDDocument(Path, String, MemoryUsageSetting, Metadata, ParseContext) - Method in class org.apache.tika.parser.pdf.PDFPreflightParser
Deprecated.
 
getPDFParserConfig() - Method in class org.apache.tika.parser.pdf.PDFParser
 
getPreserveInterwordSpacing() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
 
getPrevContent() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
 
getR0() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
getR1() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
getR2() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
getReader(InputStream, String) - Method in class org.apache.tika.parser.txt.CharsetDetector
Autodetect the charset of an inputStream, and return a Java Reader to access the converted input data.
getReader() - Method in class org.apache.tika.parser.txt.CharsetMatch
Create a java.io.Reader for reading the Unicode character data corresponding to the original byte data supplied to the Charset detect operation.
getResetInterval() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
Returns reset interval
getResetTableIndex() - Method in class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
Return index of reset table
getResize() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
 
getRevisionManifestDataElementData(List<DataElement>, CellManifestDataElementData, HashMap<ExGuid, ExGuid>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
This method is used to get revision manifest data element from a list of data element.
getRight() - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.HeaderFooterFromString
 
getSampleRate() - Method in class org.apache.tika.parser.mp3.AudioFrame
Get the sampling rate, in Hz
getSeparatorChar() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
Returns the separator character used for annotation properties.
getSerializerType() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
Returns the type of cTAKES (UIMA) serializer used to write the CAS.
getSetKCMS() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
 
getSignature() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Returns a signature of itsf header
getSignature() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Returns a signature of the header
getSignature() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
Returns a signature of control data block
getSignature() - Method in class org.apache.tika.parser.chm.accessor.ChmPmgiHeader
Returns pmgi signature if exists
getSignature() - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
 
getSize() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
Returns a size of control data
getSize() - Method in class org.apache.tika.parser.mp3.ID3v2Frame.RawTag
 
getSortByPosition() - Method in class org.apache.tika.parser.pdf.PDFParser
getSortByPosition() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
 
getSpacingTolerance() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
 
getStartBlock() - Method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
Returns the start block index
getStartIndex() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
 
getStartOffset() - Method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
Returns the start offset index
getState() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxBlock
 
getStorageManifestDataElementData(List<DataElement>, ExGuid) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
This method is used to get storage manifest data element from a list of data element.
getStream_uuid() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Returns stream uuid
getStreamObjectTypeMapping() - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObject
Gets the StreamObjectTypeMapping
getString(byte[], int, int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
Returns the String at the given offset and length.
getString(byte[], String) - Method in class org.apache.tika.parser.txt.CharsetDetector
Autodetect the charset of an inputStream, and return a String containing the converted input data.
getString() - Method in class org.apache.tika.parser.txt.CharsetMatch
Create a Java String from Unicode character data corresponding to the original byte data supplied to the Charset detect operation.
getString(int) - Method in class org.apache.tika.parser.txt.CharsetMatch
Create a Java String from Unicode character data corresponding to the original byte data supplied to the Charset detect operation.
getStringsPath() - Method in class org.apache.tika.parser.strings.StringsConfig
Returns the "strings" installation folder.
getStringsProg() - Static method in class org.apache.tika.parser.strings.StringsParser
 
getStripMarkup() - Method in class org.apache.tika.parser.txt.Icu4jEncodingDetector
 
getStyleClass() - Method in class org.apache.tika.parser.microsoft.WordExtractor.TagAndStyle
 
getStyleID() - Method in class org.apache.tika.parser.microsoft.ooxml.ParagraphProperties
 
getStyleName(String) - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFStylesShim
 
getSuffix(InputStream, int) - Static method in class org.apache.tika.parser.mp3.LyricsHandler
Reads and returns the last length bytes from the given stream.
getSupportedMimes() - Method in class org.apache.tika.parser.captioning.tf.TensorflowRESTCaptioner
 
getSupportedMimes() - Method in interface org.apache.tika.parser.recognition.ObjectRecogniser
The mimes supported by this recogniser
getSupportedMimes() - Method in class org.apache.tika.parser.recognition.tf.TensorflowImageRecParser
 
getSupportedMimes() - Method in class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.apple.AppleSingleFileParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.apple.PListParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.asm.ClassParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.audio.AudioParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.audio.MidiParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.chm.ChmParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.code.SourceCodeParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.crypto.Pkcs7Parser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.crypto.TSDParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.csv.TextAndCSVParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.dbf.DBFParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.dif.DIFParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.dwg.DWGParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.envi.EnviHeaderParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.epub.EpubContentParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.epub.EpubParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.executable.ExecutableParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.feed.FeedParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.font.AdobeFontMetricParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.font.TrueTypeParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.gdal.GDALParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.geo.topic.GeoParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.geoinfo.GeographicInformationParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.grib.GribParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.hdf.HDFParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.html.HtmlParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.hwp.HwpV5Parser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.BPGParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.HeifParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.ICNSParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.ImageParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.PSDParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.TiffParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.WebPParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.indesign.IDMLParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.iptc.IptcAnpaParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.isatab.ISArchiveParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.iwork.iwana.IWork13PackageParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.iwork.iwana.IWork18PackageParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.iwork.IWorkPackageParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.jdbc.SQLite3Parser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.journal.JournalParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.jpeg.JpegParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mail.RFC822Parser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mat.MatParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mbox.MboxParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mbox.OutlookPSTParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.EMFParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.JackcessParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.MSOwnerFileParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.OfficeParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.OldExcelParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.onenote.OneNoteParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.ml2006.Word2006MLParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.TNEFParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.WMFParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.xml.SpreadsheetMLParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.xml.WordMLParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mif.MIFParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mp3.Mp3Parser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mp4.MP4Parser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mp4.NoakesMP4Parser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.ner.NamedEntityParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.netcdf.NetCDFParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.odf.FlatOpenDocumentParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.odf.OpenDocumentContentParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.odf.OpenDocumentParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.pdf.PDFParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.pkg.CompressorParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.pkg.PackageParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.pkg.RarParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.pot.PooledTimeSeriesParser
Returns the set of media types supported by this parser when used with the given parse context.
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.prt.PRTParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.recognition.ObjectRecognitionParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.rtf.RTFParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.sas.SAS7BDATParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.sentiment.SentimentAnalysisParser
Returns the types supported
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.strings.Latin1StringsParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.strings.StringsParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.txt.TXTParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.video.FLVParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.wordperfect.QuattroProParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.wordperfect.WordPerfectParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.xliff.XLIFF12Parser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.xliff.XLZParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.xml.FictionBookParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.xml.XMLParser
 
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.xml.XMLProfiler
 
getSuppressDuplicateOverlappingText() - Method in class org.apache.tika.parser.pdf.PDFParser
getSuppressDuplicateOverlappingText() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
 
getSwath() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
 
getSyncBits(int) - Method in class org.apache.tika.parser.chm.lzx.ChmSection
 
getSystem_uuid() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Returns system uuid
getTableOffset() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
Gets a table offset
getTag() - Method in class org.apache.tika.parser.microsoft.WordExtractor.TagAndStyle
 
getTagsPresent() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
 
getTagsPresent() - Method in interface org.apache.tika.parser.mp3.ID3Tags
Does the file contain this kind of tags?
getTagsPresent() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
 
getTagsPresent() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
 
getTagsPresent() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
 
getTagsPresent() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
 
getTagString(byte[], int, int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
Returns the (possibly null padded) String at the given offset and length.
getTessdataPath() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
 
getTesseractPath() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
 
getText() - Method in class org.apache.tika.parser.microsoft.ooxml.xps.XPSTextExtractor
 
getText() - Method in class org.apache.tika.parser.microsoft.ooxml.xslf.XSLFEventBasedPowerPointExtractor
 
getText() - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFEventBasedWordExtractor
 
getText() - Method in class org.apache.tika.parser.mp3.ID3Tags.ID3Comment
Gets the text, if present
getTextDocument() - Method in class org.apache.tika.parser.html.BoilerpipeContentHandler
Retrieves the built TextDocument
getTimeout() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
 
getTimeout() - Method in class org.apache.tika.parser.strings.StringsConfig
Returns the maximum time (in seconds) to wait for the "strings" command to terminate.
getTitle() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
 
getTitle() - Method in interface org.apache.tika.parser.mp3.ID3Tags
 
getTitle() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
 
getTitle() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
 
getTitle() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
 
getTitle() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
 
getTotal() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
 
getTrackingMetadata() - Method in class org.apache.tika.parser.mbox.MboxParser
 
getTrackNumber() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
 
getTrackNumber() - Method in interface org.apache.tika.parser.mp3.ID3Tags
The number of the track within the album / recording
getTrackNumber() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
 
getTrackNumber() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
 
getTrackNumber() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
 
getTrackNumber() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
 
getType() - Method in class org.apache.tika.parser.image.ICNSType
 
getType() - Method in enum org.apache.tika.parser.iwork.iwana.IWork13PackageParser.IWork13DocumentType
 
getType() - Method in enum org.apache.tika.parser.iwork.iwana.IWork18PackageParser.IWork18DocumentType
 
getType() - Method in enum org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
 
getType() - Method in enum org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
 
getType() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
 
getType(OneNotePropertyEnum) - Static method in enum org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
 
getTypeFromVal(int) - Static method in enum org.apache.tika.parser.microsoft.OutlookExtractor.RECIPIENT_TYPE
 
getUMLSPass() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
Returns the UMLS password.
getUMLSUser() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
Returns the UMLS username.
getUncompressedLen() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
Gets uncompressed length
getUnderline() - Method in class org.apache.tika.parser.microsoft.ooxml.RunProperties
 
getUnknown() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
Gets unknown
getUnknown0008() - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
 
getUnknown_000c() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Returns unknown_00c value
getUnknown_000c() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Returns 000c unknown bytes
getUnknown_0024() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Returns 0024 unknown bytes
getUnknown_002c() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Returns 002c unknown bytes
getUnknown_0044() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Returns 0044 unknown bytes
getUnknown_18() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
Returns unknown 18 bytes
getUnknownLen() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Returns unknown length
getUnknownOffset() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Returns unknown offset
getUseSAXDocxExtractor() - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
 
getUseSAXDocxExtractor() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
 
getUseSAXPptxExtractor() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
 
getUtf16PropertiesToPrint() - Method in class org.apache.tika.parser.microsoft.onenote.OneNoteTreeWalkerOptions
Print file node data in UTF-16 format when they match these props.
getVersion() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Returns itsf header version
getVersion() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Returns version of itsp header
getVersion() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
Returns a version of control data block
getVersion() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
Returns the version
getVersion() - Method in class org.apache.tika.parser.mp3.AudioFrame
 
getVersionCode() - Method in class org.apache.tika.parser.mp3.AudioFrame
Get the version code.
getWidth() - Method in class org.apache.tika.parser.image.ICNSType
 
getWindow() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
getWindowPosition() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
getWindowSize() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
Returns a window size
getWindowSize(int) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
LZX supports window sizes of 2^15 (32Kb) through 2^21 (2Mb) Returns X, i.e 2^X
getWindowSize() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
getWindowsPerReset() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
Returns windows per reset
getXHTML(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
 
getXHTML(ContentHandler, Metadata, ParseContext) - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLExtractor
Parses the document into a sequence of XHTML SAX events sent to the given content handler.
getXHTML(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFBExcelExtractorDecorator
 
getXHTML(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
 
getYear() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
 
getYear() - Method in interface org.apache.tika.parser.mp3.ID3Tags
 
getYear() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
 
getYear() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
 
getYear() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
 
getYear() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
 
GlobalIdTableEntry3FNDX - Class in org.apache.tika.parser.microsoft.onenote
 
GlobalIdTableEntry3FNDX() - Constructor for class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntry3FNDX
 
GlobalIdTableEntryFNDX - Class in org.apache.tika.parser.microsoft.onenote
 
GlobalIdTableEntryFNDX() - Constructor for class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntryFNDX
 
GRIB_MIME_TYPE - Static variable in class org.apache.tika.parser.grib.GribParser
 
GribParser - Class in org.apache.tika.parser.grib
 
GribParser() - Constructor for class org.apache.tika.parser.grib.GribParser
 
GrobidNERecogniser - Class in org.apache.tika.parser.ner.grobid
 
GrobidNERecogniser() - Constructor for class org.apache.tika.parser.ner.grobid.GrobidNERecogniser
 
GrobidRESTParser - Class in org.apache.tika.parser.journal
 
GrobidRESTParser() - Constructor for class org.apache.tika.parser.journal.GrobidRESTParser
 
guid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
 
guid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.SerialNumber
 
guid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestSchemaGUID
 
GUID - Class in org.apache.tika.parser.microsoft.onenote
 
GUID(int[]) - Constructor for class org.apache.tika.parser.microsoft.onenote.GUID
 
guidCellSchemaId - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AlternativePackaging
 
guidFile - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AlternativePackaging
 
guidFileFormat - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AlternativePackaging
 
guidFileType - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AlternativePackaging
 
guidIndex - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CompactID
 
guidLegacyFileVersion - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AlternativePackaging
 
GuidUtil - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.util
 
GuidUtil() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.GuidUtil
 

H

handle(Metadata) - Method in class org.apache.tika.parser.image.ImageMetadataExtractor
Copies extracted tags to tika metadata using registered handlers.
handle(Iterator<Directory>) - Method in class org.apache.tika.parser.image.ImageMetadataExtractor
Copies extracted tags to tika metadata using registered handlers.
handleEmbeddedFile(PackagePart, ContentHandler, String) - Method in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
Handles an embedded file in the document
handleEntryMetadata(String, Date, Date, Long, XHTMLContentHandler) - Static method in class org.apache.tika.parser.pkg.PackageParser
 
handleXMP(InputStream, int, ImageMetadataExtractor) - Method in class org.apache.tika.parser.image.BPGParser
 
hashCode() - Method in class org.apache.tika.parser.csv.CSVResult
 
hashCode() - Method in class org.apache.tika.parser.microsoft.onenote.ExtendedGUID
 
hashCode() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellID
Override the GetHashCode.
hashCode() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
Override the GetHashCode.
hashCode() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataHashObject
 
hashCode() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
 
hashCode() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
 
hashCode() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
 
hashCode() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
 
hashCode() - Method in class org.apache.tika.parser.microsoft.onenote.GUID
 
hashCode() - Method in class org.apache.tika.parser.pdf.AccessChecker
 
hashCode() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
 
hashCode() - Method in class org.apache.tika.parser.txt.CharsetMatch
generates a hashCode based on the confidence value
hashCode() - Method in class org.apache.tika.parser.utils.DataURIScheme
 
hasID3v1() - Method in class org.apache.tika.parser.mp3.LyricsHandler
 
hasLyrics() - Method in class org.apache.tika.parser.mp3.LyricsHandler
 
hasMask() - Method in class org.apache.tika.parser.image.ICNSType
 
hasNext() - Method in class org.apache.tika.parser.mp3.ID3v2Frame.RawTagIterator
 
hasRetinaDisplay() - Method in class org.apache.tika.parser.image.ICNSType
 
hasSkip(DirectoryListingEntry) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
Checks skippable patterns
hasTesseract(TesseractOCRConfig) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
 
hasWarned() - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
 
HDFParser - Class in org.apache.tika.parser.hdf
Since the NetCDFParser depends on the NetCDF-Java API, we are able to use it to parse HDF files as well.
HDFParser() - Constructor for class org.apache.tika.parser.hdf.HDFParser
 
header - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfContextIDs
 
header - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfOIDs
 
header - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfOSIDs
 
headerCell - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStorePackage
 
HeaderCell - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
 
HeaderCell() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.HeaderCell
 
headerCellCellManifest - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStorePackage
 
headerCellRevisionManifest - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStorePackage
 
headerFooter(String, boolean, String) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.SheetTextAsHTML
 
HeaderFooterFromString(String) - Constructor for class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.HeaderFooterFromString
 
headers - Variable in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.SheetTextAsHTML
 
headerType - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart
Gets or sets the type of the stream object.
healthUri - Variable in class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
 
HeifParser - Class in org.apache.tika.parser.image
 
HeifParser() - Constructor for class org.apache.tika.parser.image.HeifParser
 
hfHelper - Static variable in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
Allows access to headers/footers from raw xml strings
HSLFExtractor - Class in org.apache.tika.parser.microsoft
 
HSLFExtractor(ParseContext, Metadata) - Constructor for class org.apache.tika.parser.microsoft.HSLFExtractor
 
HtmlEncodingDetector - Class in org.apache.tika.parser.html
Character encoding detector for determining the character encoding of a HTML document based on the potential charset parameter found in a Content-Type http-equiv meta tag somewhere near the beginning.
HtmlEncodingDetector() - Constructor for class org.apache.tika.parser.html.HtmlEncodingDetector
 
HtmlMapper - Interface in org.apache.tika.parser.html
HTML mapper used to make incoming HTML documents easier to handle by Tika clients.
HtmlParser - Class in org.apache.tika.parser.html
HTML parser.
HtmlParser() - Constructor for class org.apache.tika.parser.html.HtmlParser
 
HtmlParser(EncodingDetector) - Constructor for class org.apache.tika.parser.html.HtmlParser
 
HWP - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
Hangul Word Processor (Korean)
HWP_MIME_TYPE - Static variable in class org.apache.tika.parser.hwp.HwpV5Parser
 
HwpStreamReader - Class in org.apache.tika.parser.hwp
 
HwpStreamReader(InputStream) - Constructor for class org.apache.tika.parser.hwp.HwpStreamReader
 
HwpTextExtractorV5 - Class in org.apache.tika.parser.hwp
 
HwpTextExtractorV5() - Constructor for class org.apache.tika.parser.hwp.HwpTextExtractorV5
 
HwpV5Parser - Class in org.apache.tika.parser.hwp
 
HwpV5Parser() - Constructor for class org.apache.tika.parser.hwp.HwpV5Parser
 
hyperlinkEnd() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
 
hyperlinkEnd() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
 
hyperlinkStart(String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
 
hyperlinkStart(String) - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
 

I

ICNS_1024x1024_2X_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
 
ICNS_128x128_24BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
 
ICNS_128x128_2X_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
 
ICNS_128x128_8BIT_MASK - Static variable in class org.apache.tika.parser.image.ICNSType
 
ICNS_128x128_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
 
ICNS_16x12_1BIT_IMAGE_AND_MASK - Static variable in class org.apache.tika.parser.image.ICNSType
 
ICNS_16x12_4BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
 
ICNS_16x12_8BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
 
ICNS_16x16_1BIT_IMAGE_AND_MASK - Static variable in class org.apache.tika.parser.image.ICNSType
 
ICNS_16x16_24BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
 
ICNS_16x16_2X_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
 
ICNS_16x16_4BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
 
ICNS_16x16_8BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
 
ICNS_16x16_8BIT_MASK - Static variable in class org.apache.tika.parser.image.ICNSType
 
ICNS_16x16_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
 
ICNS_256x256_2X_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
 
ICNS_256x256_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
 
ICNS_32x32_1BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
 
ICNS_32x32_1BIT_IMAGE_AND_MASK - Static variable in class org.apache.tika.parser.image.ICNSType
 
ICNS_32x32_24BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
 
ICNS_32x32_2X_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
 
ICNS_32x32_4BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
 
ICNS_32x32_8BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
 
ICNS_32x32_8BIT_MASK - Static variable in class org.apache.tika.parser.image.ICNSType
 
ICNS_32x32_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
 
ICNS_48x48_1BIT_IMAGE_AND_MASK - Static variable in class org.apache.tika.parser.image.ICNSType
 
ICNS_48x48_24BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
 
ICNS_48x48_4BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
 
ICNS_48x48_8BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
 
ICNS_48x48_8BIT_MASK - Static variable in class org.apache.tika.parser.image.ICNSType
 
ICNS_512x512_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
 
ICNS_64x64_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
 
ICNS_MIME_TYPE - Static variable in class org.apache.tika.parser.image.ICNSParser
 
ICNSParser - Class in org.apache.tika.parser.image
A basic parser class for Apple ICNS icon files
ICNSParser() - Constructor for class org.apache.tika.parser.image.ICNSParser
 
ICNSType - Class in org.apache.tika.parser.image
Holds details on Apple ICNS icons
Icu4jEncodingDetector - Class in org.apache.tika.parser.txt
 
Icu4jEncodingDetector() - Constructor for class org.apache.tika.parser.txt.Icu4jEncodingDetector
 
id - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyID
 
id - Variable in class org.apache.tika.parser.recognition.RecognisedObject
Identifier for this object
id - Variable in class org.apache.tika.parser.rtf.ListDescriptor
 
ID3Comment(String) - Constructor for class org.apache.tika.parser.mp3.ID3Tags.ID3Comment
Creates an ID3 v1 style comment tag
ID3Comment(String, String, String) - Constructor for class org.apache.tika.parser.mp3.ID3Tags.ID3Comment
Creates an ID3 v2 style comment tag
ID3Tags - Interface in org.apache.tika.parser.mp3
Interface that defines the common interface for ID3 tag parsers, such as ID3v1 and ID3v2.3.
ID3Tags.ID3Comment - Class in org.apache.tika.parser.mp3
Represents a comments in ID3 (especially ID3 v2), where are made up of several parts
ID3TagsAndAudio() - Constructor for class org.apache.tika.parser.mp3.Mp3Parser.ID3TagsAndAudio
 
ID3v1Handler - Class in org.apache.tika.parser.mp3
This is used to parse ID3 Version 1 Tag information from an MP3 file, if available.
ID3v1Handler(InputStream, ContentHandler) - Constructor for class org.apache.tika.parser.mp3.ID3v1Handler
 
ID3v1Handler(byte[]) - Constructor for class org.apache.tika.parser.mp3.ID3v1Handler
Creates from the last 128 bytes of a stream.
ID3v22Handler - Class in org.apache.tika.parser.mp3
This is used to parse ID3 Version 2.2 Tag information from an MP3 file, if available.
ID3v22Handler(ID3v2Frame) - Constructor for class org.apache.tika.parser.mp3.ID3v22Handler
 
ID3v23Handler - Class in org.apache.tika.parser.mp3
This is used to parse ID3 Version 2.3 Tag information from an MP3 file, if available.
ID3v23Handler(ID3v2Frame) - Constructor for class org.apache.tika.parser.mp3.ID3v23Handler
 
ID3v24Handler - Class in org.apache.tika.parser.mp3
This is used to parse ID3 Version 2.4 Tag information from an MP3 file, if available.
ID3v24Handler(ID3v2Frame) - Constructor for class org.apache.tika.parser.mp3.ID3v24Handler
 
ID3v2Frame - Class in org.apache.tika.parser.mp3
A frame of ID3v2 data, which is then passed to a handler to be turned into useful data.
ID3v2Frame.RawTag - Class in org.apache.tika.parser.mp3
 
ID3v2Frame.RawTagIterator - Class in org.apache.tika.parser.mp3
Iterates over id3v2 raw tags.
ID3v2Frame.TextEncoding - Class in org.apache.tika.parser.mp3
 
IdentityHtmlMapper - Class in org.apache.tika.parser.html
Alternative HTML mapping rules that pass the input HTML as-is without any modifications.
IdentityHtmlMapper() - Constructor for class org.apache.tika.parser.html.IdentityHtmlMapper
 
IDMLParser - Class in org.apache.tika.parser.indesign
Adobe InDesign IDML Parser.
IDMLParser() - Constructor for class org.apache.tika.parser.indesign.IDMLParser
 
IFSSHTTPBSerializable - Interface in org.apache.tika.parser.microsoft.onenote.fsshttpb
FSSHTTPB Serialize interface.
ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.parser.dif.DIFContentHandler
 
ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
 
ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
 
ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.parser.xml.ElementMetadataHandler
 
ImageMetadataExtractor - Class in org.apache.tika.parser.image
Uses the Metadata Extractor library to read EXIF and IPTC image metadata and map to Tika fields.
ImageMetadataExtractor(Metadata) - Constructor for class org.apache.tika.parser.image.ImageMetadataExtractor
 
ImageMetadataExtractor(Metadata, ImageMetadataExtractor.DirectoryHandler...) - Constructor for class org.apache.tika.parser.image.ImageMetadataExtractor
 
ImageParser - Class in org.apache.tika.parser.image
 
ImageParser() - Constructor for class org.apache.tika.parser.image.ImageParser
 
inclusiveOr(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
 
inclusiveOr(long) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
 
inclusiveOr(UInteger) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
 
increaseFramesRead() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
incrementLevel(int, AbstractListManager.LevelTuple[]) - Method in class org.apache.tika.parser.microsoft.AbstractListManager.ParagraphLevelCounter
Apply this to every numbered paragraph in order.
index - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.JCID
 
index - Variable in exception org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectParseErrorException
 
indexOf(byte[], byte[]) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
Searches some pattern in byte[]
indexOf(List<DirectoryListingEntry>, String) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
Searches for some pattern in the directory listing entry list
indexOfResetTableBlock(byte[], byte[]) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
Returns an index of the reset table
initialize(Map<String, Param>) - Method in class org.apache.tika.parser.captioning.tf.TensorflowRESTCaptioner
 
initialize(GeoParserConfig) - Method in class org.apache.tika.parser.geo.topic.GeoParser
Initializes this parser
initialize(Map<String, Param>) - Method in class org.apache.tika.parser.jdbc.SQLite3Parser
No-op
initialize(Map<String, Param>) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
no-op
initialize(Map<String, Param>) - Method in class org.apache.tika.parser.pdf.PDFParser
This is a no-op.
initialize(Map<String, Param>) - Method in interface org.apache.tika.parser.recognition.ObjectRecogniser
This is the hook for configuring the recogniser
initialize(Map<String, Param>) - Method in class org.apache.tika.parser.recognition.ObjectRecognitionParser
 
initialize(Map<String, Param>) - Method in class org.apache.tika.parser.recognition.tf.TensorflowImageRecParser
 
initialize(Map<String, Param>) - Method in class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
 
initialize(Map<String, Param>) - Method in class org.apache.tika.parser.recognition.tf.TensorflowRESTVideoRecogniser
 
initialize(Map<String, Param>) - Method in class org.apache.tika.parser.sentiment.SentimentAnalysisParser
 
inputFilterEnabled() - Method in class org.apache.tika.parser.txt.CharsetDetector
Test whether or not input filtering is enabled.
INSTANCE - Static variable in class org.apache.tika.parser.html.DefaultHtmlMapper
 
INSTANCE - Static variable in class org.apache.tika.parser.html.IdentityHtmlMapper
 
int64BitsToDouble(long) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
 
intelE8Decoding() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxBlock
 
IntermediateNodeObject - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
 
IntermediateNodeObject() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.IntermediateNodeObject
Initializes a new instance of the IntermediateNodeObject class.
IntermediateNodeObject.RootNodeObjectBuilder - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
The class is used to build a root node object.
IntermediateNodeObjectBuilder() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.LeafNodeObject.IntermediateNodeObjectBuilder
 
intermediateNodeObjectList - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.NodeObject
 
intValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
 
intValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
 
intValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
 
intValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
 
IProperty - Interface in org.apache.tika.parser.microsoft.onenote.fsshttpb.property
The interface of the property in OneNote file.
IptcAnpaParser - Class in org.apache.tika.parser.iptc
Parser for IPTC ANPA New Wire Feeds
IptcAnpaParser() - Constructor for class org.apache.tika.parser.iptc.IptcAnpaParser
 
ISArchiveParser - Class in org.apache.tika.parser.isatab
 
ISArchiveParser() - Constructor for class org.apache.tika.parser.isatab.ISArchiveParser
Default constructor.
ISArchiveParser(String) - Constructor for class org.apache.tika.parser.isatab.ISArchiveParser
Constructor that accepts the pathname of ISArchive folder.
ISATabUtils - Class in org.apache.tika.parser.isatab
 
ISATabUtils() - Constructor for class org.apache.tika.parser.isatab.ISATabUtils
 
isAudioHeader(int, int, int, int) - Static method in class org.apache.tika.parser.mp3.AudioFrame
Does this appear to be a 4 byte audio frame header?
isAvailable() - Method in class org.apache.tika.parser.captioning.tf.TensorflowRESTCaptioner
 
isAvailable(GeoParserConfig) - Method in class org.apache.tika.parser.geo.topic.GeoParser
 
isAvailable() - Method in class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
 
isAvailable() - Method in class org.apache.tika.parser.ner.grobid.GrobidNERecogniser
 
isAvailable() - Method in class org.apache.tika.parser.ner.mitie.MITIENERecogniser
 
isAvailable() - Method in interface org.apache.tika.parser.ner.NERecogniser
checks if this Named Entity recogniser is available for service
isAvailable() - Method in class org.apache.tika.parser.ner.nltk.NLTKNERecogniser
 
isAvailable() - Method in class org.apache.tika.parser.ner.opennlp.OpenNLPNameFinder
 
isAvailable() - Method in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
 
isAvailable() - Method in class org.apache.tika.parser.ner.regex.RegexNERecogniser
 
isAvailable() - Method in interface org.apache.tika.parser.recognition.ObjectRecogniser
Is this service available
isAvailable() - Method in class org.apache.tika.parser.recognition.tf.TensorflowImageRecParser
 
isAvailable() - Method in class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
 
isBase64() - Method in class org.apache.tika.parser.utils.DataURIScheme
 
isBinary - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.JCID
 
isBitSet(byte[], long) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.Bit
Read a bit value from a byte array with the specified bit position.
isBold() - Method in class org.apache.tika.parser.microsoft.ooxml.RunProperties
 
isCatchIntermediateIOExceptions() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
isComplete() - Method in class org.apache.tika.parser.csv.CSVParams
 
isCrawlAllFileNodesFromRoot() - Method in class org.apache.tika.parser.microsoft.onenote.OneNoteTreeWalkerOptions
Do this to ignore revisions and just parse all file nodes from the root recursively.
isDiscardElement(String) - Method in class org.apache.tika.parser.html.DefaultHtmlMapper
 
isDiscardElement(String) - Method in interface org.apache.tika.parser.html.HtmlMapper
Checks whether all content within the given HTML element should be discarded instead of including it in the parse output.
isDiscardElement(String) - Method in class org.apache.tika.parser.html.HtmlParser
Deprecated.
Use the HtmlMapper mechanism to customize the HTML mapping. This method will be removed in Tika 1.0.
isDiscardElement(String) - Method in class org.apache.tika.parser.html.IdentityHtmlMapper
 
isEmpty(String) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
 
isEmpty() - Method in class org.apache.tika.parser.csv.CSVParams
 
isEnableImageProcessing() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
 
isFileData - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.JCID
 
isFileHeader(byte[], int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ZipHeader
Check the input data is a local file header.
isGraphNode - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.JCID
 
isHeading() - Method in class org.apache.tika.parser.microsoft.WordExtractor.TagAndStyle
 
isIncludeMarkup() - Method in class org.apache.tika.parser.html.BoilerpipeContentHandler
 
isItalics() - Method in class org.apache.tika.parser.microsoft.ooxml.RunProperties
 
isListenForAllRecords() - Method in class org.apache.tika.parser.microsoft.ExcelExtractor
Returns true if this parser is configured to listen for all records instead of just the specified few.
isMatchingElement(String, String) - Method in class org.apache.tika.parser.xml.ElementMetadataHandler
 
isMatchingParentElement(String, String) - Method in class org.apache.tika.parser.xml.ElementMetadataHandler
 
isMetadataField(String) - Static method in class org.apache.tika.parser.image.MetadataFields
 
isMetadataField(Property) - Static method in class org.apache.tika.parser.image.MetadataFields
 
isMimetype() - Method in class org.apache.tika.parser.strings.FileConfig
Returns true if the mime option is enabled.
isMSB() - Method in class org.apache.tika.parser.executable.MachineMetadata.Endian
 
isOnlyLatestRevision() - Method in class org.apache.tika.parser.microsoft.onenote.OneNoteTreeWalkerOptions
Only parse the latest revision.
isPrettyPrint() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
Returns true if formatted output is enabled, false otherwise.
isPropertySet - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.JCID
 
isReadOnly - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.JCID
 
isSerialize() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
Returns true if CAS serialization is enabled, false otherwise.
isStrikeThrough() - Method in class org.apache.tika.parser.microsoft.ooxml.RunProperties
 
isStyle - Variable in class org.apache.tika.parser.rtf.ListDescriptor
 
isText() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
Returns true if content text analysis is enabled false otherwise.
isTracking() - Method in class org.apache.tika.parser.mbox.MboxParser
 
isUnordered(int) - Method in class org.apache.tika.parser.rtf.ListDescriptor
 
ITSF - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
ITSP - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
IWORK13_COMMON_ENTRY - Static variable in class org.apache.tika.parser.iwork.iwana.IWork13PackageParser
All iWork 13 files contain this, so we can detect based on it
IWork13PackageParser - Class in org.apache.tika.parser.iwork.iwana
 
IWork13PackageParser() - Constructor for class org.apache.tika.parser.iwork.iwana.IWork13PackageParser
 
IWork13PackageParser.IWork13DocumentType - Enum in org.apache.tika.parser.iwork.iwana
 
IWork18PackageParser - Class in org.apache.tika.parser.iwork.iwana
For now, this parser isn't even registered.
IWork18PackageParser() - Constructor for class org.apache.tika.parser.iwork.iwana.IWork18PackageParser
 
IWork18PackageParser.IWork18DocumentType - Enum in org.apache.tika.parser.iwork.iwana
 
IWORK_COMMON_ENTRY - Static variable in class org.apache.tika.parser.iwork.IWorkPackageParser
All iWork files contain one of these, so we can detect based on it
IWORK_CONTENT_ENTRIES - Static variable in class org.apache.tika.parser.iwork.IWorkPackageParser
Which files within an iWork file contain the actual content?
IWorkPackageParser - Class in org.apache.tika.parser.iwork
A parser for the IWork container files.
IWorkPackageParser() - Constructor for class org.apache.tika.parser.iwork.IWorkPackageParser
 
IWorkPackageParser.IWORKDocumentType - Enum in org.apache.tika.parser.iwork
 

J

JackcessParser - Class in org.apache.tika.parser.microsoft
Parser that handles Microsoft Access files via Jackcess
JackcessParser() - Constructor for class org.apache.tika.parser.microsoft.JackcessParser
 
JCID - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
This class is used to represent a JCID
JCID() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.JCID
 
jcid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.JCIDObject
 
jcid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionStoreObject
 
JCIDObject - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
This class is used to represent the JCID object.
JCIDObject(ObjectGroupObjectDeclare, ObjectGroupObjectData) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.JCIDObject
Construct the JCIDObject instance.
JempboxExtractor - Class in org.apache.tika.parser.image.xmp
 
JempboxExtractor(Metadata) - Constructor for class org.apache.tika.parser.image.xmp.JempboxExtractor
 
joinCreators(List<String>) - Static method in class org.apache.tika.parser.image.xmp.JempboxExtractor
 
JournalParser - Class in org.apache.tika.parser.journal
 
JournalParser() - Constructor for class org.apache.tika.parser.journal.JournalParser
 
JpegParser - Class in org.apache.tika.parser.jpeg
 
JpegParser() - Constructor for class org.apache.tika.parser.jpeg.JpegParser
 

L

label - Variable in class org.apache.tika.parser.recognition.RecognisedObject
Label of this object.
LABEL_LANG - Static variable in class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
 
labelLang - Variable in class org.apache.tika.parser.recognition.RecognisedObject
Language of label, Example : english
largeLength - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart32bit
Gets or sets an optional compact uint64 that specifies the length in bytes for additional data (if any).
Latin1StringsParser - Class in org.apache.tika.parser.strings
Parser to extract printable Latin1 strings from arbitrary files with pure java without running any external process.
Latin1StringsParser() - Constructor for class org.apache.tika.parser.strings.Latin1StringsParser
 
LAYER_1 - Static variable in class org.apache.tika.parser.mp3.AudioFrame
Constant for audio layer 1.
LAYER_2 - Static variable in class org.apache.tika.parser.mp3.AudioFrame
Constant for audio layer 2.
LAYER_3 - Static variable in class org.apache.tika.parser.mp3.AudioFrame
Constant for audio layer 3.
LeafNodeObject - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
 
LeafNodeObject() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.LeafNodeObject
Initializes a new instance of the LeafNodeObjectData class.
LeafNodeObject.IntermediateNodeObjectBuilder - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
The class is used to build a intermediate node object.
leftShift(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
 
length - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.BinaryItem
 
length - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart
 
lengthTreeLengtsTable - Variable in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
lengthTreeTable - Variable in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
LevelTuple(String) - Constructor for class org.apache.tika.parser.microsoft.AbstractListManager.LevelTuple
 
LevelTuple(int, int, String, String, boolean) - Constructor for class org.apache.tika.parser.microsoft.AbstractListManager.LevelTuple
 
LinkedCell - Class in org.apache.tika.parser.microsoft
Linked cell.
LinkedCell(Cell, String) - Constructor for class org.apache.tika.parser.microsoft.LinkedCell
 
ListDescriptor - Class in org.apache.tika.parser.rtf
Contains the information for a single list in the list or list override tables.
ListDescriptor() - Constructor for class org.apache.tika.parser.rtf.ListDescriptor
 
listLevelMap - Variable in class org.apache.tika.parser.microsoft.AbstractListManager
 
ListManager - Class in org.apache.tika.parser.microsoft
Computes the number text which goes at the beginning of each list paragraph

ListManager(HWPFDocument) - Constructor for class org.apache.tika.parser.microsoft.ListManager
Ordinary constructor for a new list reader
LITTLE - Static variable in class org.apache.tika.parser.executable.MachineMetadata.Endian
 
LittleEndianBitConverter - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.util
Implement a converter which converts to/from little-endian byte arrays
loadLinkedRelationships(PackagePart, boolean, Metadata) - Method in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
This is used by the SAX docx and pptx decorators to load hyperlinks and other linked objects
LOCAL_FILE_HEADER - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ZipHeader
The file header in zip.
Location - Class in org.apache.tika.parser.geo.topic.gazetteer
 
Location() - Constructor for class org.apache.tika.parser.geo.topic.gazetteer.Location
 
LOCATION - Static variable in interface org.apache.tika.parser.ner.NERecogniser
 
LOCATION_FILE - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
 
LOG - Static variable in class org.apache.tika.parser.hwp.HwpTextExtractorV5
 
LOG - Static variable in class org.apache.tika.parser.ner.NamedEntityParser
 
longValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
 
longValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
 
longValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
 
longValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
 
LyricsHandler - Class in org.apache.tika.parser.mp3
This is used to parse Lyrics3 tag information from an MP3 file, if available.
LyricsHandler(InputStream, ContentHandler) - Constructor for class org.apache.tika.parser.mp3.LyricsHandler
 
LyricsHandler(byte[]) - Constructor for class org.apache.tika.parser.mp3.LyricsHandler
Looks for the Lyrics data, which will be just before the ID3v1 data (if present), and process it.
LZX_ALIGNED_MAXSYMBOLS - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
LZX_ALIGNED_NUM_ELEMENTS - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
LZX_ALIGNED_TABLEBITS - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
LZX_BLOCKTYPE_ALIGNED - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
LZX_BLOCKTYPE_INVALID - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
LZX_BLOCKTYPE_UNCOMPRESSED - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
LZX_BLOCKTYPE_VERBATIM - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
LZX_LENGTH_MAXSYMBOLS - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
LZX_LENGTH_TABLEBITS - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
LZX_LENTABLE_SAFETY - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
LZX_MAIN_MAXSYMBOLS - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
LZX_MAINTREE_MAXSYMBOLS - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
LZX_MAINTREE_TABLEBITS - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
LZX_MAX_MATCH - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
LZX_MIN_MATCH - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
LZX_NUM_CHARS - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
LZX_NUM_PRIMARY_LENGTHS - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
LZX_NUM_SECONDARY_LENGTHS - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
LZX_PRETREE_MAXSYMBOLS - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
LZX_PRETREE_NUM_ELEMENTS - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
LZX_PRETREE_NUM_ELEMENTS_BITS - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
LZX_PRETREE_TABLEBITS - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
LZXC - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 

M

MACHINE_ALPHA - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
 
MACHINE_ARM - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
 
MACHINE_EFI - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
 
MACHINE_IA_64 - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
 
MACHINE_M32R - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
 
MACHINE_M68K - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
 
MACHINE_M88K - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
 
MACHINE_MIPS - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
 
MACHINE_PPC - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
 
MACHINE_S370 - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
 
MACHINE_S390 - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
 
MACHINE_SH3 - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
 
MACHINE_SH4 - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
 
MACHINE_SH5 - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
 
MACHINE_SPARC - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
 
MACHINE_TYPE - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
 
MACHINE_UNKNOWN - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
 
MACHINE_VAX - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
 
MACHINE_x86_32 - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
 
MACHINE_x86_64 - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
 
MachineMetadata - Interface in org.apache.tika.parser.executable
Metadata for describing machines, such as their architecture, type and endian-ness
MachineMetadata.Endian - Class in org.apache.tika.parser.executable
 
MAIL_MAX_SIZE - Static variable in class org.apache.tika.parser.mbox.MboxParser
 
MailUtil - Class in org.apache.tika.parser.mail
 
MailUtil() - Constructor for class org.apache.tika.parser.mail.MailUtil
 
main(String[]) - Static method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
 
main(String[]) - Static method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
 
main(String[]) - Static method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
 
main(String[]) - Static method in class org.apache.tika.parser.chm.lzx.ChmSection
 
main(String[]) - Static method in class org.apache.tika.parser.microsoft.ooxml.xslf.XSLFEventBasedPowerPointExtractor
 
main(String[]) - Static method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFEventBasedWordExtractor
 
main(String[]) - Static method in class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
 
mainTreeLengtsTable - Variable in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
mainTreeTable - Variable in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
manifestMappingExGuid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexManifestMapping
 
manifestMappingSerialNumber - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexManifestMapping
 
mapSafeAttribute(String, String) - Method in class org.apache.tika.parser.html.DefaultHtmlMapper
Normalizes an attribute name.
mapSafeAttribute(String, String) - Method in interface org.apache.tika.parser.html.HtmlMapper
Maps "safe" HTML attribute names to semantic XHTML equivalents.
mapSafeAttribute(String, String) - Method in class org.apache.tika.parser.html.HtmlParser
Deprecated.
Use the HtmlMapper mechanism to customize the HTML mapping. This method will be removed in Tika 1.0.
mapSafeAttribute(String, String) - Method in class org.apache.tika.parser.html.IdentityHtmlMapper
 
mapSafeElement(String) - Method in class org.apache.tika.parser.html.DefaultHtmlMapper
 
mapSafeElement(String) - Method in interface org.apache.tika.parser.html.HtmlMapper
Maps "safe" HTML element names to semantic XHTML equivalents.
mapSafeElement(String) - Method in class org.apache.tika.parser.html.HtmlParser
Deprecated.
Use the HtmlMapper mechanism to customize the HTML mapping. This method will be removed in Tika 1.0.
mapSafeElement(String) - Method in class org.apache.tika.parser.html.IdentityHtmlMapper
 
MATLAB_MIME_TYPE - Static variable in class org.apache.tika.parser.mat.MatParser
 
MatParser - Class in org.apache.tika.parser.mat
 
MatParser() - Constructor for class org.apache.tika.parser.mat.MatParser
 
MAX - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
A constant holding the maximum value an unsigned byte can have as UByte, 28-1.
MAX - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
A constant holding the maximum value an unsigned int can have as UInteger, 232-1.
MAX - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
A constant holding the maximum value + 1 an signed long can have as ULong, 263.
max(UByte, UByte) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UMath
Returns the greater of two UByte values.
max(UInteger, UInteger) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UMath
Returns the greater of two UInteger values.
max(ULong, ULong) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UMath
Returns the greater of two ULong values.
max(UShort, UShort) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UMath
Returns the greater of two UShort values.
MAX - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
A constant holding the maximum value an unsigned short can have as UShort, 216-1.
MAX_VALUE - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
A constant holding the maximum value an unsigned byte can have, 28-1.
MAX_VALUE - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
A constant holding the maximum value an unsigned int can have, 232-1.
MAX_VALUE - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
A constant holding the maximum value an unsigned long can have, 264-1.
MAX_VALUE - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
A constant holding the maximum value an unsigned short can have, 216-1.
MAX_VALUE_LONG - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
A constant holding the maximum value + 1 an signed long can have, 263.
MAXSUBREQUESTID - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.SequenceNumberGenerator
Specify the max sub request ID.
MAXTOKENVALUE - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.SequenceNumberGenerator
Specify the max token value.
MBOX_MIME_TYPE - Static variable in class org.apache.tika.parser.mbox.MboxParser
 
MBOX_RECORD_DIVIDER - Static variable in class org.apache.tika.parser.mbox.MboxParser
 
MboxParser - Class in org.apache.tika.parser.mbox
Mbox (mailbox) parser.
MboxParser() - Constructor for class org.apache.tika.parser.mbox.MboxParser
 
MD_KEY_IMG_CAP - Static variable in class org.apache.tika.parser.recognition.ObjectRecognitionParser
 
MD_KEY_OBJ_REC - Static variable in class org.apache.tika.parser.recognition.ObjectRecognitionParser
 
MD_KEY_PREFIX - Static variable in class org.apache.tika.parser.ner.NamedEntityParser
 
MD_REC_IMPL_KEY - Static variable in class org.apache.tika.parser.recognition.ObjectRecognitionParser
 
MDB_PROPERTY_PREFIX - Static variable in class org.apache.tika.parser.microsoft.JackcessParser
 
MDB_PW - Static variable in class org.apache.tika.parser.microsoft.JackcessParser
 
MEDIA_TYPES - Static variable in class org.apache.tika.parser.ner.NamedEntityParser
 
memcmp(int[], int[], int) - Static method in class org.apache.tika.parser.microsoft.onenote.GUID
 
metadata - Variable in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
 
MetadataExtractor - Class in org.apache.tika.parser.microsoft.ooxml
OOXML metadata extractor.
MetadataExtractor(POIXMLTextExtractor) - Constructor for class org.apache.tika.parser.microsoft.ooxml.MetadataExtractor
 
MetadataFields - Class in org.apache.tika.parser.image
Knowns about all declared Metadata fields.
MetadataFields() - Constructor for class org.apache.tika.parser.image.MetadataFields
 
MetadataHandler - Class in org.apache.tika.parser.xml
Deprecated.
MetadataHandler(Metadata, String) - Constructor for class org.apache.tika.parser.xml.MetadataHandler
Deprecated.
 
MetadataHandler(Metadata, Property) - Constructor for class org.apache.tika.parser.xml.MetadataHandler
Deprecated.
 
MidiParser - Class in org.apache.tika.parser.audio
 
MidiParser() - Constructor for class org.apache.tika.parser.audio.MidiParser
 
MIFContentHandler - Class in org.apache.tika.parser.mif
Content handler for MIF Content and Metadata.
MIFExtractor - Class in org.apache.tika.parser.mif
Helper Class to Parse and Extract Adobe MIF Files.
MIFExtractor() - Constructor for class org.apache.tika.parser.mif.MIFExtractor
 
MIFParser - Class in org.apache.tika.parser.mif
 
MIFParser() - Constructor for class org.apache.tika.parser.mif.MIFParser
 
MIFParser(EncodingDetector) - Constructor for class org.apache.tika.parser.mif.MIFParser
 
MIN - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
A constant holding the minimum value an unsigned byte can have as UByte, 0.
MIN - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
A constant holding the minimum value an unsigned int can have as UInteger, 0.
MIN - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
A constant holding the minimum value an unsigned long can have as ULong, 0.
min(UByte, UByte) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UMath
Returns the smaller of two UByte values.
min(UInteger, UInteger) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UMath
Returns the smaller of two UInteger values.
min(ULong, ULong) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UMath
Returns the smaller of two ULong values.
min(UShort, UShort) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UMath
Returns the smaller of two UShort values.
MIN - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
A constant holding the minimum value an unsigned short can have as UShort, 0.
MIN_VALUE - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
A constant holding the minimum value an unsigned byte can have, 0.
MIN_VALUE - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
A constant holding the minimum value an unsigned int can have, 0.
MIN_VALUE - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
A constant holding the minimum value an unsigned long can have, 0.
MIN_VALUE - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
A constant holding the minimum value an unsigned short can have, 0.
minConfidence - Variable in class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
 
MISCELLANEOUS - Static variable in interface org.apache.tika.parser.ner.NERecogniser
 
MITIENERecogniser - Class in org.apache.tika.parser.ner.mitie
This class offers an implementation of NERecogniser based on trained models using state-of-the-art information extraction tools.
MITIENERecogniser() - Constructor for class org.apache.tika.parser.ner.mitie.MITIENERecogniser
 
MITIENERecogniser(String) - Constructor for class org.apache.tika.parser.ner.mitie.MITIENERecogniser
Creates a NERecogniser by loading model from given path
MODEL_PROP_NAME - Static variable in class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
 
MODEL_PROP_NAME - Static variable in class org.apache.tika.parser.ner.mitie.MITIENERecogniser
 
MODELS_DIR - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
 
MONEY - Static variable in interface org.apache.tika.parser.ner.NERecogniser
 
MONEY_FILE - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
 
moveNext() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitReader
Advances the enumerator to the next bit of the byte array.
MP3Frame - Interface in org.apache.tika.parser.mp3
A frame in an MP3 file, such as ID3v2 Tags or some audio.
Mp3Parser - Class in org.apache.tika.parser.mp3
The Mp3Parser is used to parse ID3 Version 1 Tag information from an MP3 file, if available.
Mp3Parser() - Constructor for class org.apache.tika.parser.mp3.Mp3Parser
 
Mp3Parser.ID3TagsAndAudio - Class in org.apache.tika.parser.mp3
 
MP4Parser - Class in org.apache.tika.parser.mp4
Parser for the MP4 media container format, as well as the older QuickTime format that MP4 is based on.
MP4Parser() - Constructor for class org.apache.tika.parser.mp4.MP4Parser
 
MPEG_V1 - Static variable in class org.apache.tika.parser.mp3.AudioFrame
Constant for the MPEG version 1.
MPEG_V2 - Static variable in class org.apache.tika.parser.mp3.AudioFrame
Constant for the MPEG version 2.
MPEG_V2_5 - Static variable in class org.apache.tika.parser.mp3.AudioFrame
Constant for the MPEG version 2.5.
MPP - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
Microsoft Project
MS_EQUATION - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
Equation embedded in Office docs
MS_GRAPH_CHART - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
Graph/Charts embedded in PowerPoint and Excel
MS_OUTLOOK_PST_MIMETYPE - Static variable in class org.apache.tika.parser.mbox.OutlookPSTParser
 
MSG - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
Microsoft Outlook
MSOneStorePackage - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb
 
MSOneStorePackage() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStorePackage
 
MSOneStoreParser - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb
 
MSOneStoreParser() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStoreParser
 
MSOwnerFileParser - Class in org.apache.tika.parser.microsoft
Parser for temporary MSOFfice files.
MSOwnerFileParser() - Constructor for class org.apache.tika.parser.microsoft.MSOwnerFileParser
 

N

n - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CompactID
 
name - Variable in class org.apache.tika.parser.mp3.ID3v2Frame.RawTag
 
NamedEntityParser - Class in org.apache.tika.parser.ner
This implementation of Parser extracts entity names from text content and adds it to the metadata.
NamedEntityParser() - Constructor for class org.apache.tika.parser.ner.NamedEntityParser
 
NameEntityExtractor - Class in org.apache.tika.parser.geo.topic
 
NameEntityExtractor(NameFinderME) - Constructor for class org.apache.tika.parser.geo.topic.NameEntityExtractor
 
NER_3CLASS_MODEL - Static variable in class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
 
NER_4CLASS_MODEL - Static variable in class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
 
NER_7CLASS_MODEL - Static variable in class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
 
NER_DATE_MODEL - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
 
NER_LOCATION_MODEL - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
 
NER_MONEY_MODEL - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
 
NER_ORGANIZATION_MODEL - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
 
NER_PERCENT_MODEL - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
 
NER_PERSON_MODEL - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
 
NER_REGEX_FILE - Static variable in class org.apache.tika.parser.ner.regex.RegexNERecogniser
 
NER_TIME_MODEL - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
 
NERecogniser - Interface in org.apache.tika.parser.ner
Defines a contract for named entity recogniser.
NetCDFParser - Class in org.apache.tika.parser.netcdf
A Parser for NetCDF files using the UCAR, MIT-licensed NetCDF for Java API.
NetCDFParser() - Constructor for class org.apache.tika.parser.netcdf.NetCDFParser
 
newDecoder() - Method in class org.apache.tika.parser.html.charsetdetector.charsets.ReplacementCharset
 
newDecoder() - Method in class org.apache.tika.parser.html.charsetdetector.charsets.XUserDefinedCharset
 
newEncoder() - Method in class org.apache.tika.parser.html.charsetdetector.charsets.ReplacementCharset
 
newEncoder() - Method in class org.apache.tika.parser.html.charsetdetector.charsets.XUserDefinedCharset
 
next() - Method in class org.apache.tika.parser.mp3.ID3v2Frame.RawTagIterator
 
nil() - Static method in class org.apache.tika.parser.microsoft.onenote.ExtendedGUID
 
nil() - Static method in class org.apache.tika.parser.microsoft.onenote.GUID
 
NLTKNERecogniser - Class in org.apache.tika.parser.ner.nltk
This class offers an implementation of NERecogniser based on ne_chunk() module of NLTK.
NLTKNERecogniser() - Constructor for class org.apache.tika.parser.ner.nltk.NLTKNERecogniser
 
NoakesMP4Parser - Class in org.apache.tika.parser.mp4
Parser for the MP4 media container format, as well as the older QuickTime format that MP4 is based on.
NoakesMP4Parser() - Constructor for class org.apache.tika.parser.mp4.NoakesMP4Parser
 
NoData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.property
This class is used to represent the property contains no data.
NoData() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.NoData
 
NodeObject - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
 
NodeObject(StreamObjectTypeHeaderStart) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.NodeObject
Initializes a new instance of the NodeObject class.
NSNormalizerContentHandler - Class in org.apache.tika.parser.odf
Content handler decorator that: Maps old OpenOffice 1.0 Namespaces to the OpenDocument ones Returns a fake DTD when parser requests OpenOffice DTD
NSNormalizerContentHandler(ContentHandler) - Constructor for class org.apache.tika.parser.odf.NSNormalizerContentHandler
 
number - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.ArrayNumber
 
NUMBER_TYPE_BULLET - Static variable in class org.apache.tika.parser.rtf.ListDescriptor
 
NumberCell - Class in org.apache.tika.parser.microsoft
Number cell.
NumberCell(double, NumberFormat) - Constructor for class org.apache.tika.parser.microsoft.NumberCell
 
numberType - Variable in class org.apache.tika.parser.rtf.ListDescriptor
 

O

ObjectChangeFrequency - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupMetadata
Gets or sets a compact unsigned 64-bit integer that specifies the expected change frequency of the object.
objectData - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.DataNodeObjectData
 
objectData - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.HeaderCell
 
objectDataBLOBExGUID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectBLOBDataDeclaration
 
objectDataSize - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectBLOBDataDeclaration
Gets or sets a compact unsigned 64-bit integer that specifies the size in bytes of the object.opaque binary data for the declared object.
objectDataSize - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectDeclare
Gets or sets a compact unsigned 64-bit integer that specifies the size in bytes of the object.binary data opaque to this protocol for the declared object.
objectDeclaration - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.HeaderCell
 
objectDeclaration - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.JCIDObject
 
objectDeclaration - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.PropertySetObject
 
objectDeclarationList - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupDeclarations
 
objectExGUID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectBLOBDataDeclaration
 
objectExGuid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestRootDeclare
 
objectExGUIDArray - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectData
 
objectExtendedGUID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectDeclare
 
objectExtendedGUIDArray - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectDataBLOBReference
 
ObjectGroupData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
The ObjectGroupData class.
ObjectGroupData() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupData
Initializes a new instance of the ObjectGroupData class.
objectGroupData - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupDataElementData
 
ObjectGroupDataElementData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
 
ObjectGroupDataElementData() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupDataElementData
Initializes a new instance of the ObjectGroupDataElementData class.
ObjectGroupDataElementData.Builder - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
The internal class for build a list of DataElement from a node object.
objectGroupDeclarations - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupDataElementData
 
ObjectGroupDeclarations - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
Object Group Declarations
ObjectGroupDeclarations() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupDeclarations
Initializes a new instance of the ObjectGroupDeclarations class.
objectGroupExtendedGUID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestObjectGroupReferences
 
objectGroupID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionStoreObject
 
objectGroupID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionStoreObjectGroup
 
ObjectGroupMetadata - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
Specifies an object group metadata
ObjectGroupMetadata() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupMetadata
Initializes a new instance of the ObjectGroupMetadata class.
ObjectGroupMetadataDeclarations - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
Object Metadata Declaration
ObjectGroupMetadataDeclarations() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupMetadataDeclarations
Initializes a new instance of the ObjectGroupMetadataDeclarations class.
objectGroupMetadataList - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupMetadataDeclarations
 
ObjectGroupObjectBLOBDataDeclaration - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
object data BLOB declaration
ObjectGroupObjectBLOBDataDeclaration() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectBLOBDataDeclaration
Initializes a new instance of the ObjectGroupObjectBLOBDataDeclaration class.
objectGroupObjectBLOBDataDeclarationList - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupDeclarations
 
ObjectGroupObjectData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
 
ObjectGroupObjectData() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectData
Initializes a new instance of the ObjectGroupObjectData class.
ObjectGroupObjectDataBLOBReference - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
object data BLOB reference
ObjectGroupObjectDataBLOBReference() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectDataBLOBReference
Initializes a new instance of the ObjectGroupObjectDataBLOBReference class.
objectGroupObjectDataBLOBReferenceList - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupData
 
objectGroupObjectDataList - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupData
 
ObjectGroupObjectDeclare - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
 
ObjectGroupObjectDeclare() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectDeclare
Initializes a new instance of the ObjectGroupObjectDeclare class.
objectID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionStoreObject
 
objectMetadataDeclaration - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupDataElementData
 
objectPartitionID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectBLOBDataDeclaration
 
objectPartitionID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectDeclare
 
ObjectRecogniser - Interface in org.apache.tika.parser.recognition
This is a contract for object recognisers used by ObjectRecognitionParser
ObjectRecognitionParser - Class in org.apache.tika.parser.recognition
This parser recognises objects from Images.
ObjectRecognitionParser() - Constructor for class org.apache.tika.parser.recognition.ObjectRecognitionParser
 
objectReferencesCount - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectBLOBDataDeclaration
 
objectReferencesCount - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectDeclare
 
objects - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionStoreObjectGroup
 
objectSpaceObjectPropSet - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.PropertySetObject
 
ObjectSpaceObjectPropSet - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space
This class is used to represent a ObjectSpaceObjectPropSet.
ObjectSpaceObjectPropSet() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectPropSet
 
ObjectSpaceObjectPropSet - Class in org.apache.tika.parser.microsoft.onenote
 
ObjectSpaceObjectPropSet() - Constructor for class org.apache.tika.parser.microsoft.onenote.ObjectSpaceObjectPropSet
 
ObjectSpaceObjectStreamHeader - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space
 
ObjectSpaceObjectStreamHeader() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamHeader
 
ObjectSpaceObjectStreamOfContextIDs - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space
This class is used to represent a ObjectSpaceObjectStreamOfContextIDs.
ObjectSpaceObjectStreamOfContextIDs() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfContextIDs
 
ObjectSpaceObjectStreamOfOIDs - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space
This class is used to represent a ObjectSpaceObjectStreamOfOIDs.
ObjectSpaceObjectStreamOfOIDs() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfOIDs
 
ObjectSpaceObjectStreamOfOSIDs - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space
This class is used to represent a ObjectSpaceObjectStreamOfOSIDs.
ObjectSpaceObjectStreamOfOSIDs() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfOSIDs
 
of(Long) - Static method in enum org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
 
OfficeParser - Class in org.apache.tika.parser.microsoft
Defines a Microsoft document content extractor.
OfficeParser() - Constructor for class org.apache.tika.parser.microsoft.OfficeParser
 
OfficeParser.POIFSDocumentType - Enum in org.apache.tika.parser.microsoft
 
OfficeParserConfig - Class in org.apache.tika.parser.microsoft
 
OfficeParserConfig() - Constructor for class org.apache.tika.parser.microsoft.OfficeParserConfig
 
oids - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectPropSet
 
OldExcelParser - Class in org.apache.tika.parser.microsoft
A POI-powered Tika Parser for very old versions of Excel, from pre-OLE2 days, such as Excel 4.
OldExcelParser() - Constructor for class org.apache.tika.parser.microsoft.OldExcelParser
 
OLE - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
The OLE base file format
OLE10_NATIVE - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
An OLE10 Native embedded document within another OLE2 document
ONE_NOTE_PREFIX - Static variable in class org.apache.tika.parser.microsoft.onenote.OneNoteParser
 
OneByteOfData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.property
This class is used to represent the property contains 1 byte of data in the PropertySet.rgData stream field.
OneByteOfData() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.OneByteOfData
 
OneNoteParser - Class in org.apache.tika.parser.microsoft.onenote
OneNote tika parser capable of parsing Microsoft OneNote files.
OneNoteParser() - Constructor for class org.apache.tika.parser.microsoft.onenote.OneNoteParser
 
OneNotePropertyEnum - Enum in org.apache.tika.parser.microsoft.onenote
 
OneNoteTreeWalkerOptions - Class in org.apache.tika.parser.microsoft.onenote
Options when walking the one note tree.
OneNoteTreeWalkerOptions() - Constructor for class org.apache.tika.parser.microsoft.onenote.OneNoteTreeWalkerOptions
 
OOXML_PROTECTED - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
The protected OOXML base file format
OOXMLExtractor - Interface in org.apache.tika.parser.microsoft.ooxml
Interface implemented by all Tika OOXML extractors.
OOXMLExtractorFactory - Class in org.apache.tika.parser.microsoft.ooxml
Figures out the correct OOXMLExtractor for the supplied document and returns it.
OOXMLExtractorFactory() - Constructor for class org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory
 
OOXMLParser - Class in org.apache.tika.parser.microsoft.ooxml
Office Open XML (OOXML) parser.
OOXMLParser() - Constructor for class org.apache.tika.parser.microsoft.ooxml.OOXMLParser
 
OOXMLTikaBodyPartHandler - Class in org.apache.tika.parser.microsoft.ooxml
 
OOXMLTikaBodyPartHandler(XHTMLContentHandler) - Constructor for class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
 
OOXMLTikaBodyPartHandler(XHTMLContentHandler, XWPFStylesShim, XWPFListManager, OfficeParserConfig) - Constructor for class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
 
OOXMLWordAndPowerPointTextHandler - Class in org.apache.tika.parser.microsoft.ooxml
This class is intended to handle anything that might contain IBodyElements: main document, headers, footers, notes, slides, etc.
OOXMLWordAndPowerPointTextHandler(OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler, Map<String, String>) - Constructor for class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
 
OOXMLWordAndPowerPointTextHandler(OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler, Map<String, String>, boolean, boolean) - Constructor for class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
 
OOXMLWordAndPowerPointTextHandler.EditType - Enum in org.apache.tika.parser.microsoft.ooxml
 
OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler - Interface in org.apache.tika.parser.microsoft.ooxml
 
OpenDocumentContentParser - Class in org.apache.tika.parser.odf
Parser for ODF content.xml files.
OpenDocumentContentParser() - Constructor for class org.apache.tika.parser.odf.OpenDocumentContentParser
 
OpenDocumentMetaParser - Class in org.apache.tika.parser.odf
Parser for OpenDocument meta.xml files.
OpenDocumentMetaParser() - Constructor for class org.apache.tika.parser.odf.OpenDocumentMetaParser
 
OpenDocumentParser - Class in org.apache.tika.parser.odf
OpenOffice parser
OpenDocumentParser() - Constructor for class org.apache.tika.parser.odf.OpenDocumentParser
 
OpenNLPNameFinder - Class in org.apache.tika.parser.ner.opennlp
An implementation of NERecogniser that finds names in text using Open NLP Model.
OpenNLPNameFinder(String, String) - Constructor for class org.apache.tika.parser.ner.opennlp.OpenNLPNameFinder
Creates OpenNLP name finder
OpenNLPNERecogniser - Class in org.apache.tika.parser.ner.opennlp
This implementation of NERecogniser chains an array of OpenNLPNameFinders for which NER models are available in classpath.
OpenNLPNERecogniser() - Constructor for class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
Creates a default chain of Name finders using default OpenNLP recognizers
OpenNLPNERecogniser(Map<String, String>) - Constructor for class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
Creates a chain of Named Entity recognisers
OpenOfficeParser - Class in org.apache.tika.parser.opendocument
Deprecated.
Use the OpenDocumentParser class instead. This class will be removed in Apache Tika 1.0.
OpenOfficeParser() - Constructor for class org.apache.tika.parser.opendocument.OpenOfficeParser
Deprecated.
 
org.apache.tika.parser.apple - package org.apache.tika.parser.apple
 
org.apache.tika.parser.asm - package org.apache.tika.parser.asm
 
org.apache.tika.parser.audio - package org.apache.tika.parser.audio
 
org.apache.tika.parser.captioning - package org.apache.tika.parser.captioning
 
org.apache.tika.parser.captioning.tf - package org.apache.tika.parser.captioning.tf
 
org.apache.tika.parser.chm - package org.apache.tika.parser.chm
 
org.apache.tika.parser.chm.accessor - package org.apache.tika.parser.chm.accessor
 
org.apache.tika.parser.chm.assertion - package org.apache.tika.parser.chm.assertion
 
org.apache.tika.parser.chm.core - package org.apache.tika.parser.chm.core
 
org.apache.tika.parser.chm.exception - package org.apache.tika.parser.chm.exception
 
org.apache.tika.parser.chm.lzx - package org.apache.tika.parser.chm.lzx
 
org.apache.tika.parser.code - package org.apache.tika.parser.code
 
org.apache.tika.parser.crypto - package org.apache.tika.parser.crypto
 
org.apache.tika.parser.csv - package org.apache.tika.parser.csv
 
org.apache.tika.parser.ctakes - package org.apache.tika.parser.ctakes
 
org.apache.tika.parser.dbf - package org.apache.tika.parser.dbf
 
org.apache.tika.parser.dif - package org.apache.tika.parser.dif
 
org.apache.tika.parser.dwg - package org.apache.tika.parser.dwg
 
org.apache.tika.parser.envi - package org.apache.tika.parser.envi
 
org.apache.tika.parser.epub - package org.apache.tika.parser.epub
 
org.apache.tika.parser.executable - package org.apache.tika.parser.executable
 
org.apache.tika.parser.feed - package org.apache.tika.parser.feed
 
org.apache.tika.parser.font - package org.apache.tika.parser.font
 
org.apache.tika.parser.gdal - package org.apache.tika.parser.gdal
 
org.apache.tika.parser.geo.topic - package org.apache.tika.parser.geo.topic
 
org.apache.tika.parser.geo.topic.gazetteer - package org.apache.tika.parser.geo.topic.gazetteer
 
org.apache.tika.parser.geoinfo - package org.apache.tika.parser.geoinfo
 
org.apache.tika.parser.grib - package org.apache.tika.parser.grib
 
org.apache.tika.parser.hdf - package org.apache.tika.parser.hdf
 
org.apache.tika.parser.html - package org.apache.tika.parser.html
 
org.apache.tika.parser.html.charsetdetector - package org.apache.tika.parser.html.charsetdetector
 
org.apache.tika.parser.html.charsetdetector.charsets - package org.apache.tika.parser.html.charsetdetector.charsets
 
org.apache.tika.parser.hwp - package org.apache.tika.parser.hwp
 
org.apache.tika.parser.image - package org.apache.tika.parser.image
 
org.apache.tika.parser.image.xmp - package org.apache.tika.parser.image.xmp
 
org.apache.tika.parser.indesign - package org.apache.tika.parser.indesign
 
org.apache.tika.parser.indesign.xmp - package org.apache.tika.parser.indesign.xmp
 
org.apache.tika.parser.internal - package org.apache.tika.parser.internal
 
org.apache.tika.parser.iptc - package org.apache.tika.parser.iptc
 
org.apache.tika.parser.isatab - package org.apache.tika.parser.isatab
 
org.apache.tika.parser.iwork - package org.apache.tika.parser.iwork
 
org.apache.tika.parser.iwork.iwana - package org.apache.tika.parser.iwork.iwana
 
org.apache.tika.parser.jdbc - package org.apache.tika.parser.jdbc
 
org.apache.tika.parser.journal - package org.apache.tika.parser.journal
 
org.apache.tika.parser.jpeg - package org.apache.tika.parser.jpeg
 
org.apache.tika.parser.mail - package org.apache.tika.parser.mail
 
org.apache.tika.parser.mat - package org.apache.tika.parser.mat
 
org.apache.tika.parser.mbox - package org.apache.tika.parser.mbox
 
org.apache.tika.parser.microsoft - package org.apache.tika.parser.microsoft
 
org.apache.tika.parser.microsoft.onenote - package org.apache.tika.parser.microsoft.onenote
 
org.apache.tika.parser.microsoft.onenote.fsshttpb - package org.apache.tika.parser.microsoft.onenote.fsshttpb
 
org.apache.tika.parser.microsoft.onenote.fsshttpb.exception - package org.apache.tika.parser.microsoft.onenote.fsshttpb.exception
 
org.apache.tika.parser.microsoft.onenote.fsshttpb.property - package org.apache.tika.parser.microsoft.onenote.fsshttpb.property
 
org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj - package org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
 
org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic - package org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
 
org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking - package org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking
 
org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space - package org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space
 
org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned - package org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned
 
org.apache.tika.parser.microsoft.onenote.fsshttpb.util - package org.apache.tika.parser.microsoft.onenote.fsshttpb.util
 
org.apache.tika.parser.microsoft.ooxml - package org.apache.tika.parser.microsoft.ooxml
 
org.apache.tika.parser.microsoft.ooxml.xps - package org.apache.tika.parser.microsoft.ooxml.xps
 
org.apache.tika.parser.microsoft.ooxml.xslf - package org.apache.tika.parser.microsoft.ooxml.xslf
 
org.apache.tika.parser.microsoft.ooxml.xwpf - package org.apache.tika.parser.microsoft.ooxml.xwpf
 
org.apache.tika.parser.microsoft.ooxml.xwpf.ml2006 - package org.apache.tika.parser.microsoft.ooxml.xwpf.ml2006
 
org.apache.tika.parser.microsoft.xml - package org.apache.tika.parser.microsoft.xml
 
org.apache.tika.parser.mif - package org.apache.tika.parser.mif
 
org.apache.tika.parser.mp3 - package org.apache.tika.parser.mp3
 
org.apache.tika.parser.mp4 - package org.apache.tika.parser.mp4
 
org.apache.tika.parser.mp4.boxes - package org.apache.tika.parser.mp4.boxes
 
org.apache.tika.parser.ner - package org.apache.tika.parser.ner
 
org.apache.tika.parser.ner.corenlp - package org.apache.tika.parser.ner.corenlp
 
org.apache.tika.parser.ner.grobid - package org.apache.tika.parser.ner.grobid
 
org.apache.tika.parser.ner.mitie - package org.apache.tika.parser.ner.mitie
 
org.apache.tika.parser.ner.nltk - package org.apache.tika.parser.ner.nltk
 
org.apache.tika.parser.ner.opennlp - package org.apache.tika.parser.ner.opennlp
 
org.apache.tika.parser.ner.regex - package org.apache.tika.parser.ner.regex
 
org.apache.tika.parser.netcdf - package org.apache.tika.parser.netcdf
 
org.apache.tika.parser.ocr - package org.apache.tika.parser.ocr
 
org.apache.tika.parser.odf - package org.apache.tika.parser.odf
 
org.apache.tika.parser.opendocument - package org.apache.tika.parser.opendocument
 
org.apache.tika.parser.pdf - package org.apache.tika.parser.pdf
 
org.apache.tika.parser.pkg - package org.apache.tika.parser.pkg
 
org.apache.tika.parser.pot - package org.apache.tika.parser.pot
 
org.apache.tika.parser.prt - package org.apache.tika.parser.prt
 
org.apache.tika.parser.recognition - package org.apache.tika.parser.recognition
 
org.apache.tika.parser.recognition.tf - package org.apache.tika.parser.recognition.tf
 
org.apache.tika.parser.rtf - package org.apache.tika.parser.rtf
 
org.apache.tika.parser.sas - package org.apache.tika.parser.sas
 
org.apache.tika.parser.sentiment - package org.apache.tika.parser.sentiment
 
org.apache.tika.parser.strings - package org.apache.tika.parser.strings
 
org.apache.tika.parser.txt - package org.apache.tika.parser.txt
 
org.apache.tika.parser.utils - package org.apache.tika.parser.utils
 
org.apache.tika.parser.video - package org.apache.tika.parser.video
 
org.apache.tika.parser.wordperfect - package org.apache.tika.parser.wordperfect
 
org.apache.tika.parser.xliff - package org.apache.tika.parser.xliff
 
org.apache.tika.parser.xml - package org.apache.tika.parser.xml
 
ORGANIZATION - Static variable in interface org.apache.tika.parser.ner.NERecogniser
 
ORGANIZATION_FILE - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
 
osids - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectPropSet
 
osidStreamNotPresent - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamHeader
 
OtherFileNodeList - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStorePackage
 
OutlookExtractor - Class in org.apache.tika.parser.microsoft
Outlook Message Parser.
OutlookExtractor(POIFSFileSystem, ParseContext) - Constructor for class org.apache.tika.parser.microsoft.OutlookExtractor
 
OutlookExtractor(DirectoryNode, ParseContext) - Constructor for class org.apache.tika.parser.microsoft.OutlookExtractor
 
OutlookExtractor.RECIPIENT_TYPE - Enum in org.apache.tika.parser.microsoft
 
OutlookPSTParser - Class in org.apache.tika.parser.mbox
Parser for MS Outlook PST email storage files
OutlookPSTParser() - Constructor for class org.apache.tika.parser.mbox.OutlookPSTParser
 
overrideTupleMap - Variable in class org.apache.tika.parser.microsoft.AbstractListManager
 

P

PackageParser - Class in org.apache.tika.parser.pkg
Parser for various packaging formats.
PackageParser() - Constructor for class org.apache.tika.parser.pkg.PackageParser
 
PackageParser(EncodingDetector) - Constructor for class org.apache.tika.parser.pkg.PackageParser
 
packagingEnd - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AlternativePackaging
 
packagingStart - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AlternativePackaging
 
padding - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectPropSet
 
ParagraphLevelCounter(AbstractListManager.LevelTuple[]) - Constructor for class org.apache.tika.parser.microsoft.AbstractListManager.ParagraphLevelCounter
 
ParagraphProperties - Class in org.apache.tika.parser.microsoft.ooxml
 
ParagraphProperties() - Constructor for class org.apache.tika.parser.microsoft.ooxml.ParagraphProperties
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.apple.AppleSingleFileParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.apple.PListParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.asm.ClassParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.audio.AudioParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.audio.MidiParser
 
parse(byte[], T) - Method in interface org.apache.tika.parser.chm.accessor.ChmAccessor
Parses chm accessor
parse(byte[], ChmItsfHeader) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
 
parse(byte[], ChmItspHeader) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
 
parse(byte[], ChmLzxcControlData) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
 
parse(byte[], ChmLzxcResetTable) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
 
parse(byte[], ChmPmgiHeader) - Method in class org.apache.tika.parser.chm.accessor.ChmPmgiHeader
 
parse(byte[], ChmPmglHeader) - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.chm.ChmParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.code.SourceCodeParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.crypto.Pkcs7Parser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.crypto.TSDParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.csv.TextAndCSVParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.ctakes.CTAKESParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.dbf.DBFParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.dif.DIFParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.dwg.DWGParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.envi.EnviHeaderParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.epub.EpubContentParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.epub.EpubParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.executable.ExecutableParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.feed.FeedParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.font.AdobeFontMetricParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.font.TrueTypeParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.gdal.GDALParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.geo.topic.GeoParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.geoinfo.GeographicInformationParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.grib.GribParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.hdf.HDFParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.html.HtmlParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.hwp.HwpV5Parser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.image.BPGParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.image.HeifParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.image.ICNSParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.image.ImageParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.image.PSDParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.image.TiffParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.image.WebPParser
 
parse(InputStream) - Method in class org.apache.tika.parser.image.xmp.JempboxExtractor
 
parse(InputStream, OutputStream) - Method in class org.apache.tika.parser.image.xmp.XMPPacketScanner
Locates an XMP packet in a stream, parses it and returns the XMP metadata.
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.indesign.IDMLParser
 
parse(InputStream, Metadata) - Static method in class org.apache.tika.parser.indesign.xmp.XMPMetadataExtractor
Parse the XMP Packets.
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.iptc.IptcAnpaParser
 
parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.iptc.IptcAnpaParser
Deprecated.
This method will be removed in Apache Tika 1.0.
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.isatab.ISArchiveParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.iwork.iwana.IWork13PackageParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.iwork.iwana.IWork18PackageParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.iwork.IWorkPackageParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.jdbc.SQLite3Parser
 
parse(String, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.journal.GrobidRESTParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.journal.JournalParser
 
parse(String, ParseContext) - Method in class org.apache.tika.parser.journal.TEIDOMParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.jpeg.JpegParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.mail.RFC822Parser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.mat.MatParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.mbox.MboxParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.mbox.OutlookPSTParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.EMFParser
 
parse(POIFSFileSystem, XHTMLContentHandler, Locale) - Method in class org.apache.tika.parser.microsoft.ExcelExtractor
Extracts text from an Excel Workbook writing the extracted content to the specified Appendable.
parse(DirectoryNode, XHTMLContentHandler, Locale) - Method in class org.apache.tika.parser.microsoft.ExcelExtractor
 
parse(POIFSFileSystem, XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.HSLFExtractor
 
parse(DirectoryNode, XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.HSLFExtractor
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.JackcessParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.MSOwnerFileParser
Extracts owner from MS temp file
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.OfficeParser
Extracts properties and text from an MS Document input stream
parse(DirectoryNode, ParseContext, Metadata, XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.OfficeParser
 
parse(OldExcelExtractor, XHTMLContentHandler) - Static method in class org.apache.tika.parser.microsoft.OldExcelParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.OldExcelParser
Extracts properties and text from an MS Document input stream
parse(DataElementPackage) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStoreParser
 
parse(byte[], AtomicInteger, Class<T>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.BasicObject
Used to parse byte array to special object.
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.onenote.OneNoteParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Static method in class org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.ml2006.Word2006MLParser
 
parse(XHTMLContentHandler, Metadata) - Method in class org.apache.tika.parser.microsoft.OutlookExtractor
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.TNEFParser
Extracts properties and text from an MS Document input stream
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.WMFParser
 
parse(POIFSFileSystem, XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.WordExtractor
 
parse(DirectoryNode, XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.WordExtractor
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.xml.AbstractXML2003Parser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.mif.MIFParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.mp3.Mp3Parser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.mp4.MP4Parser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.mp4.NoakesMP4Parser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.ner.NamedEntityParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.netcdf.NetCDFParser
 
parse(Image, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.odf.FlatOpenDocumentParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.odf.OpenDocumentContentParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.odf.OpenDocumentMetaParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.odf.OpenDocumentParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.pdf.PDFParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.pkg.CompressorParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.pkg.PackageParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.pkg.RarParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.pot.PooledTimeSeriesParser
Parses a document stream into a sequence of XHTML SAX events.
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.prt.PRTParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.recognition.ObjectRecognitionParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.rtf.RTFParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.sas.SAS7BDATParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.sentiment.SentimentAnalysisParser
Performs the parse
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.strings.Latin1StringsParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.strings.StringsParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.txt.TXTParser
 
parse(String) - Static method in class org.apache.tika.parser.utils.CommonsDigester
Deprecated.
parse(String) - Method in class org.apache.tika.parser.utils.DataURISchemeUtil
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.video.FLVParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.wordperfect.QuattroProParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.wordperfect.WordPerfectParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.xliff.XLIFF12Parser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.xliff.XLZParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.xml.XMLParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.xml.XMLProfiler
 
parseAssay(InputStream, XHTMLContentHandler, Metadata, ParseContext) - Static method in class org.apache.tika.parser.isatab.ISATabUtils
 
parseContext - Variable in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
 
parseDate(String) - Static method in class org.apache.tika.parser.mbox.MboxParser
 
parseELF(XHTMLContentHandler, Metadata, InputStream, byte[]) - Method in class org.apache.tika.parser.executable.ExecutableParser
Parses a Unix ELF file
parseHeif(InputStream) - Method in class org.apache.tika.parser.image.ImageMetadataExtractor
 
parseInline(InputStream, XHTMLContentHandler, TesseractOCRConfig) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
parseInline(InputStream, XHTMLContentHandler, ParseContext, TesseractOCRConfig) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
Use this to parse content without starting a new document.
parseInvestigation(InputStream, XHTMLContentHandler, Metadata, ParseContext, String) - Static method in class org.apache.tika.parser.isatab.ISATabUtils
 
parseInvestigation(InputStream, XHTMLContentHandler, Metadata, ParseContext) - Static method in class org.apache.tika.parser.isatab.ISATabUtils
 
parseJpeg(File) - Method in class org.apache.tika.parser.image.ImageMetadataExtractor
 
parseObject(String, ParsePosition) - Method in class org.apache.tika.parser.microsoft.TikaExcelGeneralFormat
 
parseOOXMLContentTypes(InputStream) - Static method in class org.apache.tika.parser.pkg.StreamingZipContainerDetector
 
parseOOXMLRels(InputStream) - Static method in class org.apache.tika.parser.pkg.StreamingZipContainerDetector
 
parsePE(XHTMLContentHandler, Metadata, InputStream, byte[]) - Method in class org.apache.tika.parser.executable.ExecutableParser
Parses a DOS or Windows PE file
parseRawExif(InputStream, int, boolean) - Method in class org.apache.tika.parser.image.ImageMetadataExtractor
 
parseRawExif(byte[]) - Method in class org.apache.tika.parser.image.ImageMetadataExtractor
 
parseRawXMP(byte[]) - Method in class org.apache.tika.parser.image.ImageMetadataExtractor
 
parseStreamObject(StreamObjectHeaderStart, byte[], AtomicInteger) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObject
Parse stream object from byte array.
parseStudy(InputStream, XHTMLContentHandler, Metadata, ParseContext) - Static method in class org.apache.tika.parser.isatab.ISATabUtils
 
parseSummaries(POIFSFileSystem) - Method in class org.apache.tika.parser.microsoft.SummaryExtractor
 
parseSummaries(DirectoryNode) - Method in class org.apache.tika.parser.microsoft.SummaryExtractor
 
parseTiff(File) - Method in class org.apache.tika.parser.image.ImageMetadataExtractor
 
parseWebP(File) - Method in class org.apache.tika.parser.image.ImageMetadataExtractor
 
parseWord6(POIFSFileSystem, XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.WordExtractor
 
parseWord6(DirectoryNode, XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.WordExtractor
 
PASSWORD - Static variable in class org.apache.tika.parser.pdf.PDFParser
Deprecated.
Supply a PasswordProvider on the ParseContext instead
patterns - Variable in class org.apache.tika.parser.ner.regex.RegexNERecogniser
 
PDFMarkedContent2XHTML - Class in org.apache.tika.parser.pdf
This was added in Tika 1.24 as an alpha version of a text extractor that builds the text from the marked text tree and includes/normalizes some of the structural tags.
PDFParser - Class in org.apache.tika.parser.pdf
PDF parser.
PDFParser() - Constructor for class org.apache.tika.parser.pdf.PDFParser
 
PDFParserConfig - Class in org.apache.tika.parser.pdf
Config for PDFParser.
PDFParserConfig() - Constructor for class org.apache.tika.parser.pdf.PDFParserConfig
 
PDFParserConfig(InputStream) - Constructor for class org.apache.tika.parser.pdf.PDFParserConfig
Loads properties from InputStream and then tries to close InputStream.
PDFParserConfig.OCR_STRATEGY - Enum in org.apache.tika.parser.pdf
 
PDFPreflightParser - Class in org.apache.tika.parser.pdf
Deprecated.
This will be removed in 2.x. The PDFBox community voted to retire the preflight parser in PDFBox 4.x.
PDFPreflightParser() - Constructor for class org.apache.tika.parser.pdf.PDFPreflightParser
Deprecated.
 
peekBits(int) - Method in class org.apache.tika.parser.chm.lzx.ChmSection
 
PERCENT - Static variable in interface org.apache.tika.parser.ner.NERecogniser
 
PERCENT_FILE - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
 
PERSON - Static variable in interface org.apache.tika.parser.ner.NERecogniser
 
PERSON_FILE - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
 
Pkcs7Parser - Class in org.apache.tika.parser.crypto
Basic parser for PKCS7 data.
Pkcs7Parser() - Constructor for class org.apache.tika.parser.crypto.Pkcs7Parser
 
PLATFORM - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
 
PLATFORM_AIX - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
 
PLATFORM_ARM - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
 
PLATFORM_EMBEDDED - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
 
PLATFORM_FREEBSD - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
 
PLATFORM_HPUX - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
 
PLATFORM_IRIX - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
 
PLATFORM_LINUX - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
 
PLATFORM_NETBSD - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
 
PLATFORM_SOLARIS - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
 
PLATFORM_SYSV - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
 
PLATFORM_TRU64 - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
 
PLATFORM_WINDOWS - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
 
PListParser - Class in org.apache.tika.parser.apple
Parser for Apple's plist and bplist.
PListParser() - Constructor for class org.apache.tika.parser.apple.PListParser
 
PMGL - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
POIFSContainerDetector - Class in org.apache.tika.parser.microsoft
A detector that works on a POIFS OLE2 document to figure out exactly what the file is.
POIFSContainerDetector() - Constructor for class org.apache.tika.parser.microsoft.POIFSContainerDetector
 
POIXMLTextExtractorDecorator - Class in org.apache.tika.parser.microsoft.ooxml
 
POIXMLTextExtractorDecorator(ParseContext, POIXMLTextExtractor) - Constructor for class org.apache.tika.parser.microsoft.ooxml.POIXMLTextExtractorDecorator
 
PooledTimeSeriesParser - Class in org.apache.tika.parser.pot
Uses the Pooled Time Series algorithm + command line tool, to generate a numeric representation of the video suitable for similarity searches.
PooledTimeSeriesParser() - Constructor for class org.apache.tika.parser.pot.PooledTimeSeriesParser
 
POSITION_BASE - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
PPT - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
Microsoft PowerPoint
PREFIX - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
 
process(PDDocument, ContentHandler, ParseContext, Metadata, PDFParserConfig) - Static method in class org.apache.tika.parser.pdf.PDFMarkedContent2XHTML
Converts the given PDF document (and related metadata) to a stream of XHTML SAX events sent to the given content handler.
processBox(Box, byte[], Mp4Context) - Method in class org.apache.tika.parser.mp4.TikaMp4BoxHandler
 
processCommand(InputStream) - Method in class org.apache.tika.parser.gdal.GDALParser
 
processingInstruction(String, String) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
 
processPages(PDPageTree) - Method in class org.apache.tika.parser.pdf.PDFMarkedContent2XHTML
 
processShapes(List<XSSFShape>, XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
 
processSheet(XSSFSheetXMLHandler.SheetContentsHandler, CommentsTable, StylesTable, ReadOnlySharedStringsTable, InputStream) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
 
propertyID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.PrtArrayOfPropertyValues
 
PropertyID - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
This class is used to represent a PropertyID.
PropertyID() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyID
 
PropertySet - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
This class is used to represent a PropertySet.
PropertySet() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.PropertySet
 
propertySet - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionStoreObject
 
PropertySetObject - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
This class is used to represent the property set.
PropertySetObject(ObjectGroupObjectDeclare, ObjectGroupObjectData) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.PropertySetObject
Construct the PropertySetObject instance.
PropertyType - Enum in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
 
PRT_MIME_TYPE - Static variable in class org.apache.tika.parser.prt.PRTParser
 
PrtArrayOfPropertyValues - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.property
The class is used to represent the prtArrayOfPropertyValues .
PrtArrayOfPropertyValues() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.PrtArrayOfPropertyValues
 
PrtFourBytesOfLengthFollowedByData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.property
This class is used to represent the prtFourBytesOfLengthFollowedByData.
PrtFourBytesOfLengthFollowedByData() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.PrtFourBytesOfLengthFollowedByData
 
PRTParser - Class in org.apache.tika.parser.prt
A basic text extracting parser for the CADKey PRT (CAD Drawing) format.
PRTParser() - Constructor for class org.apache.tika.parser.prt.PRTParser
 
PSDParser - Class in org.apache.tika.parser.image
Parser for the Adobe Photoshop PSD File Format.
PSDParser() - Constructor for class org.apache.tika.parser.image.PSDParser
 
PUB - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
Microsoft Publisher

Q

QP_7_8 - Static variable in class org.apache.tika.parser.wordperfect.QuattroProParser
 
QP_9 - Static variable in class org.apache.tika.parser.wordperfect.QuattroProParser
 
QUATTROPRO - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
Base QuattroPro mime
QuattroProParser - Class in org.apache.tika.parser.wordperfect
Parser for Corel QuattroPro documents (part of Corel WordPerfect Office Suite).
QuattroProParser() - Constructor for class org.apache.tika.parser.wordperfect.QuattroProParser
 

R

RarParser - Class in org.apache.tika.parser.pkg
Parser for Rar files.
RarParser() - Constructor for class org.apache.tika.parser.pkg.RarParser
 
RawTagIterator(int, int, int, int) - Constructor for class org.apache.tika.parser.mp3.ID3v2Frame.RawTagIterator
 
RDCAnalysisChunking - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking
This class is used to process RDC analysis chunking
RDCAnalysisChunking(byte[]) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.RDCAnalysisChunking
Initializes a new instance of the class
readBytes(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitReader
Reading the bytes specified by the byte length.
readFully(InputStream, int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
 
readFully(InputStream, int, boolean) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
 
readGuid(byte[], int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AdapterHelper
This method is used to read the Guid for byte array.
readGuid() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitReader
Read as a GUID from the current offset position and increate the bit offset with 128 bit.
readInt16(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitReader
Read specified bit length content as an UInt16 type and increase the bit offset with the specified length.
readInt32(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitReader
Read specified bit length content as an Int32 type and increase the bit offset with the specified length.
readUInt16(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitReader
 
readUInt32(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitReader
Read specified bit length content as an UInt32 type and increase the bit offset with the specified length.
readUInt64(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitReader
Read specified bit length content as an UInt64 type and increase the bit offset.
recognise(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.captioning.tf.TensorflowRESTCaptioner
 
recognise(String) - Method in class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
recognises names of entities in the text
recognise(String) - Method in class org.apache.tika.parser.ner.grobid.GrobidNERecogniser
recognises names of entities in the text
recognise(String) - Method in class org.apache.tika.parser.ner.mitie.MITIENERecogniser
recognises names of entities in the text
recognise(String) - Method in interface org.apache.tika.parser.ner.NERecogniser
call for name recognition action from text
recognise(String) - Method in class org.apache.tika.parser.ner.nltk.NLTKNERecogniser
recognises names of entities in the text
recognise(String) - Method in class org.apache.tika.parser.ner.opennlp.OpenNLPNameFinder
 
recognise(String) - Method in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
 
recognise(String) - Method in class org.apache.tika.parser.ner.regex.RegexNERecogniser
 
recognise(InputStream, ContentHandler, Metadata, ParseContext) - Method in interface org.apache.tika.parser.recognition.ObjectRecogniser
Recognise the objects in the stream
recognise(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.recognition.tf.TensorflowImageRecParser
 
recognise(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
 
RecognisedObject - Class in org.apache.tika.parser.recognition
A model for recognised objects from graphics and texts typically includes human readable label for the object, language of the label, id and confidence score.
RecognisedObject(String, String, String, double) - Constructor for class org.apache.tika.parser.recognition.RecognisedObject
 
referencedObjectID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionStoreObject
 
referencedObjectSpacesID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionStoreObject
 
RegexNERecogniser - Class in org.apache.tika.parser.ner.regex
This class offers an implementation of NERecogniser based on Regular Expressions.
RegexNERecogniser() - Constructor for class org.apache.tika.parser.ner.regex.RegexNERecogniser
 
RegexNERecogniser(InputStream) - Constructor for class org.apache.tika.parser.ner.regex.RegexNERecogniser
 
remove() - Method in class org.apache.tika.parser.mp3.ID3v2Frame.RawTagIterator
 
render(XHTMLContentHandler) - Method in interface org.apache.tika.parser.microsoft.Cell
Renders the content to the given XHTML SAX event stream.
render(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.CellDecorator
 
render(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.LinkedCell
 
render(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.NumberCell
 
render(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.TextCell
 
ReplacementCharset - Class in org.apache.tika.parser.html.charsetdetector.charsets
An implementation of the standard "replacement" charset defined by the W3C.
ReplacementCharset() - Constructor for class org.apache.tika.parser.html.charsetdetector.charsets.ReplacementCharset
 
RequestTypes - Enum in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
The enumeration of request type.
reserved - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.JCID
 
reserved - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementPackage
 
reserved - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamHeader
 
reset(AnalysisEngine, JCas) - Static method in class org.apache.tika.parser.ctakes.CTAKESUtils
Resets cTAKES objects, if created.
reset() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitReader
Sets the enumerator to its initial position, which is before the first bit in the byte array.
reset() - Method in class org.apache.tika.parser.microsoft.ooxml.ParagraphProperties
 
RESET_TABLE - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
resetAE(AnalysisEngine) - Static method in class org.apache.tika.parser.ctakes.CTAKESUtils
Resets the AE (AnalysisEngine), releasing all resources held by the current AE.
resetCAS(JCas) - Static method in class org.apache.tika.parser.ctakes.CTAKESUtils
Resets the CAS (Common Analysis System), emptying it of all content.
resolveEntity(String, String) - Method in class org.apache.tika.parser.odf.NSNormalizerContentHandler
do not load any DTDs (may be requested by parser).
reverse(byte[]) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
Reverses the order of given array
reverseByteOrder(byte[]) - Method in class org.apache.tika.parser.chm.lzx.ChmSection
 
revisionExGuid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexRevisionMapping
 
revisionID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifest
 
RevisionManifest - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
 
RevisionManifest() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifest
Initializes a new instance of the RevisionManifest class.
revisionManifest - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestDataElementData
 
RevisionManifestDataElementData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
 
RevisionManifestDataElementData() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestDataElementData
Initializes a new instance of the RevisionManifestDataElementData class.
revisionManifestObjectGroupReferences - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestDataElementData
 
RevisionManifestObjectGroupReferences - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
Specifies a revision manifest object group references, each followed by object group extended GUIDs
RevisionManifestObjectGroupReferences() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestObjectGroupReferences
Initializes a new instance of the RevisionManifestObjectGroupReferences class.
RevisionManifestObjectGroupReferences(ExGuid) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestObjectGroupReferences
Initializes a new instance of the RevisionManifestObjectGroupReferences class.
RevisionManifestRootDeclare - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
Specifies a revision manifest root declare, each followed by root and object extended GUIDs
RevisionManifestRootDeclare() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestRootDeclare
Initializes a new instance of the RevisionManifestRootDeclare class.
revisionManifestRootDeclareList - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestDataElementData
 
revisionManifests - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStorePackage
 
revisionMappingExGuid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexRevisionMapping
 
revisionMappingSerialNumber - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexRevisionMapping
 
RevisionStoreObject - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
The class is used to represent the revision store object.
RevisionStoreObject() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionStoreObject
Initialize the class.
RevisionStoreObjectGroup - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
 
RevisionStoreObjectGroup(ExGuid) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionStoreObjectGroup
 
RFC822Parser - Class in org.apache.tika.parser.mail
Uses apache-mime4j to parse emails.
RFC822Parser() - Constructor for class org.apache.tika.parser.mail.RFC822Parser
 
rgbReserved - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AlternativePackaging
 
rgData - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.PropertySet
 
rgPrids - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.PropertySet
 
rightShift(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
 
ROOT_ENTITY - Static variable in class org.apache.tika.parser.xml.XMLProfiler
 
rootExGuid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestRootDeclare
 
rootExGUID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestRootDeclare
 
RootExGuid - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
 
RootNodeObjectBuilder() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.IntermediateNodeObject.RootNodeObjectBuilder
 
RTFParser - Class in org.apache.tika.parser.rtf
RTF parser
RTFParser() - Constructor for class org.apache.tika.parser.rtf.RTFParser
 
run(RunProperties, String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
 
run(RunProperties, String) - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
 
RunProperties - Class in org.apache.tika.parser.microsoft.ooxml
WARNING: This class is mutable.
RunProperties() - Constructor for class org.apache.tika.parser.microsoft.ooxml.RunProperties
 

S

salvageCopy(InputStream, File, boolean) - Static method in class org.apache.tika.parser.utils.ZipSalvager
This streams the broken zip and rebuilds a new zip that is at least a valid zip file.
salvageCopy(File, File) - Static method in class org.apache.tika.parser.utils.ZipSalvager
 
SAS7BDATParser - Class in org.apache.tika.parser.sas
Processes the SAS7BDAT data columnar database file used by SAS and other similar languages.
SAS7BDATParser() - Constructor for class org.apache.tika.parser.sas.SAS7BDATParser
 
SchemaGuid - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
 
SDA - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
StarOffice Draw
SDC - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
StarOffice Calc
SDD - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
StarOffice Impress
SDW - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
StarOffice Writer
searchGeoNames(ArrayList<String>) - Method in class org.apache.tika.parser.geo.topic.GeoParser
 
secondaryParser - Variable in class org.apache.tika.parser.ner.NamedEntityParser
 
SentimentAnalysisParser - Class in org.apache.tika.parser.sentiment
This parser classifies documents based on the sentiment of document.
SentimentAnalysisParser() - Constructor for class org.apache.tika.parser.sentiment.SentimentAnalysisParser
 
SequenceNumberGenerator - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.util
 
SequenceNumberGenerator() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.SequenceNumberGenerator
 
serialize(JCas, CTAKESSerializer, boolean, OutputStream) - Static method in class org.apache.tika.parser.ctakes.CTAKESUtils
Serializes a CAS in the given format.
serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.CellManifestCurrentRevision
Used to convert the element into a byte List.
serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElement
Used to convert the element into a byte List.
serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementHash
Used to convert the element into a byte List
serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementPackage
Used to convert the element into a byte List
serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataHashObject
Used to convert the element into a byte List.
serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataSizeObject
Used to convert the element into a byte List.
serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.IntermediateNodeObject
Used to convert the element into a byte List.
serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.LeafNodeObject
Used to convert the element into a byte List.
serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupData
Used to convert the element into a byte List
serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupDeclarations
Used to convert the element into a byte List
serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupMetadata
Used to convert the element into a byte List
serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupMetadataDeclarations
Used to convert the element into a byte List
serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectBLOBDataDeclaration
Used to convert the element into a byte List.
serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectData
Used to convert the element into a byte List
serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectDataBLOBReference
Used to convert the element into a byte List.
serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectDeclare
Used to convert the element into a byte List.
serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifest
Used to convert the element into a byte List.
serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestObjectGroupReferences
Used to convert the element into a byte List.
serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestRootDeclare
Used to convert the element into a byte List.
serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.SignatureObject
Used to convert the element into a byte List.
serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexCellMapping
Used to convert the element into a byte List.
serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexManifestMapping
Used to convert the element into a byte List.
serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexRevisionMapping
Used to convert the element into a byte List.
serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestRootDeclare
Used to convert the element into a byte List.
serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestSchemaGUID
Used to convert the element into a byte List.
serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObject
Serialize items to byte list.
SerializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.ExtendedGUID
This method is used to convert the element of ExtendedGUID object into a byte List.
serializeToByteList() - Method in interface org.apache.tika.parser.microsoft.onenote.fsshttpb.IFSSHTTPBSerializable
Serialize to byte list.
serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.ArrayNumber
This method is used to convert the element of the number of array into a byte List.
serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.EightBytesOfData
This method is used to convert the element of EightBytesOfData into a byte List.
serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.FourBytesOfData
This method is used to convert the element of FourBytesOfData into a byte List.
serializeToByteList() - Method in interface org.apache.tika.parser.microsoft.onenote.fsshttpb.property.IProperty
This method is used to convert the element of property into a byte List.
serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.NoData
This method is used to convert the element of NoData into a byte List.
serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.OneByteOfData
This method is used to convert the element of OneByteOfData into a byte List.
serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.PrtArrayOfPropertyValues
This method is used to convert the element of the prtArrayOfPropertyValues into a byte List.
serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.PrtFourBytesOfLengthFollowedByData
This method is used to convert the element of prtFourBytesOfLengthFollowedByData into a byte List.
serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.TwoBytesOfData
This method is used to convert the element of TwoBytesOfData into a byte List.
serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.BasicObject
Used to serialize item to byte list.
serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.BinaryItem
This method is used to convert the element of BinaryItem basic object into a byte List.
serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellID
This method is used to convert the element of CellID basic object into a byte List.
serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellIDArray
This method is used to convert the element of CellIDArray basic object into a byte List.
serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
This method is used to convert the element of Compact64bitInt basic object into a byte List.
serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CompactID
This method is used to convert the element of CompactID object into a byte List.
serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
This method is used to convert the element of ExGuid basic object into a byte List.
serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGUIDArray
This method is used to convert the element of ExGUIDArray basic object into a byte List.
serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.JCID
This method is used to convert the element of JCID object into a byte List.
serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyID
This method is used to convert the element of PropertyID object into a byte List.
serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.SerialNumber
This method is used to convert the element of SerialNumber basic object into a byte List.
serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.CellManifestDataElementData
Used to convert the element into a byte List.
serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementData
Serialize item to byte list.
serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupDataElementData
Used to convert the element into a byte List.
serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.PropertySet
This method is used to convert the element of PropertySet into a byte List.
serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestDataElementData
Used to convert the element into a byte List.
serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectPropSet
This method is used to convert the element of the ObjectSpaceObjectPropSet into a byte List.
serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamHeader
This method is used to convert the element of ObjectSpaceObjectStreamHeader into a byte List.
serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfContextIDs
This method is used to convert the element of ObjectSpaceObjectStreamOfContextIDs object into a byte List.
serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfOIDs
This method is used to convert the element of ObjectSpaceObjectStreamOfOIDs object into a byte List.
serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfOSIDs
This method is used to convert the element of ObjectSpaceObjectStreamOfOSIDs object into a byte List.
serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexDataElementData
Used to convert the element into a byte List.
serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestDataElementData
Used to convert the element into a byte List.
serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObject
Serialize item to byte list.
serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderEnd16bit
This method is used to convert the element of StreamObjectHeaderEnd16bit basic object into a byte List.
serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderEnd8bit
This method is used to convert the element of StreamObjectHeaderEnd8bit basic object into a byte List.
serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart16bit
This method is used to convert the element of StreamObjectHeaderStart16bit basic object into a byte List.
serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart32bit
This method is used to convert the element of StreamObjectHeaderStart32bit basic object into a byte List.
SerialNumber - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
 
SerialNumber(UUID, long) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.SerialNumber
Initializes a new instance of the SerialNumber class with specified values.
SerialNumber(SerialNumber) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.SerialNumber
Initializes a new instance of the SerialNumber class, this is the copy constructor.
SerialNumber() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.SerialNumber
Initializes a new instance of the SerialNumber class, this is default contractor
serialNumber - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElement
 
setAccessChecker(AccessChecker) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
 
setAdmin1Code(String) - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
 
setAdmin2Code(String) - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
 
setAeDescriptorPath(String) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
Sets the path to XML descriptor for AnalysisEngine.
setAlignedLenTable(short[]) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
setAlignedTreeTable(short[]) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
setAnnotationProps(CTAKESAnnotationProperty[]) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
Sets the CTAKESAnnotationProperty's that will be included into cTAKES metadata.
setAnnotationProps(String[]) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
ets the CTAKESAnnotationProperty's that will be included into cTAKES metadata.
setApplyRotation(boolean) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
Sets whether or not a rotation value should be calculated and passed to ImageMagick.
setApplyRotation(boolean) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
 
setAverageCharTolerance(Float) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
See PDFTextStripper.setAverageCharTolerance(float)
setBit(byte[], long) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.Bit
Set a bit value to "On" in the specified byte array with the specified bit position.
setBlock_len(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Sets block length
setBlockAddress(long[]) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
Sets block addresses
setBlockCount(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
Sets a block count
setBlockidx_intvl(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Sets block index interval
setBlockLength(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
setBlockLlen(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
Sets a block length
setBlockNext(int) - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
 
setBlockPrev(int) - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
 
setBlockRemaining(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
setBlockType(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
setBody(PropertySet) - Method in class org.apache.tika.parser.microsoft.onenote.ObjectSpaceObjectPropSet
 
setBold(boolean) - Method in class org.apache.tika.parser.microsoft.ooxml.RunProperties
 
setByteArrayMaxOverride(int) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
WARNING: this sets a static variable in POI.
setCatchIntermediateIOExceptions(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
The PDFBox parser will throw an IOException if there is a problem with a stream.
setCenter(String) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.HeaderFooterFromString
 
setCharset(Charset) - Method in class org.apache.tika.parser.csv.CSVParams
 
setChmDirList(ChmDirectoryListingSet) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
 
setChmItsfHeader(ChmItsfHeader) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
 
setChmItspHeader(ChmItspHeader) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
 
setChmLzxcControlData(ChmLzxcControlData) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
 
setChmLzxcResetTable(ChmLzxcResetTable) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
 
setColorspace(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
 
setColorspace(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
 
setCommand(String) - Method in class org.apache.tika.parser.gdal.GDALParser
 
setCompressedLen(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
Sets compressed length
setConcatenatePhoneticRuns(boolean) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
 
setConcatenatePhoneticRuns(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
Microsoft Excel files can sometimes contain phonetic (furigana) strings.
setConfidence(double) - Method in class org.apache.tika.parser.recognition.RecognisedObject
 
setContent(List<ExGuid>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGUIDArray
 
setContentLength(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxBlock
 
setContentParser(Parser) - Method in class org.apache.tika.parser.epub.EpubParser
 
setContentParser(Parser) - Method in class org.apache.tika.parser.odf.OpenDocumentParser
 
setContentType(Metadata) - Method in class org.apache.tika.parser.microsoft.xml.AbstractXML2003Parser
 
setContentType(Metadata) - Method in class org.apache.tika.parser.microsoft.xml.SpreadsheetMLParser
 
setContentType(Metadata) - Method in class org.apache.tika.parser.microsoft.xml.WordMLParser
 
setContextIDs(ObjectSpaceObjectStreamOfOIDsOSIDsOrContextIDs) - Method in class org.apache.tika.parser.microsoft.onenote.ObjectSpaceObjectPropSet
 
setControlDataIndex(int) - Method in class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
Sets control data index
setCountryCode(String) - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
 
setCrawlAllFileNodesFromRoot(boolean) - Method in class org.apache.tika.parser.microsoft.onenote.OneNoteTreeWalkerOptions
Do this to ignore revisions and just parse all file nodes from the root recursively.
setData(byte[]) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
 
setDataOffset(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Sets data offset
setDateFormatOverride(String) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
 
setDateFormatOverride(String) - Method in class org.apache.tika.parser.microsoft.TikaExcelDataFormatter
 
setDateOverrideFormat(String) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
A user may wish to override the date formats in xls and xlsx files.
setDeclaredEncoding(String) - Method in class org.apache.tika.parser.txt.CharsetDetector
Set the declared encoding for charset detection.
setDecodedValue(long) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
 
setDelimiter(Character) - Method in class org.apache.tika.parser.csv.CSVParams
 
setDensity(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
 
setDensity(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
 
setDepth(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
 
setDepth(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
 
setDetectableCharset(String, boolean) - Method in class org.apache.tika.parser.txt.CharsetDetector
Deprecated.
This API is ICU internal only.
setDetectAngles(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
 
setDetectCharsetsInEntryNames(boolean) - Method in class org.apache.tika.parser.pkg.PackageParser
Whether or not to run the default charset detector against entry names in ZipFiles.
setDir_uuid(byte[]) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Sets directory uuid
setDirectoryListingEntryList(List<DirectoryListingEntry>) - Method in class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
Sets chm directory listing entry list
setDirLen(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Sets directory length
setDirOffset(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Sets directory offset
setDocumentLocator(Locator) - Method in class org.apache.tika.parser.dif.DIFContentHandler
 
setDocumentLocator(Locator) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
 
setDropThreshold(float) - Method in class org.apache.tika.parser.pdf.PDFParser
 
setDropThreshold(float) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
 
setEnableAutoSpace(boolean) - Method in class org.apache.tika.parser.pdf.PDFParser
If true (the default), the parser should estimate where spaces should be inserted between words.
setEnableAutoSpace(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
If true (the default), the parser should estimate where spaces should be inserted between words.
setEnableImageProcessing(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
Set the value to true if processing is to be enabled.
setEnableImageProcessing(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
 
setEncoding(StringsEncoding) - Method in class org.apache.tika.parser.strings.StringsConfig
Sets the character encoding of the strings that are to be found.
setEntriesToCopy(long) - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntry3FNDX
 
setEntryType(ChmCommons.EntryType) - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
 
setExtractAcroFormContent(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
If true (the default), extract content from AcroForms at the end of the document.
setExtractActions(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
Whether or not to extract PDActions from the file.
setExtractAllAlternatives(boolean) - Method in class org.apache.tika.parser.mail.RFC822Parser
Until version 1.17, Tika handled all body parts as embedded objects (see TIKA-2478).
setExtractAllAlternativesFromMSG(boolean) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
Some .msg files can contain body content in html, rtf and/or text.
setExtractAllAlternativesFromMSG(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
Some .msg files can contain body content in html, rtf and/or text.
setExtractAnnotationText(boolean) - Method in class org.apache.tika.parser.pdf.PDFParser
If true (the default), text in annotations will be extracted.
setExtractAnnotationText(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
If true (the default), text in annotations will be extracted.
setExtractBookmarksText(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
If true, extract bookmarks (document outline) text.
setExtractFontNames(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
Extract font names into a metadata field
setExtractInlineImages(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
If true, extract inline embedded OBXImages.
setExtractMacros(boolean) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
 
setExtractMacros(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
Sets whether or not MSOffice parsers should extract macros.
setExtractMacros(boolean) - Method in class org.apache.tika.parser.odf.FlatOpenDocumentParser
 
setExtractMacros(boolean) - Method in class org.apache.tika.parser.odf.OpenDocumentParser
 
setExtractMarkedContent(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
If the PDF contains marked content, try to extract text and its marked structure.
setExtractScripts(boolean) - Method in class org.apache.tika.parser.html.HtmlParser
Whether or not to extract contents in script entities.
setExtractUniqueInlineImagesOnly(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
Multiple pages within a PDF file might refer to the same underlying image.
setFilePath(String) - Method in class org.apache.tika.parser.strings.FileConfig
Sets the "file" installation folder.
setFilter(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
 
setFilter(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
 
setFramesRead(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
setFreeSpace(long) - Method in class org.apache.tika.parser.chm.accessor.ChmPmgiHeader
Sets pmgi free space
setFreeSpace(long) - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
 
setGazetteerRestEndpoint(String) - Method in class org.apache.tika.parser.geo.topic.GeoParser
 
setGazetteerRestEndpoint(String) - Method in class org.apache.tika.parser.geo.topic.GeoParserConfig
Configure REST endpoint for lucene-geo-gazetteer
setGuid(GUID) - Method in class org.apache.tika.parser.microsoft.onenote.ExtendedGUID
 
setGuid(GUID) - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntryFNDX
 
setGuid(int[]) - Method in class org.apache.tika.parser.microsoft.onenote.GUID
 
setHadStarted(ChmCommons.LzxState) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
setHeader_len(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Sets itsp header length
setHeaderLen(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Sets itsf header length
setId(String) - Method in class org.apache.tika.parser.recognition.RecognisedObject
 
setIfXFAExtractOnlyXFA(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
If false (the default), extract content from the full PDF as well as the XFA form.
setIlvl(int) - Method in class org.apache.tika.parser.microsoft.ooxml.ParagraphProperties
 
setImageMagickPath(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
Set the path to the ImageMagick executable directory, needed if it is not on system path.
setImageMagickPath(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
 
setIncludeDeletedContent(boolean) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
 
setIncludeDeletedContent(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
Sets whether or not the parser should include deleted content.
setIncludeDeletedContent(boolean) - Method in class org.apache.tika.parser.wordperfect.WordPerfectParser
Whether or not to include deleted content.
setIncludeHeadersAndFooters(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
Whether or not to include headers and footers.
setIncludeMarkup(boolean) - Method in class org.apache.tika.parser.html.BoilerpipeContentHandler
 
setIncludeMissingRows(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
For table-like formats, and tables within other formats, should missing rows in sparse tables be output where detected? The default is to only output rows defined within the file, which avoid lots of blank lines, but means layout isn't preserved.
setIncludeMoveFromContent(boolean) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
 
setIncludeMoveFromContent(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
With track changes on, when a section is moved, the content is stored in both the "moveFrom" section and in the "moveTo" section.
setIncludeShapeBasedContent(boolean) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
 
setIncludeShapeBasedContent(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
In Excel and Word, there can be text stored within drawing shapes.
setIncludeSlideMasterContent(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
Whether or not to include contents from any of the three types of masters -- slide, notes, handout -- in a .ppt or ppt[xm] file.
setIncludeSlideNotes(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
Whether or not to process slide notes content.
setIndex(long) - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntryFNDX
 
setIndex_depth(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Sets an index depth
setIndex_head(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Sets an index head
setIndex_root(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Sets an index root
setIndexCopyFromStart(long) - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntry3FNDX
 
setIndexCopyToStart(long) - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntry3FNDX
 
setIndexOfContent(int) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
 
setIndexOfResetData(int) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
 
setIndexOfResetTable(int) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
 
setInitializableProblemHandler(InitializableProblemHandler) - Method in class org.apache.tika.parser.pdf.PDFParser
 
setIntelCurrentPossition(long) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
setIntelFileSize(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
setIntelState(ChmCommons.IntelState) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
setItalics(boolean) - Method in class org.apache.tika.parser.microsoft.ooxml.RunProperties
 
setLabel(String) - Method in class org.apache.tika.parser.recognition.RecognisedObject
 
setLabelLang(String) - Method in class org.apache.tika.parser.recognition.RecognisedObject
 
setLang_id(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Sets language id
setLangId(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Sets language_id
setLanguage(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
Set tesseract language dictionary to be used.
setLanguage(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
 
setLastModified(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Sets last modified date of the chm file
setLatitude(String) - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
 
setLeft(String) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.HeaderFooterFromString
 
setLength(int) - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
 
setLengthTreeLengtsTable(short[]) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
setLengthTreeTable(short[]) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
setListenForAllRecords(boolean) - Method in class org.apache.tika.parser.microsoft.ExcelExtractor
Specifies whether this parser should to listen for all records or just for the specified few.
setLongitude(String) - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
 
setLzxBlockLength(long) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
 
setLzxBlockOffset(long) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
 
setLzxBlocksCache(List<ChmLzxBlock>) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
 
setMain(String, String, String) - Method in class org.apache.tika.parser.geo.topic.GeoTag
 
setMainTreeElements(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
setMainTreeLengtsTable(short[]) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
setMainTreeTable(short[]) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
setMarkLimit(int) - Method in class org.apache.tika.parser.html.charsetdetector.StandardHtmlEncodingDetector
How far into the stream to read for charset detection.
setMarkLimit(int) - Method in class org.apache.tika.parser.html.HtmlEncodingDetector
How far into the stream to read for charset detection.
setMarkLimit(int) - Method in class org.apache.tika.parser.microsoft.POIFSContainerDetector
 
setMarkLimit(int) - Method in class org.apache.tika.parser.pkg.ZipContainerDetector
If this is less than 0, the file will be spooled to disk, and detection will run on the full file.
setMarkLimit(int) - Method in class org.apache.tika.parser.txt.Icu4jEncodingDetector
How far into the stream to read for charset detection.
setMarkLimit(int) - Method in class org.apache.tika.parser.txt.UniversalEncodingDetector
How far into the stream to read for charset detection.
setMaxBytesForEmbeddedObject(int) - Static method in class org.apache.tika.parser.rtf.RTFParser
Deprecated.
setMaxFileSizeToOcr(long) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
Set maximum file size to submit file to ocr.
setMaxFileSizeToOcr(long) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
 
setMaxMainMemoryBytes(long) - Method in class org.apache.tika.parser.pdf.PDFParser
 
setMaxMainMemoryBytes(int) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
setMaxMainMemoryBytes(long) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
 
setMaxRecordSize(long) - Method in class org.apache.tika.parser.mp4.MP4Parser
Override the maximum record size limit.
setMaxXMPMMHistory(int) - Static method in class org.apache.tika.parser.image.xmp.JempboxExtractor
Maximum number of events to extract from the event history in the XMP Media Management (XMPMM) section.
setMediaType(MediaType) - Method in class org.apache.tika.parser.csv.CSVParams
 
setMemoryLimitInKb(int) - Method in class org.apache.tika.parser.pkg.CompressorParser
 
setMemoryLimitInKb(int) - Method in class org.apache.tika.parser.rtf.RTFParser
 
setMetadata(String[]) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
Sets the metadata whose values will be analyzed using cTAKES.
setMetaParser(Parser) - Method in class org.apache.tika.parser.epub.EpubParser
 
setMetaParser(Parser) - Method in class org.apache.tika.parser.odf.OpenDocumentParser
 
setMimetype(boolean) - Method in class org.apache.tika.parser.strings.FileConfig
Sets the mime option.
setMinFileSizeToOcr(long) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
Set minimum file size to submit file to ocr.
setMinFileSizeToOcr(long) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
 
setMinLength(int) - Method in class org.apache.tika.parser.strings.StringsConfig
Sets the minimum sequence length (characters) to print.
setMinSize(int) - Method in class org.apache.tika.parser.strings.Latin1StringsParser
Sets the minimum size of a character sequence to be extracted.
setN(long) - Method in class org.apache.tika.parser.microsoft.onenote.ExtendedGUID
 
setName(String) - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
Sets entry name
setName(String) - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
 
setNameLength(int) - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
Sets an entry name length
setNERModelPath(String) - Method in class org.apache.tika.parser.geo.topic.GeoParserConfig
 
setNerModelUrl(String) - Method in class org.apache.tika.parser.geo.topic.GeoParser
 
setNerModelUrl(URL) - Method in class org.apache.tika.parser.geo.topic.GeoParserConfig
 
setNum_blocks(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Sets number of blocks containing in the chm file
setNumId(int) - Method in class org.apache.tika.parser.microsoft.ooxml.ParagraphProperties
 
setOcrDPI(int) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
Dots per inch used to render the page image for OCR.
setOcrImageFormatName(String) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
 
setOcrImageQuality(float) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
Image quality used to render the page image for OCR.
setOcrImageScale(float) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
Deprecated.
(as of Tika 1.23, this is no longer used in rendering page images)
setOcrImageType(String) - Method in class org.apache.tika.parser.pdf.PDFParser
 
setOcrImageType(ImageType) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
Image type used to render the page image for OCR.
setOcrImageType(String) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
Image type used to render the page image for OCR.
setOcrStrategy(String) - Method in class org.apache.tika.parser.pdf.PDFParser
 
setOcrStrategy(PDFParserConfig.OCR_STRATEGY) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
Which strategy to use for OCR
setOcrStrategy(String) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
Which strategy to use for OCR
setOffset(int) - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
 
setOids(ObjectSpaceObjectStreamOfOIDsOSIDsOrContextIDs) - Method in class org.apache.tika.parser.microsoft.onenote.ObjectSpaceObjectPropSet
 
setOnlyLatestRevision(boolean) - Method in class org.apache.tika.parser.microsoft.onenote.OneNoteTreeWalkerOptions
Only parse the latest revision.
setOsids(ObjectSpaceObjectStreamOfOIDsOSIDsOrContextIDs) - Method in class org.apache.tika.parser.microsoft.onenote.ObjectSpaceObjectPropSet
 
setOutputStream(OutputStream) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
Sets the OutputStream object used to write the CAS.
setOutputType(TesseractOCRConfig.OUTPUT_TYPE) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
Set output type from ocr process.
setOutputType(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
 
setOutputType(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
 
setPageSegMode(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
Set tesseract page segmentation mode.
setPageSegMode(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
 
setPageSeparator(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
The page separator to use in plain text output.
setPDFParserConfig(PDFParserConfig) - Method in class org.apache.tika.parser.pdf.PDFParser
 
setPersonAndEmail(String, Property, Property, Metadata) - Static method in class org.apache.tika.parser.mail.MailUtil
This tries to split a "from" or "to" value into a person field and an email field.
setPreserveInterwordSpacing(boolean) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
Whether or not to maintain interword spacing.
setPreserveInterwordSpacing(boolean) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
 
setPrettyPrint(boolean) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
Enables the formatted output for serializer.
setR0(long) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
setR1(long) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
setR2(long) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
setRecogniser(String) - Method in class org.apache.tika.parser.recognition.ObjectRecognitionParser
 
setResetInterval(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
Sets a reset interval
setResetTableIndex(int) - Method in class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
Sets reset table index
setResize(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
 
setResize(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
 
setRight(String) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.HeaderFooterFromString
 
setSeparatorChar(char) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
Sets the separator character used for annotation properties.
setSerialize(boolean) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
Enables CAS serialization.
setSerializerType(CTAKESSerializer) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
Sets the type of cTAKES (UIMA) serializer used to write CAS.
setSetKCMS(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
Whether to call System.setProperty("sun.java2d.cmm", "sun.java2d.cmm.kcms.KcmsServiceProvider").
setSignature(byte[]) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Sets itsf header signature
setSignature(byte[]) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Sets itsp signature
setSignature(byte[]) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
Sets a signature of control data block
setSignature(byte[]) - Method in class org.apache.tika.parser.chm.accessor.ChmPmgiHeader
Sets pmgi signature
setSignature(byte[]) - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
 
setSize(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
Sets a size of control data
setSortByPosition(boolean) - Method in class org.apache.tika.parser.pdf.PDFParser
If true, sort text tokens by their x/y position before extracting text.
setSortByPosition(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
If true, sort text tokens by their x/y position before extracting text.
setSpacingTolerance(Float) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
See PDFTextStripper.setSpacingTolerance(float)
setStartIndex(int) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
 
setStream_uuid(byte[]) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Sets stream uuid
setStrike(boolean) - Method in class org.apache.tika.parser.microsoft.ooxml.RunProperties
 
setStringsPath(String) - Method in class org.apache.tika.parser.strings.StringsConfig
Sets the "strings" installation folder.
setStripMarkup(boolean) - Method in class org.apache.tika.parser.txt.Icu4jEncodingDetector
Whether or not to attempt to strip html-ish markup from the stream before sending it to the underlying detector.
setStyleID(String) - Method in class org.apache.tika.parser.microsoft.ooxml.ParagraphProperties
 
setSuppressDuplicateOverlappingText(boolean) - Method in class org.apache.tika.parser.pdf.PDFParser
If true, the parser should try to remove duplicated text over the same region.
setSuppressDuplicateOverlappingText(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
If true, the parser should try to remove duplicated text over the same region.
setSwath(int) - Method in class org.apache.tika.parser.chm.lzx.ChmSection
 
setSystem_uuid(byte[]) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Sets system uuid
setTableOffset(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
Sets a table offset
setTessdataPath(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
Set the path to the 'tessdata' folder, which contains language files and config files.
setTessdataPath(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
 
setTesseractPath(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
Set the path to the Tesseract executable's directory, needed if it is not on system path.
setTesseractPath(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
 
setText(boolean) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
Enables content text analysis using cTAKES.
setText(byte[]) - Method in class org.apache.tika.parser.txt.CharsetDetector
Set the input text (byte) data whose charset is to be detected.
setText(InputStream) - Method in class org.apache.tika.parser.txt.CharsetDetector
Set the input text (byte) data whose charset is to be detected.
setTimeout(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
Set maximum time (seconds) to wait for the ocring process to terminate.
setTimeout(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
 
setTimeout(int) - Method in class org.apache.tika.parser.strings.StringsConfig
Sets the maximum time (in seconds) to wait for the "strings" command to terminate.
setTotal(int) - Method in class org.apache.tika.parser.chm.lzx.ChmSection
 
setTracking(boolean) - Method in class org.apache.tika.parser.mbox.MboxParser
 
setTrustedPageSeparator(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
Same as TesseractOCRConfig.setPageSeparator(String) but does not perform any checks on the string.
setType(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
 
setUMLSPass(String) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
Sets the UMLS password.
setUMLSUser(String) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
Sets the UMLS username.
setUncompressedLen(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
Sets uncompressed length
setUnderline(String) - Method in class org.apache.tika.parser.microsoft.ooxml.RunProperties
 
setUnknown(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
Sets an unknown
setUnknown0008(long) - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
 
setUnknown_000c(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Sets unknown_00c
setUnknown_000c(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Sets 000c unknown bytes Unknown means here that those guys who cracked the chm format do not know what's it purposes for
setUnknown_0024(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Sets 0024 unknown bytes
setUnknown_002c(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Sets 002c unknown bytes
setUnknown_0044(byte[]) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Sets 0044 unknown bytes
setUnknown_18(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
Sets unknown 18 bytes
setUnknownLen(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Sets unknown length
setUnknownOffset(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Sets unknown offset
setUseSAXDocxExtractor(boolean) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
 
setUseSAXDocxExtractor(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
Use the experimental SAX-based streaming DOCX parser? If set to false, the classic parser will be used; if true, the new experimental parser will be used.
setUseSAXPptxExtractor(boolean) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
 
setUseSAXPptxExtractor(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
Use the experimental SAX-based streaming DOCX parser? If set to false, the classic parser will be used; if true, the new experimental parser will be used.
setUtf16PropertiesToPrint(Set<OneNotePropertyEnum>) - Method in class org.apache.tika.parser.microsoft.onenote.OneNoteTreeWalkerOptions
Print file node data in UTF-16 format when they match these props.
setVersion(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Sets itsf version
setVersion(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Sets a version of itsp header
setVersion(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
Sets version of control data block
setVersion(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
Sets the version
setWindow(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
setWindowPosition(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
setWindowSize(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
Sets a window size
setWindowSize(long) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
 
setWindowsPerReset(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
Sets windows per reset
sheetParts - Variable in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
 
SheetTextAsHTML(OfficeParserConfig, XHTMLContentHandler) - Constructor for class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.SheetTextAsHTML
 
shouldAcceptBox(Box) - Method in class org.apache.tika.parser.mp4.TikaMp4BoxHandler
 
shouldAcceptContainer(Box) - Method in class org.apache.tika.parser.mp4.TikaMp4BoxHandler
 
signature - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.NodeObject
 
SIGNATURE_RELATIONSHIP - Static variable in class org.apache.tika.parser.microsoft.ooxml.OOXMLParser
 
signatureData - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.SignatureObject
Gets or sets a binary item as specified in [MS-FSSHTTPB] section 2.2.1.3 that specifies a value that is unique to the file data represented by this root node object.
SignatureObject - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
Signature Object
SignatureObject() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.SignatureObject
Initializes a new instance of the SignatureObject class.
SimpleChunking - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking
 
SimpleChunking(byte[]) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.SimpleChunking
Initializes a new instance of the SimpleChunking class
skippedEntity(String) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
 
SLDWORKS - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
SolidWorks CAD file
SourceCodeParser - Class in org.apache.tika.parser.code
Generic Source code parser for Java, Groovy, C++.
SourceCodeParser() - Constructor for class org.apache.tika.parser.code.SourceCodeParser
 
SourceCodeParser(EncodingDetector) - Constructor for class org.apache.tika.parser.code.SourceCodeParser
 
SpreadsheetMLParser - Class in org.apache.tika.parser.microsoft.xml
Parses wordml 2003 format Excel files.
SpreadsheetMLParser() - Constructor for class org.apache.tika.parser.microsoft.xml.SpreadsheetMLParser
 
SQLite3Parser - Class in org.apache.tika.parser.jdbc
This is the main class for parsing SQLite3 files.
SQLite3Parser() - Constructor for class org.apache.tika.parser.jdbc.SQLite3Parser
Checks to see if class is available for org.sqlite.JDBC.
StandardHtmlEncodingDetector - Class in org.apache.tika.parser.html.charsetdetector
An encoding detector that tries to respect the spirit of the HTML spec part 12.2.3 "The input byte stream", or at least the part that is compatible with the implementation of tika.
StandardHtmlEncodingDetector() - Constructor for class org.apache.tika.parser.html.charsetdetector.StandardHtmlEncodingDetector
 
start(BundleContext) - Method in class org.apache.tika.parser.internal.Activator
 
START_PMGL - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
 
startBookmark(String, String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
 
startBookmark(String, String) - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
 
startDocument() - Method in class org.apache.tika.parser.dif.DIFContentHandler
 
startDocument() - Method in class org.apache.tika.parser.html.BoilerpipeContentHandler
 
startDocument() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
 
startDocument() - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
 
startDocument() - Method in class org.apache.tika.parser.xliff.XLIFF12ContentHandler
 
startEditedSection(String, Date, OOXMLWordAndPowerPointTextHandler.EditType) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
 
startEditedSection(String, Date, OOXMLWordAndPowerPointTextHandler.EditType) - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
 
startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.dif.DIFContentHandler
 
startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.html.BoilerpipeContentHandler
 
startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
 
startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
 
startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.mif.MIFContentHandler
 
startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.odf.NSNormalizerContentHandler
 
startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.xliff.XLIFF12ContentHandler
 
startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.xml.AttributeDependantMetadataHandler
 
startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.xml.AttributeMetadataHandler
 
startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.xml.ElementMetadataHandler
 
startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.xml.MetadataHandler
Deprecated.
 
startParagraph(ParagraphProperties) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
 
startParagraph(ParagraphProperties) - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
 
startPrefixMapping(String, String) - Method in class org.apache.tika.parser.html.BoilerpipeContentHandler
 
startPrefixMapping(String, String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
 
startPrefixMapping(String, String) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
 
startPrefixMapping(String, String) - Method in class org.apache.tika.parser.odf.NSNormalizerContentHandler
 
startRow(int) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.SheetTextAsHTML
 
startSDT() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
 
startSDT() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
 
startsWith(byte[], String) - Static method in class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
 
startTable() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
 
startTable() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
 
startTableCell() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
 
startTableCell() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
 
startTableRow() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
 
startTableRow() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
 
stop(BundleContext) - Method in class org.apache.tika.parser.internal.Activator
 
storageIndex - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStorePackage
 
StorageIndexCellMapping - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
Specifies the storage index cell mappings (with cell identifier, cell mapping extended GUID, and cell mapping serial number)
StorageIndexCellMapping() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexCellMapping
Initializes a new instance of the StorageIndexCellMapping class.
storageIndexCellMappingList - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexDataElementData
 
StorageIndexDataElementData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
 
StorageIndexDataElementData() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexDataElementData
Initializes a new instance of the StorageIndexDataElementData class.
storageIndexExtendedGUID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AlternativePackaging
 
storageIndexManifestMapping - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexDataElementData
 
StorageIndexManifestMapping - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
 
StorageIndexManifestMapping() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexManifestMapping
Initializes a new instance of the StorageIndexManifestMapping class.
StorageIndexRevisionMapping - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
Specifies the storage index revision mappings (with revision and revision mapping extended GUIDs, and revision mapping serial number)
StorageIndexRevisionMapping() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexRevisionMapping
Initializes a new instance of the StorageIndexRevisionMapping class.
storageIndexRevisionMappingList - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexDataElementData
 
storageManifest - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStorePackage
 
StorageManifestDataElementData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
 
StorageManifestDataElementData() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestDataElementData
Initializes a new instance of the StorageManifestDataElementData class.
StorageManifestRootDeclare - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
Specifies one or more storage manifest root declare.
StorageManifestRootDeclare() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestRootDeclare
Initializes a new instance of the StorageManifestRootDeclare class.
storageManifestRootDeclareList - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestDataElementData
 
storageManifestSchemaGUID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestDataElementData
 
StorageManifestSchemaGUID - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
Specifies a storage manifest schema GUID
StorageManifestSchemaGUID() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestSchemaGUID
Initializes a new instance of the StorageManifestSchemaGUID class.
STREAM_OBJECT_HEADER_START_16_BIT - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart
Specify for 16-bit stream object header start.
STREAM_OBJECT_HEADER_START_32_BIT - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart
Specify for 32-bit stream object header start.
StreamingZipContainerDetector - Class in org.apache.tika.parser.pkg
 
StreamingZipContainerDetector(int) - Constructor for class org.apache.tika.parser.pkg.StreamingZipContainerDetector
 
StreamObject - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
 
StreamObject(StreamObjectTypeHeaderStart) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObject
Initializes a new instance of the StreamObject class.
StreamObjectHeaderEnd - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
 
StreamObjectHeaderEnd() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderEnd
 
StreamObjectHeaderEnd16bit - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
An 16-bit header for a compound object would indicate the end of a stream object
StreamObjectHeaderEnd16bit(int) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderEnd16bit
Initializes a new instance of the StreamObjectHeaderEnd16bit class with the specified type value.
StreamObjectHeaderEnd16bit(StreamObjectTypeHeaderEnd) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderEnd16bit
Initializes a new instance of the StreamObjectHeaderEnd16bit class with the specified type value.
StreamObjectHeaderEnd16bit() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderEnd16bit
Initializes a new instance of the StreamObjectHeaderEnd16bit class, this is the default constructor.
StreamObjectHeaderEnd8bit - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
An 8-bit header for a compound object would indicate the end of a stream object
StreamObjectHeaderEnd8bit(int) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderEnd8bit
Initializes a new instance of the StreamObjectHeaderEnd8bit class with the specified type value.
StreamObjectHeaderEnd8bit() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderEnd8bit
Initializes a new instance of the StreamObjectHeaderEnd8bit class, this is the default constructor.
StreamObjectHeaderEnd8bit(StreamObjectTypeHeaderEnd) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderEnd8bit
Initializes a new instance of the StreamObjectHeaderEnd8bit class with the specified type value.
StreamObjectHeaderStart - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
This class specifies the base class for 16-bit or 32-bit stream object header start
StreamObjectHeaderStart() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart
Initializes a new instance of the StreamObjectHeaderStart class.
StreamObjectHeaderStart(StreamObjectTypeHeaderStart) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart
Initializes a new instance of the StreamObjectHeaderStart class with specified header type.
StreamObjectHeaderStart16bit - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
An 16-bit header for a compound object would indicate the start of a stream object
StreamObjectHeaderStart16bit(StreamObjectTypeHeaderStart, int) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart16bit
Initializes a new instance of the StreamObjectHeaderStart16bit class with specified type and length.
StreamObjectHeaderStart16bit(StreamObjectTypeHeaderStart) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart16bit
Initializes a new instance of the StreamObjectHeaderStart16bit class with specified type.
StreamObjectHeaderStart16bit() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart16bit
Initializes a new instance of the StreamObjectHeaderStart16bit class, this is the default constructor.
StreamObjectHeaderStart32bit - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
An 32-bit header for a compound object would indicate the start of a stream object
StreamObjectHeaderStart32bit(StreamObjectTypeHeaderStart, int) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart32bit
Initializes a new instance of the StreamObjectHeaderStart32bit class with specified type and length.
StreamObjectHeaderStart32bit() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart32bit
Initializes a new instance of the StreamObjectHeaderStart32bit class, this is the default constructor.
StreamObjectHeaderStart32bit(StreamObjectTypeHeaderStart) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart32bit
Initializes a new instance of the StreamObjectHeaderStart32bit class with specified type.
StreamObjectParseErrorException - Exception in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
 
StreamObjectParseErrorException(int, String, Exception) - Constructor for exception org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectParseErrorException
Initializes a new instance of the StreamObjectParseErrorException class
StreamObjectParseErrorException(int, String, String, Exception) - Constructor for exception org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectParseErrorException
Initializes a new instance of the StreamObjectParseErrorException class
StreamObjectTypeHeaderEnd - Enum in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
 
StreamObjectTypeHeaderStart - Enum in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
The enumeration of the stream object type header start
streamObjectTypeName - Variable in exception org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectParseErrorException
 
StringsConfig - Class in org.apache.tika.parser.strings
Configuration for the "strings" (or strings-alternative) command.
StringsConfig() - Constructor for class org.apache.tika.parser.strings.StringsConfig
Default contructor.
StringsConfig(InputStream) - Constructor for class org.apache.tika.parser.strings.StringsConfig
Loads properties from InputStream and then tries to close InputStream.
StringsEncoding - Enum in org.apache.tika.parser.strings
Character encoding of the strings that are to be found using the "strings" command.
StringsParser - Class in org.apache.tika.parser.strings
Parser that uses the "strings" (or strings-alternative) command to find the printable strings in a object, or other binary, file (application/octet-stream).
StringsParser() - Constructor for class org.apache.tika.parser.strings.StringsParser
 
stringToAsciiBytes(String) - Method in class org.apache.tika.parser.chm.lzx.ChmSection
 
subtract(UByte) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
 
subtract(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
 
subtract(UInteger) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
 
subtract(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
 
subtract(ULong) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
 
subtract(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
 
subtract(long) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
 
subtract(UShort) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
 
subtract(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
 
SUMMARY_PROPERTY_PREFIX - Static variable in class org.apache.tika.parser.microsoft.JackcessParser
 
SummaryExtractor - Class in org.apache.tika.parser.microsoft
Extractor for Common OLE2 (HPSF) metadata
SummaryExtractor(Metadata) - Constructor for class org.apache.tika.parser.microsoft.SummaryExtractor
 
SUPPORTED_TYPES - Static variable in class org.apache.tika.parser.microsoft.ooxml.OOXMLParser
 
SUPPORTED_TYPES - Static variable in class org.apache.tika.parser.microsoft.ooxml.xwpf.ml2006.Word2006MLParser
 
SXSLFPowerPointExtractorDecorator - Class in org.apache.tika.parser.microsoft.ooxml
SAX/Streaming pptx extractior
SXSLFPowerPointExtractorDecorator(Metadata, ParseContext, XSLFEventBasedPowerPointExtractor) - Constructor for class org.apache.tika.parser.microsoft.ooxml.SXSLFPowerPointExtractorDecorator
 
SXWPFWordExtractorDecorator - Class in org.apache.tika.parser.microsoft.ooxml
This is an experimental, alternative extractor for docx files.
SXWPFWordExtractorDecorator(Metadata, ParseContext, XWPFEventBasedWordExtractor) - Constructor for class org.apache.tika.parser.microsoft.ooxml.SXWPFWordExtractorDecorator
 
SYS_PROP_NER_IMPL - Static variable in class org.apache.tika.parser.ner.NamedEntityParser
 

T

TagAndStyle(String, String) - Constructor for class org.apache.tika.parser.microsoft.WordExtractor.TagAndStyle
 
tagName() - Method in enum org.apache.tika.parser.microsoft.FormattingUtils.Tag
 
TEIDOMParser - Class in org.apache.tika.parser.journal
 
TEIDOMParser() - Constructor for class org.apache.tika.parser.journal.TEIDOMParser
 
templateID - Variable in class org.apache.tika.parser.rtf.ListDescriptor
 
TensorflowImageRecParser - Class in org.apache.tika.parser.recognition.tf
TensorflowImageRecParser() - Constructor for class org.apache.tika.parser.recognition.tf.TensorflowImageRecParser
 
TensorflowRESTCaptioner - Class in org.apache.tika.parser.captioning.tf
Tensorflow image captioner.
TensorflowRESTCaptioner() - Constructor for class org.apache.tika.parser.captioning.tf.TensorflowRESTCaptioner
 
TensorflowRESTRecogniser - Class in org.apache.tika.parser.recognition.tf
Tensor Flow image recogniser which has high performance.
TensorflowRESTRecogniser() - Constructor for class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
 
TensorflowRESTVideoRecogniser - Class in org.apache.tika.parser.recognition.tf
Tensor Flow video recogniser which has high performance.
TensorflowRESTVideoRecogniser() - Constructor for class org.apache.tika.parser.recognition.tf.TensorflowRESTVideoRecogniser
 
TesseractOCRConfig - Class in org.apache.tika.parser.ocr
Configuration for TesseractOCRParser.
TesseractOCRConfig() - Constructor for class org.apache.tika.parser.ocr.TesseractOCRConfig
Default contructor.
TesseractOCRConfig(InputStream) - Constructor for class org.apache.tika.parser.ocr.TesseractOCRConfig
Loads properties from InputStream and then tries to close InputStream.
TesseractOCRConfig.OUTPUT_TYPE - Enum in org.apache.tika.parser.ocr
 
TesseractOCRParser - Class in org.apache.tika.parser.ocr
TesseractOCRParser powered by tesseract-ocr engine.
TesseractOCRParser() - Constructor for class org.apache.tika.parser.ocr.TesseractOCRParser
 
TextAndCSVParser - Class in org.apache.tika.parser.csv
Unless the TikaCoreProperties.CONTENT_TYPE_OVERRIDE is set, this parser tries to assess whether the file is a text file, csv or tsv.
TextAndCSVParser() - Constructor for class org.apache.tika.parser.csv.TextAndCSVParser
 
TextAndCSVParser(EncodingDetector) - Constructor for class org.apache.tika.parser.csv.TextAndCSVParser
 
TextCell - Class in org.apache.tika.parser.microsoft
Text cell.
TextCell(String) - Constructor for class org.apache.tika.parser.microsoft.TextCell
 
TiffParser - Class in org.apache.tika.parser.image
 
TiffParser() - Constructor for class org.apache.tika.parser.image.TiffParser
 
TikaExcelDataFormatter - Class in org.apache.tika.parser.microsoft
Overrides Excel's General format to include more significant digits than the MS Spec allows.
TikaExcelDataFormatter() - Constructor for class org.apache.tika.parser.microsoft.TikaExcelDataFormatter
 
TikaExcelDataFormatter(Locale) - Constructor for class org.apache.tika.parser.microsoft.TikaExcelDataFormatter
 
TikaExcelGeneralFormat - Class in org.apache.tika.parser.microsoft
A Format that allows up to 15 significant digits for integers.
TikaExcelGeneralFormat(Locale) - Constructor for class org.apache.tika.parser.microsoft.TikaExcelGeneralFormat
 
TikaMp4BoxHandler - Class in org.apache.tika.parser.mp4
 
TikaMp4BoxHandler(Metadata, Metadata, XHTMLContentHandler) - Constructor for class org.apache.tika.parser.mp4.TikaMp4BoxHandler
 
TikaUserDataBox - Class in org.apache.tika.parser.mp4.boxes
 
TikaUserDataBox(Box, byte[], Metadata, XHTMLContentHandler) - Constructor for class org.apache.tika.parser.mp4.boxes.TikaUserDataBox
 
TIME - Static variable in interface org.apache.tika.parser.ner.NERecogniser
 
TIME_FILE - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
 
TNEFParser - Class in org.apache.tika.parser.microsoft
A POI-powered Tika Parser for TNEF (Transport Neutral Encoding Format) messages, aka winmail.dat
TNEFParser() - Constructor for class org.apache.tika.parser.microsoft.TNEFParser
 
toBigInteger() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
 
toBigInteger() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
 
toBigInteger() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UNumber
Get this number as a BigInteger.
toBigInteger() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
 
toBoolean(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
 
toByte() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderEnd8bit
This method is used to get the byte value of the 8bit stream object header End.
toByteArray(List<Byte>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.ByteUtil
 
toByteArray() - Method in class org.apache.tika.parser.microsoft.onenote.GUID
 
toChar(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
 
toDouble(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
 
toGeoTag(Map<String, List<Location>>, String) - Method in class org.apache.tika.parser.geo.topic.GeoTag
 
toInt16(byte[], int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
 
toInt16(byte[], int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.LittleEndianBitConverter
Returns a 16-bit signed integer converted from two bytes at a specified position in a byte array.
toInt32(byte[], int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
 
toInt32(byte[], int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.LittleEndianBitConverter
Returns a 32-bit signed integer converted from two bytes at a specified position in a byte array.
toInt64(byte[], int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
 
tokenize(String) - Static method in class org.apache.tika.parser.ner.opennlp.OpenNLPNameFinder
 
toListOfByte(byte[]) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.ByteUtil
 
topN - Variable in class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
 
toSingle(byte[], int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
 
toString() - Method in class org.apache.tika.parser.captioning.CaptionObject
 
toString() - Method in class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
 
toString() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Prints the values of ChmfHeader
toString() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
 
toString() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
Returns textual representation of ChmLzxcControlData
toString() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
 
toString() - Method in class org.apache.tika.parser.chm.accessor.ChmPmgiHeader
Returns textual representation of the pmgi header
toString() - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
 
toString() - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
 
toString() - Method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
Returns textual representation of ChmBlockInfo
toString() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
It suits for informative outlook
toString() - Method in class org.apache.tika.parser.csv.CSVResult
 
toString() - Method in class org.apache.tika.parser.dif.DIFContentHandler
 
toString() - Method in class org.apache.tika.parser.microsoft.NumberCell
 
toString() - Method in class org.apache.tika.parser.microsoft.onenote.ExtendedGUID
 
toString() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataHashObject
 
toString() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
 
toString() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
 
toString() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
 
toString() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
 
toString(byte[]) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
 
toString() - Method in class org.apache.tika.parser.microsoft.onenote.GUID
 
toString() - Method in class org.apache.tika.parser.microsoft.TextCell
 
toString() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
 
toString() - Method in class org.apache.tika.parser.recognition.RecognisedObject
 
toString() - Method in enum org.apache.tika.parser.strings.StringsEncoding
 
toString() - Method in class org.apache.tika.parser.txt.CharsetMatch
 
toTags(CharacterRun) - Static method in class org.apache.tika.parser.microsoft.FormattingUtils
 
toUint16() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderEnd16bit
This method is used to get the byte value of the 16-bit stream object header End.
ToUint16() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart16bit
This method is used to get the Uint16 value of the 16bit stream object header.
ToUInt16(byte[], int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.LittleEndianBitConverter
Returns a 16-bit unsigned integer converted from two bytes at a specified position in a byte array.
toUInt32(byte[], int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
 
toUInt32(byte[], int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.LittleEndianBitConverter
Returns a 32-bit unsigned integer converted from two bytes at a specified position in a byte array.
toUInt64(byte[], int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.LittleEndianBitConverter
Returns a 64-bit unsigned integer converted from two bytes at a specified position in a byte array.
TrueTypeParser - Class in org.apache.tika.parser.font
Parser for TrueType font files (TTF).
TrueTypeParser() - Constructor for class org.apache.tika.parser.font.TrueTypeParser
 
tryAnalyzeWhetherConfirmSchema(List<DataElement>, ExGuid) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
This method is used to analyze whether the data elements are confirmed to the schema defined in MS-FSSHTTPD.
tryAnalyzeWhetherFullDataElementList(List<DataElement>, ExGuid) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
This method is used to try to analyze the returned whether data elements are complete.
tryGetCurrent(byte[], AtomicInteger, AtomicReference<T>, Class<T>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObject
Try to get current object, true will returned if success.
tryParse(byte[], int, AtomicReference<StreamObjectHeaderStart>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart
This method is used to parse the actual 16bit or 32bit stream header.
TSD_MIME_TYPE - Static variable in class org.apache.tika.parser.crypto.TSDParser
 
TSDParser - Class in org.apache.tika.parser.crypto
Tika parser for Time Stamped Data Envelope (application/timestamped-data)
TSDParser() - Constructor for class org.apache.tika.parser.crypto.TSDParser
 
TwoBytesOfData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.property
This class is used to represent the property contains 2 bytes of data in the PropertySet.rgData stream field.
TwoBytesOfData() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.TwoBytesOfData
 
TXTParser - Class in org.apache.tika.parser.txt
Plain text parser.
TXTParser() - Constructor for class org.apache.tika.parser.txt.TXTParser
 
TXTParser(EncodingDetector) - Constructor for class org.apache.tika.parser.txt.TXTParser
 
type - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
 
type - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyID
 
type - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.SerialNumber
 
type - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart
 

U

UByte - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned
The unsigned byte type
ubyte(String) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.Unsigned
Create an unsigned byte
ubyte(byte) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.Unsigned
Create an unsigned byte by masking it with 0xFF i.e.
ubyte(short) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.Unsigned
Create an unsigned byte
ubyte(int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.Unsigned
Create an unsigned byte
ubyte(long) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.Unsigned
Create an unsigned byte
uint(String) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.Unsigned
Create an unsigned int
uint(int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.Unsigned
Create an unsigned int by masking it with 0xFFFFFFFF i.e.
uint(long) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.Unsigned
Create an unsigned int
uint16() - Method in class org.apache.tika.parser.hwp.HwpStreamReader
unsigned 2 byte
uint16(int) - Method in class org.apache.tika.parser.hwp.HwpStreamReader
unsigned 2 byte array
uint32() - Method in class org.apache.tika.parser.hwp.HwpStreamReader
unsigned 4 byte
uint8() - Method in class org.apache.tika.parser.hwp.HwpStreamReader
unsigned 1 byte
UInteger - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned
The unsigned int type
ULong - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned
The unsigned long type
ulong(String) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.Unsigned
Create an unsigned long
ulong(long) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.Unsigned
Create an unsigned long by masking it with 0xFFFFFFFFFFFFFFFF i.e.
ulong(BigInteger) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.Unsigned
Create an unsigned long
UMath - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned
 
UNCOMPRESSED - Static variable in class org.apache.tika.parser.chm.core.ChmCommons
 
UNDEFINED - Static variable in class org.apache.tika.parser.chm.core.ChmCommons
Represents lzx block types in order to decompress differently
UniversalEncodingDetector - Class in org.apache.tika.parser.txt
 
UniversalEncodingDetector() - Constructor for class org.apache.tika.parser.txt.UniversalEncodingDetector
 
unmarshalBytes(int) - Method in class org.apache.tika.parser.chm.lzx.ChmSection
 
unmarshalCharArray(byte[], ChmPmglHeader, int) - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
 
unmarshalInt() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
 
unmarshalUByte() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
 
unmarshalUInt() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
 
unmarshalUlong() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
 
unmarshalUtfChar() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
 
unravelStringMet(NetcdfFile, Group, Metadata) - Method in class org.apache.tika.parser.hdf.HDFParser
 
Unsigned - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned
A utility class for static access to unsigned number functionality.
UNSPECIFIED_MEDIA_TYPE - Static variable in class org.apache.tika.parser.utils.DataURISchemeUtil
 
UNSUPPORTED_OOXML_TYPES - Static variable in class org.apache.tika.parser.microsoft.ooxml.OOXMLParser
We claim to support all OOXML files, but we actually don't support a small number of them.
UNumber - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned
A base type for unsigned numbers.
UNumber() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UNumber
 
USER_DEFINED_PROPERTY_PREFIX - Static variable in class org.apache.tika.parser.microsoft.JackcessParser
 
ushort(String) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.Unsigned
Create an unsigned short
ushort(short) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.Unsigned
Create an unsigned short by masking it with 0xFFFF i.e.
ushort(int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.Unsigned
Create an unsigned short
UShort - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned
The unsigned short type
UuidUtils - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.util
 
UuidUtils() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.UuidUtils
 

V

value - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
 
value - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyID
 
value - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.SerialNumber
 
valueOf(String) - Static method in enum org.apache.tika.parser.chm.core.ChmCommons.EntryType
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.tika.parser.chm.core.ChmCommons.IntelState
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.tika.parser.chm.core.ChmCommons.LzxState
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.tika.parser.ctakes.CTAKESAnnotationProperty
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.tika.parser.ctakes.CTAKESSerializer
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.tika.parser.iwork.iwana.IWork13PackageParser.IWork13DocumentType
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.tika.parser.iwork.iwana.IWork18PackageParser.IWork18DocumentType
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.tika.parser.microsoft.FormattingUtils.Tag
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.tika.parser.microsoft.onenote.Error
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.DataElementType
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyType
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.RequestTypes
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.ChunkingMethod
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderEnd
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
Get an instance of an unsigned byte
valueOf(byte) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
Get an instance of an unsigned byte by masking it with 0xFF i.e.
valueOf(short) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
Get an instance of an unsigned byte
valueOf(int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
Get an instance of an unsigned byte
valueOf(long) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
Get an instance of an unsigned byte
valueOf(String) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
Create an unsigned int
valueOf(int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
Create an unsigned int by masking it with 0xFFFFFFFF i.e.
valueOf(long) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
Create an unsigned int
valueOf(String) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
Create an unsigned long
valueOf(long) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
Create an unsigned long by masking it with 0xFFFFFFFFFFFFFFFF i.e.
valueOf(BigInteger) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
Create an unsigned long
valueOf(String) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
Create an unsigned short
valueOf(short) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
Create an unsigned short by masking it with 0xFFFF i.e.
valueOf(int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
Create an unsigned short
valueOf(String) - Static method in enum org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.EditType
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.tika.parser.microsoft.OutlookExtractor.RECIPIENT_TYPE
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.tika.parser.ocr.TesseractOCRConfig.OUTPUT_TYPE
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.tika.parser.pdf.PDFParserConfig.OCR_STRATEGY
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.tika.parser.strings.StringsEncoding
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.tika.parser.utils.CommonsDigester.DigestAlgorithm
Returns the enum constant of this type with the specified name.
values() - Static method in enum org.apache.tika.parser.chm.core.ChmCommons.EntryType
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.tika.parser.chm.core.ChmCommons.IntelState
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.tika.parser.chm.core.ChmCommons.LzxState
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.tika.parser.ctakes.CTAKESAnnotationProperty
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.tika.parser.ctakes.CTAKESSerializer
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.tika.parser.iwork.iwana.IWork13PackageParser.IWork13DocumentType
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.tika.parser.iwork.iwana.IWork18PackageParser.IWork18DocumentType
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.tika.parser.microsoft.FormattingUtils.Tag
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.tika.parser.microsoft.onenote.Error
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.DataElementType
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyType
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.RequestTypes
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.ChunkingMethod
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderEnd
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.EditType
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.tika.parser.microsoft.OutlookExtractor.RECIPIENT_TYPE
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.tika.parser.ocr.TesseractOCRConfig.OUTPUT_TYPE
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.tika.parser.pdf.PDFParserConfig.OCR_STRATEGY
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.tika.parser.strings.StringsEncoding
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.tika.parser.utils.CommonsDigester.DigestAlgorithm
Returns an array containing the constants of this enum type, in the order they are declared.
VERBATIM - Static variable in class org.apache.tika.parser.chm.core.ChmCommons
 
VSD - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
Microsoft Visio

W

W_NS - Static variable in class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
 
walkTree(OneNoteTreeWalkerOptions, Metadata, XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStorePackage
 
warn() - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
 
WebPParser - Class in org.apache.tika.parser.image
 
WebPParser() - Constructor for class org.apache.tika.parser.image.WebPParser
 
WMFParser - Class in org.apache.tika.parser.microsoft
This parser offers a very rough capability to extract text if there is text stored in the WMF files.
WMFParser() - Constructor for class org.apache.tika.parser.microsoft.WMFParser
 
Word2006MLParser - Class in org.apache.tika.parser.microsoft.ooxml.xwpf.ml2006
 
Word2006MLParser() - Constructor for class org.apache.tika.parser.microsoft.ooxml.xwpf.ml2006.Word2006MLParser
 
WordExtractor - Class in org.apache.tika.parser.microsoft
 
WordExtractor(ParseContext, Metadata) - Constructor for class org.apache.tika.parser.microsoft.WordExtractor
 
WordExtractor.TagAndStyle - Class in org.apache.tika.parser.microsoft
 
WordMLParser - Class in org.apache.tika.parser.microsoft.xml
Parses wordml 2003 format word files.
WordMLParser() - Constructor for class org.apache.tika.parser.microsoft.xml.WordMLParser
 
WordPerfectParser - Class in org.apache.tika.parser.wordperfect
Parser for Corel WordPerfect documents.
WordPerfectParser() - Constructor for class org.apache.tika.parser.wordperfect.WordPerfectParser
 
WPS - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
Microsoft Works
writeFile(byte[][], String) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
Writes byte[][] to the file

X

XLIFF12ContentHandler - Class in org.apache.tika.parser.xliff
Content Handler for XLIFF 1.2 documents.
XLIFF12Parser - Class in org.apache.tika.parser.xliff
Parser for XLIFF 1.2 files.
XLIFF12Parser() - Constructor for class org.apache.tika.parser.xliff.XLIFF12Parser
 
XLR - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
Microsoft Works Spreadsheet 7.0
XLS - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
Microsoft Excel
XLZParser - Class in org.apache.tika.parser.xliff
Parser for XLZ Archives.
XLZParser() - Constructor for class org.apache.tika.parser.xliff.XLZParser
 
XMLParser - Class in org.apache.tika.parser.xml
XML parser.
XMLParser() - Constructor for class org.apache.tika.parser.xml.XMLParser
 
XMLProfiler - Class in org.apache.tika.parser.xml
This parser enables profiling of XML.
XMLProfiler() - Constructor for class org.apache.tika.parser.xml.XMLProfiler
 
XMPMetadataExtractor - Class in org.apache.tika.parser.indesign.xmp
XMP Metadata Extractor based on Apache XmpBox.
XMPMetadataExtractor() - Constructor for class org.apache.tika.parser.indesign.xmp.XMPMetadataExtractor
 
XMPPacketScanner - Class in org.apache.tika.parser.image.xmp
This class is a parser for XMP packets.
XMPPacketScanner() - Constructor for class org.apache.tika.parser.image.xmp.XMPPacketScanner
 
xor(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
 
xor(long) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
 
xor(UInteger) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
 
xorExtendedGUID(ExtendedGUID, ExtendedGUID) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AdapterHelper
XOR two ExtendedGUID instances.
XPS - Static variable in class org.apache.tika.parser.microsoft.ooxml.OOXMLParser
 
XPSExtractorDecorator - Class in org.apache.tika.parser.microsoft.ooxml.xps
 
XPSExtractorDecorator(ParseContext, POIXMLTextExtractor) - Constructor for class org.apache.tika.parser.microsoft.ooxml.xps.XPSExtractorDecorator
 
XPSTextExtractor - Class in org.apache.tika.parser.microsoft.ooxml.xps
Currently, mostly a pass-through class to hold pkg and properties and keep the general framework similar to our other POI-integrated extractors.
XPSTextExtractor(OPCPackage) - Constructor for class org.apache.tika.parser.microsoft.ooxml.xps.XPSTextExtractor
 
XSLFEventBasedPowerPointExtractor - Class in org.apache.tika.parser.microsoft.ooxml.xslf
 
XSLFEventBasedPowerPointExtractor(String) - Constructor for class org.apache.tika.parser.microsoft.ooxml.xslf.XSLFEventBasedPowerPointExtractor
 
XSLFEventBasedPowerPointExtractor(OPCPackage) - Constructor for class org.apache.tika.parser.microsoft.ooxml.xslf.XSLFEventBasedPowerPointExtractor
 
XSLFPowerPointExtractorDecorator - Class in org.apache.tika.parser.microsoft.ooxml
 
XSLFPowerPointExtractorDecorator(Metadata, ParseContext, XSLFPowerPointExtractor) - Constructor for class org.apache.tika.parser.microsoft.ooxml.XSLFPowerPointExtractorDecorator
 
XSLFPowerPointExtractorDecorator(ParseContext, XSLFPowerPointExtractor) - Constructor for class org.apache.tika.parser.microsoft.ooxml.XSLFPowerPointExtractorDecorator
Deprecated.
XSSFBExcelExtractorDecorator - Class in org.apache.tika.parser.microsoft.ooxml
 
XSSFBExcelExtractorDecorator(ParseContext, POIXMLTextExtractor, Locale) - Constructor for class org.apache.tika.parser.microsoft.ooxml.XSSFBExcelExtractorDecorator
 
XSSFExcelExtractorDecorator - Class in org.apache.tika.parser.microsoft.ooxml
 
XSSFExcelExtractorDecorator(ParseContext, POIXMLTextExtractor, Locale) - Constructor for class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
 
XSSFExcelExtractorDecorator.HeaderFooterFromString - Class in org.apache.tika.parser.microsoft.ooxml
 
XSSFExcelExtractorDecorator.SheetTextAsHTML - Class in org.apache.tika.parser.microsoft.ooxml
Turns formatted sheet events into HTML
XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer - Class in org.apache.tika.parser.microsoft.ooxml
Captures information on interesting tags, whilst delegating the main work to the formatting handler
XSSFSheetInterestingPartsCapturer(ContentHandler) - Constructor for class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
 
XUserDefinedCharset - Class in org.apache.tika.parser.html.charsetdetector.charsets
 
XUserDefinedCharset() - Constructor for class org.apache.tika.parser.html.charsetdetector.charsets.XUserDefinedCharset
 
XWPFEventBasedWordExtractor - Class in org.apache.tika.parser.microsoft.ooxml.xwpf
Experimental class that is based on POI's XSSFEventBasedExcelExtractor
XWPFEventBasedWordExtractor(String) - Constructor for class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFEventBasedWordExtractor
 
XWPFEventBasedWordExtractor(OPCPackage) - Constructor for class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFEventBasedWordExtractor
 
XWPFListManager - Class in org.apache.tika.parser.microsoft.ooxml
 
XWPFListManager(XWPFNumbering) - Constructor for class org.apache.tika.parser.microsoft.ooxml.XWPFListManager
 
XWPFNumberingShim - Class in org.apache.tika.parser.microsoft.ooxml.xwpf
Stub class of POI's XWPFNumbering because onDocumentRead() is protected
XWPFNumberingShim(PackagePart) - Constructor for class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFNumberingShim
 
XWPFStylesShim - Class in org.apache.tika.parser.microsoft.ooxml.xwpf
For Tika, all we need (so far) is a mapping between styleId and a style's name.
XWPFStylesShim(PackagePart, ParseContext) - Constructor for class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFStylesShim
 
XWPFWordExtractorDecorator - Class in org.apache.tika.parser.microsoft.ooxml
 
XWPFWordExtractorDecorator(Metadata, ParseContext, XWPFWordExtractor) - Constructor for class org.apache.tika.parser.microsoft.ooxml.XWPFWordExtractorDecorator
 
XWPFWordExtractorDecorator(ParseContext, XWPFWordExtractor) - Constructor for class org.apache.tika.parser.microsoft.ooxml.XWPFWordExtractorDecorator

Z

ZipContainerDetector - Class in org.apache.tika.parser.pkg
A detector that works on Zip documents and other archive and compression formats to figure out exactly what the file is.
ZipContainerDetector() - Constructor for class org.apache.tika.parser.pkg.ZipContainerDetector
 
ZipFilesChunking - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking
This class is used to process zip file chunking
ZipFilesChunking(byte[]) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.ZipFilesChunking
Initializes a new instance of the ZipFilesChunking class
ZipHeader - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
 
ZipSalvager - Class in org.apache.tika.parser.utils
 
ZipSalvager() - Constructor for class org.apache.tika.parser.utils.ZipSalvager
 
A B C D E F G H I J L M N O P Q R S T U V W X Z 
Skip navigation links

Copyright © 2007–2022 The Apache Software Foundation. All rights reserved.