- canRun() - Static method in class org.apache.tika.parser.journal.GrobidRESTParser
-
- CaptionObject - Class in org.apache.tika.parser.captioning
-
A model for caption objects from graphics and texts typically includes
human readable sentence, language of the sentence and confidence score.
- CaptionObject(String, String, double) - Constructor for class org.apache.tika.parser.captioning.CaptionObject
-
- cb - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.PrtFourBytesOfLengthFollowedByData
-
- Cell - Interface in org.apache.tika.parser.microsoft
-
Cell of content.
- cell(String, String, XSSFComment) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.SheetTextAsHTML
-
- CellDecorator - Class in org.apache.tika.parser.microsoft
-
Cell decorator.
- CellDecorator(Cell) - Constructor for class org.apache.tika.parser.microsoft.CellDecorator
-
- CellID - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
-
- CellID(ExGuid, ExGuid) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellID
-
Initializes a new instance of the CellID class with specified ExGuids.
- CellID(CellID) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellID
-
Initializes a new instance of the CellID class, this is the copy constructor.
- CellID() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellID
-
Initializes a new instance of the CellID class, this is default constructor.
- cellID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexCellMapping
-
- cellID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestRootDeclare
-
- CellIDArray - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
-
- CellIDArray(long, List<CellID>) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellIDArray
-
Initializes a new instance of the CellIDArray class.
- CellIDArray(CellIDArray) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellIDArray
-
Initializes a new instance of the CellIDArray class, this is copy constructor.
- CellIDArray() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellIDArray
-
Initializes a new instance of the CellIDArray class, this is default constructor.
- cellIDArray - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectData
-
- cellIDArray - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectDataBLOBReference
-
- CellManifestCurrentRevision - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
- CellManifestCurrentRevision() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.CellManifestCurrentRevision
-
Initializes a new instance of the CellManifestCurrentRevision class.
- cellManifestCurrentRevision - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.CellManifestDataElementData
-
- cellManifestCurrentRevisionExGuid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.CellManifestCurrentRevision
-
- CellManifestDataElementData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
Cell manifest data element
- CellManifestDataElementData() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.CellManifestDataElementData
-
Initializes a new instance of the CellManifestDataElementData class.
- cellManifests - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStorePackage
-
- cellMappingExGuid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexCellMapping
-
- cellMappingSerialNumber - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexCellMapping
-
- cellReferencesCount - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectBLOBDataDeclaration
-
- cellReferencesCount - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectDeclare
-
- CellSecondExGuid - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
-
- characters(char[], int, int) - Method in class org.apache.tika.parser.ctakes.CTAKESContentHandler
-
- characters(char[], int, int) - Method in class org.apache.tika.parser.dif.DIFContentHandler
-
- characters(char[], int, int) - Method in class org.apache.tika.parser.html.BoilerpipeContentHandler
-
- characters(char[], int, int) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
-
- characters(char[], int, int) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
-
- characters(char[], int, int) - Method in class org.apache.tika.parser.mif.MIFContentHandler
-
- characters(char[], int, int) - Method in class org.apache.tika.parser.xliff.XLIFF12ContentHandler
-
- characters(char[], int, int) - Method in class org.apache.tika.parser.xml.AttributeDependantMetadataHandler
-
- characters(char[], int, int) - Method in class org.apache.tika.parser.xml.ElementMetadataHandler
-
- characters(char[], int, int) - Method in class org.apache.tika.parser.xml.MetadataHandler
-
Deprecated.
- CharsetDetector - Class in org.apache.tika.parser.txt
-
CharsetDetector provides a facility for detecting the
charset or encoding of character data in an unknown format.
- CharsetDetector() - Constructor for class org.apache.tika.parser.txt.CharsetDetector
-
Constructor
- CharsetDetector(int) - Constructor for class org.apache.tika.parser.txt.CharsetDetector
-
- CharsetMatch - Class in org.apache.tika.parser.txt
-
This class represents a charset that has been identified by a CharsetDetector
as a possible encoding for a set of input data.
- check(Metadata) - Method in class org.apache.tika.parser.pdf.AccessChecker
-
- checkAvail() - Method in class org.apache.tika.parser.geo.topic.gazetteer.GeoGazetteerClient
-
Ping lucene-geo-gazetteer API
- checkBit(int) - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.captioning.tf.TensorflowRESTCaptioner
-
- checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.jdbc.SQLite3Parser
-
- checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.recognition.ObjectRecognitionParser
-
- checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.recognition.tf.TensorflowImageRecParser
-
- checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
-
- checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.sentiment.SentimentAnalysisParser
-
- CHM_ITSF_V2_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CHM_ITSF_V3_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CHM_ITSP_V1_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CHM_LZXC_MIN_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CHM_LZXC_RESETTABLE_V1_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CHM_LZXC_V2_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CHM_PMGI_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CHM_PMGI_MARKER - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CHM_PMGL_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CHM_SIGNATURE_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CHM_VER_1 - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CHM_VER_2 - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CHM_VER_3 - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CHM_WINDOW_SIZE_BLOCK - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- ChmAccessor<T> - Interface in org.apache.tika.parser.chm.accessor
-
Defines an accessor interface
- ChmAssert - Class in org.apache.tika.parser.chm.assertion
-
Contains chm extractor assertions
- ChmAssert() - Constructor for class org.apache.tika.parser.chm.assertion.ChmAssert
-
- ChmBlockInfo - Class in org.apache.tika.parser.chm.lzx
-
A container that contains chm block information such as: i.
- ChmCommons - Class in org.apache.tika.parser.chm.core
-
- ChmCommons.EntryType - Enum in org.apache.tika.parser.chm.core
-
Represents entry types: uncompressed, compressed
- ChmCommons.IntelState - Enum in org.apache.tika.parser.chm.core
-
Represents intel file states during decompression
- ChmCommons.LzxState - Enum in org.apache.tika.parser.chm.core
-
Represents lzx states: started decoding, not started decoding
- ChmConstants - Class in org.apache.tika.parser.chm.core
-
- ChmDirectoryListingSet - Class in org.apache.tika.parser.chm.accessor
-
Holds chm listing entries
- ChmDirectoryListingSet(byte[], ChmItsfHeader, ChmItspHeader) - Constructor for class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
-
Constructs chm directory listing set
- ChmExtractor - Class in org.apache.tika.parser.chm.core
-
Extracts text from chm file.
- ChmExtractor(InputStream) - Constructor for class org.apache.tika.parser.chm.core.ChmExtractor
-
- ChmItsfHeader - Class in org.apache.tika.parser.chm.accessor
-
The Header 0000: char[4] 'ITSF' 0004: DWORD 3 (Version number) 0008: DWORD
Total header length, including header section table and following data.
- ChmItsfHeader() - Constructor for class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
- ChmItspHeader - Class in org.apache.tika.parser.chm.accessor
-
Directory header The directory starts with a header; its format is as
follows: 0000: char[4] 'ITSP' 0004: DWORD Version number 1 0008: DWORD Length
of the directory header 000C: DWORD $0a (unknown) 0010: DWORD $1000 Directory
chunk size 0014: DWORD "Density" of quickref section, usually 2 0018: DWORD
Depth of the index tree - 1 there is no index, 2 if there is one level of
PMGI chunks 001C: DWORD Chunk number of root index chunk, -1 if there is none
(though at least one file has 0 despite there being no index chunk, probably
a bug) 0020: DWORD Chunk number of first PMGL (listing) chunk 0024: DWORD
Chunk number of last PMGL (listing) chunk 0028: DWORD -1 (unknown) 002C:
DWORD Number of directory chunks (total) 0030: DWORD Windows language ID
0034: GUID {5D02926A-212E-11D0-9DF9-00A0C922E6EC} 0044: DWORD $54 (This is
the length again) 0048: DWORD -1 (unknown) 004C: DWORD -1 (unknown) 0050:
DWORD -1 (unknown)
- ChmItspHeader() - Constructor for class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
- ChmLzxBlock - Class in org.apache.tika.parser.chm.lzx
-
Decompresses a chm block.
- ChmLzxBlock(int, byte[], long, ChmLzxBlock) - Constructor for class org.apache.tika.parser.chm.lzx.ChmLzxBlock
-
- ChmLzxcControlData - Class in org.apache.tika.parser.chm.accessor
-
::DataSpace/Storage//ControlData This file contains $20 bytes of
information on the compression.
- ChmLzxcControlData() - Constructor for class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
- ChmLzxcResetTable - Class in org.apache.tika.parser.chm.accessor
-
LZXC reset table For ensuring a decompression.
- ChmLzxcResetTable() - Constructor for class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
- ChmLzxState - Class in org.apache.tika.parser.chm.lzx
-
- ChmLzxState(int) - Constructor for class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- ChmParser - Class in org.apache.tika.parser.chm
-
- ChmParser() - Constructor for class org.apache.tika.parser.chm.ChmParser
-
- ChmParsingException - Exception in org.apache.tika.parser.chm.exception
-
- ChmParsingException(String) - Constructor for exception org.apache.tika.parser.chm.exception.ChmParsingException
-
- ChmPmgiHeader - Class in org.apache.tika.parser.chm.accessor
-
Description Note: not always exists An index chunk has the following format:
0000: char[4] 'PMGI' 0004: DWORD Length of quickref/free area at end of
directory chunk 0008: Directory index entries (to quickref/free area) The
quickref area in an PMGI is the same as in an PMGL The format of a directory
index entry is as follows: BYTE: length of name BYTEs: name (UTF-8 encoded)
ENCINT: directory listing chunk which starts with name Encoded Integers aka
ENCINT An ENCINT is a variable-length integer.
- ChmPmgiHeader() - Constructor for class org.apache.tika.parser.chm.accessor.ChmPmgiHeader
-
- ChmPmglHeader - Class in org.apache.tika.parser.chm.accessor
-
Description There are two types of directory chunks -- index chunks, and
listing chunks.
- ChmPmglHeader() - Constructor for class org.apache.tika.parser.chm.accessor.ChmPmglHeader
-
- ChmSection - Class in org.apache.tika.parser.chm.lzx
-
- ChmSection(byte[]) - Constructor for class org.apache.tika.parser.chm.lzx.ChmSection
-
- ChmSection(byte[], byte[]) - Constructor for class org.apache.tika.parser.chm.lzx.ChmSection
-
- ChmWrapper - Class in org.apache.tika.parser.chm.core
-
- ChmWrapper() - Constructor for class org.apache.tika.parser.chm.core.ChmWrapper
-
- chunking() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.AbstractChunking
-
This method is used to chunk the file data.
- chunking() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.RDCAnalysisChunking
-
This method is used to chunk the file data.
- chunking() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.SimpleChunking
-
This method is used to chunk the file data.
- chunking() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.ZipFilesChunking
-
This method is used to chunk the file data.
- ChunkingFactory - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking
-
This class is used to create instance of AbstractChunking.
- ChunkingMethod - Enum in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking
-
- ClassParser - Class in org.apache.tika.parser.asm
-
Parser for Java .class files.
- ClassParser() - Constructor for class org.apache.tika.parser.asm.ClassParser
-
- clearBit(byte[], long) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.Bit
-
Set a bit value to "Off" in the specified byte array with the specified bit position.
- clone() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- closeStyleTags(XHTMLContentHandler, Deque<FormattingUtils.Tag>) - Static method in class org.apache.tika.parser.microsoft.FormattingUtils
-
Closes all formatting tags.
- CommonsDigester - Class in org.apache.tika.parser.utils
-
Implementation of DigestingParser.Digester
that relies on commons.codec.digest.DigestUtils to calculate digest hashes.
- CommonsDigester(int, String) - Constructor for class org.apache.tika.parser.utils.CommonsDigester
-
Include a string representing the comma-separated algorithms to run: e.g.
- CommonsDigester(int, CommonsDigester.DigestAlgorithm...) - Constructor for class org.apache.tika.parser.utils.CommonsDigester
-
- CommonsDigester.DigestAlgorithm - Enum in org.apache.tika.parser.utils
-
- COMP_OBJ - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
Some other kind of embedded document, in a CompObj container within another OLE2 document
- Compact64bitInt - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
-
A 9-byte encoding of values in the range 0x0002000000000000 through 0xFFFFFFFFFFFFFFFF
- Compact64bitInt(long) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
-
Initializes a new instance of the Compact64bitInt class with specified value.
- Compact64bitInt() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
-
Initializes a new instance of the Compact64bitInt class, this is the default constructor.
- CompactID - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
-
This class is used to represent the CompactID structrue.
- CompactID() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CompactID
-
- CompactUint14bitType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
-
Specify the type value for compact uint 14 bits type value.
- CompactUint21bitType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
-
Specify the type value for compact uint 21 bits type value.
- CompactUint28bitType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
-
Specify the type value for compact uint 28 bits type value.
- CompactUint35bitType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
-
Specify the type value for compact uint 35 bits type value.
- CompactUint42bitType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
-
Specify the type value for compact uint 42 bits type value.
- CompactUint49bitType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
-
Specify the type value for compact uint 49 bits type value.
- CompactUint64bitType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
-
Specify the type value for compact uint 64 bits type value.
- CompactUint7bitType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
-
Specify the type value for compact uint 7 bits type value.
- CompactUintNullType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
-
Specify the type value for compact uint zero type value.
- compare(long, long) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
-
- compareTo(CSVResult) - Method in class org.apache.tika.parser.csv.CSVResult
-
Sorts in descending order of confidence
- compareTo(ExtendedGUID) - Method in class org.apache.tika.parser.microsoft.onenote.ExtendedGUID
-
- compareTo(UByte) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
-
- compareTo(UInteger) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
-
- compareTo(ULong) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
-
- compareTo(UShort) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
-
- compareTo(GUID) - Method in class org.apache.tika.parser.microsoft.onenote.GUID
-
- compareTo(CharsetMatch) - Method in class org.apache.tika.parser.txt.CharsetMatch
-
Compare to other CharsetMatch objects.
- CompositeTagHandler - Class in org.apache.tika.parser.mp3
-
Takes an array of
ID3Tags in preference order, and when asked for
a given tag, will return it from the first
ID3Tags that has it.
- CompositeTagHandler(ID3Tags[]) - Constructor for class org.apache.tika.parser.mp3.CompositeTagHandler
-
- compound - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart
-
Gets or sets a value that specifies if set a compound parse type is needed and
MUST be ended with either an 8-bit stream object header end or a 16-bit stream object header end.
- CompressorParser - Class in org.apache.tika.parser.pkg
-
Parser for various compression formats.
- CompressorParser() - Constructor for class org.apache.tika.parser.pkg.CompressorParser
-
- CompressorParserOptions - Interface in org.apache.tika.parser.pkg
-
Interface for setting options for the
CompressorParser by passing
via the
ParseContext.
- confidence - Variable in class org.apache.tika.parser.recognition.RecognisedObject
-
Confidence score
- config - Variable in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
-
- configure(ParseContext) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- configure(PDF2XHTML) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Configures the given pdf2XHTML.
- configureExtractor(POIXMLTextExtractor, Locale) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFBExcelExtractorDecorator
-
- configureExtractor(POIXMLTextExtractor, Locale) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
-
- contains(Charset) - Method in class org.apache.tika.parser.html.charsetdetector.charsets.ReplacementCharset
-
- contains(Charset) - Method in class org.apache.tika.parser.html.charsetdetector.charsets.XUserDefinedCharset
-
- containsEmail(String) - Static method in class org.apache.tika.parser.mail.MailUtil
-
If the chunk looks like it contains an email
- CONTENT - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- content - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.BinaryItem
-
- content - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellIDArray
-
- content - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGUIDArray
-
Gets or sets an extended GUID array
- contextIDs - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectPropSet
-
- CONTROL_DATA - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- converttoInt(byte[]) - Static method in class org.apache.tika.parser.image.ICNSType
-
- convertToJSONArray(JSONObject, String) - Method in class org.apache.tika.parser.ner.grobid.GrobidNERecogniser
-
Converts JSON Object to JSON Array
- convertToJSONObject(String) - Method in class org.apache.tika.parser.ner.grobid.GrobidNERecogniser
-
Parses a JSON String and converts it to a JSON Object
- copyOfRange(byte[], int, int) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
-
- CoreNLPNERecogniser - Class in org.apache.tika.parser.ner.corenlp
-
This class offers an implementation of
NERecogniser based on
CRF classifiers from Stanford CoreNLP.
- CoreNLPNERecogniser() - Constructor for class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
-
- CoreNLPNERecogniser(String) - Constructor for class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
-
Creates a NERecogniser by loading model from given path
- count - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellIDArray
-
- count - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGUIDArray
-
- count - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamHeader
-
- cProperties - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.PrtArrayOfPropertyValues
-
- cProperties - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.PropertySet
-
- createCellMainifestDataElement(ExGuid, Map<CellID, ExGuid>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
-
This method is used to create the cell manifest data element.
- createChunkingInstance(byte[]) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.ChunkingFactory
-
This method is used to create the instance of AbstractChunking.
- createChunkingInstance(IntermediateNodeObject) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.ChunkingFactory
-
This method is used to create the instance of AbstractChunking.
- createChunkingInstance(byte[], ChunkingMethod) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.ChunkingFactory
-
This method is used to create the instance of AbstractChunking.
- createDecryptStream(InputStream, Key) - Method in class org.apache.tika.parser.hwp.HwpTextExtractorV5
-
- createFrameIfPresent(InputStream) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
-
Returns the next ID3v2 Frame in
the file, or null if the next batch of data
doesn't correspond to either an ID3v2 header.
- createInstance(ObjectGroupDataElementData) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.HeaderCell
-
Create the instance of Header Cell.
- createInstance(ExGuid, ObjectGroupDataElementData, boolean) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionStoreObjectGroup
-
- createObjectGroupDataElement(byte[], AtomicReference<ExGuid>, List<ExGuid>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
-
This method is used to create object group data/blob element list.
- createOneNoteDocumentFromDirectFileResource(OneNoteDirectFileResource) - Method in class org.apache.tika.parser.microsoft.onenote.OneNoteParser
-
Create a OneNoteDocument object.
- createRevisionManifestDataElement(ExGuid, ExGuid, List<ExGuid>, Map<ExGuid, ExGuid>, AtomicReference<ExGuid>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
-
This method is used to create the revision manifest data element.
- createStorageIndexDataElement(ExGuid, Map<CellID, ExGuid>, Map<ExGuid, ExGuid>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
-
This method is used to create the storage index data element.
- createStorageManifestDataElement(Map<CellID, ExGuid>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
-
This method is used to create the storage manifest data element.
- CSVParams - Class in org.apache.tika.parser.csv
-
- CSVResult - Class in org.apache.tika.parser.csv
-
- CSVResult(double, MediaType, Character) - Constructor for class org.apache.tika.parser.csv.CSVResult
-
- CTAKES_META_PREFIX - Static variable in class org.apache.tika.parser.ctakes.CTAKESContentHandler
-
- CTAKESAnnotationProperty - Enum in org.apache.tika.parser.ctakes
-
This enumeration includes the properties that an IdentifiedAnnotation object can provide.
- CTAKESConfig - Class in org.apache.tika.parser.ctakes
-
- CTAKESConfig() - Constructor for class org.apache.tika.parser.ctakes.CTAKESConfig
-
Default constructor.
- CTAKESConfig(InputStream) - Constructor for class org.apache.tika.parser.ctakes.CTAKESConfig
-
Loads properties from InputStream and then tries to close InputStream.
- CTAKESContentHandler - Class in org.apache.tika.parser.ctakes
-
Class used to extract biomedical information while parsing.
- CTAKESContentHandler(ContentHandler, Metadata, CTAKESConfig) - Constructor for class org.apache.tika.parser.ctakes.CTAKESContentHandler
-
- CTAKESContentHandler(ContentHandler, Metadata) - Constructor for class org.apache.tika.parser.ctakes.CTAKESContentHandler
-
- CTAKESContentHandler() - Constructor for class org.apache.tika.parser.ctakes.CTAKESContentHandler
-
Default constructor.
- CTAKESParser - Class in org.apache.tika.parser.ctakes
-
CTAKESParser decorates a
Parser and leverages on
CTAKESContentHandler to extract biomedical information from
clinical text using Apache cTAKES.
- CTAKESParser() - Constructor for class org.apache.tika.parser.ctakes.CTAKESParser
-
Wraps the default Parser
- CTAKESParser(TikaConfig) - Constructor for class org.apache.tika.parser.ctakes.CTAKESParser
-
Wraps the default Parser for this Config
- CTAKESParser(Parser) - Constructor for class org.apache.tika.parser.ctakes.CTAKESParser
-
Wraps the specified Parser
- CTAKESSerializer - Enum in org.apache.tika.parser.ctakes
-
Enumeration for types of cTAKES (UIMA) CAS serializer supported by cTAKES.
- CTAKESUtils - Class in org.apache.tika.parser.ctakes
-
This class provides methods to extract biomedical information from plain text
using
CTAKESContentHandler that relies on Apache cTAKES.
- CTAKESUtils() - Constructor for class org.apache.tika.parser.ctakes.CTAKESUtils
-
- data - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.EightBytesOfData
-
- data - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.FourBytesOfData
-
- data - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.OneByteOfData
-
- data - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.PrtArrayOfPropertyValues
-
- data - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.PrtFourBytesOfLengthFollowedByData
-
- data - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.TwoBytesOfData
-
- data - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElement
-
- data - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataHashObject
-
Gets or sets a binary item as specified in [MS-FSSHTTPB] section 2.2.1.3 that specifies a
value that is unique to the file data represented by this root node object.
- data - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectData
-
- data - Variable in class org.apache.tika.parser.mp3.ID3v2Frame.RawTag
-
- DataElement - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
- DataElement(DataElementType, DataElementData) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElement
-
Initializes a new instance of the DataElement class.
- DataElement() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElement
-
Initializes a new instance of the DataElement class.
- DataElementData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
Base class of data element
- DataElementData() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementData
-
- dataElementExGuid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElement
-
- DataElementHash - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
Specifies an data element hash stream object
- DataElementHash() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementHash
-
Initializes a new instance of the DataElementHash class.
- dataElementHash - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupDataElementData
-
- dataElementHashData - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementHash
-
- dataElementHashScheme - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementHash
-
- dataElementPackage - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AlternativePackaging
-
- DataElementPackage - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
- DataElementPackage() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementPackage
-
Initializes a new instance of the DataElementHash class.
- DataElementParseErrorException - Exception in org.apache.tika.parser.microsoft.onenote.fsshttpb.exception
-
- DataElementParseErrorException(int, Exception) - Constructor for exception org.apache.tika.parser.microsoft.onenote.fsshttpb.exception.DataElementParseErrorException
-
- DataElementParseErrorException(int, String, Exception) - Constructor for exception org.apache.tika.parser.microsoft.onenote.fsshttpb.exception.DataElementParseErrorException
-
- dataElements - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementPackage
-
- DataElementType - Enum in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
-
The enumeration of the data element type
- dataElementType - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElement
-
- DataElementUtils - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.util
-
- DataElementUtils() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
-
- dataHash - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.LeafNodeObject
-
- DataHashObject - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
- DataHashObject() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataHashObject
-
Initializes a new instance of the DataHashObject class.
- DataNodeObjectData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
-
Data Node Object data
- DataNodeObjectData(byte[], int, int) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.DataNodeObjectData
-
Initializes a new instance of the DataNodeObjectData class.
- dataNodeObjectData - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.LeafNodeObject
-
- dataRoot - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStorePackage
-
- dataSize - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataSizeObject
-
- dataSize - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.NodeObject
-
- DataSizeObject - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
Data Size Object
- DataSizeObject() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataSizeObject
-
Initializes a new instance of the DataSizeObject class.
- DataURIScheme - Class in org.apache.tika.parser.utils
-
- DataURISchemeParseException - Exception in org.apache.tika.parser.utils
-
- DataURISchemeParseException(String) - Constructor for exception org.apache.tika.parser.utils.DataURISchemeParseException
-
- DataURISchemeUtil - Class in org.apache.tika.parser.utils
-
Not thread safe.
- DataURISchemeUtil() - Constructor for class org.apache.tika.parser.utils.DataURISchemeUtil
-
- DATE - Static variable in interface org.apache.tika.parser.ner.NERecogniser
-
- DATE_FILE - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
-
- DBFParser - Class in org.apache.tika.parser.dbf
-
This is a Tika wrapper around the DBFReader.
- DBFParser() - Constructor for class org.apache.tika.parser.dbf.DBFParser
-
- DcXMLParser - Class in org.apache.tika.parser.xml
-
Dublin Core metadata parser
- DcXMLParser() - Constructor for class org.apache.tika.parser.xml.DcXMLParser
-
- decompressConcatenated(Metadata) - Method in interface org.apache.tika.parser.pkg.CompressorParserOptions
-
- DEF_MODEL - Static variable in class org.apache.tika.parser.sentiment.SentimentAnalysisParser
-
- DEFAULT_CHARSET - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- DEFAULT_MODEL_PATH - Static variable in class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
-
default Model path
- DEFAULT_MODELS - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
-
- DEFAULT_NER_IMPL - Static variable in class org.apache.tika.parser.ner.NamedEntityParser
-
- DefaultHtmlMapper - Class in org.apache.tika.parser.html
-
The default HTML mapping rules in Tika.
- DefaultHtmlMapper() - Constructor for class org.apache.tika.parser.html.DefaultHtmlMapper
-
- DELIMITER_PROPERTY - Static variable in class org.apache.tika.parser.csv.TextAndCSVParser
-
- deserializeDataElementDataFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.CellManifestDataElementData
-
Used to return the length of this element.
- deserializeDataElementDataFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementData
-
De-serialize data element data from byte array.
- deserializeDataElementDataFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupDataElementData
-
Used to return the length of this element.
- deserializeDataElementDataFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestDataElementData
-
Used to return the length of this element.
- deserializeDataElementDataFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexDataElementData
-
Used to de-serialize the data element.
- deserializeDataElementDataFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestDataElementData
-
Used to de-serialize data element.
- deserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.BasicObject
-
Used to return the length of this element.
- deserializeFromByteArray(StreamObjectHeaderStart, byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObject
-
Used to return the length of this element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.CellManifestCurrentRevision
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElement
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementHash
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementPackage
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataHashObject
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataSizeObject
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.IntermediateNodeObject
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.LeafNodeObject
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupData
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupDeclarations
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupMetadata
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupMetadataDeclarations
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectBLOBDataDeclaration
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectData
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectDataBLOBReference
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectDeclare
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifest
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestObjectGroupReferences
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestRootDeclare
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.SignatureObject
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexCellMapping
-
Used to de-serialize the items.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexManifestMapping
-
Used to Deserialize the items.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexRevisionMapping
-
Used to de-serialize the items
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestRootDeclare
-
Used to de-serialize the items.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestSchemaGUID
-
Used to de-serialize the items.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObject
-
De-serialize items from byte array.
- detect(InputStream, Metadata) - Method in class org.apache.tika.parser.apple.BPListDetector
-
- detect(InputStream, Metadata) - Method in class org.apache.tika.parser.html.charsetdetector.StandardHtmlEncodingDetector
-
- detect(InputStream, Metadata) - Method in class org.apache.tika.parser.html.HtmlEncodingDetector
-
- detect(ZipFile) - Static method in enum org.apache.tika.parser.iwork.iwana.IWork13PackageParser.IWork13DocumentType
-
- detect(ZipFile) - Static method in enum org.apache.tika.parser.iwork.iwana.IWork18PackageParser.IWork18DocumentType
-
- detect(Set<String>) - Static method in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
- detect(Set<String>, DirectoryEntry) - Static method in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
Internal detection of the specific kind of OLE2 document, based on the
names of the top-level streams within the file.
- detect(InputStream, Metadata) - Method in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
- detect(InputStream, Metadata) - Method in class org.apache.tika.parser.pkg.StreamingZipContainerDetector
-
- detect(InputStream, Metadata) - Method in class org.apache.tika.parser.pkg.ZipContainerDetector
-
- detect() - Method in class org.apache.tika.parser.txt.CharsetDetector
-
Return the charset that best matches the supplied input data.
- detect(InputStream, Metadata) - Method in class org.apache.tika.parser.txt.Icu4jEncodingDetector
-
- detect(InputStream, Metadata) - Method in class org.apache.tika.parser.txt.UniversalEncodingDetector
-
- detectAll() - Method in class org.apache.tika.parser.txt.CharsetDetector
-
Return an array of all charsets that appear to be plausible
matches with the input data.
- detectIfPossible(ZipEntry) - Static method in enum org.apache.tika.parser.iwork.iwana.IWork13PackageParser.IWork13DocumentType
-
- detectIfPossible(ZipEntry) - Static method in enum org.apache.tika.parser.iwork.iwana.IWork18PackageParser.IWork18DocumentType
-
- detectOfficeOpenXML(OPCPackage) - Static method in class org.apache.tika.parser.pkg.ZipContainerDetector
-
Detects the type of an OfficeOpenXML (OOXML) file from
opened Package
- detectType(ZipArchiveEntry, ZipFile) - Static method in enum org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
-
- detectType(ZipArchiveEntry, ZipArchiveInputStream) - Static method in enum org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
-
- detectType(InputStream) - Static method in enum org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
-
- detectType(POIFSFileSystem) - Static method in enum org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
-
- detectType(DirectoryEntry) - Static method in enum org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
-
- DIFContentHandler - Class in org.apache.tika.parser.dif
-
- DIFContentHandler(ContentHandler, Metadata) - Constructor for class org.apache.tika.parser.dif.DIFContentHandler
-
- DIFParser - Class in org.apache.tika.parser.dif
-
- DIFParser() - Constructor for class org.apache.tika.parser.dif.DIFParser
-
- DirectoryListingEntry - Class in org.apache.tika.parser.chm.accessor
-
The format of a directory listing entry is as follows: BYTE: length of name
BYTEs: name (UTF-8 encoded) ENCINT: content section ENCINT: offset ENCINT:
length The offset is from the beginning of the content section the file is
in, after the section has been decompressed (if appropriate).
- DirectoryListingEntry() - Constructor for class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
-
- DirectoryListingEntry(int, String, ChmCommons.EntryType, int, int) - Constructor for class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
-
Constructs directoryListingEntry
- dispose() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitReader
-
Assign the internal read buffer to null.
- DOC - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
Microsoft Word
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.ArrayNumber
-
This method is used to deserialize the number of array from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.EightBytesOfData
-
This method is used to deserialize the EightBytesOfData from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.FourBytesOfData
-
This method is used to deserialize the FourBytesOfData from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in interface org.apache.tika.parser.microsoft.onenote.fsshttpb.property.IProperty
-
This method is used to deserialize the property from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.NoData
-
This method is used to deserialize the NoData from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.OneByteOfData
-
This method is used to deserialize the OneByteOfData from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.PrtArrayOfPropertyValues
-
This method is used to deserialize the prtArrayOfPropertyValues from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.PrtFourBytesOfLengthFollowedByData
-
This method is used to deserialize the prtFourBytesOfLengthFollowedByData from
the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.TwoBytesOfData
-
This method is used to deserialize the TwoBytesOfData from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AlternativePackaging
-
This method is used to deserialize the Alternative Packaging object from the specified byte
array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.BasicObject
-
Used to return the length of this element.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.BinaryItem
-
This method is used to de-serialize the BinaryItem basic object from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellID
-
This method is used to deserialize the CellID basic object from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellIDArray
-
This method is used to deserialize the CellIDArray basic object from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
-
This method is used to deserialize the Compact64bitInt basic object from the specified byte
array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CompactID
-
This method is used to deserialize the CompactID object from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
-
This method is used to deserialize the ExGuid basic object from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGUIDArray
-
This method is used to deserialize the ExGUIDArray basic object from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.JCID
-
This method is used to deserialize the JCID object from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyID
-
This method is used to deserialize the PropertyID object from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.SerialNumber
-
This method is used to deserialize the SerialNumber basic object from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.PropertySet
-
This method is used to deserialize the PropertySet from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectPropSet
-
This method is used to deserialize the ObjectSpaceObjectPropSet from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamHeader
-
This method is used to deserialize the ObjectSpaceObjectStreamHeader object from
the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfContextIDs
-
This method is used to deserialize the ObjectSpaceObjectStreamOfContextIDs object
from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfOIDs
-
This method is used to deserialize the ObjectSpaceObjectStreamOfOIDs object from the
specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfOSIDs
-
This method is used to deserialize the ObjectSpaceObjectStreamOfOSIDs object from the
specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderEnd16bit
-
This method is used to deserialize the StreamObjectHeaderEnd16bit basic object from the
specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderEnd8bit
-
This method is used to deserialize the StreamObjectHeaderEnd8bit basic object from the
specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart16bit
-
This method is used to deserialize the StreamObjectHeaderStart16bit basic object from the
specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart32bit
-
This method is used to deserialize the StreamObjectHeaderStart32bit basic object
from the specified byte array and start index.
- doubleByte - Variable in class org.apache.tika.parser.mp3.ID3v2Frame.TextEncoding
-
- doubleToInt64Bits(double) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
-
- doubleValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
-
- doubleValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
-
- doubleValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
-
- doubleValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
-
- drawingHyperlinks - Variable in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
-
- DWGParser - Class in org.apache.tika.parser.dwg
-
DWG (CAD Drawing) parser.
- DWGParser() - Constructor for class org.apache.tika.parser.dwg.DWGParser
-
- GDALParser - Class in org.apache.tika.parser.gdal
-
- GDALParser() - Constructor for class org.apache.tika.parser.gdal.GDALParser
-
- GENERAL_EMBEDDED - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
General embedded document type within an OLE2 container
- GENRES - Static variable in interface org.apache.tika.parser.mp3.ID3Tags
-
List of predefined genres.
- GeoGazetteerClient - Class in org.apache.tika.parser.geo.topic.gazetteer
-
- GeoGazetteerClient(String) - Constructor for class org.apache.tika.parser.geo.topic.gazetteer.GeoGazetteerClient
-
Pass URL on which lucene-geo-gazetteer is available - eg.
- GeoGazetteerClient(GeoParserConfig) - Constructor for class org.apache.tika.parser.geo.topic.gazetteer.GeoGazetteerClient
-
- GeographicInformationParser - Class in org.apache.tika.parser.geoinfo
-
- GeographicInformationParser() - Constructor for class org.apache.tika.parser.geoinfo.GeographicInformationParser
-
- geoInfoType - Static variable in class org.apache.tika.parser.geoinfo.GeographicInformationParser
-
- GeoParser - Class in org.apache.tika.parser.geo.topic
-
- GeoParser() - Constructor for class org.apache.tika.parser.geo.topic.GeoParser
-
- GeoParserConfig - Class in org.apache.tika.parser.geo.topic
-
- GeoParserConfig() - Constructor for class org.apache.tika.parser.geo.topic.GeoParserConfig
-
- GeoTag - Class in org.apache.tika.parser.geo.topic
-
- GeoTag() - Constructor for class org.apache.tika.parser.geo.topic.GeoTag
-
- get() - Method in enum org.apache.tika.parser.strings.StringsEncoding
-
- get7BitsInt(byte[], int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
-
AKA a Synchsafe integer.
- getAccessChecker() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getAdmin1Code() - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
-
- getAdmin2Code() - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
-
- getAeDescriptorPath() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns the path to XML descriptor for AnalysisEngine.
- getAlbum() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
-
- getAlbum() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
- getAlbum() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
- getAlbum() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
-
- getAlbum() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
-
- getAlbum() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
-
- getAlbumArtist() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
-
- getAlbumArtist() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
The Artist for the overall album / compilation of albums
- getAlbumArtist() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
ID3v1 doesn't have album-wide artists,
so returns null;
- getAlbumArtist() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
-
- getAlbumArtist() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
-
- getAlbumArtist() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
-
- getAlignedLenTable() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getAlignedTreeTable() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getAllDetectableCharsets() - Static method in class org.apache.tika.parser.txt.CharsetDetector
-
Get the names of all charsets supported by CharsetDetector class.
- getAllNameEntitiesfromInput(InputStream) - Method in class org.apache.tika.parser.geo.topic.NameEntityExtractor
-
- getAllTagHandlers(InputStream, ContentHandler) - Static method in class org.apache.tika.parser.mp3.Mp3Parser
-
Scans the MP3 frames for ID3 tags, and creates ID3Tag Handlers
for each supported set of tags.
- getAnalysisEngine(String, String, String) - Static method in class org.apache.tika.parser.ctakes.CTAKESUtils
-
Returns a new UIMA Analysis Engine (AE).
- getAnnotationProperty(IdentifiedAnnotation, CTAKESAnnotationProperty) - Static method in class org.apache.tika.parser.ctakes.CTAKESUtils
-
Returns the annotation value based on the given annotation type.
- getAnnotationProps() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
- getAnnotationPropsAsString() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns a string containing a comma-separated list of
CTAKESAnnotationProperty names that will be included into cTAKES metadata.
- getApiUri(Metadata) - Method in class org.apache.tika.parser.captioning.tf.TensorflowRESTCaptioner
-
- getApiUri(Metadata) - Method in class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
-
- getApiUri(Metadata) - Method in class org.apache.tika.parser.recognition.tf.TensorflowRESTVideoRecogniser
-
- getApplyRotation() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getArtist() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
-
- getArtist() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
The Artist for the track
- getArtist() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
- getArtist() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
-
- getArtist() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
-
- getArtist() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
-
- getAverageCharTolerance() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getBestNameEntity() - Method in class org.apache.tika.parser.geo.topic.NameEntityExtractor
-
- getBigInteger(int) - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- getBitRate() - Method in class org.apache.tika.parser.mp3.AudioFrame
-
Get the bit rate in bit per second.
- getBitsPerPixel() - Method in class org.apache.tika.parser.image.ICNSType
-
- getBlock_len() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns block's length
- getBlockAddress() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Returns block addresses
- getBlockCount() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Gets a block count
- getBlockidx_intvl() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns block index interval
- getBlockLen() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Gets a block length
- getBlockLength() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getBlockNext() - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
-
- getBlockNumber() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxBlock
-
- getBlockPrev() - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
-
- getBlockRemaining() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getBlockType() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getBody() - Method in class org.apache.tika.parser.microsoft.onenote.ObjectSpaceObjectPropSet
-
- getByte() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- getByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitWriter
-
- getBytes(boolean) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
-
- getBytes(char) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
-
- getBytes(double) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
-
- getBytes(short) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
-
- getBytes(int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
-
- getBytes(long) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
-
- getBytes(float) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
-
- getBytes(String) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
-
- getBytes() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitWriter
-
Gets a copy byte array which contains the current written byte.
- getBytes(long) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.LittleEndianBitConverter
-
Returns the specified 64-bit unsigned integer value as an array of bytes.
- getBytes(int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.LittleEndianBitConverter
-
Returns the specified 32-bit unsigned integer value as an array of bytes.
- getCatchIntermediateIOExceptions() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getCellManifestDataElementData(List<DataElement>, StorageManifestDataElementData, HashMap<CellID, ExGuid>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
-
This method is used to get cell manifest data element from a list of data element.
- getCenter() - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.HeaderFooterFromString
-
- getChannels() - Method in class org.apache.tika.parser.mp3.AudioFrame
-
Get the number of channels (1=mono, 2=stereo)
- getCharset() - Method in class org.apache.tika.parser.csv.CSVParams
-
- getChmBlockInfoInstance(DirectoryListingEntry, int, ChmLzxcControlData) - Static method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
-
Deprecated.
- getChmBlockInfoInstance(DirectoryListingEntry, int, ChmLzxcControlData, ChmBlockInfo) - Static method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
-
- getChmBlockSegment(byte[], ChmLzxcResetTable, int, int, int) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
-
- getChmDirList() - Method in class org.apache.tika.parser.chm.core.ChmExtractor
-
- getChmDirList() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- getChmItsfHeader() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- getChmItspHeader() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- getChmLzxcControlData() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- getChmLzxcResetTable() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- getClassName() - Method in enum org.apache.tika.parser.ctakes.CTAKESSerializer
-
- getColorspace() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getCommand() - Method in class org.apache.tika.parser.gdal.GDALParser
-
- getComment(byte[], int, int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
-
Builds up the ID3 comment, by parsing and extracting
the comment string parts from the given data.
- getComments() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
-
- getComments() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
Retrieves the comments, if any.
- getComments() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
- getComments() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
-
- getComments() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
-
- getComments() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
-
- getCompilation() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
-
- getCompilation() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
- getCompilation() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
ID3v1 doesn't have compilations,
so returns null;
- getCompilation() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
-
ID3v22 doesn't have compilations,
so returns null;
- getCompilation() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
-
- getCompilation() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
-
- getComposer() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
-
- getComposer() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
- getComposer() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
ID3v1 doesn't have composers,
so returns null;
- getComposer() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
-
- getComposer() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
-
- getComposer() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
-
- getCompoundTypes() - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObject
-
Gets the StreamObjectTypeHeaderStart
- getCompressedLen() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Gets compressed length
- getConcatenatePhoneticRuns() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
- getConfidence() - Method in class org.apache.tika.parser.csv.CSVResult
-
- getConfidence() - Method in class org.apache.tika.parser.recognition.RecognisedObject
-
- getConfidence() - Method in class org.apache.tika.parser.txt.CharsetMatch
-
Get an indication of the confidence in the charset detected.
- getContent() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxBlock
-
- getContent(int, int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxBlock
-
- getContent(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxBlock
-
- getContent() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGUIDArray
-
- getContent() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.IntermediateNodeObject
-
Get all the content which is represented by the root node object.
- getContent() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.LeafNodeObject
-
Get all the content which is represented by the intermediate node object.
- getContent() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.NodeObject
-
Get all the content which is represented by the node object.
- getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.dif.DIFParser
-
- getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.xml.AbstractXML2003Parser
-
- getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.xml.SpreadsheetMLParser
-
- getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.xml.WordMLParser
-
- getContentHandler(ContentHandler, Metadata) - Method in class org.apache.tika.parser.mif.MIFParser
-
Get the content handler to use.
- getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.odf.OpenDocumentMetaParser
-
- getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.xml.DcXMLParser
-
- getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.xml.FictionBookParser
-
- getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.xml.XMLParser
-
- getContentLength() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxBlock
-
- getContentParser() - Method in class org.apache.tika.parser.epub.EpubParser
-
- getContentParser() - Method in class org.apache.tika.parser.odf.OpenDocumentParser
-
- getContextIDs() - Method in class org.apache.tika.parser.microsoft.onenote.ObjectSpaceObjectPropSet
-
- getControlDataIndex() - Method in class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
-
Returns control data index that located in List
- getCoreProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xps.XPSTextExtractor
-
- getCoreProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xslf.XSLFEventBasedPowerPointExtractor
-
- getCoreProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFEventBasedWordExtractor
-
- getCountryCode() - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
-
- getCurrent(byte[], AtomicInteger, Class<T>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObject
-
Get current stream object.
- getCurrent() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitReader
-
- getCurrentFSSHTTPBSubRequestID() - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.SequenceNumberGenerator
-
This method is used to get the current sub request ID and atomic adding the token by 1.
- GetCurrentSerialNumber() - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.SequenceNumberGenerator
-
This method is used to get the current serial number and atomic adding the token by 1.
- getCurrentToken() - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.SequenceNumberGenerator
-
This method is used to get the current token value and atomic adding the token by 1.
- getCustomProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xps.XPSTextExtractor
-
- getCustomProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xslf.XSLFEventBasedPowerPointExtractor
-
- getCustomProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFEventBasedWordExtractor
-
- getData() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- getData() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- getData(Class<T>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElement
-
Used to get data.
- getData() - Method in class org.apache.tika.parser.mp3.ID3v2Frame
-
- getDataObjectDataElementData(List<DataElement>, ExGuid, AtomicReference<ExGuid>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
-
This method is used to get the list of object group data element from a list of data element.
- getDataObjectDataElementData(List<DataElement>, RevisionManifestDataElementData, AtomicReference<ExGuid>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
-
This method is used to get a list of object group data element from a list of data element.
- getDataOffset() - Method in class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
-
Returns data offset
- getDataOffset() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Returns data offset
- getDateFormatOverride() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
- getDecodedValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
-
- getDecorationName() - Method in class org.apache.tika.parser.ctakes.CTAKESParser
-
- getDefaultConfig() - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- getDelimiter() - Method in class org.apache.tika.parser.csv.CSVParams
-
- getDelimiter() - Method in class org.apache.tika.parser.csv.CSVResult
-
- getDensity() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getDepth() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getDescription() - Method in class org.apache.tika.parser.mp3.ID3Tags.ID3Comment
-
Gets the description, if present
- getDetectableCharsets() - Method in class org.apache.tika.parser.txt.CharsetDetector
-
- getDetectAngles() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getDir_uuid() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Returns directory uuid
- getDirectoryListingEntryList() - Method in class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
-
Returns chm directory listing entry list
- getDirLen() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Returns directory length
- getDirOffset() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Returns directory offset
- getDisc() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
-
- getDisc() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
The number of the disc this belongs to, within the set
- getDisc() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
ID3v1 doesn't have disc numbers,
so returns null;
- getDisc() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
-
- getDisc() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
-
- getDisc() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
-
- getDocument() - Method in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
-
- getDocument() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLExtractor
-
Returns the opened document.
- getDocument() - Method in class org.apache.tika.parser.microsoft.ooxml.xps.XPSExtractorDecorator
-
- getDropThreshold() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getDuration() - Method in class org.apache.tika.parser.mp3.AudioFrame
-
Returns the duration in milliseconds.
- getEnableAutoSpace() - Method in class org.apache.tika.parser.pdf.PDFParser
-
- getEnableAutoSpace() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getEncint() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- getEncoding() - Method in class org.apache.tika.parser.strings.StringsConfig
-
Returns the character encoding of the strings that are to be found.
- getEndBlock() - Method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
-
Returns the end block index
- getEndOffset() - Method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
-
Returns the end offset index
- getEntityTypes() - Method in class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
-
Gets set of entity types recognised by this recogniser
- getEntityTypes() - Method in class org.apache.tika.parser.ner.grobid.GrobidNERecogniser
-
Gets set of entity types recognised by this recogniser
- getEntityTypes() - Method in class org.apache.tika.parser.ner.mitie.MITIENERecogniser
-
Gets set of entity types recognised by this recogniser
- getEntityTypes() - Method in interface org.apache.tika.parser.ner.NERecogniser
-
gets a set of entity types whose names are recognisable by this
- getEntityTypes() - Method in class org.apache.tika.parser.ner.nltk.NLTKNERecogniser
-
Gets set of entity types recognised by this recogniser
- getEntityTypes() - Method in class org.apache.tika.parser.ner.opennlp.OpenNLPNameFinder
-
- getEntityTypes() - Method in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
-
- getEntityTypes() - Method in class org.apache.tika.parser.ner.regex.RegexNERecogniser
-
- getEntriesToCopy() - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntry3FNDX
-
- getEntryType() - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
-
Returns ChmCommons.EntryType (COMPRESSED or UNCOMPRESSED)
- getExtendedGuidString() - Method in class org.apache.tika.parser.microsoft.onenote.ExtendedGUID
-
- getExtendedHeader() - Method in class org.apache.tika.parser.mp3.ID3v2Frame
-
- getExtendedProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xps.XPSTextExtractor
-
- getExtendedProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xslf.XSLFEventBasedPowerPointExtractor
-
- getExtendedProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFEventBasedWordExtractor
-
- getExtension() - Method in enum org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
-
- getExtractAcroFormContent() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getExtractActions() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getExtractAllAlternativesFromMSG() - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- getExtractAllAlternativesFromMSG() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
- getExtractAnnotationText() - Method in class org.apache.tika.parser.pdf.PDFParser
-
- getExtractAnnotationText() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getExtractBookmarksText() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getExtractFontNames() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getExtractInlineImages() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getExtractMacros() - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- getExtractMacros() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
- getExtractMarkedContent() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getExtractScripts() - Method in class org.apache.tika.parser.html.HtmlParser
-
- getExtractUniqueInlineImagesOnly() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getFilePath() - Method in class org.apache.tika.parser.strings.FileConfig
-
Returns the "file" installation folder.
- getFileProg() - Static method in class org.apache.tika.parser.strings.StringsParser
-
- getFilter() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getFlags() - Method in class org.apache.tika.parser.mp3.ID3v2Frame
-
- getFormattedNumber(Paragraph) - Method in class org.apache.tika.parser.microsoft.ListManager
-
Get the formatted number for a given paragraph
- getFormattedNumber(XWPFParagraph) - Method in class org.apache.tika.parser.microsoft.ooxml.XWPFListManager
-
- getFormattedNumber(BigInteger, int) - Method in class org.apache.tika.parser.microsoft.ooxml.XWPFListManager
-
- getFramesRead() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getFreeSpace() - Method in class org.apache.tika.parser.chm.accessor.ChmPmgiHeader
-
Returns pmgi free space
- getFreeSpace() - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
-
- getGazetteerRestEndpoint() - Method in class org.apache.tika.parser.geo.topic.GeoParser
-
- getGazetteerRestEndpoint() - Method in class org.apache.tika.parser.geo.topic.GeoParserConfig
-
- getGenre() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
-
- getGenre() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
- getGenre() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
- getGenre() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
-
- getGenre() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
-
- getGenre() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
-
- getGuid() - Method in class org.apache.tika.parser.microsoft.onenote.ExtendedGUID
-
- getGuid() - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntryFNDX
-
- getGuid() - Method in class org.apache.tika.parser.microsoft.onenote.GUID
-
- getGuidString() - Method in class org.apache.tika.parser.microsoft.onenote.GUID
-
- getHadStarted() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getHeader_len() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns header length
- getHeaderLen() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Returns itsf header length
- getHeight() - Method in class org.apache.tika.parser.image.ICNSType
-
- getId() - Method in class org.apache.tika.parser.recognition.RecognisedObject
-
- getIfXFAExtractOnlyXFA() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getIlvl() - Method in class org.apache.tika.parser.microsoft.ooxml.ParagraphProperties
-
- getImageMagickPath() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getIncludeDeletedContent() - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- getIncludeDeletedContent() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
- getIncludeDeletedText() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
-
- getIncludeDeletedText() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
-
- getIncludeHeadersAndFooters() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
- getIncludeMissingRows() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
- getIncludeMoveFromContent() - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- getIncludeMoveFromContent() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
- getIncludeMoveFromText() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
-
- getIncludeMoveFromText() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
-
- getIncludeShapeBasedContent() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
- getIncludeSlideMasterContent() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
- getIncludeSlideNotes() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
- getIndex() - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntryFNDX
-
- getIndex_depth() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns an index depth
- getIndex_head() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns an index head
- getIndex_root() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns index root
- getIndexCopyFromStart() - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntry3FNDX
-
- getIndexCopyToStart() - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntry3FNDX
-
- getIndexOfContent() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- getIndexOfResetData() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- getIndexOfResetTable() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- getIniBlock() - Method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
-
Returns an initial block index
- getInlineBool(OneNotePropertyEnum) - Static method in enum org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
-
- getInputStream() - Method in class org.apache.tika.parser.utils.DataURIScheme
-
- getInstance() - Static method in class org.apache.tika.parser.ner.regex.RegexNERecogniser
-
- getInt(byte[]) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
-
- getInt(byte[], int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
-
- getInt2(byte[], int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
-
- getInt3(byte[], int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
-
- getIntelCurrentPossition() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getIntelFileSize() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getIntelState() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getIntVal() - Method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.DataElementType
-
- getIntVal() - Method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyType
-
- getIntVal() - Method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.RequestTypes
-
- getIntVal() - Method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderEnd
-
- getIntVal() - Method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
- getJCas(AnalysisEngine) - Static method in class org.apache.tika.parser.ctakes.CTAKESUtils
-
Returns a new JCas () appropriate for the given Analysis Engine.
- getJustFileName(String) - Method in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
-
- getLabel() - Method in class org.apache.tika.parser.recognition.RecognisedObject
-
- getLabelLang() - Method in class org.apache.tika.parser.recognition.RecognisedObject
-
- getLang_id() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns language id
- getLangId() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Returns language ID
- getLanguage(long) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
-
Returns textual representation of LangID
- getLanguage() - Method in class org.apache.tika.parser.mp3.ID3Tags.ID3Comment
-
Gets the language, if present
- getLanguage() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getLanguage() - Method in class org.apache.tika.parser.txt.CharsetMatch
-
Get the ISO code for the language of the detected charset.
- getLastModified() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Returns last modified date of the chm file
- getLatitude() - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
-
- getLayer() - Method in class org.apache.tika.parser.mp3.AudioFrame
-
Get the audio layer code.
- getLeft() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- getLeft() - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.HeaderFooterFromString
-
- getLength() - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
-
- getLength() - Method in class org.apache.tika.parser.mp3.AudioFrame
-
Returns the frame length in bytes.
- getLength() - Method in class org.apache.tika.parser.mp3.ID3v2Frame
-
- getLengthTreeLengtsTable() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getLengthTreeTable() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getLinearizedDictionary(PDDocument) - Static method in class org.apache.tika.parser.pdf.PDFPreflightParser
-
Deprecated.
Copied verbatim from PDFBox
According to the PDF Reference, A linearized PDF contain a dictionary as first object (linearized dictionary) and
only this one in the first section.
- getLocations(List<String>) - Method in class org.apache.tika.parser.geo.topic.gazetteer.GeoGazetteerClient
-
Calls API of lucene-geo-gazetteer to search location name in gazetteer.
- getLongitude() - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
-
- getLzxBlockLength() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- getLzxBlockOffset() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- getLzxBlocksCache() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- getMainDocumentParts() - Method in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
-
Return a list of the main parts of the document, used
when searching for embedded resources.
- getMainDocumentParts() - Method in class org.apache.tika.parser.microsoft.ooxml.POIXMLTextExtractorDecorator
-
- getMainDocumentParts() - Method in class org.apache.tika.parser.microsoft.ooxml.SXSLFPowerPointExtractorDecorator
-
In PowerPoint files, slides have things embedded in them,
and slide drawings which have the images
- getMainDocumentParts() - Method in class org.apache.tika.parser.microsoft.ooxml.SXWPFWordExtractorDecorator
-
This returns all items that might contain embedded objects:
main document, headers, footers, comments, etc.
- getMainDocumentParts() - Method in class org.apache.tika.parser.microsoft.ooxml.xps.XPSExtractorDecorator
-
- getMainDocumentParts() - Method in class org.apache.tika.parser.microsoft.ooxml.XSLFPowerPointExtractorDecorator
-
In PowerPoint files, slides have things embedded in them,
and slide drawings which have the images
- getMainDocumentParts() - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
-
In Excel files, sheets have things embedded in them,
and sheet drawings which have the images
- getMainDocumentParts() - Method in class org.apache.tika.parser.microsoft.ooxml.XWPFWordExtractorDecorator
-
Include main body and anything else that can
have an attachment/embedded object
- getMainTreeElements() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getMainTreeLengtsTable() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getMainTreeTable() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getMajorVersion() - Method in class org.apache.tika.parser.mp3.ID3v2Frame
-
- getMarkLimit() - Method in class org.apache.tika.parser.html.charsetdetector.StandardHtmlEncodingDetector
-
- getMarkLimit() - Method in class org.apache.tika.parser.html.HtmlEncodingDetector
-
- getMarkLimit() - Method in class org.apache.tika.parser.txt.Icu4jEncodingDetector
-
- getMarkLimit() - Method in class org.apache.tika.parser.txt.UniversalEncodingDetector
-
- getMaxBytesForEmbeddedObject() - Static method in class org.apache.tika.parser.rtf.RTFParser
-
Deprecated.
- getMaxFileSizeToOcr() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getMaxMainMemoryBytes() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
The maximum amount of memory to use when loading a pdf into a PDDocument.
- getMaxXMPMMHistory() - Static method in class org.apache.tika.parser.image.xmp.JempboxExtractor
-
- getMediaType() - Method in class org.apache.tika.parser.csv.CSVParams
-
- getMediaType() - Method in class org.apache.tika.parser.csv.CSVResult
-
- getMediaType() - Method in class org.apache.tika.parser.utils.DataURIScheme
-
- getMessageClass(String) - Static method in class org.apache.tika.parser.microsoft.OutlookExtractor
-
- getMetadata() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns an array of metadata whose values will be analyzed using cTAKES.
- getMetadata() - Method in class org.apache.tika.parser.ctakes.CTAKESContentHandler
-
Returns metadata that includes cTAKES annotations.
- getMetadataAsString() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns a string containing a comma-separated list of metadata whose values will be analyzed using cTAKES.
- getMetadataExtractor() - Method in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
-
- getMetadataExtractor() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLExtractor
-
POIXMLTextExtractor.getMetadataTextExtractor() not yet supported
for OOXML by POI.
- getMetaParser() - Method in class org.apache.tika.parser.epub.EpubParser
-
- getMetaParser() - Method in class org.apache.tika.parser.odf.OpenDocumentParser
-
- getMinFileSizeToOcr() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getMinLength() - Method in class org.apache.tika.parser.strings.StringsConfig
-
Returns the minimum sequence length (characters) to print.
- getMinorVersion() - Method in class org.apache.tika.parser.mp3.ID3v2Frame
-
- getMinSize() - Method in class org.apache.tika.parser.strings.Latin1StringsParser
-
Returns the minimum size of a character sequence to be extracted.
- getMSB() - Method in class org.apache.tika.parser.executable.MachineMetadata.Endian
-
- getN() - Method in class org.apache.tika.parser.microsoft.onenote.ExtendedGUID
-
- getName() - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
-
Returns an entry name
- getName() - Method in enum org.apache.tika.parser.ctakes.CTAKESAnnotationProperty
-
- getName() - Method in class org.apache.tika.parser.executable.MachineMetadata.Endian
-
- getName() - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
-
- getName() - Method in class org.apache.tika.parser.txt.CharsetMatch
-
Get the name of the detected charset.
- getNameLength() - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
-
Returns an entry name length
- getNamespace() - Method in enum org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
-
- getNerModelUrl() - Method in class org.apache.tika.parser.geo.topic.GeoParser
-
- getNerModelUrl() - Method in class org.apache.tika.parser.geo.topic.GeoParserConfig
-
- getNum_blocks() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns number of blocks
- getNumberOfLevels() - Method in class org.apache.tika.parser.microsoft.AbstractListManager.ParagraphLevelCounter
-
- getNumId() - Method in class org.apache.tika.parser.microsoft.ooxml.ParagraphProperties
-
- getOcrDPI() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Dots per inch used to render the page image for OCR
- getOcrImageFormatName() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
String representation of the image format used to render
the page image for OCR (examples: png, tiff, jpeg)
- getOcrImageQuality() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Image quality used to render the page image for OCR.
- getOcrImageScale() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getOcrImageType() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Image type used to render the page image for OCR.
- getOcrStrategy() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getOffset() - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
-
- getOids() - Method in class org.apache.tika.parser.microsoft.onenote.ObjectSpaceObjectPropSet
-
- getOsids() - Method in class org.apache.tika.parser.microsoft.onenote.ObjectSpaceObjectPropSet
-
- getOtherTesseractConfig() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getOutputStream() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
- getOutputType() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getPackage() - Method in class org.apache.tika.parser.microsoft.ooxml.xps.XPSTextExtractor
-
- getPackage() - Method in class org.apache.tika.parser.microsoft.ooxml.xslf.XSLFEventBasedPowerPointExtractor
-
- getPackage() - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFEventBasedWordExtractor
-
- getPageSegMode() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getPageSeparator() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getPart() - Method in enum org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
-
- getPDDocument(InputStream, String, MemoryUsageSetting, Metadata, ParseContext) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- getPDDocument(Path, String, MemoryUsageSetting, Metadata, ParseContext) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- getPDDocument(InputStream, String, MemoryUsageSetting, Metadata, ParseContext) - Method in class org.apache.tika.parser.pdf.PDFPreflightParser
-
Deprecated.
- getPDDocument(Path, String, MemoryUsageSetting, Metadata, ParseContext) - Method in class org.apache.tika.parser.pdf.PDFPreflightParser
-
Deprecated.
- getPDFParserConfig() - Method in class org.apache.tika.parser.pdf.PDFParser
-
- getPreserveInterwordSpacing() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getPrevContent() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- getR0() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getR1() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getR2() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getReader(InputStream, String) - Method in class org.apache.tika.parser.txt.CharsetDetector
-
Autodetect the charset of an inputStream, and return a Java Reader
to access the converted input data.
- getReader() - Method in class org.apache.tika.parser.txt.CharsetMatch
-
Create a java.io.Reader for reading the Unicode character data corresponding
to the original byte data supplied to the Charset detect operation.
- getResetInterval() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Returns reset interval
- getResetTableIndex() - Method in class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
-
Return index of reset table
- getResize() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getRevisionManifestDataElementData(List<DataElement>, CellManifestDataElementData, HashMap<ExGuid, ExGuid>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
-
This method is used to get revision manifest data element from a list of data element.
- getRight() - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.HeaderFooterFromString
-
- getSampleRate() - Method in class org.apache.tika.parser.mp3.AudioFrame
-
Get the sampling rate, in Hz
- getSeparatorChar() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns the separator character used for annotation properties.
- getSerializerType() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns the type of cTAKES (UIMA) serializer used to write the CAS.
- getSetKCMS() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getSignature() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Returns a signature of itsf header
- getSignature() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns a signature of the header
- getSignature() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Returns a signature of control data block
- getSignature() - Method in class org.apache.tika.parser.chm.accessor.ChmPmgiHeader
-
Returns pmgi signature if exists
- getSignature() - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
-
- getSize() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Returns a size of control data
- getSize() - Method in class org.apache.tika.parser.mp3.ID3v2Frame.RawTag
-
- getSortByPosition() - Method in class org.apache.tika.parser.pdf.PDFParser
-
- getSortByPosition() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getSpacingTolerance() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getStartBlock() - Method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
-
Returns the start block index
- getStartIndex() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- getStartOffset() - Method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
-
Returns the start offset index
- getState() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxBlock
-
- getStorageManifestDataElementData(List<DataElement>, ExGuid) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
-
This method is used to get storage manifest data element from a list of data element.
- getStream_uuid() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Returns stream uuid
- getStreamObjectTypeMapping() - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObject
-
Gets the StreamObjectTypeMapping
- getString(byte[], int, int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
-
Returns the String at the given
offset and length.
- getString(byte[], String) - Method in class org.apache.tika.parser.txt.CharsetDetector
-
Autodetect the charset of an inputStream, and return a String
containing the converted input data.
- getString() - Method in class org.apache.tika.parser.txt.CharsetMatch
-
Create a Java String from Unicode character data corresponding
to the original byte data supplied to the Charset detect operation.
- getString(int) - Method in class org.apache.tika.parser.txt.CharsetMatch
-
Create a Java String from Unicode character data corresponding
to the original byte data supplied to the Charset detect operation.
- getStringsPath() - Method in class org.apache.tika.parser.strings.StringsConfig
-
Returns the "strings" installation folder.
- getStringsProg() - Static method in class org.apache.tika.parser.strings.StringsParser
-
- getStripMarkup() - Method in class org.apache.tika.parser.txt.Icu4jEncodingDetector
-
- getStyleClass() - Method in class org.apache.tika.parser.microsoft.WordExtractor.TagAndStyle
-
- getStyleID() - Method in class org.apache.tika.parser.microsoft.ooxml.ParagraphProperties
-
- getStyleName(String) - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFStylesShim
-
- getSuffix(InputStream, int) - Static method in class org.apache.tika.parser.mp3.LyricsHandler
-
Reads and returns the last length bytes from the
given stream.
- getSupportedMimes() - Method in class org.apache.tika.parser.captioning.tf.TensorflowRESTCaptioner
-
- getSupportedMimes() - Method in interface org.apache.tika.parser.recognition.ObjectRecogniser
-
The mimes supported by this recogniser
- getSupportedMimes() - Method in class org.apache.tika.parser.recognition.tf.TensorflowImageRecParser
-
- getSupportedMimes() - Method in class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.apple.AppleSingleFileParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.apple.PListParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.asm.ClassParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.audio.AudioParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.audio.MidiParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.chm.ChmParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.code.SourceCodeParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.crypto.Pkcs7Parser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.crypto.TSDParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.csv.TextAndCSVParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.dbf.DBFParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.dif.DIFParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.dwg.DWGParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.envi.EnviHeaderParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.epub.EpubContentParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.epub.EpubParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.executable.ExecutableParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.feed.FeedParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.font.AdobeFontMetricParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.font.TrueTypeParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.gdal.GDALParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.geo.topic.GeoParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.geoinfo.GeographicInformationParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.grib.GribParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.hdf.HDFParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.html.HtmlParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.hwp.HwpV5Parser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.BPGParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.HeifParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.ICNSParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.ImageParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.PSDParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.TiffParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.WebPParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.indesign.IDMLParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.iptc.IptcAnpaParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.isatab.ISArchiveParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.iwork.iwana.IWork13PackageParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.iwork.iwana.IWork18PackageParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.iwork.IWorkPackageParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.jdbc.SQLite3Parser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.journal.JournalParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.jpeg.JpegParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mail.RFC822Parser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mat.MatParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mbox.MboxParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mbox.OutlookPSTParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.EMFParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.JackcessParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.MSOwnerFileParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.OfficeParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.OldExcelParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.onenote.OneNoteParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.ml2006.Word2006MLParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.TNEFParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.WMFParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.xml.SpreadsheetMLParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.xml.WordMLParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mif.MIFParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mp3.Mp3Parser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mp4.MP4Parser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mp4.NoakesMP4Parser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.ner.NamedEntityParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.netcdf.NetCDFParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.odf.FlatOpenDocumentParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.odf.OpenDocumentContentParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.odf.OpenDocumentParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.pkg.CompressorParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.pkg.PackageParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.pkg.RarParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.pot.PooledTimeSeriesParser
-
Returns the set of media types supported by this parser when used with the
given parse context.
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.prt.PRTParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.recognition.ObjectRecognitionParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.rtf.RTFParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.sas.SAS7BDATParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.sentiment.SentimentAnalysisParser
-
Returns the types supported
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.strings.Latin1StringsParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.strings.StringsParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.txt.TXTParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.video.FLVParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.wordperfect.QuattroProParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.wordperfect.WordPerfectParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.xliff.XLIFF12Parser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.xliff.XLZParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.xml.FictionBookParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.xml.XMLParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.xml.XMLProfiler
-
- getSuppressDuplicateOverlappingText() - Method in class org.apache.tika.parser.pdf.PDFParser
-
- getSuppressDuplicateOverlappingText() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getSwath() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- getSyncBits(int) - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- getSystem_uuid() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns system uuid
- getTableOffset() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Gets a table offset
- getTag() - Method in class org.apache.tika.parser.microsoft.WordExtractor.TagAndStyle
-
- getTagsPresent() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
-
- getTagsPresent() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
Does the file contain this kind of tags?
- getTagsPresent() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
- getTagsPresent() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
-
- getTagsPresent() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
-
- getTagsPresent() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
-
- getTagString(byte[], int, int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
-
Returns the (possibly null padded) String at the given offset and
length.
- getTessdataPath() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getTesseractPath() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getText() - Method in class org.apache.tika.parser.microsoft.ooxml.xps.XPSTextExtractor
-
- getText() - Method in class org.apache.tika.parser.microsoft.ooxml.xslf.XSLFEventBasedPowerPointExtractor
-
- getText() - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFEventBasedWordExtractor
-
- getText() - Method in class org.apache.tika.parser.mp3.ID3Tags.ID3Comment
-
Gets the text, if present
- getTextDocument() - Method in class org.apache.tika.parser.html.BoilerpipeContentHandler
-
Retrieves the built TextDocument
- getTimeout() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getTimeout() - Method in class org.apache.tika.parser.strings.StringsConfig
-
Returns the maximum time (in seconds) to wait for the "strings" command
to terminate.
- getTitle() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
-
- getTitle() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
- getTitle() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
- getTitle() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
-
- getTitle() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
-
- getTitle() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
-
- getTotal() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- getTrackingMetadata() - Method in class org.apache.tika.parser.mbox.MboxParser
-
- getTrackNumber() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
-
- getTrackNumber() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
The number of the track within the album / recording
- getTrackNumber() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
- getTrackNumber() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
-
- getTrackNumber() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
-
- getTrackNumber() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
-
- getType() - Method in class org.apache.tika.parser.image.ICNSType
-
- getType() - Method in enum org.apache.tika.parser.iwork.iwana.IWork13PackageParser.IWork13DocumentType
-
- getType() - Method in enum org.apache.tika.parser.iwork.iwana.IWork18PackageParser.IWork18DocumentType
-
- getType() - Method in enum org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
-
- getType() - Method in enum org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
-
- getType() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
-
- getType(OneNotePropertyEnum) - Static method in enum org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
-
- getTypeFromVal(int) - Static method in enum org.apache.tika.parser.microsoft.OutlookExtractor.RECIPIENT_TYPE
-
- getUMLSPass() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns the UMLS password.
- getUMLSUser() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns the UMLS username.
- getUncompressedLen() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Gets uncompressed length
- getUnderline() - Method in class org.apache.tika.parser.microsoft.ooxml.RunProperties
-
- getUnknown() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Gets unknown
- getUnknown0008() - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
-
- getUnknown_000c() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Returns unknown_00c value
- getUnknown_000c() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns 000c unknown bytes
- getUnknown_0024() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns 0024 unknown bytes
- getUnknown_002c() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns 002c unknown bytes
- getUnknown_0044() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns 0044 unknown bytes
- getUnknown_18() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Returns unknown 18 bytes
- getUnknownLen() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Returns unknown length
- getUnknownOffset() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Returns unknown offset
- getUseSAXDocxExtractor() - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- getUseSAXDocxExtractor() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
- getUseSAXPptxExtractor() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
- getUtf16PropertiesToPrint() - Method in class org.apache.tika.parser.microsoft.onenote.OneNoteTreeWalkerOptions
-
Print file node data in UTF-16 format when they match these props.
- getVersion() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Returns itsf header version
- getVersion() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns version of itsp header
- getVersion() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Returns a version of control data block
- getVersion() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Returns the version
- getVersion() - Method in class org.apache.tika.parser.mp3.AudioFrame
-
- getVersionCode() - Method in class org.apache.tika.parser.mp3.AudioFrame
-
Get the version code.
- getWidth() - Method in class org.apache.tika.parser.image.ICNSType
-
- getWindow() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getWindowPosition() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getWindowSize() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Returns a window size
- getWindowSize(int) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
-
LZX supports window sizes of 2^15 (32Kb) through 2^21 (2Mb) Returns X,
i.e 2^X
- getWindowSize() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getWindowsPerReset() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Returns windows per reset
- getXHTML(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
-
- getXHTML(ContentHandler, Metadata, ParseContext) - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLExtractor
-
Parses the document into a sequence of XHTML SAX events sent to the
given content handler.
- getXHTML(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFBExcelExtractorDecorator
-
- getXHTML(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
-
- getYear() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
-
- getYear() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
- getYear() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
- getYear() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
-
- getYear() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
-
- getYear() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
-
- GlobalIdTableEntry3FNDX - Class in org.apache.tika.parser.microsoft.onenote
-
- GlobalIdTableEntry3FNDX() - Constructor for class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntry3FNDX
-
- GlobalIdTableEntryFNDX - Class in org.apache.tika.parser.microsoft.onenote
-
- GlobalIdTableEntryFNDX() - Constructor for class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntryFNDX
-
- GRIB_MIME_TYPE - Static variable in class org.apache.tika.parser.grib.GribParser
-
- GribParser - Class in org.apache.tika.parser.grib
-
- GribParser() - Constructor for class org.apache.tika.parser.grib.GribParser
-
- GrobidNERecogniser - Class in org.apache.tika.parser.ner.grobid
-
- GrobidNERecogniser() - Constructor for class org.apache.tika.parser.ner.grobid.GrobidNERecogniser
-
- GrobidRESTParser - Class in org.apache.tika.parser.journal
-
- GrobidRESTParser() - Constructor for class org.apache.tika.parser.journal.GrobidRESTParser
-
- guid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
-
- guid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.SerialNumber
-
- guid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestSchemaGUID
-
- GUID - Class in org.apache.tika.parser.microsoft.onenote
-
- GUID(int[]) - Constructor for class org.apache.tika.parser.microsoft.onenote.GUID
-
- guidCellSchemaId - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AlternativePackaging
-
- guidFile - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AlternativePackaging
-
- guidFileFormat - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AlternativePackaging
-
- guidFileType - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AlternativePackaging
-
- guidIndex - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CompactID
-
- guidLegacyFileVersion - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AlternativePackaging
-
- GuidUtil - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.util
-
- GuidUtil() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.GuidUtil
-
- MACHINE_ALPHA - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MACHINE_ARM - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MACHINE_EFI - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MACHINE_IA_64 - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MACHINE_M32R - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MACHINE_M68K - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MACHINE_M88K - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MACHINE_MIPS - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MACHINE_PPC - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MACHINE_S370 - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MACHINE_S390 - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MACHINE_SH3 - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MACHINE_SH4 - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MACHINE_SH5 - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MACHINE_SPARC - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MACHINE_TYPE - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MACHINE_UNKNOWN - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MACHINE_VAX - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MACHINE_x86_32 - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MACHINE_x86_64 - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MachineMetadata - Interface in org.apache.tika.parser.executable
-
Metadata for describing machines, such as their
architecture, type and endian-ness
- MachineMetadata.Endian - Class in org.apache.tika.parser.executable
-
- MAIL_MAX_SIZE - Static variable in class org.apache.tika.parser.mbox.MboxParser
-
- MailUtil - Class in org.apache.tika.parser.mail
-
- MailUtil() - Constructor for class org.apache.tika.parser.mail.MailUtil
-
- main(String[]) - Static method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
- main(String[]) - Static method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
- main(String[]) - Static method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
-
- main(String[]) - Static method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- main(String[]) - Static method in class org.apache.tika.parser.microsoft.ooxml.xslf.XSLFEventBasedPowerPointExtractor
-
- main(String[]) - Static method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFEventBasedWordExtractor
-
- main(String[]) - Static method in class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
-
- mainTreeLengtsTable - Variable in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- mainTreeTable - Variable in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- manifestMappingExGuid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexManifestMapping
-
- manifestMappingSerialNumber - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexManifestMapping
-
- mapSafeAttribute(String, String) - Method in class org.apache.tika.parser.html.DefaultHtmlMapper
-
Normalizes an attribute name.
- mapSafeAttribute(String, String) - Method in interface org.apache.tika.parser.html.HtmlMapper
-
Maps "safe" HTML attribute names to semantic XHTML equivalents.
- mapSafeAttribute(String, String) - Method in class org.apache.tika.parser.html.HtmlParser
-
- mapSafeAttribute(String, String) - Method in class org.apache.tika.parser.html.IdentityHtmlMapper
-
- mapSafeElement(String) - Method in class org.apache.tika.parser.html.DefaultHtmlMapper
-
- mapSafeElement(String) - Method in interface org.apache.tika.parser.html.HtmlMapper
-
Maps "safe" HTML element names to semantic XHTML equivalents.
- mapSafeElement(String) - Method in class org.apache.tika.parser.html.HtmlParser
-
- mapSafeElement(String) - Method in class org.apache.tika.parser.html.IdentityHtmlMapper
-
- MATLAB_MIME_TYPE - Static variable in class org.apache.tika.parser.mat.MatParser
-
- MatParser - Class in org.apache.tika.parser.mat
-
- MatParser() - Constructor for class org.apache.tika.parser.mat.MatParser
-
- MAX - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
-
A constant holding the maximum value an unsigned byte can
have as UByte, 28-1.
- MAX - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
-
A constant holding the maximum value an unsigned int can
have as UInteger, 232-1.
- MAX - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
-
A constant holding the maximum value + 1 an signed long can
have as ULong, 263.
- max(UByte, UByte) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UMath
-
Returns the greater of two UByte values.
- max(UInteger, UInteger) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UMath
-
Returns the greater of two UInteger values.
- max(ULong, ULong) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UMath
-
Returns the greater of two ULong values.
- max(UShort, UShort) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UMath
-
Returns the greater of two UShort values.
- MAX - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
-
A constant holding the maximum value an unsigned short can
have as UShort, 216-1.
- MAX_VALUE - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
-
A constant holding the maximum value an unsigned byte can
have, 28-1.
- MAX_VALUE - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
-
A constant holding the maximum value an unsigned int can
have, 232-1.
- MAX_VALUE - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
-
A constant holding the maximum value an unsigned long can
have, 264-1.
- MAX_VALUE - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
-
A constant holding the maximum value an unsigned short can
have, 216-1.
- MAX_VALUE_LONG - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
-
A constant holding the maximum value + 1 an signed long can
have, 263.
- MAXSUBREQUESTID - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.SequenceNumberGenerator
-
Specify the max sub request ID.
- MAXTOKENVALUE - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.SequenceNumberGenerator
-
Specify the max token value.
- MBOX_MIME_TYPE - Static variable in class org.apache.tika.parser.mbox.MboxParser
-
- MBOX_RECORD_DIVIDER - Static variable in class org.apache.tika.parser.mbox.MboxParser
-
- MboxParser - Class in org.apache.tika.parser.mbox
-
Mbox (mailbox) parser.
- MboxParser() - Constructor for class org.apache.tika.parser.mbox.MboxParser
-
- MD_KEY_IMG_CAP - Static variable in class org.apache.tika.parser.recognition.ObjectRecognitionParser
-
- MD_KEY_OBJ_REC - Static variable in class org.apache.tika.parser.recognition.ObjectRecognitionParser
-
- MD_KEY_PREFIX - Static variable in class org.apache.tika.parser.ner.NamedEntityParser
-
- MD_REC_IMPL_KEY - Static variable in class org.apache.tika.parser.recognition.ObjectRecognitionParser
-
- MDB_PROPERTY_PREFIX - Static variable in class org.apache.tika.parser.microsoft.JackcessParser
-
- MDB_PW - Static variable in class org.apache.tika.parser.microsoft.JackcessParser
-
- MEDIA_TYPES - Static variable in class org.apache.tika.parser.ner.NamedEntityParser
-
- memcmp(int[], int[], int) - Static method in class org.apache.tika.parser.microsoft.onenote.GUID
-
- metadata - Variable in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
-
- MetadataExtractor - Class in org.apache.tika.parser.microsoft.ooxml
-
OOXML metadata extractor.
- MetadataExtractor(POIXMLTextExtractor) - Constructor for class org.apache.tika.parser.microsoft.ooxml.MetadataExtractor
-
- MetadataFields - Class in org.apache.tika.parser.image
-
Knowns about all declared Metadata fields.
- MetadataFields() - Constructor for class org.apache.tika.parser.image.MetadataFields
-
- MetadataHandler - Class in org.apache.tika.parser.xml
-
- MetadataHandler(Metadata, String) - Constructor for class org.apache.tika.parser.xml.MetadataHandler
-
Deprecated.
- MetadataHandler(Metadata, Property) - Constructor for class org.apache.tika.parser.xml.MetadataHandler
-
Deprecated.
- MidiParser - Class in org.apache.tika.parser.audio
-
- MidiParser() - Constructor for class org.apache.tika.parser.audio.MidiParser
-
- MIFContentHandler - Class in org.apache.tika.parser.mif
-
Content handler for MIF Content and Metadata.
- MIFExtractor - Class in org.apache.tika.parser.mif
-
Helper Class to Parse and Extract Adobe MIF Files.
- MIFExtractor() - Constructor for class org.apache.tika.parser.mif.MIFExtractor
-
- MIFParser - Class in org.apache.tika.parser.mif
-
- MIFParser() - Constructor for class org.apache.tika.parser.mif.MIFParser
-
- MIFParser(EncodingDetector) - Constructor for class org.apache.tika.parser.mif.MIFParser
-
- MIN - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
-
A constant holding the minimum value an unsigned byte can
have as UByte, 0.
- MIN - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
-
A constant holding the minimum value an unsigned int can
have as UInteger, 0.
- MIN - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
-
A constant holding the minimum value an unsigned long can
have as ULong, 0.
- min(UByte, UByte) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UMath
-
Returns the smaller of two UByte values.
- min(UInteger, UInteger) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UMath
-
Returns the smaller of two UInteger values.
- min(ULong, ULong) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UMath
-
Returns the smaller of two ULong values.
- min(UShort, UShort) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UMath
-
Returns the smaller of two UShort values.
- MIN - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
-
A constant holding the minimum value an unsigned short can
have as UShort, 0.
- MIN_VALUE - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
-
A constant holding the minimum value an unsigned byte can
have, 0.
- MIN_VALUE - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
-
A constant holding the minimum value an unsigned int can
have, 0.
- MIN_VALUE - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
-
A constant holding the minimum value an unsigned long can
have, 0.
- MIN_VALUE - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
-
A constant holding the minimum value an unsigned short can
have, 0.
- minConfidence - Variable in class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
-
- MISCELLANEOUS - Static variable in interface org.apache.tika.parser.ner.NERecogniser
-
- MITIENERecogniser - Class in org.apache.tika.parser.ner.mitie
-
This class offers an implementation of
NERecogniser based on
trained models using state-of-the-art information extraction tools.
- MITIENERecogniser() - Constructor for class org.apache.tika.parser.ner.mitie.MITIENERecogniser
-
- MITIENERecogniser(String) - Constructor for class org.apache.tika.parser.ner.mitie.MITIENERecogniser
-
Creates a NERecogniser by loading model from given path
- MODEL_PROP_NAME - Static variable in class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
-
- MODEL_PROP_NAME - Static variable in class org.apache.tika.parser.ner.mitie.MITIENERecogniser
-
- MODELS_DIR - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
-
- MONEY - Static variable in interface org.apache.tika.parser.ner.NERecogniser
-
- MONEY_FILE - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
-
- moveNext() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitReader
-
Advances the enumerator to the next bit of the byte array.
- MP3Frame - Interface in org.apache.tika.parser.mp3
-
A frame in an MP3 file, such as ID3v2 Tags or some
audio.
- Mp3Parser - Class in org.apache.tika.parser.mp3
-
The Mp3Parser is used to parse ID3 Version 1 Tag information
from an MP3 file, if available.
- Mp3Parser() - Constructor for class org.apache.tika.parser.mp3.Mp3Parser
-
- Mp3Parser.ID3TagsAndAudio - Class in org.apache.tika.parser.mp3
-
- MP4Parser - Class in org.apache.tika.parser.mp4
-
Parser for the MP4 media container format, as well as the older
QuickTime format that MP4 is based on.
- MP4Parser() - Constructor for class org.apache.tika.parser.mp4.MP4Parser
-
- MPEG_V1 - Static variable in class org.apache.tika.parser.mp3.AudioFrame
-
Constant for the MPEG version 1.
- MPEG_V2 - Static variable in class org.apache.tika.parser.mp3.AudioFrame
-
Constant for the MPEG version 2.
- MPEG_V2_5 - Static variable in class org.apache.tika.parser.mp3.AudioFrame
-
Constant for the MPEG version 2.5.
- MPP - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
Microsoft Project
- MS_EQUATION - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
Equation embedded in Office docs
- MS_GRAPH_CHART - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
Graph/Charts embedded in PowerPoint and Excel
- MS_OUTLOOK_PST_MIMETYPE - Static variable in class org.apache.tika.parser.mbox.OutlookPSTParser
-
- MSG - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
Microsoft Outlook
- MSOneStorePackage - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb
-
- MSOneStorePackage() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStorePackage
-
- MSOneStoreParser - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb
-
- MSOneStoreParser() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStoreParser
-
- MSOwnerFileParser - Class in org.apache.tika.parser.microsoft
-
Parser for temporary MSOFfice files.
- MSOwnerFileParser() - Constructor for class org.apache.tika.parser.microsoft.MSOwnerFileParser
-
- salvageCopy(InputStream, File, boolean) - Static method in class org.apache.tika.parser.utils.ZipSalvager
-
This streams the broken zip and rebuilds a new zip that
is at least a valid zip file.
- salvageCopy(File, File) - Static method in class org.apache.tika.parser.utils.ZipSalvager
-
- SAS7BDATParser - Class in org.apache.tika.parser.sas
-
Processes the SAS7BDAT data columnar database file used by SAS and
other similar languages.
- SAS7BDATParser() - Constructor for class org.apache.tika.parser.sas.SAS7BDATParser
-
- SchemaGuid - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
-
- SDA - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
StarOffice Draw
- SDC - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
StarOffice Calc
- SDD - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
StarOffice Impress
- SDW - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
StarOffice Writer
- searchGeoNames(ArrayList<String>) - Method in class org.apache.tika.parser.geo.topic.GeoParser
-
- secondaryParser - Variable in class org.apache.tika.parser.ner.NamedEntityParser
-
- SentimentAnalysisParser - Class in org.apache.tika.parser.sentiment
-
This parser classifies documents based on the sentiment of document.
- SentimentAnalysisParser() - Constructor for class org.apache.tika.parser.sentiment.SentimentAnalysisParser
-
- SequenceNumberGenerator - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.util
-
- SequenceNumberGenerator() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.SequenceNumberGenerator
-
- serialize(JCas, CTAKESSerializer, boolean, OutputStream) - Static method in class org.apache.tika.parser.ctakes.CTAKESUtils
-
Serializes a CAS in the given format.
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.CellManifestCurrentRevision
-
Used to convert the element into a byte List.
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElement
-
Used to convert the element into a byte List.
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementHash
-
Used to convert the element into a byte List
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementPackage
-
Used to convert the element into a byte List
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataHashObject
-
Used to convert the element into a byte List.
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataSizeObject
-
Used to convert the element into a byte List.
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.IntermediateNodeObject
-
Used to convert the element into a byte List.
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.LeafNodeObject
-
Used to convert the element into a byte List.
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupData
-
Used to convert the element into a byte List
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupDeclarations
-
Used to convert the element into a byte List
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupMetadata
-
Used to convert the element into a byte List
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupMetadataDeclarations
-
Used to convert the element into a byte List
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectBLOBDataDeclaration
-
Used to convert the element into a byte List.
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectData
-
Used to convert the element into a byte List
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectDataBLOBReference
-
Used to convert the element into a byte List.
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectDeclare
-
Used to convert the element into a byte List.
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifest
-
Used to convert the element into a byte List.
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestObjectGroupReferences
-
Used to convert the element into a byte List.
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestRootDeclare
-
Used to convert the element into a byte List.
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.SignatureObject
-
Used to convert the element into a byte List.
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexCellMapping
-
Used to convert the element into a byte List.
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexManifestMapping
-
Used to convert the element into a byte List.
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexRevisionMapping
-
Used to convert the element into a byte List.
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestRootDeclare
-
Used to convert the element into a byte List.
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestSchemaGUID
-
Used to convert the element into a byte List.
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObject
-
Serialize items to byte list.
- SerializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.ExtendedGUID
-
This method is used to convert the element of ExtendedGUID object into a byte List.
- serializeToByteList() - Method in interface org.apache.tika.parser.microsoft.onenote.fsshttpb.IFSSHTTPBSerializable
-
Serialize to byte list.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.ArrayNumber
-
This method is used to convert the element of the number of array into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.EightBytesOfData
-
This method is used to convert the element of EightBytesOfData into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.FourBytesOfData
-
This method is used to convert the element of FourBytesOfData into a byte List.
- serializeToByteList() - Method in interface org.apache.tika.parser.microsoft.onenote.fsshttpb.property.IProperty
-
This method is used to convert the element of property into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.NoData
-
This method is used to convert the element of NoData into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.OneByteOfData
-
This method is used to convert the element of OneByteOfData into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.PrtArrayOfPropertyValues
-
This method is used to convert the element of the prtArrayOfPropertyValues into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.PrtFourBytesOfLengthFollowedByData
-
This method is used to convert the element of prtFourBytesOfLengthFollowedByData into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.TwoBytesOfData
-
This method is used to convert the element of TwoBytesOfData into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.BasicObject
-
Used to serialize item to byte list.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.BinaryItem
-
This method is used to convert the element of BinaryItem basic object into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellID
-
This method is used to convert the element of CellID basic object into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellIDArray
-
This method is used to convert the element of CellIDArray basic object into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
-
This method is used to convert the element of Compact64bitInt basic object into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CompactID
-
This method is used to convert the element of CompactID object into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
-
This method is used to convert the element of ExGuid basic object into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGUIDArray
-
This method is used to convert the element of ExGUIDArray basic object into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.JCID
-
This method is used to convert the element of JCID object into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyID
-
This method is used to convert the element of PropertyID object into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.SerialNumber
-
This method is used to convert the element of SerialNumber basic object into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.CellManifestDataElementData
-
Used to convert the element into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementData
-
Serialize item to byte list.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupDataElementData
-
Used to convert the element into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.PropertySet
-
This method is used to convert the element of PropertySet into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestDataElementData
-
Used to convert the element into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectPropSet
-
This method is used to convert the element of the ObjectSpaceObjectPropSet into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamHeader
-
This method is used to convert the element of ObjectSpaceObjectStreamHeader into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfContextIDs
-
This method is used to convert the element of ObjectSpaceObjectStreamOfContextIDs object into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfOIDs
-
This method is used to convert the element of ObjectSpaceObjectStreamOfOIDs object into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfOSIDs
-
This method is used to convert the element of ObjectSpaceObjectStreamOfOSIDs object into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexDataElementData
-
Used to convert the element into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestDataElementData
-
Used to convert the element into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObject
-
Serialize item to byte list.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderEnd16bit
-
This method is used to convert the element of StreamObjectHeaderEnd16bit basic object into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderEnd8bit
-
This method is used to convert the element of StreamObjectHeaderEnd8bit basic object into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart16bit
-
This method is used to convert the element of StreamObjectHeaderStart16bit basic object into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart32bit
-
This method is used to convert the element of StreamObjectHeaderStart32bit basic object into a byte List.
- SerialNumber - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
-
- SerialNumber(UUID, long) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.SerialNumber
-
Initializes a new instance of the SerialNumber class with specified values.
- SerialNumber(SerialNumber) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.SerialNumber
-
Initializes a new instance of the SerialNumber class, this is the copy constructor.
- SerialNumber() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.SerialNumber
-
Initializes a new instance of the SerialNumber class, this is default contractor
- serialNumber - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElement
-
- setAccessChecker(AccessChecker) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- setAdmin1Code(String) - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
-
- setAdmin2Code(String) - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
-
- setAeDescriptorPath(String) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Sets the path to XML descriptor for AnalysisEngine.
- setAlignedLenTable(short[]) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setAlignedTreeTable(short[]) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setAnnotationProps(CTAKESAnnotationProperty[]) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
- setAnnotationProps(String[]) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
- setApplyRotation(boolean) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Sets whether or not a rotation value should be calculated and passed to ImageMagick.
- setApplyRotation(boolean) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setAverageCharTolerance(Float) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
See PDFTextStripper.setAverageCharTolerance(float)
- setBit(byte[], long) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.Bit
-
Set a bit value to "On" in the specified byte array with the specified bit position.
- setBlock_len(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets block length
- setBlockAddress(long[]) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Sets block addresses
- setBlockCount(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Sets a block count
- setBlockidx_intvl(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets block index interval
- setBlockLength(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setBlockLlen(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Sets a block length
- setBlockNext(int) - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
-
- setBlockPrev(int) - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
-
- setBlockRemaining(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setBlockType(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setBody(PropertySet) - Method in class org.apache.tika.parser.microsoft.onenote.ObjectSpaceObjectPropSet
-
- setBold(boolean) - Method in class org.apache.tika.parser.microsoft.ooxml.RunProperties
-
- setByteArrayMaxOverride(int) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
WARNING: this sets a static variable in POI.
- setCatchIntermediateIOExceptions(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
The PDFBox parser will throw an IOException if there is
a problem with a stream.
- setCenter(String) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.HeaderFooterFromString
-
- setCharset(Charset) - Method in class org.apache.tika.parser.csv.CSVParams
-
- setChmDirList(ChmDirectoryListingSet) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- setChmItsfHeader(ChmItsfHeader) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- setChmItspHeader(ChmItspHeader) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- setChmLzxcControlData(ChmLzxcControlData) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- setChmLzxcResetTable(ChmLzxcResetTable) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- setColorspace(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- setColorspace(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setCommand(String) - Method in class org.apache.tika.parser.gdal.GDALParser
-
- setCompressedLen(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Sets compressed length
- setConcatenatePhoneticRuns(boolean) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- setConcatenatePhoneticRuns(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
Microsoft Excel files can sometimes contain phonetic (furigana) strings.
- setConfidence(double) - Method in class org.apache.tika.parser.recognition.RecognisedObject
-
- setContent(List<ExGuid>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGUIDArray
-
- setContentLength(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxBlock
-
- setContentParser(Parser) - Method in class org.apache.tika.parser.epub.EpubParser
-
- setContentParser(Parser) - Method in class org.apache.tika.parser.odf.OpenDocumentParser
-
- setContentType(Metadata) - Method in class org.apache.tika.parser.microsoft.xml.AbstractXML2003Parser
-
- setContentType(Metadata) - Method in class org.apache.tika.parser.microsoft.xml.SpreadsheetMLParser
-
- setContentType(Metadata) - Method in class org.apache.tika.parser.microsoft.xml.WordMLParser
-
- setContextIDs(ObjectSpaceObjectStreamOfOIDsOSIDsOrContextIDs) - Method in class org.apache.tika.parser.microsoft.onenote.ObjectSpaceObjectPropSet
-
- setControlDataIndex(int) - Method in class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
-
Sets control data index
- setCountryCode(String) - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
-
- setCrawlAllFileNodesFromRoot(boolean) - Method in class org.apache.tika.parser.microsoft.onenote.OneNoteTreeWalkerOptions
-
Do this to ignore revisions and just parse all file nodes from the root recursively.
- setData(byte[]) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- setDataOffset(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Sets data offset
- setDateFormatOverride(String) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- setDateFormatOverride(String) - Method in class org.apache.tika.parser.microsoft.TikaExcelDataFormatter
-
- setDateOverrideFormat(String) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
A user may wish to override the date formats in xls and xlsx files.
- setDeclaredEncoding(String) - Method in class org.apache.tika.parser.txt.CharsetDetector
-
Set the declared encoding for charset detection.
- setDecodedValue(long) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
-
- setDelimiter(Character) - Method in class org.apache.tika.parser.csv.CSVParams
-
- setDensity(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- setDensity(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setDepth(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- setDepth(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setDetectableCharset(String, boolean) - Method in class org.apache.tika.parser.txt.CharsetDetector
-
- setDetectAngles(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- setDetectCharsetsInEntryNames(boolean) - Method in class org.apache.tika.parser.pkg.PackageParser
-
Whether or not to run the default charset detector against entry
names in ZipFiles.
- setDir_uuid(byte[]) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Sets directory uuid
- setDirectoryListingEntryList(List<DirectoryListingEntry>) - Method in class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
-
Sets chm directory listing entry list
- setDirLen(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Sets directory length
- setDirOffset(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Sets directory offset
- setDocumentLocator(Locator) - Method in class org.apache.tika.parser.dif.DIFContentHandler
-
- setDocumentLocator(Locator) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
-
- setDropThreshold(float) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- setDropThreshold(float) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- setEnableAutoSpace(boolean) - Method in class org.apache.tika.parser.pdf.PDFParser
-
If true (the default), the parser should estimate
where spaces should be inserted between words.
- setEnableAutoSpace(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If true (the default), the parser should estimate
where spaces should be inserted between words.
- setEnableImageProcessing(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Set the value to true if processing is to be enabled.
- setEnableImageProcessing(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setEncoding(StringsEncoding) - Method in class org.apache.tika.parser.strings.StringsConfig
-
Sets the character encoding of the strings that are to be found.
- setEntriesToCopy(long) - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntry3FNDX
-
- setEntryType(ChmCommons.EntryType) - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
-
- setExtractAcroFormContent(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If true (the default), extract content from AcroForms
at the end of the document.
- setExtractActions(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Whether or not to extract PDActions from the file.
- setExtractAllAlternatives(boolean) - Method in class org.apache.tika.parser.mail.RFC822Parser
-
Until version 1.17, Tika handled all body parts as embedded objects (see TIKA-2478).
- setExtractAllAlternativesFromMSG(boolean) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
Some .msg files can contain body content in html, rtf and/or text.
- setExtractAllAlternativesFromMSG(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
Some .msg files can contain body content in html, rtf and/or text.
- setExtractAnnotationText(boolean) - Method in class org.apache.tika.parser.pdf.PDFParser
-
If true (the default), text in annotations will be
extracted.
- setExtractAnnotationText(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If true (the default), text in annotations will be
extracted.
- setExtractBookmarksText(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If true, extract bookmarks (document outline) text.
- setExtractFontNames(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Extract font names into a metadata field
- setExtractInlineImages(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If true, extract inline embedded OBXImages.
- setExtractMacros(boolean) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- setExtractMacros(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
Sets whether or not MSOffice parsers should extract macros.
- setExtractMacros(boolean) - Method in class org.apache.tika.parser.odf.FlatOpenDocumentParser
-
- setExtractMacros(boolean) - Method in class org.apache.tika.parser.odf.OpenDocumentParser
-
- setExtractMarkedContent(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If the PDF contains marked content, try to extract text and its marked structure.
- setExtractScripts(boolean) - Method in class org.apache.tika.parser.html.HtmlParser
-
Whether or not to extract contents in script entities.
- setExtractUniqueInlineImagesOnly(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Multiple pages within a PDF file might refer to the same underlying image.
- setFilePath(String) - Method in class org.apache.tika.parser.strings.FileConfig
-
Sets the "file" installation folder.
- setFilter(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- setFilter(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setFramesRead(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setFreeSpace(long) - Method in class org.apache.tika.parser.chm.accessor.ChmPmgiHeader
-
Sets pmgi free space
- setFreeSpace(long) - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
-
- setGazetteerRestEndpoint(String) - Method in class org.apache.tika.parser.geo.topic.GeoParser
-
- setGazetteerRestEndpoint(String) - Method in class org.apache.tika.parser.geo.topic.GeoParserConfig
-
Configure REST endpoint for lucene-geo-gazetteer
- setGuid(GUID) - Method in class org.apache.tika.parser.microsoft.onenote.ExtendedGUID
-
- setGuid(GUID) - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntryFNDX
-
- setGuid(int[]) - Method in class org.apache.tika.parser.microsoft.onenote.GUID
-
- setHadStarted(ChmCommons.LzxState) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setHeader_len(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets itsp header length
- setHeaderLen(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Sets itsf header length
- setId(String) - Method in class org.apache.tika.parser.recognition.RecognisedObject
-
- setIfXFAExtractOnlyXFA(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If false (the default), extract content from the full PDF
as well as the XFA form.
- setIlvl(int) - Method in class org.apache.tika.parser.microsoft.ooxml.ParagraphProperties
-
- setImageMagickPath(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Set the path to the ImageMagick executable directory, needed if it is not on system path.
- setImageMagickPath(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setIncludeDeletedContent(boolean) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- setIncludeDeletedContent(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
Sets whether or not the parser should include deleted content.
- setIncludeDeletedContent(boolean) - Method in class org.apache.tika.parser.wordperfect.WordPerfectParser
-
Whether or not to include deleted content.
- setIncludeHeadersAndFooters(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
Whether or not to include headers and footers.
- setIncludeMarkup(boolean) - Method in class org.apache.tika.parser.html.BoilerpipeContentHandler
-
- setIncludeMissingRows(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
For table-like formats, and tables within other formats, should
missing rows in sparse tables be output where detected?
The default is to only output rows defined within the file, which
avoid lots of blank lines, but means layout isn't preserved.
- setIncludeMoveFromContent(boolean) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- setIncludeMoveFromContent(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
With track changes on, when a section is moved, the content
is stored in both the "moveFrom" section and in the "moveTo" section.
- setIncludeShapeBasedContent(boolean) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- setIncludeShapeBasedContent(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
In Excel and Word, there can be text stored within drawing shapes.
- setIncludeSlideMasterContent(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
Whether or not to include contents from any of the three
types of masters -- slide, notes, handout -- in a .ppt or ppt[xm] file.
- setIncludeSlideNotes(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
Whether or not to process slide notes content.
- setIndex(long) - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntryFNDX
-
- setIndex_depth(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets an index depth
- setIndex_head(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets an index head
- setIndex_root(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets an index root
- setIndexCopyFromStart(long) - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntry3FNDX
-
- setIndexCopyToStart(long) - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntry3FNDX
-
- setIndexOfContent(int) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- setIndexOfResetData(int) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- setIndexOfResetTable(int) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- setInitializableProblemHandler(InitializableProblemHandler) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- setIntelCurrentPossition(long) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setIntelFileSize(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setIntelState(ChmCommons.IntelState) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setItalics(boolean) - Method in class org.apache.tika.parser.microsoft.ooxml.RunProperties
-
- setLabel(String) - Method in class org.apache.tika.parser.recognition.RecognisedObject
-
- setLabelLang(String) - Method in class org.apache.tika.parser.recognition.RecognisedObject
-
- setLang_id(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets language id
- setLangId(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Sets language_id
- setLanguage(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Set tesseract language dictionary to be used.
- setLanguage(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setLastModified(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Sets last modified date of the chm file
- setLatitude(String) - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
-
- setLeft(String) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.HeaderFooterFromString
-
- setLength(int) - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
-
- setLengthTreeLengtsTable(short[]) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setLengthTreeTable(short[]) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setListenForAllRecords(boolean) - Method in class org.apache.tika.parser.microsoft.ExcelExtractor
-
Specifies whether this parser should to listen for all
records or just for the specified few.
- setLongitude(String) - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
-
- setLzxBlockLength(long) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- setLzxBlockOffset(long) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- setLzxBlocksCache(List<ChmLzxBlock>) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- setMain(String, String, String) - Method in class org.apache.tika.parser.geo.topic.GeoTag
-
- setMainTreeElements(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setMainTreeLengtsTable(short[]) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setMainTreeTable(short[]) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setMarkLimit(int) - Method in class org.apache.tika.parser.html.charsetdetector.StandardHtmlEncodingDetector
-
How far into the stream to read for charset detection.
- setMarkLimit(int) - Method in class org.apache.tika.parser.html.HtmlEncodingDetector
-
How far into the stream to read for charset detection.
- setMarkLimit(int) - Method in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
- setMarkLimit(int) - Method in class org.apache.tika.parser.pkg.ZipContainerDetector
-
If this is less than 0, the file will be spooled to disk,
and detection will run on the full file.
- setMarkLimit(int) - Method in class org.apache.tika.parser.txt.Icu4jEncodingDetector
-
How far into the stream to read for charset detection.
- setMarkLimit(int) - Method in class org.apache.tika.parser.txt.UniversalEncodingDetector
-
How far into the stream to read for charset detection.
- setMaxBytesForEmbeddedObject(int) - Static method in class org.apache.tika.parser.rtf.RTFParser
-
- setMaxFileSizeToOcr(long) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Set maximum file size to submit file to ocr.
- setMaxFileSizeToOcr(long) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setMaxMainMemoryBytes(long) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- setMaxMainMemoryBytes(int) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- setMaxMainMemoryBytes(long) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- setMaxRecordSize(long) - Method in class org.apache.tika.parser.mp4.MP4Parser
-
Override the maximum record size limit.
- setMaxXMPMMHistory(int) - Static method in class org.apache.tika.parser.image.xmp.JempboxExtractor
-
Maximum number of events to extract from the
event history in the XMP Media Management (XMPMM) section.
- setMediaType(MediaType) - Method in class org.apache.tika.parser.csv.CSVParams
-
- setMemoryLimitInKb(int) - Method in class org.apache.tika.parser.pkg.CompressorParser
-
- setMemoryLimitInKb(int) - Method in class org.apache.tika.parser.rtf.RTFParser
-
- setMetadata(String[]) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Sets the metadata whose values will be analyzed using cTAKES.
- setMetaParser(Parser) - Method in class org.apache.tika.parser.epub.EpubParser
-
- setMetaParser(Parser) - Method in class org.apache.tika.parser.odf.OpenDocumentParser
-
- setMimetype(boolean) - Method in class org.apache.tika.parser.strings.FileConfig
-
Sets the mime option.
- setMinFileSizeToOcr(long) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Set minimum file size to submit file to ocr.
- setMinFileSizeToOcr(long) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setMinLength(int) - Method in class org.apache.tika.parser.strings.StringsConfig
-
Sets the minimum sequence length (characters) to print.
- setMinSize(int) - Method in class org.apache.tika.parser.strings.Latin1StringsParser
-
Sets the minimum size of a character sequence to be extracted.
- setN(long) - Method in class org.apache.tika.parser.microsoft.onenote.ExtendedGUID
-
- setName(String) - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
-
Sets entry name
- setName(String) - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
-
- setNameLength(int) - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
-
Sets an entry name length
- setNERModelPath(String) - Method in class org.apache.tika.parser.geo.topic.GeoParserConfig
-
- setNerModelUrl(String) - Method in class org.apache.tika.parser.geo.topic.GeoParser
-
- setNerModelUrl(URL) - Method in class org.apache.tika.parser.geo.topic.GeoParserConfig
-
- setNum_blocks(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets number of blocks containing in the chm file
- setNumId(int) - Method in class org.apache.tika.parser.microsoft.ooxml.ParagraphProperties
-
- setOcrDPI(int) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Dots per inch used to render the page image for OCR.
- setOcrImageFormatName(String) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- setOcrImageQuality(float) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Image quality used to render the page image for OCR.
- setOcrImageScale(float) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- setOcrImageType(String) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- setOcrImageType(ImageType) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Image type used to render the page image for OCR.
- setOcrImageType(String) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Image type used to render the page image for OCR.
- setOcrStrategy(String) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- setOcrStrategy(PDFParserConfig.OCR_STRATEGY) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Which strategy to use for OCR
- setOcrStrategy(String) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Which strategy to use for OCR
- setOffset(int) - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
-
- setOids(ObjectSpaceObjectStreamOfOIDsOSIDsOrContextIDs) - Method in class org.apache.tika.parser.microsoft.onenote.ObjectSpaceObjectPropSet
-
- setOnlyLatestRevision(boolean) - Method in class org.apache.tika.parser.microsoft.onenote.OneNoteTreeWalkerOptions
-
Only parse the latest revision.
- setOsids(ObjectSpaceObjectStreamOfOIDsOSIDsOrContextIDs) - Method in class org.apache.tika.parser.microsoft.onenote.ObjectSpaceObjectPropSet
-
- setOutputStream(OutputStream) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
- setOutputType(TesseractOCRConfig.OUTPUT_TYPE) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Set output type from ocr process.
- setOutputType(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- setOutputType(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setPageSegMode(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Set tesseract page segmentation mode.
- setPageSegMode(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setPageSeparator(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
The page separator to use in plain text output.
- setPDFParserConfig(PDFParserConfig) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- setPersonAndEmail(String, Property, Property, Metadata) - Static method in class org.apache.tika.parser.mail.MailUtil
-
This tries to split a "from" or "to" value into a person field and an email field.
- setPreserveInterwordSpacing(boolean) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Whether or not to maintain interword spacing.
- setPreserveInterwordSpacing(boolean) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setPrettyPrint(boolean) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Enables the formatted output for serializer.
- setR0(long) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setR1(long) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setR2(long) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setRecogniser(String) - Method in class org.apache.tika.parser.recognition.ObjectRecognitionParser
-
- setResetInterval(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Sets a reset interval
- setResetTableIndex(int) - Method in class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
-
Sets reset table index
- setResize(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- setResize(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setRight(String) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.HeaderFooterFromString
-
- setSeparatorChar(char) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Sets the separator character used for annotation properties.
- setSerialize(boolean) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Enables CAS serialization.
- setSerializerType(CTAKESSerializer) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Sets the type of cTAKES (UIMA) serializer used to write CAS.
- setSetKCMS(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Whether to call System.setProperty("sun.java2d.cmm", "sun.java2d.cmm.kcms.KcmsServiceProvider").
- setSignature(byte[]) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Sets itsf header signature
- setSignature(byte[]) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets itsp signature
- setSignature(byte[]) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Sets a signature of control data block
- setSignature(byte[]) - Method in class org.apache.tika.parser.chm.accessor.ChmPmgiHeader
-
Sets pmgi signature
- setSignature(byte[]) - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
-
- setSize(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Sets a size of control data
- setSortByPosition(boolean) - Method in class org.apache.tika.parser.pdf.PDFParser
-
If true, sort text tokens by their x/y position
before extracting text.
- setSortByPosition(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If true, sort text tokens by their x/y position
before extracting text.
- setSpacingTolerance(Float) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
See PDFTextStripper.setSpacingTolerance(float)
- setStartIndex(int) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- setStream_uuid(byte[]) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Sets stream uuid
- setStrike(boolean) - Method in class org.apache.tika.parser.microsoft.ooxml.RunProperties
-
- setStringsPath(String) - Method in class org.apache.tika.parser.strings.StringsConfig
-
Sets the "strings" installation folder.
- setStripMarkup(boolean) - Method in class org.apache.tika.parser.txt.Icu4jEncodingDetector
-
Whether or not to attempt to strip html-ish markup
from the stream before sending it to the underlying
detector.
- setStyleID(String) - Method in class org.apache.tika.parser.microsoft.ooxml.ParagraphProperties
-
- setSuppressDuplicateOverlappingText(boolean) - Method in class org.apache.tika.parser.pdf.PDFParser
-
If true, the parser should try to remove duplicated
text over the same region.
- setSuppressDuplicateOverlappingText(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If true, the parser should try to remove duplicated
text over the same region.
- setSwath(int) - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- setSystem_uuid(byte[]) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets system uuid
- setTableOffset(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Sets a table offset
- setTessdataPath(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Set the path to the 'tessdata' folder, which contains language files and config files.
- setTessdataPath(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setTesseractPath(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Set the path to the Tesseract executable's directory, needed if it is not on system path.
- setTesseractPath(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setText(boolean) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Enables content text analysis using cTAKES.
- setText(byte[]) - Method in class org.apache.tika.parser.txt.CharsetDetector
-
Set the input text (byte) data whose charset is to be detected.
- setText(InputStream) - Method in class org.apache.tika.parser.txt.CharsetDetector
-
Set the input text (byte) data whose charset is to be detected.
- setTimeout(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Set maximum time (seconds) to wait for the ocring process to terminate.
- setTimeout(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setTimeout(int) - Method in class org.apache.tika.parser.strings.StringsConfig
-
Sets the maximum time (in seconds) to wait for the "strings" command to
terminate.
- setTotal(int) - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- setTracking(boolean) - Method in class org.apache.tika.parser.mbox.MboxParser
-
- setTrustedPageSeparator(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- setType(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
-
- setUMLSPass(String) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Sets the UMLS password.
- setUMLSUser(String) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Sets the UMLS username.
- setUncompressedLen(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Sets uncompressed length
- setUnderline(String) - Method in class org.apache.tika.parser.microsoft.ooxml.RunProperties
-
- setUnknown(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Sets an unknown
- setUnknown0008(long) - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
-
- setUnknown_000c(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Sets unknown_00c
- setUnknown_000c(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets 000c unknown bytes Unknown means here that those guys who cracked
the chm format do not know what's it purposes for
- setUnknown_0024(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets 0024 unknown bytes
- setUnknown_002c(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets 002c unknown bytes
- setUnknown_0044(byte[]) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets 0044 unknown bytes
- setUnknown_18(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Sets unknown 18 bytes
- setUnknownLen(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Sets unknown length
- setUnknownOffset(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Sets unknown offset
- setUseSAXDocxExtractor(boolean) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- setUseSAXDocxExtractor(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
Use the experimental SAX-based streaming DOCX parser?
If set to false, the classic parser will be used; if true,
the new experimental parser will be used.
- setUseSAXPptxExtractor(boolean) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- setUseSAXPptxExtractor(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
Use the experimental SAX-based streaming DOCX parser?
If set to false, the classic parser will be used; if true,
the new experimental parser will be used.
- setUtf16PropertiesToPrint(Set<OneNotePropertyEnum>) - Method in class org.apache.tika.parser.microsoft.onenote.OneNoteTreeWalkerOptions
-
Print file node data in UTF-16 format when they match these props.
- setVersion(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Sets itsf version
- setVersion(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets a version of itsp header
- setVersion(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Sets version of control data block
- setVersion(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Sets the version
- setWindow(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setWindowPosition(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setWindowSize(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Sets a window size
- setWindowSize(long) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setWindowsPerReset(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Sets windows per reset
- sheetParts - Variable in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
-
- SheetTextAsHTML(OfficeParserConfig, XHTMLContentHandler) - Constructor for class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.SheetTextAsHTML
-
- shouldAcceptBox(Box) - Method in class org.apache.tika.parser.mp4.TikaMp4BoxHandler
-
- shouldAcceptContainer(Box) - Method in class org.apache.tika.parser.mp4.TikaMp4BoxHandler
-
- signature - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.NodeObject
-
- SIGNATURE_RELATIONSHIP - Static variable in class org.apache.tika.parser.microsoft.ooxml.OOXMLParser
-
- signatureData - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.SignatureObject
-
Gets or sets a binary item as specified in [MS-FSSHTTPB] section 2.2.1.3 that specifies a
value that is unique to the file data represented by this root node object.
- SignatureObject - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
Signature Object
- SignatureObject() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.SignatureObject
-
Initializes a new instance of the SignatureObject class.
- SimpleChunking - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking
-
- SimpleChunking(byte[]) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.SimpleChunking
-
Initializes a new instance of the SimpleChunking class
- skippedEntity(String) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
-
- SLDWORKS - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
SolidWorks CAD file
- SourceCodeParser - Class in org.apache.tika.parser.code
-
Generic Source code parser for Java, Groovy, C++.
- SourceCodeParser() - Constructor for class org.apache.tika.parser.code.SourceCodeParser
-
- SourceCodeParser(EncodingDetector) - Constructor for class org.apache.tika.parser.code.SourceCodeParser
-
- SpreadsheetMLParser - Class in org.apache.tika.parser.microsoft.xml
-
Parses wordml 2003 format Excel files.
- SpreadsheetMLParser() - Constructor for class org.apache.tika.parser.microsoft.xml.SpreadsheetMLParser
-
- SQLite3Parser - Class in org.apache.tika.parser.jdbc
-
This is the main class for parsing SQLite3 files.
- SQLite3Parser() - Constructor for class org.apache.tika.parser.jdbc.SQLite3Parser
-
Checks to see if class is available for org.sqlite.JDBC.
- StandardHtmlEncodingDetector - Class in org.apache.tika.parser.html.charsetdetector
-
An encoding detector that tries to respect the spirit of the HTML spec
part 12.2.3 "The input byte stream", or at least the part that is compatible with
the implementation of tika.
- StandardHtmlEncodingDetector() - Constructor for class org.apache.tika.parser.html.charsetdetector.StandardHtmlEncodingDetector
-
- start(BundleContext) - Method in class org.apache.tika.parser.internal.Activator
-
- START_PMGL - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- startBookmark(String, String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
-
- startBookmark(String, String) - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
-
- startDocument() - Method in class org.apache.tika.parser.dif.DIFContentHandler
-
- startDocument() - Method in class org.apache.tika.parser.html.BoilerpipeContentHandler
-
- startDocument() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
-
- startDocument() - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
-
- startDocument() - Method in class org.apache.tika.parser.xliff.XLIFF12ContentHandler
-
- startEditedSection(String, Date, OOXMLWordAndPowerPointTextHandler.EditType) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
-
- startEditedSection(String, Date, OOXMLWordAndPowerPointTextHandler.EditType) - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.dif.DIFContentHandler
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.html.BoilerpipeContentHandler
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.mif.MIFContentHandler
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.odf.NSNormalizerContentHandler
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.xliff.XLIFF12ContentHandler
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.xml.AttributeDependantMetadataHandler
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.xml.AttributeMetadataHandler
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.xml.ElementMetadataHandler
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.xml.MetadataHandler
-
Deprecated.
- startParagraph(ParagraphProperties) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
-
- startParagraph(ParagraphProperties) - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
-
- startPrefixMapping(String, String) - Method in class org.apache.tika.parser.html.BoilerpipeContentHandler
-
- startPrefixMapping(String, String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
-
- startPrefixMapping(String, String) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
-
- startPrefixMapping(String, String) - Method in class org.apache.tika.parser.odf.NSNormalizerContentHandler
-
- startRow(int) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.SheetTextAsHTML
-
- startSDT() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
-
- startSDT() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
-
- startsWith(byte[], String) - Static method in class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
-
- startTable() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
-
- startTable() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
-
- startTableCell() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
-
- startTableCell() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
-
- startTableRow() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
-
- startTableRow() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
-
- stop(BundleContext) - Method in class org.apache.tika.parser.internal.Activator
-
- storageIndex - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStorePackage
-
- StorageIndexCellMapping - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
Specifies the storage index cell mappings (with cell identifier, cell mapping extended GUID,
and cell mapping serial number)
- StorageIndexCellMapping() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexCellMapping
-
Initializes a new instance of the StorageIndexCellMapping class.
- storageIndexCellMappingList - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexDataElementData
-
- StorageIndexDataElementData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
- StorageIndexDataElementData() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexDataElementData
-
Initializes a new instance of the StorageIndexDataElementData class.
- storageIndexExtendedGUID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AlternativePackaging
-
- storageIndexManifestMapping - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexDataElementData
-
- StorageIndexManifestMapping - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
- StorageIndexManifestMapping() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexManifestMapping
-
Initializes a new instance of the StorageIndexManifestMapping class.
- StorageIndexRevisionMapping - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
Specifies the storage index revision mappings (with revision and revision mapping
extended GUIDs, and revision mapping serial number)
- StorageIndexRevisionMapping() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexRevisionMapping
-
Initializes a new instance of the StorageIndexRevisionMapping class.
- storageIndexRevisionMappingList - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexDataElementData
-
- storageManifest - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStorePackage
-
- StorageManifestDataElementData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
- StorageManifestDataElementData() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestDataElementData
-
Initializes a new instance of the StorageManifestDataElementData class.
- StorageManifestRootDeclare - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
Specifies one or more storage manifest root declare.
- StorageManifestRootDeclare() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestRootDeclare
-
Initializes a new instance of the StorageManifestRootDeclare class.
- storageManifestRootDeclareList - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestDataElementData
-
- storageManifestSchemaGUID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestDataElementData
-
- StorageManifestSchemaGUID - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
Specifies a storage manifest schema GUID
- StorageManifestSchemaGUID() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestSchemaGUID
-
Initializes a new instance of the StorageManifestSchemaGUID class.
- STREAM_OBJECT_HEADER_START_16_BIT - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart
-
Specify for 16-bit stream object header start.
- STREAM_OBJECT_HEADER_START_32_BIT - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart
-
Specify for 32-bit stream object header start.
- StreamingZipContainerDetector - Class in org.apache.tika.parser.pkg
-
- StreamingZipContainerDetector(int) - Constructor for class org.apache.tika.parser.pkg.StreamingZipContainerDetector
-
- StreamObject - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
- StreamObject(StreamObjectTypeHeaderStart) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObject
-
Initializes a new instance of the StreamObject class.
- StreamObjectHeaderEnd - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
- StreamObjectHeaderEnd() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderEnd
-
- StreamObjectHeaderEnd16bit - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
An 16-bit header for a compound object would indicate the end of a stream object
- StreamObjectHeaderEnd16bit(int) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderEnd16bit
-
Initializes a new instance of the StreamObjectHeaderEnd16bit class with the specified type value.
- StreamObjectHeaderEnd16bit(StreamObjectTypeHeaderEnd) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderEnd16bit
-
Initializes a new instance of the StreamObjectHeaderEnd16bit class with the specified type value.
- StreamObjectHeaderEnd16bit() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderEnd16bit
-
Initializes a new instance of the StreamObjectHeaderEnd16bit class, this is the default constructor.
- StreamObjectHeaderEnd8bit - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
An 8-bit header for a compound object would indicate the end of a stream object
- StreamObjectHeaderEnd8bit(int) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderEnd8bit
-
Initializes a new instance of the StreamObjectHeaderEnd8bit class with the specified type value.
- StreamObjectHeaderEnd8bit() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderEnd8bit
-
Initializes a new instance of the StreamObjectHeaderEnd8bit class, this is the default constructor.
- StreamObjectHeaderEnd8bit(StreamObjectTypeHeaderEnd) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderEnd8bit
-
Initializes a new instance of the StreamObjectHeaderEnd8bit class with the specified type value.
- StreamObjectHeaderStart - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
This class specifies the base class for 16-bit or 32-bit stream object header start
- StreamObjectHeaderStart() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart
-
Initializes a new instance of the StreamObjectHeaderStart class.
- StreamObjectHeaderStart(StreamObjectTypeHeaderStart) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart
-
Initializes a new instance of the StreamObjectHeaderStart class with specified header type.
- StreamObjectHeaderStart16bit - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
An 16-bit header for a compound object would indicate the start of a stream object
- StreamObjectHeaderStart16bit(StreamObjectTypeHeaderStart, int) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart16bit
-
Initializes a new instance of the StreamObjectHeaderStart16bit class with specified type and length.
- StreamObjectHeaderStart16bit(StreamObjectTypeHeaderStart) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart16bit
-
Initializes a new instance of the StreamObjectHeaderStart16bit class with specified type.
- StreamObjectHeaderStart16bit() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart16bit
-
Initializes a new instance of the StreamObjectHeaderStart16bit class, this is the default constructor.
- StreamObjectHeaderStart32bit - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
An 32-bit header for a compound object would indicate the start of a stream object
- StreamObjectHeaderStart32bit(StreamObjectTypeHeaderStart, int) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart32bit
-
Initializes a new instance of the StreamObjectHeaderStart32bit class with specified type and length.
- StreamObjectHeaderStart32bit() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart32bit
-
Initializes a new instance of the StreamObjectHeaderStart32bit class, this is the default constructor.
- StreamObjectHeaderStart32bit(StreamObjectTypeHeaderStart) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart32bit
-
Initializes a new instance of the StreamObjectHeaderStart32bit class with specified type.
- StreamObjectParseErrorException - Exception in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
- StreamObjectParseErrorException(int, String, Exception) - Constructor for exception org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectParseErrorException
-
Initializes a new instance of the StreamObjectParseErrorException class
- StreamObjectParseErrorException(int, String, String, Exception) - Constructor for exception org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectParseErrorException
-
Initializes a new instance of the StreamObjectParseErrorException class
- StreamObjectTypeHeaderEnd - Enum in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
- StreamObjectTypeHeaderStart - Enum in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
The enumeration of the stream object type header start
- streamObjectTypeName - Variable in exception org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectParseErrorException
-
- StringsConfig - Class in org.apache.tika.parser.strings
-
Configuration for the "strings" (or strings-alternative) command.
- StringsConfig() - Constructor for class org.apache.tika.parser.strings.StringsConfig
-
Default contructor.
- StringsConfig(InputStream) - Constructor for class org.apache.tika.parser.strings.StringsConfig
-
Loads properties from InputStream and then tries to close InputStream.
- StringsEncoding - Enum in org.apache.tika.parser.strings
-
Character encoding of the strings that are to be found using the "strings" command.
- StringsParser - Class in org.apache.tika.parser.strings
-
Parser that uses the "strings" (or strings-alternative) command to find the
printable strings in a object, or other binary, file
(application/octet-stream).
- StringsParser() - Constructor for class org.apache.tika.parser.strings.StringsParser
-
- stringToAsciiBytes(String) - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- subtract(UByte) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
-
- subtract(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
-
- subtract(UInteger) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
-
- subtract(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
-
- subtract(ULong) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
-
- subtract(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
-
- subtract(long) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
-
- subtract(UShort) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
-
- subtract(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
-
- SUMMARY_PROPERTY_PREFIX - Static variable in class org.apache.tika.parser.microsoft.JackcessParser
-
- SummaryExtractor - Class in org.apache.tika.parser.microsoft
-
Extractor for Common OLE2 (HPSF) metadata
- SummaryExtractor(Metadata) - Constructor for class org.apache.tika.parser.microsoft.SummaryExtractor
-
- SUPPORTED_TYPES - Static variable in class org.apache.tika.parser.microsoft.ooxml.OOXMLParser
-
- SUPPORTED_TYPES - Static variable in class org.apache.tika.parser.microsoft.ooxml.xwpf.ml2006.Word2006MLParser
-
- SXSLFPowerPointExtractorDecorator - Class in org.apache.tika.parser.microsoft.ooxml
-
SAX/Streaming pptx extractior
- SXSLFPowerPointExtractorDecorator(Metadata, ParseContext, XSLFEventBasedPowerPointExtractor) - Constructor for class org.apache.tika.parser.microsoft.ooxml.SXSLFPowerPointExtractorDecorator
-
- SXWPFWordExtractorDecorator - Class in org.apache.tika.parser.microsoft.ooxml
-
This is an experimental, alternative extractor for docx files.
- SXWPFWordExtractorDecorator(Metadata, ParseContext, XWPFEventBasedWordExtractor) - Constructor for class org.apache.tika.parser.microsoft.ooxml.SXWPFWordExtractorDecorator
-
- SYS_PROP_NER_IMPL - Static variable in class org.apache.tika.parser.ner.NamedEntityParser
-
- value - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
-
- value - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyID
-
- value - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.SerialNumber
-
- valueOf(String) - Static method in enum org.apache.tika.parser.chm.core.ChmCommons.EntryType
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.chm.core.ChmCommons.IntelState
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.chm.core.ChmCommons.LzxState
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.ctakes.CTAKESAnnotationProperty
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.ctakes.CTAKESSerializer
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.iwork.iwana.IWork13PackageParser.IWork13DocumentType
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.iwork.iwana.IWork18PackageParser.IWork18DocumentType
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.microsoft.FormattingUtils.Tag
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.microsoft.onenote.Error
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.DataElementType
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyType
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.RequestTypes
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.ChunkingMethod
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderEnd
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
-
Get an instance of an unsigned byte
- valueOf(byte) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
-
Get an instance of an unsigned byte by masking it with
0xFF i.e.
- valueOf(short) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
-
Get an instance of an unsigned byte
- valueOf(int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
-
Get an instance of an unsigned byte
- valueOf(long) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
-
Get an instance of an unsigned byte
- valueOf(String) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
-
Create an unsigned int
- valueOf(int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
-
Create an unsigned int by masking it with
0xFFFFFFFF i.e.
- valueOf(long) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
-
Create an unsigned int
- valueOf(String) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
-
Create an unsigned long
- valueOf(long) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
-
Create an unsigned long by masking it with
0xFFFFFFFFFFFFFFFF i.e.
- valueOf(BigInteger) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
-
Create an unsigned long
- valueOf(String) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
-
Create an unsigned short
- valueOf(short) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
-
Create an unsigned short by masking it with
0xFFFF i.e.
- valueOf(int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
-
Create an unsigned short
- valueOf(String) - Static method in enum org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.EditType
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.microsoft.OutlookExtractor.RECIPIENT_TYPE
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.ocr.TesseractOCRConfig.OUTPUT_TYPE
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.pdf.PDFParserConfig.OCR_STRATEGY
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.strings.StringsEncoding
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.utils.CommonsDigester.DigestAlgorithm
-
Returns the enum constant of this type with the specified name.
- values() - Static method in enum org.apache.tika.parser.chm.core.ChmCommons.EntryType
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.chm.core.ChmCommons.IntelState
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.chm.core.ChmCommons.LzxState
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.ctakes.CTAKESAnnotationProperty
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.ctakes.CTAKESSerializer
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.iwork.iwana.IWork13PackageParser.IWork13DocumentType
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.iwork.iwana.IWork18PackageParser.IWork18DocumentType
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.microsoft.FormattingUtils.Tag
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.microsoft.onenote.Error
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.DataElementType
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyType
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.RequestTypes
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.ChunkingMethod
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderEnd
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.EditType
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.microsoft.OutlookExtractor.RECIPIENT_TYPE
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.ocr.TesseractOCRConfig.OUTPUT_TYPE
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.pdf.PDFParserConfig.OCR_STRATEGY
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.strings.StringsEncoding
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.utils.CommonsDigester.DigestAlgorithm
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- VERBATIM - Static variable in class org.apache.tika.parser.chm.core.ChmCommons
-
- VSD - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
Microsoft Visio