public class NamedEntityParser
extends org.apache.tika.parser.AbstractParser
Parser extracts
entity names from text content and adds it to the metadata.
All the metadata keys will have a common suffix "NER_"
The Named Entity recogniser implementation can be changed by setting the
system property "ner.impl.class" value to a name of class that
implements NERecogniser contract
OpenNLPNERecogniser,
NERecogniser,
Serialized Form| Modifier and Type | Field and Description |
|---|---|
static String |
DEFAULT_NER_IMPL |
static org.slf4j.Logger |
LOG |
static String |
MD_KEY_PREFIX |
static Set<org.apache.tika.mime.MediaType> |
MEDIA_TYPES |
org.apache.tika.Tika |
secondaryParser |
static String |
SYS_PROP_NER_IMPL |
| Constructor and Description |
|---|
NamedEntityParser() |
| Modifier and Type | Method and Description |
|---|---|
Set<org.apache.tika.mime.MediaType> |
getSupportedTypes(org.apache.tika.parser.ParseContext parseContext) |
void |
parse(InputStream inputStream,
ContentHandler contentHandler,
org.apache.tika.metadata.Metadata metadata,
org.apache.tika.parser.ParseContext parseContext) |
public static final org.slf4j.Logger LOG
public static final Set<org.apache.tika.mime.MediaType> MEDIA_TYPES
public static final String MD_KEY_PREFIX
public static final String DEFAULT_NER_IMPL
public static final String SYS_PROP_NER_IMPL
public org.apache.tika.Tika secondaryParser
public Set<org.apache.tika.mime.MediaType> getSupportedTypes(org.apache.tika.parser.ParseContext parseContext)
public void parse(InputStream inputStream, ContentHandler contentHandler, org.apache.tika.metadata.Metadata metadata, org.apache.tika.parser.ParseContext parseContext) throws IOException, SAXException, org.apache.tika.exception.TikaException
IOExceptionSAXExceptionorg.apache.tika.exception.TikaExceptionCopyright © 2007–2025 The Apache Software Foundation. All rights reserved.