Class CSVPipesIterator
- java.lang.Object
-
- org.apache.tika.config.ConfigBase
-
- org.apache.tika.pipes.pipesiterator.PipesIterator
-
- org.apache.tika.pipes.pipesiterator.csv.CSVPipesIterator
-
- All Implemented Interfaces:
Iterable<org.apache.tika.pipes.FetchEmitTuple>,Callable<Integer>,org.apache.tika.config.Initializable
public class CSVPipesIterator extends org.apache.tika.pipes.pipesiterator.PipesIterator implements org.apache.tika.config.InitializableIterates through a UTF-8 CSV file. This adds all columns (except for the 'fetchKeyColumn' and 'emitKeyColumn', if specified) to the metadata object.- If an 'idColumn' is specified, this will use that column's value as the id.
- If no 'idColumn' is specified, but a 'fetchKeyColumn' is specified, the string in the 'fetchKeyColumn' will be used as the 'id'.
- The 'idColumn' value is not added to the metadata.
- If a 'fetchKeyColumn' is specified, this will use that column's value as the fetchKey.
- If no 'fetchKeyColumn' is specified, this will send the metadata from the other columns.
- The 'fetchKeyColumn' value is not added to the metadata.
- If an 'emitKeyColumn' is specified, this will use that column's value as the emit key.
- If an 'emitKeyColumn' is not specified, this will use the value from the 'fetchKeyColumn'.
- The 'emitKeyColumn' value is not added to the metadata.
-
-
Constructor Summary
Constructors Constructor Description CSVPipesIterator()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidcheckInitialization(org.apache.tika.config.InitializableProblemHandler problemHandler)protected voidenqueue()voidsetCsvPath(String csvPath)voidsetCsvPath(Path csvPath)voidsetEmitKeyColumn(String emitKeyColumn)voidsetFetchKeyColumn(String fetchKeyColumn)voidsetIdColumn(String idColumn)-
Methods inherited from class org.apache.tika.pipes.pipesiterator.PipesIterator
build, call, getEmitterName, getFetcherName, getHandlerConfig, getOnParseException, initialize, iterator, setEmitterName, setFetcherName, setHandlerType, setMaxEmbeddedResources, setMaxWaitMs, setOnParseException, setOnParseException, setParseMode, setParseMode, setQueueSize, setThrowOnWriteLimitReached, setWriteLimit, tryToAdd
-
Methods inherited from class org.apache.tika.config.ConfigBase
buildComposite, buildComposite, buildSingle, buildSingle, configure, handleSettings
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface java.lang.Iterable
forEach, spliterator
-
-
-
-
Method Detail
-
setCsvPath
@Field public void setCsvPath(String csvPath)
-
setFetchKeyColumn
@Field public void setFetchKeyColumn(String fetchKeyColumn)
-
setEmitKeyColumn
@Field public void setEmitKeyColumn(String emitKeyColumn)
-
setIdColumn
@Field public void setIdColumn(String idColumn)
-
setCsvPath
@Field public void setCsvPath(Path csvPath)
-
enqueue
protected void enqueue() throws InterruptedException, IOException, TimeoutException- Specified by:
enqueuein classorg.apache.tika.pipes.pipesiterator.PipesIterator- Throws:
InterruptedExceptionIOExceptionTimeoutException
-
checkInitialization
public void checkInitialization(org.apache.tika.config.InitializableProblemHandler problemHandler) throws org.apache.tika.exception.TikaConfigException- Specified by:
checkInitializationin interfaceorg.apache.tika.config.Initializable- Overrides:
checkInitializationin classorg.apache.tika.pipes.pipesiterator.PipesIterator- Throws:
org.apache.tika.exception.TikaConfigException
-
-