Class JDBCEmitter

  • All Implemented Interfaces:
    Closeable, AutoCloseable, org.apache.tika.config.Initializable, org.apache.tika.pipes.emitter.Emitter

    public class JDBCEmitter
    extends org.apache.tika.pipes.emitter.AbstractEmitter
    implements org.apache.tika.config.Initializable, Closeable
    This is only an initial, basic implementation of an emitter for JDBC.

    It is currently NOT thread safe because of the shared prepared statement, and depending on the jdbc implementation because of the shared connection.

    As of the 2.5.0 release, this is ALPHA version. There may be breaking changes in the future.

    • Constructor Detail

      • JDBCEmitter

        public JDBCEmitter()
    • Method Detail

      • setAlterTable

        public void setAlterTable​(String alterTable)
        This is called immediately after the table is created. The purpose of this is to allow for adding a complex primary key or other constraint on the table after it is created.
        Parameters:
        alterTable -
      • setCreateTable

        @Field
        public void setCreateTable​(String createTable)
      • setInsert

        @Field
        public void setInsert​(String insert)
      • setConnection

        @Field
        public void setConnection​(String connection)
      • setMaxStringLength

        @Field
        public void setMaxStringLength​(int maxStringLength)
        Set the maximum string length in characters (not bytes). This is applies only to fields with name "string" not to "varchar".
        Parameters:
        maxStringLength -
      • setMaxRetries

        public void setMaxRetries​(int maxRetries)
      • setPostConnection

        @Field
        public void setPostConnection​(String postConnection)
        This sql will be called immediately after the connection is made. This was initially added for setting pragmas on sqlite3, but may be used for other connection configuration in other dbs. Note: This is called before the table is created if it needs to be created.
        Parameters:
        postConnection -
      • setMultivaluedFieldStrategy

        @Field
        public void setMultivaluedFieldStrategy​(String strategy)
                                         throws org.apache.tika.exception.TikaConfigException
        This applies to fields of type 'string' or 'varchar'. If there's a multivalued field in a metadata object, do you want the first value only or should we concatenate these with the setMultivaluedFieldDelimiter(String).

        The default values as of 2.6.1 are JDBCEmitter.MultivaluedFieldStrategy.CONCATENATE and the default delimiter is ", "

        Parameters:
        strategy -
        Throws:
        org.apache.tika.exception.TikaConfigException
      • setKeys

        @Field
        public void setKeys​(Map<String,​String> keys)
        The implementation of keys should be a LinkedHashMap because order matters!

        Key is the name of the metadata field, value is the type of column: boolean, string, int, long

        Parameters:
        keys -
      • setAttachmentStrategy

        @Field
        public void setAttachmentStrategy​(String attachmentStrategy)
      • emit

        public void emit​(String emitKey,
                         List<org.apache.tika.metadata.Metadata> metadataList,
                         org.apache.tika.parser.ParseContext parseContext)
                  throws IOException,
                         org.apache.tika.pipes.emitter.TikaEmitterException
        This executes the emit with each call. For more efficient batch execution use emit(List).
        Specified by:
        emit in interface org.apache.tika.pipes.emitter.Emitter
        Parameters:
        emitKey - emit key
        metadataList - list of metadata per file
        Throws:
        IOException
        org.apache.tika.pipes.emitter.TikaEmitterException
      • emit

        public void emit​(List<? extends org.apache.tika.pipes.emitter.EmitData> emitData)
                  throws IOException,
                         org.apache.tika.pipes.emitter.TikaEmitterException
        Specified by:
        emit in interface org.apache.tika.pipes.emitter.Emitter
        Overrides:
        emit in class org.apache.tika.pipes.emitter.AbstractEmitter
        Throws:
        IOException
        org.apache.tika.pipes.emitter.TikaEmitterException
      • initialize

        public void initialize​(Map<String,​org.apache.tika.config.Param> params)
                        throws org.apache.tika.exception.TikaConfigException
        Specified by:
        initialize in interface org.apache.tika.config.Initializable
        Throws:
        org.apache.tika.exception.TikaConfigException
      • checkInitialization

        public void checkInitialization​(org.apache.tika.config.InitializableProblemHandler problemHandler)
                                 throws org.apache.tika.exception.TikaConfigException
        Specified by:
        checkInitialization in interface org.apache.tika.config.Initializable
        Throws:
        org.apache.tika.exception.TikaConfigException