Class CreateHadoopSequenceFile

java.lang.Object
org.apache.nifi.components.AbstractConfigurableComponent
org.apache.nifi.processor.AbstractSessionFactoryProcessor
org.apache.nifi.processor.AbstractProcessor
org.apache.nifi.processors.hadoop.AbstractHadoopProcessor
org.apache.nifi.processors.hadoop.CreateHadoopSequenceFile
All Implemented Interfaces:
org.apache.nifi.components.ClassloaderIsolationKeyProvider, org.apache.nifi.components.ConfigurableComponent, org.apache.nifi.processor.Processor

@DeprecationNotice(reason="NIFI-14846: Uses custom file format specific to Apache NiFi and minimal maintenance since initial implementation") @SideEffectFree @InputRequirement(INPUT_REQUIRED) @Tags({"hadoop","sequence file","create","sequencefile"}) @CapabilityDescription("Creates Hadoop Sequence Files from incoming flow files") @SeeAlso(PutHDFS.class) public class CreateHadoopSequenceFile extends AbstractHadoopProcessor

This processor is used to create a Hadoop Sequence File, which essentially is a file of key/value pairs. The key will be a file name and the value will be the flow file content. The processor will take either a merged (a.k.a. packaged) flow file or a singular flow file. Historically, this processor handled the merging by type and size or time prior to creating a SequenceFile output; it no longer does this. If creating a SequenceFile that contains multiple files of the same type is desired, precede this processor with a RouteOnAttribute processor to segregate files of the same type and follow that with a MergeContent processor to bundle up files. If the type of files is not important, just use the MergeContent processor. When using the MergeContent processor, the following Merge Formats are supported by this processor:

  • TAR
  • ZIP
  • FlowFileStream v3
The created SequenceFile is named the same as the incoming FlowFile with the suffix '.sf'. For incoming FlowFiles that are bundled, the keys in the SequenceFile are the individual file names, the values are the contents of each file.

NOTE: The value portion of a key/value pair is loaded into memory. While there is a max size limit of 2GB, this could cause memory issues if there are too many concurrent tasks and the flow file sizes are large.
  • Field Details

    • TAR_FORMAT

      public static final String TAR_FORMAT
      See Also:
    • ZIP_FORMAT

      public static final String ZIP_FORMAT
      See Also:
    • FLOWFILE_STREAM_FORMAT_V3

      public static final String FLOWFILE_STREAM_FORMAT_V3
      See Also:
    • NOT_PACKAGED

      private static final String NOT_PACKAGED
      See Also:
    • RELATIONSHIP_SUCCESS

      public static final org.apache.nifi.processor.Relationship RELATIONSHIP_SUCCESS
    • RELATIONSHIP_FAILURE

      public static final org.apache.nifi.processor.Relationship RELATIONSHIP_FAILURE
    • COMPRESSION_TYPE

      static final org.apache.nifi.components.PropertyDescriptor COMPRESSION_TYPE
    • DEFAULT_COMPRESSION_TYPE

      public static final String DEFAULT_COMPRESSION_TYPE
      See Also:
    • RELATIONSHIPS

      private static final Set<org.apache.nifi.processor.Relationship> RELATIONSHIPS
    • PROPERTY_DESCRIPTORS

      private static final List<org.apache.nifi.components.PropertyDescriptor> PROPERTY_DESCRIPTORS
  • Constructor Details

    • CreateHadoopSequenceFile

      public CreateHadoopSequenceFile()
  • Method Details

    • getRelationships

      public Set<org.apache.nifi.processor.Relationship> getRelationships()
      Specified by:
      getRelationships in interface org.apache.nifi.processor.Processor
      Overrides:
      getRelationships in class org.apache.nifi.processor.AbstractSessionFactoryProcessor
    • getSupportedPropertyDescriptors

      public List<org.apache.nifi.components.PropertyDescriptor> getSupportedPropertyDescriptors()
      Overrides:
      getSupportedPropertyDescriptors in class AbstractHadoopProcessor
    • onTrigger

      public void onTrigger(org.apache.nifi.processor.ProcessContext context, org.apache.nifi.processor.ProcessSession session) throws org.apache.nifi.processor.exception.ProcessException
      Specified by:
      onTrigger in class org.apache.nifi.processor.AbstractProcessor
      Throws:
      org.apache.nifi.processor.exception.ProcessException