Class BaseAppenderatorDriver

java.lang.Object
org.apache.druid.segment.realtime.appenderator.BaseAppenderatorDriver
All Implemented Interfaces:
Closeable, AutoCloseable
Direct Known Subclasses:
BatchAppenderatorDriver, StreamAppenderatorDriver

public abstract class BaseAppenderatorDriver extends Object implements Closeable
A BaseAppenderatorDriver drives an Appenderator to index a finite stream of data. This class does not help you index unbounded streams. All handoff is done at the end of indexing.

This class helps with doing things that Appenderators don't, including deciding which segments to use (with a SegmentAllocator), publishing segments to the metadata store (with a SegmentPublisher).

This class has two child classes, i.e., BatchAppenderatorDriver and StreamAppenderatorDriver, which are for batch and streaming ingestion, respectively. This class provides some fundamental methods for making the child classes' life easier like pushInBackground(org.apache.druid.segment.realtime.appenderator.BaseAppenderatorDriver.WrappedCommitter, java.util.Collection<org.apache.druid.segment.realtime.appenderator.SegmentIdWithShardSpec>, boolean), dropInBackground(org.apache.druid.segment.realtime.appenderator.SegmentsAndCommitMetadata), or publishInBackground(java.util.Set<org.apache.druid.timeline.DataSegment>, java.util.Set<org.apache.druid.timeline.DataSegment>, org.apache.druid.segment.realtime.appenderator.SegmentsAndCommitMetadata, org.apache.druid.segment.realtime.appenderator.TransactionalSegmentPublisher, java.util.function.Function<java.util.Set<org.apache.druid.timeline.DataSegment>, java.util.Set<org.apache.druid.timeline.DataSegment>>). The child classes can use these methods to achieve their goal.

Note that the commit metadata stored by this class via the underlying Appenderator is not the same metadata as you pass in. It's wrapped in some extra metadata needed by the driver.

  • Field Details

  • Method Details

    • getSegments

    • startJob

      @Nullable public abstract Object startJob(AppenderatorDriverSegmentLockHelper lockHelper)
      Perform any initial setup and return currently persisted commit metadata.

      Note that this method returns the same metadata you've passed in with your Committers, even though this class stores extra metadata on disk.

      Returns:
      currently persisted commit metadata
    • append

      protected AppenderatorDriverAddResult append(InputRow row, String sequenceName, @Nullable com.google.common.base.Supplier<Committer> committerSupplier, boolean skipSegmentLineageCheck, boolean allowIncrementalPersists) throws IOException
      Add a row. Must not be called concurrently from multiple threads.
      Parameters:
      row - the row to add
      sequenceName - sequenceName for this row's segment
      committerSupplier - supplier of a committer associated with all data that has been added, including this row if is set to false then this will not be used
      skipSegmentLineageCheck - if false, perform lineage validation using previousSegmentId for this sequence. Should be set to false if replica tasks would index events in same order
      allowIncrementalPersists - whether to allow persist to happen when maxRowsInMemory or intermediate persist period threshold is hit
      Returns:
      AppenderatorDriverAddResult
      Throws:
      IOException - if there is an I/O error while allocating or writing to a segment
    • clear

      public void clear() throws InterruptedException
      Clears out all our state and also calls Appenderator.clear() on the underlying Appenderator.
      Throws:
      InterruptedException
    • close

      public void close()
      Closes this driver. Does not close the underlying Appenderator; you should do that yourself.
      Specified by:
      close in interface AutoCloseable
      Specified by:
      close in interface Closeable