Class StreamAppenderator

java.lang.Object
org.apache.druid.segment.realtime.appenderator.StreamAppenderator
All Implemented Interfaces:
QuerySegmentWalker, Appenderator

public class StreamAppenderator extends Object implements Appenderator
  • Field Details

    • ROUGH_OVERHEAD_PER_DIMENSION_COLUMN_HOLDER

      public static final int ROUGH_OVERHEAD_PER_DIMENSION_COLUMN_HOLDER
      See Also:
    • ROUGH_OVERHEAD_PER_METRIC_COLUMN_HOLDER

      public static final int ROUGH_OVERHEAD_PER_METRIC_COLUMN_HOLDER
      See Also:
    • ROUGH_OVERHEAD_PER_TIME_COLUMN_HOLDER

      public static final int ROUGH_OVERHEAD_PER_TIME_COLUMN_HOLDER
      See Also:
    • ROUGH_OVERHEAD_PER_SINK

      public static final int ROUGH_OVERHEAD_PER_SINK
      See Also:
    • ROUGH_OVERHEAD_PER_HYDRANT

      public static final int ROUGH_OVERHEAD_PER_HYDRANT
      See Also:
  • Method Details

    • getId

      public String getId()
      Description copied from interface: Appenderator
      Return the identifier of this Appenderator; useful for log messages and such.
      Specified by:
      getId in interface Appenderator
    • getDataSource

      public String getDataSource()
      Description copied from interface: Appenderator
      Return the name of the dataSource associated with this Appenderator.
      Specified by:
      getDataSource in interface Appenderator
    • startJob

      public Object startJob()
      Description copied from interface: Appenderator
      Perform any initial setup. Should be called before using any other methods.
      Specified by:
      startJob in interface Appenderator
      Returns:
      currently persisted commit metadata
    • add

      public Appenderator.AppenderatorAddResult add(SegmentIdWithShardSpec identifier, InputRow row, @Nullable com.google.common.base.Supplier<Committer> committerSupplier, boolean allowIncrementalPersists) throws SegmentNotWritableException
      Description copied from interface: Appenderator
      Add a row. Must not be called concurrently from multiple threads.

      If no pending segment exists for the provided identifier, a new one will be created.

      This method may trigger a Appenderator.persistAll(Committer) using the supplied Committer. If it does this, the Committer is guaranteed to be *created* synchronously with the call to add, but will actually be used asynchronously.

      If committer is not provided, no metadata is persisted.

      Specified by:
      add in interface Appenderator
      Parameters:
      identifier - the segment into which this row should be added
      row - the row to add
      committerSupplier - supplier of a committer associated with all data that has been added, including this row if allowIncrementalPersists is set to false then this will not be used as no persist will be done automatically
      allowIncrementalPersists - indicate whether automatic persist should be performed or not if required. If this flag is set to false then the return value should have Appenderator.AppenderatorAddResult.isPersistRequired set to true if persist was skipped because of this flag and it is assumed that the responsibility of calling Appenderator.persistAll(Committer) is on the caller.
      Returns:
      Appenderator.AppenderatorAddResult
      Throws:
      SegmentNotWritableException - if the requested segment is known, but has been closed
    • getSegments

      public List<SegmentIdWithShardSpec> getSegments()
      Description copied from interface: Appenderator
      Returns a list of all currently active segments.
      Specified by:
      getSegments in interface Appenderator
    • getRowCount

      public int getRowCount(SegmentIdWithShardSpec identifier)
      Description copied from interface: Appenderator
      Returns the number of rows in a particular pending segment.
      Specified by:
      getRowCount in interface Appenderator
      Parameters:
      identifier - segment to examine
      Returns:
      row count
    • getTotalRowCount

      public int getTotalRowCount()
      Description copied from interface: Appenderator
      Returns the number of total rows in this appenderator of all segments pending push.
      Specified by:
      getTotalRowCount in interface Appenderator
      Returns:
      total number of rows
    • getQueryRunnerForIntervals

      public <T> QueryRunner<T> getQueryRunnerForIntervals(Query<T> query, Iterable<org.joda.time.Interval> intervals)
      Specified by:
      getQueryRunnerForIntervals in interface QuerySegmentWalker
    • getQueryRunnerForSegments

      public <T> QueryRunner<T> getQueryRunnerForSegments(Query<T> query, Iterable<SegmentDescriptor> specs)
      Specified by:
      getQueryRunnerForSegments in interface QuerySegmentWalker
    • clear

      public void clear() throws InterruptedException
      Description copied from interface: Appenderator
      Drop all in-memory and on-disk data, and forget any previously-remembered commit metadata. This could be useful if, for some reason, rows have been added that we do not actually want to hand off. Blocks until all data has been cleared. This may take some time, since all pending persists must finish first.
      Specified by:
      clear in interface Appenderator
      Throws:
      InterruptedException
    • drop

      public com.google.common.util.concurrent.ListenableFuture<?> drop(SegmentIdWithShardSpec identifier)
      Description copied from interface: Appenderator
      Schedule dropping all data associated with a particular pending segment. Unlike Appenderator.clear()), any on-disk commit metadata will remain unchanged. If there is no pending segment with this identifier, then this method will do nothing.

      You should not write to the dropped segment after calling "drop". If you need to drop all your data and re-write it, consider Appenderator.clear() instead. This method might be called concurrently from a thread different from the "main data appending / indexing thread", from where all other methods in this class (except those inherited from QuerySegmentWalker) are called. This typically happens when drop() is called in an async future callback. drop() itself is cheap and relays heavy dropping work to an internal executor of this Appenderator.

      Specified by:
      drop in interface Appenderator
      Parameters:
      identifier - the pending segment to drop
      Returns:
      future that resolves when data is dropped
    • persistAll

      public com.google.common.util.concurrent.ListenableFuture<Object> persistAll(@Nullable Committer committer)
      Description copied from interface: Appenderator
      Persist any in-memory indexed data to durable storage. This may be only somewhat durable, e.g. the machine's local disk. The Committer will be made synchronously with the call to persistAll, but will actually be used asynchronously. Any metadata returned by the committer will be associated with the data persisted to disk.

      If committer is not provided, no metadata is persisted.

      Specified by:
      persistAll in interface Appenderator
      Parameters:
      committer - a committer associated with all data that has been added so far
      Returns:
      future that resolves when all pending data has been persisted, contains commit metadata for this persist
    • push

      public com.google.common.util.concurrent.ListenableFuture<SegmentsAndCommitMetadata> push(Collection<SegmentIdWithShardSpec> identifiers, @Nullable Committer committer, boolean useUniquePath)
      Description copied from interface: Appenderator
      Merge and push particular segments to deep storage. This will trigger an implicit Appenderator.persistAll(Committer) using the provided Committer.

      After this method is called, you cannot add new data to any segments that were previously under construction.

      If committer is not provided, no metadata is persisted.

      Specified by:
      push in interface Appenderator
      Parameters:
      identifiers - list of segments to push
      committer - a committer associated with all data that has been added so far
      useUniquePath - true if the segment should be written to a path with a unique identifier
      Returns:
      future that resolves when all segments have been pushed. The segment list will be the list of segments that have been pushed and the commit metadata from the Committer.
    • close

      public void close()
      Description copied from interface: Appenderator
      Stop any currently-running processing and clean up after ourselves. This allows currently running persists and pushes to finish. This will not remove any on-disk persisted data, but it will drop any data that has not yet been persisted.
      Specified by:
      close in interface Appenderator
    • closeNow

      public void closeNow()
      Unannounce the segments and wait for outstanding persists to finish. Do not unlock base persist dir as we are not waiting for push executor to shut down relying on current JVM to shutdown to not cause any locking problem if the task is restored. In case when task is restored and current task is still active because of push executor (which it shouldn't be since push executor starts daemon threads) then the locking should fail and new task should fail to start. This also means that this method should only be called when task is shutting down.
      Specified by:
      closeNow in interface Appenderator
    • registerUpgradedPendingSegment

      public void registerUpgradedPendingSegment(PendingSegmentRecord pendingSegmentRecord) throws IOException
      Throws:
      IOException