Class UnifiedIndexerAppenderatorsManager
java.lang.Object
org.apache.druid.segment.realtime.appenderator.UnifiedIndexerAppenderatorsManager
- All Implemented Interfaces:
AppenderatorsManager
Manages
Appenderator instances for the CliIndexer task execution service, which runs all tasks in
a single process.
This class keeps a map of UnifiedIndexerAppenderatorsManager.DatasourceBundle objects, keyed by datasource name. Each bundle contains:
- A per-datasource SinkQuerySegmentWalker (with an associated per-datasource timeline)
- A map that associates a taskId with a list of Appenderators created for that task
Access to the datasource bundle map and the task->appenderator maps is synchronized. The methods
on this class can be called concurrently from multiple task threads. If there are no remaining
appenderators for a given datasource, the corresponding bundle will be removed from the bundle map.
Appenderators created by this class will use the shared per-datasource SinkQuerySegmentWalkers.
The per-datasource SinkQuerySegmentWalkers share a common queryExecutorService.
Each task that requests an Appenderator from this AppenderatorsManager will receive a heap memory limit
equal to WorkerConfig.globalIngestionHeapLimitBytes evenly divided by WorkerConfig.capacity.
This assumes that each task will only ingest to one Appenderator simultaneously.
The Appenderators created by this class share an executor pool for IndexMerger persist
and merge operations, with concurrent operations limited to `druid.worker.capacity` divided 2. This limit is imposed
to reduce overall memory usage.-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionclassstatic classThis wrapper around IndexMerger limits concurrent calls to the merge/persist methods used byStreamAppenderatorwith a shared executor service. -
Constructor Summary
ConstructorsConstructorDescriptionUnifiedIndexerAppenderatorsManager(QueryProcessingPool queryProcessingPool, WorkerConfig workerConfig, Cache cache, CacheConfig cacheConfig, CachePopulatorStats cachePopulatorStats, PolicyEnforcer policyEnforcer, com.fasterxml.jackson.databind.ObjectMapper objectMapper, ServiceEmitter serviceEmitter, com.google.inject.Provider<QueryRunnerFactoryConglomerate> queryRunnerFactoryConglomerateProvider) -
Method Summary
Modifier and TypeMethodDescriptioncreateBatchAppenderatorForTask(String taskId, DataSchema schema, AppenderatorConfig config, TaskDirectory taskDirectory, SegmentGenerationMetrics metrics, DataSegmentPusher dataSegmentPusher, com.fasterxml.jackson.databind.ObjectMapper objectMapper, IndexIO indexIO, IndexMerger indexMerger, RowIngestionMeters rowIngestionMeters, ParseExceptionHandler parseExceptionHandler, CentralizedDatasourceSchemaConfig centralizedDatasourceSchemaConfig) Creates aBatchAppenderatorsuitable for batch ingestion with no ability to process queries against the processed data.createRealtimeAppenderatorForTask(SegmentLoaderConfig segmentLoaderConfig, String taskId, DataSchema schema, AppenderatorConfig config, TaskDirectory taskDirectory, SegmentGenerationMetrics metrics, DataSegmentPusher dataSegmentPusher, com.fasterxml.jackson.databind.ObjectMapper objectMapper, IndexIO indexIO, IndexMerger indexMerger, QueryRunnerFactoryConglomerate conglomerate, DataSegmentAnnouncer segmentAnnouncer, ServiceEmitter emitter, QueryProcessingPool queryProcessingPool, JoinableFactory joinableFactory, Cache cache, CacheConfig cacheConfig, CachePopulatorStats cachePopulatorStats, PolicyEnforcer policyEnforcer, RowIngestionMeters rowIngestionMeters, ParseExceptionHandler parseExceptionHandler, CentralizedDatasourceSchemaConfig centralizedDatasourceSchemaConfig) Creates anStreamAppenderatorsuited for realtime ingestion.<T> QueryRunner<T>getQueryRunnerForIntervals(Query<T> query, Iterable<org.joda.time.Interval> intervals) Returns a query runner for the given intervals over the Appenderators managed by this AppenderatorsManager.<T> QueryRunner<T>getQueryRunnerForSegments(Query<T> query, Iterable<SegmentDescriptor> specs) Returns a query runner for the given segment specs over the Appenderators managed by this AppenderatorsManager.voidremoveAppenderatorsForTask(String taskId, String dataSource) Removes any internal Appenderator-tracking state associated with the provided taskId.booleanAs AppenderatorsManager implementions are service dependent (i.e., Peons and Indexers have different impls), this method allows Tasks to know whether they should announce themselves as nodes and segment servers to the rest of the cluster.voidshutdown()Shut down the AppenderatorsManager.
-
Constructor Details
-
UnifiedIndexerAppenderatorsManager
@Inject public UnifiedIndexerAppenderatorsManager(QueryProcessingPool queryProcessingPool, WorkerConfig workerConfig, Cache cache, CacheConfig cacheConfig, CachePopulatorStats cachePopulatorStats, PolicyEnforcer policyEnforcer, com.fasterxml.jackson.databind.ObjectMapper objectMapper, ServiceEmitter serviceEmitter, com.google.inject.Provider<QueryRunnerFactoryConglomerate> queryRunnerFactoryConglomerateProvider)
-
-
Method Details
-
createRealtimeAppenderatorForTask
public Appenderator createRealtimeAppenderatorForTask(SegmentLoaderConfig segmentLoaderConfig, String taskId, DataSchema schema, AppenderatorConfig config, TaskDirectory taskDirectory, SegmentGenerationMetrics metrics, DataSegmentPusher dataSegmentPusher, com.fasterxml.jackson.databind.ObjectMapper objectMapper, IndexIO indexIO, IndexMerger indexMerger, QueryRunnerFactoryConglomerate conglomerate, DataSegmentAnnouncer segmentAnnouncer, ServiceEmitter emitter, QueryProcessingPool queryProcessingPool, JoinableFactory joinableFactory, Cache cache, CacheConfig cacheConfig, CachePopulatorStats cachePopulatorStats, PolicyEnforcer policyEnforcer, RowIngestionMeters rowIngestionMeters, ParseExceptionHandler parseExceptionHandler, CentralizedDatasourceSchemaConfig centralizedDatasourceSchemaConfig) Description copied from interface:AppenderatorsManagerCreates anStreamAppenderatorsuited for realtime ingestion. Note that this method's parameters include objects used for query processing. Intermediary segments are persisted to disk and memory mapped to be available for query processing.- Specified by:
createRealtimeAppenderatorForTaskin interfaceAppenderatorsManager
-
createBatchAppenderatorForTask
public Appenderator createBatchAppenderatorForTask(String taskId, DataSchema schema, AppenderatorConfig config, TaskDirectory taskDirectory, SegmentGenerationMetrics metrics, DataSegmentPusher dataSegmentPusher, com.fasterxml.jackson.databind.ObjectMapper objectMapper, IndexIO indexIO, IndexMerger indexMerger, RowIngestionMeters rowIngestionMeters, ParseExceptionHandler parseExceptionHandler, CentralizedDatasourceSchemaConfig centralizedDatasourceSchemaConfig) Description copied from interface:AppenderatorsManagerCreates aBatchAppenderatorsuitable for batch ingestion with no ability to process queries against the processed data. Intermediary segments are persisted to temporary disk and then merged into the final set of segments at publishing time.- Specified by:
createBatchAppenderatorForTaskin interfaceAppenderatorsManager
-
removeAppenderatorsForTask
Description copied from interface:AppenderatorsManagerRemoves any internal Appenderator-tracking state associated with the provided taskId. This method should be called when a task is finished using its Appenderators that were previously created by createRealtimeAppenderatorForTask or createOfflineAppenderatorForTask. The method can be called by the entity managing Tasks when the Tasks finish, such as ThreadingTaskRunner.- Specified by:
removeAppenderatorsForTaskin interfaceAppenderatorsManager
-
getQueryRunnerForIntervals
public <T> QueryRunner<T> getQueryRunnerForIntervals(Query<T> query, Iterable<org.joda.time.Interval> intervals) Description copied from interface:AppenderatorsManagerReturns a query runner for the given intervals over the Appenderators managed by this AppenderatorsManager.- Specified by:
getQueryRunnerForIntervalsin interfaceAppenderatorsManager
-
getQueryRunnerForSegments
public <T> QueryRunner<T> getQueryRunnerForSegments(Query<T> query, Iterable<SegmentDescriptor> specs) Description copied from interface:AppenderatorsManagerReturns a query runner for the given segment specs over the Appenderators managed by this AppenderatorsManager.- Specified by:
getQueryRunnerForSegmentsin interfaceAppenderatorsManager
-
shouldTaskMakeNodeAnnouncements
public boolean shouldTaskMakeNodeAnnouncements()Description copied from interface:AppenderatorsManagerAs AppenderatorsManager implementions are service dependent (i.e., Peons and Indexers have different impls), this method allows Tasks to know whether they should announce themselves as nodes and segment servers to the rest of the cluster. Only Tasks running in Peons (i.e., as separate processes) should make their own individual node announcements.- Specified by:
shouldTaskMakeNodeAnnouncementsin interfaceAppenderatorsManager
-
shutdown
public void shutdown()Description copied from interface:AppenderatorsManagerShut down the AppenderatorsManager.- Specified by:
shutdownin interfaceAppenderatorsManager
-
getDatasourceBundles
-
getWorkerConfig
-