@Alpha public class DagManager extends com.google.common.util.concurrent.AbstractIdleService
Dag. A Dag is submitted to the
DagManager by the Orchestrator.orchestrate(Spec) method. On receiving a Dag, the
DagManager first persists the Dag to the DagStateStore, and then submits it to the specific
DagManager.DagManagerThread's BlockingQueue based on the flowExecutionId of the Flow.
This guarantees that each Dag received by the DagManager can be recovered in case of a leadership
change or service restart.
The implementation of the DagManager is multi-threaded. Each DagManager.DagManagerThread polls the
BlockingQueue for new Dag submissions at fixed intervals. It deques any newly submitted Dags and coordinates
the execution of individual jobs in the Dag. The coordination logic involves polling the JobStatuses of running
jobs. Upon completion of a job, it will either schedule the next job in the Dag (on SUCCESS) or mark the Dag as failed
(on FAILURE). Upon completion of a Dag execution, it will perform the required clean up actions.
For deleteSpec/cancellation requests for a flow URI, DagManager finds out the flowExecutionId using
JobStatusRetriever, and forwards the request to the DagManager.DagManagerThread which handled the addSpec request
for this flow. We need separate BlockingQueues for each DagManager.DagManagerThread because
cancellation needs the information which is stored only in the same DagManager.DagManagerThread.
The DagManager is active only in the leader mode. To ensure, each Dag managed by a DagManager is
checkpointed to a persistent location. On start up or leadership change,
the DagManager loads all the checkpointed Dags and adds them to the BlockingQueue.
Current implementation supports only FileSystem-based checkpointing of the Dag statuses.| Modifier and Type | Class and Description |
|---|---|
static class |
DagManager.DagManagerThread
Each
DagManager.DagManagerThread performs 2 actions when scheduled:
Dequeues any newly submitted Dags from the Dag queue. |
static class |
DagManager.FailureOption
Action to be performed on a
Dag, in case of a job failure. |
| Modifier and Type | Field and Description |
|---|---|
static String |
DAG_MANAGER_PREFIX |
static String |
DEFAULT_FLOW_FAILURE_OPTION |
static Integer |
DEFAULT_NUM_THREADS |
static String |
JOB_STATUS_POLLING_INTERVAL_KEY |
static String |
NUM_THREADS_KEY |
| Constructor and Description |
|---|
DagManager(com.typesafe.config.Config config) |
DagManager(com.typesafe.config.Config config,
boolean instrumentationEnabled) |
| Modifier and Type | Method and Description |
|---|---|
void |
handleKillFlowEvent(KillFlowEvent killFlowEvent) |
void |
setActive(boolean active)
When a
DagManager becomes active, it loads the serialized representations of the currently running Dags
from the checkpoint directory, deserializes the Dags and adds them to a queue to be consumed by
the DagManager.DagManagerThreads. |
void |
setTopologySpecMap(Map<URI,TopologySpec> topologySpecMap) |
protected void |
shutDown()
Stop the service.
|
protected void |
startUp()
Start the service.
|
void |
stopDag(URI uri)
Method to submit a
URI for cancellation requsts to the DagManager. |
public static final String DEFAULT_FLOW_FAILURE_OPTION
public static final String DAG_MANAGER_PREFIX
public static final Integer DEFAULT_NUM_THREADS
public static final String NUM_THREADS_KEY
public static final String JOB_STATUS_POLLING_INTERVAL_KEY
public DagManager(com.typesafe.config.Config config,
boolean instrumentationEnabled)
public DagManager(com.typesafe.config.Config config)
protected void startUp()
DagManager.DagManagerThreads, which are scheduled at
fixed intervals. The service also loads any DagsstartUp in class com.google.common.util.concurrent.AbstractIdleServicepublic void stopDag(URI uri) throws IOException
URI for cancellation requsts to the DagManager.
The DagManager adds the dag to the BlockingQueue to be picked up by one of the DagManager.DagManagerThreads.IOExceptionpublic void handleKillFlowEvent(KillFlowEvent killFlowEvent)
public void setTopologySpecMap(Map<URI,TopologySpec> topologySpecMap)
public void setActive(boolean active)
DagManager becomes active, it loads the serialized representations of the currently running Dags
from the checkpoint directory, deserializes the Dags and adds them to a queue to be consumed by
the DagManager.DagManagerThreads.active - a boolean to indicate if the DagManager is the leader.