public class TemporalRowTimeJoinOperator extends BaseTwoInputStreamOperatorWithStateRetention
Cleaning up the state drops all of the "old" values from the probe side, where "old" is defined as older then the current watermark. Build side is also cleaned up in the similar fashion, however we always keep at least one record - the latest one - even if it's past the last watermark.
One more trick is how the emitting results and cleaning up is triggered. It is achieved by registering timers for the keys. We could register a timer for every probe and build side element's event time (when watermark exceeds this timer, that's when we are emitting and/or cleaning up the state). However this would cause huge number of registered timers. For example with following evenTimes of probe records accumulated: {1, 2, 5, 8, 9}, if we had received Watermark(10), it would trigger 5 separate timers for the same key. To avoid that we always keep only one single registered timer for any given key, registered for the minimal value. Upon triggering it, we process all records with event times older then or equal to currentWatermark.
stateCleaningEnabled| 构造器和说明 |
|---|
TemporalRowTimeJoinOperator(RowDataTypeInfo leftType,
RowDataTypeInfo rightType,
GeneratedJoinCondition generatedJoinCondition,
int leftTimeAttribute,
int rightTimeAttribute,
long minRetentionTime,
long maxRetentionTime) |
| 限定符和类型 | 方法和说明 |
|---|---|
void |
cleanupState(long time)
The method to be called when a cleanup timer fires.
|
void |
close() |
void |
onEventTime(org.apache.flink.streaming.api.operators.InternalTimer<Object,org.apache.flink.runtime.state.VoidNamespace> timer) |
void |
open() |
void |
processElement1(org.apache.flink.streaming.runtime.streamrecord.StreamRecord<org.apache.flink.table.data.RowData> element) |
void |
processElement2(org.apache.flink.streaming.runtime.streamrecord.StreamRecord<org.apache.flink.table.data.RowData> element) |
cleanupLastTimer, onProcessingTime, registerProcessingCleanupTimerdispose, getChainingStrategy, getContainingTask, getCurrentKey, getExecutionConfig, getInternalTimerService, getKeyedStateBackend, getKeyedStateStore, getMetricGroup, getOperatorConfig, getOperatorID, getOperatorName, getOperatorStateBackend, getOrCreateKeyedState, getPartitionedState, getPartitionedState, getProcessingTimeService, getRuntimeContext, getTimeServiceManager, getUserCodeClassloader, initializeState, initializeState, notifyCheckpointAborted, notifyCheckpointComplete, numEventTimeTimers, numProcessingTimeTimers, prepareSnapshotPreBarrier, processLatencyMarker, processLatencyMarker1, processLatencyMarker2, processWatermark, processWatermark1, processWatermark2, reportOrForwardLatencyMarker, setChainingStrategy, setCurrentKey, setKeyContextElement1, setKeyContextElement2, setProcessingTimeService, setup, snapshotState, snapshotStateclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitprocessLatencyMarker1, processLatencyMarker2, processWatermark1, processWatermark2dispose, getMetricGroup, getOperatorID, initializeState, prepareSnapshotPreBarrier, setKeyContextElement1, setKeyContextElement2, snapshotStatepublic TemporalRowTimeJoinOperator(RowDataTypeInfo leftType, RowDataTypeInfo rightType, GeneratedJoinCondition generatedJoinCondition, int leftTimeAttribute, int rightTimeAttribute, long minRetentionTime, long maxRetentionTime)
public void open()
throws Exception
open 在接口中 org.apache.flink.streaming.api.operators.StreamOperator<org.apache.flink.table.data.RowData>open 在类中 BaseTwoInputStreamOperatorWithStateRetentionExceptionpublic void processElement1(org.apache.flink.streaming.runtime.streamrecord.StreamRecord<org.apache.flink.table.data.RowData> element)
throws Exception
Exceptionpublic void processElement2(org.apache.flink.streaming.runtime.streamrecord.StreamRecord<org.apache.flink.table.data.RowData> element)
throws Exception
Exceptionpublic void onEventTime(org.apache.flink.streaming.api.operators.InternalTimer<Object,org.apache.flink.runtime.state.VoidNamespace> timer) throws Exception
Exceptionpublic void close()
throws Exception
close 在接口中 org.apache.flink.streaming.api.operators.StreamOperator<org.apache.flink.table.data.RowData>close 在类中 org.apache.flink.streaming.api.operators.AbstractStreamOperator<org.apache.flink.table.data.RowData>Exceptionpublic void cleanupState(long time)
cleanupState 在类中 BaseTwoInputStreamOperatorWithStateRetentiontime - The timestamp of the fired timer.Copyright © 2014–2020 The Apache Software Foundation. All rights reserved.