OT - The output type of the wrapped InputFormat.S - The InputSplit type of the wrapped InputFormat.@PublicEvolving public final class ReplicatingInputFormat<OT,S extends InputSplit> extends RichInputFormat<OT,S>
InputFormat to all parallel instances of a DataSource,
i.e., the full input of the replicated InputFormat is completely processed by each parallel instance of the DataSource.
This is done by assigning all InputSplits generated by the
replicated InputFormat to each parallel instance.
Replicated data can only be used as input for a InnerJoinOperatorBase or
CrossOperatorBase with the same parallelism as the DataSource.
Before being used as an input to a Join or Cross operator, replicated data might be processed in local pipelines by
by Map-based operators with the same parallelism as the source. Map-based operators are
MapOperatorBase,
FlatMapOperatorBase,
FilterOperatorBase, and
MapPartitionOperatorBase.
Replicated DataSources can be used for local join processing (no data shipping) if one input is accessible on all
parallel instance of a join and the other input is (randomly) partitioned across all parallel instances.
However, a replicated DataSource is a plan hint that can invalidate a Flink program if not used correctly (see
usage instructions above). In such situations, the optimizer is not able to generate a valid execution plan and
the program execution will fail.| Constructor and Description |
|---|
ReplicatingInputFormat(InputFormat<OT,S> wrappedIF) |
| Modifier and Type | Method and Description |
|---|---|
void |
close()
Method that marks the end of the life-cycle of an input split.
|
void |
closeInputFormat()
Closes this InputFormat instance.
|
void |
configure(Configuration parameters)
Configures this input format.
|
S[] |
createInputSplits(int minNumSplits)
Creates the different splits of the input that can be processed in parallel.
|
InputSplitAssigner |
getInputSplitAssigner(S[] inputSplits)
Gets the type of the input splits that are processed by this input format.
|
InputFormat<OT,S> |
getReplicatedInputFormat() |
RuntimeContext |
getRuntimeContext() |
BaseStatistics |
getStatistics(BaseStatistics cachedStatistics)
Gets the basic statistics from the input described by this format.
|
OT |
nextRecord(OT reuse)
Reads the next record from the input.
|
void |
open(S split)
Opens a parallel instance of the input format to work on a split.
|
void |
openInputFormat()
Opens this InputFormat instance.
|
boolean |
reachedEnd()
Method used to check if the end of the input is reached.
|
void |
setRuntimeContext(RuntimeContext context) |
public ReplicatingInputFormat(InputFormat<OT,S> wrappedIF)
public InputFormat<OT,S> getReplicatedInputFormat()
public void configure(Configuration parameters)
InputFormatThis method is always called first on a newly instantiated input format.
parameters - The configuration with all parameters.public BaseStatistics getStatistics(BaseStatistics cachedStatistics) throws IOException
InputFormatWhen this method is called, the input format it guaranteed to be configured.
cachedStatistics - The statistics that were cached. May be null.IOExceptionpublic S[] createInputSplits(int minNumSplits) throws IOException
InputFormatWhen this method is called, the input format it guaranteed to be configured.
minNumSplits - The minimum desired number of splits. If fewer are created, some parallel
instances may remain idle.IOException - Thrown, when the creation of the splits was erroneous.public InputSplitAssigner getInputSplitAssigner(S[] inputSplits)
InputFormatpublic void open(S split) throws IOException
InputFormatWhen this method is called, the input format it guaranteed to be configured.
split - The split to be opened.IOException - Thrown, if the spit could not be opened due to an I/O problem.public boolean reachedEnd()
throws IOException
InputFormatWhen this method is called, the input format it guaranteed to be opened.
IOException - Thrown, if an I/O error occurred.public OT nextRecord(OT reuse) throws IOException
InputFormatWhen this method is called, the input format it guaranteed to be opened.
reuse - Object that may be reused.IOException - Thrown, if an I/O error occurred.public void close()
throws IOException
InputFormatWhen this method is called, the input format it guaranteed to be opened.
IOException - Thrown, if the input could not be closed properly.public void setRuntimeContext(RuntimeContext context)
setRuntimeContext in class RichInputFormat<OT,S extends InputSplit>public RuntimeContext getRuntimeContext()
getRuntimeContext in class RichInputFormat<OT,S extends InputSplit>public void openInputFormat()
RichInputFormatopenInputFormat in class RichInputFormat<OT,S extends InputSplit>InputFormatpublic void closeInputFormat()
RichInputFormatRichInputFormat.openInputFormat() should be closed in this method.closeInputFormat in class RichInputFormat<OT,S extends InputSplit>InputFormatCopyright © 2014–2016 The Apache Software Foundation. All rights reserved.