IN - The data type of the input data set.OUT - The data type of the returned data set.public abstract class SingleInputUdfOperator<IN,OUT,O extends SingleInputUdfOperator<IN,OUT,O>> extends SingleInputOperator<IN,OUT,O> implements UdfOperator<O>
RichMapFunction or
RichReduceFunction).
This class encapsulates utilities for the UDFs, such as broadcast variables, parameterization through configuration objects, and semantic properties.
| Modifier | Constructor and Description |
|---|---|
protected |
SingleInputUdfOperator(DataSet<IN> input,
TypeInformation<OUT> resultType)
Creates a new operators with the given data set as input.
|
| Modifier and Type | Method and Description |
|---|---|
protected void |
extractSemanticAnnotationsFromUdf(Class<?> udfClass) |
Map<String,DataSet<?>> |
getBroadcastSets()
Gets the broadcast sets (name and data set) that have been added to context of the UDF.
|
Configuration |
getParameters()
Gets the configuration parameters that will be passed to the UDF's open method
AbstractRichFunction.open(Configuration). |
SingleInputSemanticProperties |
getSemanticProperties()
Gets the semantic properties that have been set for the user-defined functions (UDF).
|
void |
setSemanticProperties(SingleInputSemanticProperties properties)
Sets the semantic properties for the user-defined function (UDF).
|
O |
withBroadcastSet(DataSet<?> data,
String name)
Adds a certain data set as a broadcast set to this operator.
|
O |
withConstantSet(String... constantSet)
Adds a constant-set annotation for the UDF.
|
O |
withParameters(Configuration parameters)
Sets the configuration parameters for the UDF.
|
getInput, getInputType, translateToDataFlowgetName, getParallelism, getResultType, name, setParallelismaggregate, checkSameExecutionContext, clean, coGroup, cross, crossWithHuge, crossWithTiny, distinct, distinct, distinct, distinct, filter, first, flatMap, getExecutionEnvironment, getType, groupBy, groupBy, groupBy, iterate, iterateDelta, join, join, joinWithHuge, joinWithTiny, map, mapPartition, max, maxBy, min, minBy, output, partitionByHash, partitionByHash, partitionByHash, partitionCustom, partitionCustom, partitionCustom, print, printToErr, project, rebalance, reduce, reduceGroup, runOperation, sum, union, write, write, writeAsCsv, writeAsCsv, writeAsCsv, writeAsCsv, writeAsFormattedText, writeAsFormattedText, writeAsText, writeAsTextprotected SingleInputUdfOperator(DataSet<IN> input, TypeInformation<OUT> resultType)
input - The data set that is the input to the operator.resultType - The type of the elements in the resulting data set.protected void extractSemanticAnnotationsFromUdf(Class<?> udfClass)
public O withParameters(Configuration parameters)
UdfOperatorAbstractRichFunction.open(Configuration) method.withParameters in interface UdfOperator<O extends SingleInputUdfOperator<IN,OUT,O>>parameters - The configuration parameters for the UDF.public O withBroadcastSet(DataSet<?> data, String name)
UdfOperatorRuntimeContext.getBroadcastVariable(String).
The runtime context itself is available in all UDFs via
AbstractRichFunction.getRuntimeContext().withBroadcastSet in interface UdfOperator<O extends SingleInputUdfOperator<IN,OUT,O>>data - The data set to be broadcasted.name - The name under which the broadcast data set retrieved.public O withConstantSet(String... constantSet)
Constant set annotations are used by the optimizer to infer the existence of data properties (sorted, partitioned, grouped).
In certain cases, these annotations allow the optimizer to generate a more efficient execution plan which can lead to improved performance.
Constant set annotations can only be specified if the second input and the output type of the UDF are of
Tuple data types.
A constant-set annotation is a set of constant field specifications. The constant field specification String "4->3" specifies, that this UDF copies the fourth field of an input tuple to the third field of the output tuple. Field references are zero-indexed.
NOTICE: Constant set annotations are optional, but if given need to be correct. Otherwise, the program might produce wrong results!
constantSet - A list of constant field specification Strings.public Map<String,DataSet<?>> getBroadcastSets()
UdfOperatorUdfOperator.withBroadcastSet(DataSet, String).getBroadcastSets in interface UdfOperator<O extends SingleInputUdfOperator<IN,OUT,O>>public Configuration getParameters()
UdfOperatorAbstractRichFunction.open(Configuration).
The configuration is set via the UdfOperator.withParameters(Configuration)
method.getParameters in interface UdfOperator<O extends SingleInputUdfOperator<IN,OUT,O>>public SingleInputSemanticProperties getSemanticProperties()
UdfOperatorgetSemanticProperties in interface UdfOperator<O extends SingleInputUdfOperator<IN,OUT,O>>public void setSemanticProperties(SingleInputSemanticProperties properties)
UdfOperator.getSemanticProperties().properties - The semantic properties for the UDF.UdfOperator.getSemanticProperties()Copyright © 2015 The Apache Software Foundation. All rights reserved.