PTransforms for transforming
data in a pipeline.See: Description
| Interface | Description |
|---|---|
| Aggregator<InputT,OutputT> |
An
Aggregator<InputT> enables monitoring of values of type InputT,
to be combined across all bundles. |
| Combine.AccumulatingCombineFn.Accumulator<InputT,AccumT,OutputT> |
The type of mutable accumulator values used by this
AccumulatingCombineFn. |
| CombineFnBase.GlobalCombineFn<InputT,AccumT,OutputT> |
A
GloballyCombineFn<InputT, AccumT, OutputT> specifies how to combine a
collection of input values of type InputT into a single
output value of type OutputT. |
| CombineFnBase.PerKeyCombineFn<K,InputT,AccumT,OutputT> |
A
PerKeyCombineFn<K, InputT, AccumT, OutputT> specifies how to combine
a collection of input values of type InputT, associated with
a key of type K, into a single output value of type
OutputT. |
| CombineWithContext.RequiresContextInternal |
An internal interface for signaling that a
GloballyCombineFn
or a PerKeyCombineFn needs to access CombineWithContext.Context. |
| DoFn.RequiresWindowAccess |
Interface for signaling that a
DoFn needs to access the window the
element is being processed in, via DoFn.ProcessContext.window(). |
| DoFnReflector.DoFnInvoker<InputT,OutputT> |
Interface for invoking the
DoFn processing methods. |
| DoFnWithContext.ExtraContextFactory<InputT,OutputT> |
Interface for runner implementors to provide implementations of extra context information.
|
| Partition.PartitionFn<T> |
A function object that chooses an output partition for an element.
|
| SerializableComparator<T> |
A
Comparator that is also Serializable. |
| SerializableFunction<InputT,OutputT> |
A function that computes an output value of type
OutputT from an input value of type
InputT and is Serializable. |
| Class | Description |
|---|---|
| AggregatorRetriever |
An internal class for extracting
Aggregators from DoFns. |
| AppliedPTransform<InputT extends PInput,OutputT extends POutput,TransformT extends PTransform<? super InputT,OutputT>> |
Represents the application of a
PTransform to a specific input to produce
a specific output. |
| ApproximateQuantiles |
PTransforms for getting an idea of a PCollection's
data distribution using approximate N-tiles (e.g. |
| ApproximateQuantiles.ApproximateQuantilesCombineFn<T,ComparatorT extends Comparator<T> & Serializable> |
The
ApproximateQuantilesCombineFn combiner gives an idea
of the distribution of a collection of values using approximate
N-tiles. |
| ApproximateUnique |
PTransforms for estimating the number of distinct elements
in a PCollection, or the number of distinct values
associated with each key in a PCollection of KVs. |
| ApproximateUnique.ApproximateUniqueCombineFn<T> |
CombineFn that computes an estimate of the number of
distinct values that were combined. |
| ApproximateUnique.ApproximateUniqueCombineFn.LargestUnique |
A heap utility class to efficiently track the largest added elements.
|
| Combine |
PTransforms for combining PCollection elements
globally and per-key. |
| Combine.AccumulatingCombineFn<InputT,AccumT extends Combine.AccumulatingCombineFn.Accumulator<InputT,AccumT,OutputT>,OutputT> |
A
CombineFn that uses a subclass of
Combine.AccumulatingCombineFn.Accumulator as its accumulator
type. |
| Combine.BinaryCombineDoubleFn |
An abstract subclass of
Combine.CombineFn for implementing combiners that are more
easily and efficiently expressed as binary operations on doubles. |
| Combine.BinaryCombineFn<V> |
An abstract subclass of
Combine.CombineFn for implementing combiners that are more
easily expressed as binary operations. |
| Combine.BinaryCombineIntegerFn |
An abstract subclass of
Combine.CombineFn for implementing combiners that are more
easily and efficiently expressed as binary operations on ints |
| Combine.BinaryCombineLongFn |
An abstract subclass of
Combine.CombineFn for implementing combiners that are more
easily and efficiently expressed as binary operations on longs. |
| Combine.CombineFn<InputT,AccumT,OutputT> |
A
CombineFn<InputT, AccumT, OutputT> specifies how to combine a
collection of input values of type InputT into a single
output value of type OutputT. |
| Combine.Globally<InputT,OutputT> |
Combine.Globally<InputT, OutputT> takes a PCollection<InputT>
and returns a PCollection<OutputT> whose elements are the result of
combining all the elements in each window of the input PCollection,
using a specified CombineFn<InputT, AccumT, OutputT>. |
| Combine.GloballyAsSingletonView<InputT,OutputT> |
Combine.GloballyAsSingletonView<InputT, OutputT> takes a PCollection<InputT>
and returns a PCollectionView<OutputT> whose elements are the result of
combining all the elements in each window of the input PCollection,
using a specified CombineFn<InputT, AccumT, OutputT>. |
| Combine.GroupedValues<K,InputT,OutputT> |
GroupedValues<K, InputT, OutputT> takes a
PCollection<KV<K, Iterable<InputT>>>, such as the result of
GroupByKey, applies a specified
KeyedCombineFn<K, InputT, AccumT, OutputT>
to each of the input KV<K, Iterable<InputT>> elements to
produce a combined output KV<K, OutputT> element, and returns a
PCollection<KV<K, OutputT>> containing all the combined output
elements. |
| Combine.Holder<V> |
Holds a single value value of type
V which may or may not be present. |
| Combine.IterableCombineFn<V> | |
| Combine.KeyedCombineFn<K,InputT,AccumT,OutputT> |
A
KeyedCombineFn<K, InputT, AccumT, OutputT> specifies how to combine
a collection of input values of type InputT, associated with
a key of type K, into a single output value of type
OutputT. |
| Combine.PerKey<K,InputT,OutputT> |
PerKey<K, InputT, OutputT> takes a
PCollection<KV<K, InputT>>, groups it by key, applies a
combining function to the InputT values associated with each
key to produce a combined OutputT value, and returns a
PCollection<KV<K, OutputT>> representing a map from each
distinct key of the input PCollection to the corresponding
combined value. |
| Combine.PerKeyWithHotKeyFanout<K,InputT,OutputT> |
Like
Combine.PerKey, but sharding the combining of hot keys. |
| Combine.SimpleCombineFn<V> | Deprecated |
| CombineFnBase |
This class contains the shared interfaces and abstract classes for different types of combine
functions.
|
| CombineFns |
Static utility methods that create combine function instances.
|
| CombineFns.CoCombineResult |
A tuple of outputs produced by a composed combine functions.
|
| CombineFns.ComposeCombineFnBuilder |
A builder class to construct a composed
CombineFnBase.GlobalCombineFn. |
| CombineFns.ComposedCombineFn<DataT> |
A composed
Combine.CombineFn that applies multiple CombineFns. |
| CombineFns.ComposedCombineFnWithContext<DataT> |
A composed
CombineWithContext.CombineFnWithContext that applies multiple
CombineFnWithContexts. |
| CombineFns.ComposedKeyedCombineFn<DataT,K> |
A composed
Combine.KeyedCombineFn that applies multiple KeyedCombineFns. |
| CombineFns.ComposedKeyedCombineFnWithContext<DataT,K> |
A composed
CombineWithContext.KeyedCombineFnWithContext that applies multiple
KeyedCombineFnWithContexts. |
| CombineFns.ComposeKeyedCombineFnBuilder |
A builder class to construct a composed
CombineFnBase.PerKeyCombineFn. |
| CombineWithContext |
This class contains combine functions that have access to
PipelineOptions and side inputs
through CombineWithContext.Context. |
| CombineWithContext.CombineFnWithContext<InputT,AccumT,OutputT> |
A combine function that has access to
PipelineOptions and side inputs through
CombineWithContext.Context. |
| CombineWithContext.Context |
Information accessible to all methods in
CombineFnWithContext
and KeyedCombineFnWithContext. |
| CombineWithContext.KeyedCombineFnWithContext<K,InputT,AccumT,OutputT> |
A keyed combine function that has access to
PipelineOptions and side inputs through
CombineWithContext.Context. |
| Count |
PTransorms to count the elements in a PCollection. |
| Count.PerElement<T> |
Count.PerElement<T> takes a PCollection<T> and returns a
PCollection<KV<T, Long>> representing a map from each distinct element of the input
PCollection to the number of times that element occurs in the input. |
| Create<T> |
Create<T> takes a collection of elements of type T
known when the pipeline is constructed and returns a
PCollection<T> containing the elements. |
| Create.TimestampedValues<T> |
A
PTransform that creates a PCollection whose elements have
associated timestamps. |
| Create.Values<T> |
A
PTransform that creates a PCollection from a set of in-memory objects. |
| DoFn<InputT,OutputT> |
The argument to
ParDo providing the code to use to process
elements of the input
PCollection. |
| DoFnReflector |
Utility implementing the necessary reflection for working with
DoFnWithContexts. |
| DoFnTester<InputT,OutputT> |
A harness for unit-testing a
DoFn. |
| DoFnWithContext<InputT,OutputT> |
The argument to
ParDo providing the code to use to process
elements of the input
PCollection. |
| Filter<T> |
PTransforms for filtering from a PCollection the
elements satisfying a predicate, or satisfying an inequality with
a given value based on the elements' natural ordering. |
| FlatMapElements<InputT,OutputT> |
PTransforms for mapping a simple function that returns iterables over the elements of a
PCollection and merging the results. |
| FlatMapElements.MissingOutputTypeDescriptor<InputT,OutputT> |
An intermediate builder for a
FlatMapElements transform. |
| Flatten |
Flatten<T> takes multiple PCollection<T>s bundled
into a PCollectionList<T> and returns a single
PCollection<T> containing all the elements in all the input
PCollections. |
| Flatten.FlattenIterables<T> |
FlattenIterables<T> takes a PCollection<Iterable<T>> and returns a
PCollection<T> that contains all the elements from each iterable. |
| Flatten.FlattenPCollectionList<T> |
A
PTransform that flattens a PCollectionList
into a PCollection containing all the elements of all
the PCollections in its input. |
| GroupByKey<K,V> |
GroupByKey<K, V> takes a PCollection<KV<K, V>>,
groups the values by key and windows, and returns a
PCollection<KV<K, Iterable<V>>> representing a map from
each distinct key and window of the input PCollection to an
Iterable over all the values associated with that key in
the input per window. |
| IntraBundleParallelization |
Provides multi-threading of
DoFns, using threaded execution to
process multiple elements concurrently within a bundle. |
| IntraBundleParallelization.Bound<InputT,OutputT> |
A
PTransform that, when applied to a PCollection<InputT>,
invokes a user-specified DoFn<InputT, OutputT> on all its elements,
with all its outputs collected into an output
PCollection<OutputT>. |
| IntraBundleParallelization.MultiThreadedIntraBundleProcessingDoFn<InputT,OutputT> |
A multi-threaded
DoFn wrapper. |
| IntraBundleParallelization.Unbound |
An incomplete
IntraBundleParallelization transform, with unbound input/output types. |
| Keys<K> |
Keys<K> takes a PCollection of KV<K, V>s and
returns a PCollection<K> of the keys. |
| KvSwap<K,V> |
KvSwap<K, V> takes a PCollection<KV<K, V>> and
returns a PCollection<KV<V, K>>, where all the keys and
values have been swapped. |
| MapElements<InputT,OutputT> |
PTransforms for mapping a simple function over the elements of a PCollection. |
| MapElements.MissingOutputTypeDescriptor<InputT,OutputT> |
An intermediate builder for a
MapElements transform. |
| Max |
PTransforms for computing the maximum of the elements in a PCollection, or the
maximum of the values associated with each key in a PCollection of KVs. |
| Max.MaxDoubleFn |
A
CombineFn that computes the maximum of a collection of Doubles, useful as an
argument to Combine.globally(org.apache.beam.sdk.transforms.SerializableFunction<java.lang.Iterable<V>, V>) or Combine.perKey(org.apache.beam.sdk.transforms.SerializableFunction<java.lang.Iterable<V>, V>). |
| Max.MaxFn<T> |
A
CombineFn that computes the maximum of a collection of elements of type T
using an arbitrary Comparator, useful as an argument to Combine.globally(org.apache.beam.sdk.transforms.SerializableFunction<java.lang.Iterable<V>, V>) or
Combine.perKey(org.apache.beam.sdk.transforms.SerializableFunction<java.lang.Iterable<V>, V>). |
| Max.MaxIntegerFn |
A
CombineFn that computes the maximum of a collection of Integers, useful as an
argument to Combine.globally(org.apache.beam.sdk.transforms.SerializableFunction<java.lang.Iterable<V>, V>) or Combine.perKey(org.apache.beam.sdk.transforms.SerializableFunction<java.lang.Iterable<V>, V>). |
| Max.MaxLongFn |
A
CombineFn that computes the maximum of a collection of Longs, useful as an
argument to Combine.globally(org.apache.beam.sdk.transforms.SerializableFunction<java.lang.Iterable<V>, V>) or Combine.perKey(org.apache.beam.sdk.transforms.SerializableFunction<java.lang.Iterable<V>, V>). |
| Mean |
PTransforms for computing the arithmetic mean
(a.k.a. |
| Min |
PTransforms for computing the minimum of the elements in a PCollection, or the
minimum of the values associated with each key in a PCollection of KVs. |
| Min.MinDoubleFn |
A
CombineFn that computes the minimum of a collection of Doubles, useful as an
argument to Combine.globally(org.apache.beam.sdk.transforms.SerializableFunction<java.lang.Iterable<V>, V>) or Combine.perKey(org.apache.beam.sdk.transforms.SerializableFunction<java.lang.Iterable<V>, V>). |
| Min.MinFn<T> |
A
CombineFn that computes the maximum of a collection of elements of type T
using an arbitrary Comparator, useful as an argument to Combine.globally(org.apache.beam.sdk.transforms.SerializableFunction<java.lang.Iterable<V>, V>) or
Combine.perKey(org.apache.beam.sdk.transforms.SerializableFunction<java.lang.Iterable<V>, V>). |
| Min.MinIntegerFn |
A
CombineFn that computes the minimum of a collection of Integers, useful as an
argument to Combine.globally(org.apache.beam.sdk.transforms.SerializableFunction<java.lang.Iterable<V>, V>) or Combine.perKey(org.apache.beam.sdk.transforms.SerializableFunction<java.lang.Iterable<V>, V>). |
| Min.MinLongFn |
A
CombineFn that computes the minimum of a collection of Longs, useful as an
argument to Combine.globally(org.apache.beam.sdk.transforms.SerializableFunction<java.lang.Iterable<V>, V>) or Combine.perKey(org.apache.beam.sdk.transforms.SerializableFunction<java.lang.Iterable<V>, V>). |
| ParDo |
ParDo is the core element-wise transform in Google Cloud
Dataflow, invoking a user-specified function on each of the elements of the input
PCollection to produce zero or more output elements, all
of which are collected into the output PCollection. |
| ParDo.Bound<InputT,OutputT> |
A
PTransform that, when applied to a PCollection<InputT>,
invokes a user-specified DoFn<InputT, OutputT> on all its elements,
with all its outputs collected into an output
PCollection<OutputT>. |
| ParDo.BoundMulti<InputT,OutputT> |
A
PTransform that, when applied to a
PCollection<InputT>, invokes a user-specified
DoFn<InputT, OutputT> on all its elements, which can emit elements
to any of the PTransform's main and side output
PCollections, which are bundled into a result
PCollectionTuple. |
| ParDo.Unbound |
An incomplete
ParDo transform, with unbound input/output types. |
| ParDo.UnboundMulti<OutputT> |
An incomplete multi-output
ParDo transform, with unbound
input type. |
| Partition<T> |
Partition takes a PCollection<T> and a
PartitionFn, uses the PartitionFn to split the
elements of the input PCollection into N partitions, and
returns a PCollectionList<T> that bundles N
PCollection<T>s containing the split elements. |
| PTransform<InputT extends PInput,OutputT extends POutput> | |
| RemoveDuplicates<T> |
RemoveDuplicates<T> takes a PCollection<T> and
returns a PCollection<T> that has all the elements of the
input but with duplicate elements removed such that each element is
unique within each window. |
| RemoveDuplicates.WithRepresentativeValues<T,IdT> |
A
RemoveDuplicates PTransform that uses a SerializableFunction to
obtain a representative value for each input element. |
| Sample |
PTransforms for taking samples of the elements in a
PCollection, or samples of the values associated with each
key in a PCollection of KVs. |
| Sample.FixedSizedSampleFn<T> |
CombineFn that computes a fixed-size sample of a
collection of values. |
| Sample.SampleAny<T> |
A
PTransform that takes a PCollection<T> and a limit, and
produces a new PCollection<T> containing up to limit
elements of the input PCollection. |
| SimpleFunction<InputT,OutputT> |
A
SerializableFunction which is not a functional interface. |
| Sum |
PTransforms for computing the sum of the elements in a
PCollection, or the sum of the values associated with
each key in a PCollection of KVs. |
| Sum.SumDoubleFn |
A
SerializableFunction that computes the sum of an
Iterable of Doubles, useful as an argument to
Combine.globally(org.apache.beam.sdk.transforms.SerializableFunction<java.lang.Iterable<V>, V>) or Combine.perKey(org.apache.beam.sdk.transforms.SerializableFunction<java.lang.Iterable<V>, V>). |
| Sum.SumIntegerFn |
A
SerializableFunction that computes the sum of an
Iterable of Integers, useful as an argument to
Combine.globally(org.apache.beam.sdk.transforms.SerializableFunction<java.lang.Iterable<V>, V>) or Combine.perKey(org.apache.beam.sdk.transforms.SerializableFunction<java.lang.Iterable<V>, V>). |
| Sum.SumLongFn |
A
SerializableFunction that computes the sum of an
Iterable of Longs, useful as an argument to
Combine.globally(org.apache.beam.sdk.transforms.SerializableFunction<java.lang.Iterable<V>, V>) or Combine.perKey(org.apache.beam.sdk.transforms.SerializableFunction<java.lang.Iterable<V>, V>). |
| Top |
PTransforms for finding the largest (or smallest) set
of elements in a PCollection, or the largest (or smallest)
set of values associated with each key in a PCollection of
KVs. |
| Top.Largest<T extends Comparable<? super T>> |
A
Serializable Comparator that that uses the compared elements' natural
ordering. |
| Top.Smallest<T extends Comparable<? super T>> |
Serializable Comparator that that uses the reverse of the compared elements'
natural ordering. |
| Top.TopCombineFn<T,ComparatorT extends Comparator<T> & Serializable> |
CombineFn for Top transforms that combines a
bunch of Ts into a single count-long
List<T>, using compareFn to choose the largest
Ts. |
| Values<V> |
Values<V> takes a PCollection of KV<K, V>s and
returns a PCollection<V> of the values. |
| View |
Transforms for creating
PCollectionViews from
PCollections (to read them as side inputs). |
| View.AsIterable<T> |
Not intended for direct use by pipeline authors; public only so a
PipelineRunner may
override its behavior. |
| View.AsList<T> |
Not intended for direct use by pipeline authors; public only so a
PipelineRunner may
override its behavior. |
| View.AsMap<K,V> |
Not intended for direct use by pipeline authors; public only so a
PipelineRunner may
override its behavior. |
| View.AsMultimap<K,V> |
Not intended for direct use by pipeline authors; public only so a
PipelineRunner may
override its behavior. |
| View.AsSingleton<T> |
Not intended for direct use by pipeline authors; public only so a
PipelineRunner may
override its behavior. |
| View.CreatePCollectionView<ElemT,ViewT> |
Creates a primitive
PCollectionView. |
| WithKeys<K,V> |
WithKeys<K, V> takes a PCollection<V>, and either a
constant key of type K or a function from V to
K, and returns a PCollection<KV<K, V>>, where each
of the values in the input PCollection has been paired with
either the constant key or a key computed from the value. |
| WithTimestamps<T> |
A
PTransform for assigning timestamps to all the elements of a PCollection. |
| Enum | Description |
|---|---|
| DoFnTester.CloningBehavior |
Whether or not a
DoFnTester should clone the DoFn under test. |
| Annotation Type | Description |
|---|---|
| DoFnWithContext.FinishBundle |
Annotation for the method to use to prepare an instance for processing a batch of elements.
|
| DoFnWithContext.ProcessElement |
Annotation for the method to use for processing elements.
|
| DoFnWithContext.StartBundle |
Annotation for the method to use to prepare an instance for processing a batch of elements.
|
PTransforms for transforming
data in a pipeline.
A PTransform is an operation that takes an
InputT (some subtype of PInput)
and produces an
OutputT (some subtype of POutput).
Common PTransforms include root PTransforms like
TextIO.Read and
Create, processing and
conversion operations like ParDo,
GroupByKey,
CoGroupByKey,
Combine, and
Count, and outputting
PTransforms like
TextIO.Write.
New PTransforms can be created by composing existing PTransforms. Most PTransforms in this package are composites, and users can also create composite PTransforms for their own application-specific logic.