T - the type of the elements of this PCollectionpublic class PCollection<T> extends TypedPValue<T>
PCollection<T> is an immutable collection of values of type
T. A PCollection can contain either a bounded or unbounded
number of elements. Bounded and unbounded PCollections are produced
as the output of PTransforms
(including root PTransforms like Read and Create), and can
be passed as the inputs of other PTransforms.
Some root transforms produce bounded PCollections and others
produce unbounded ones. For example, CountingInput.upTo(long) produces a fixed set of integers,
so it produces a bounded PCollection. CountingInput.unbounded() produces all
integers as an infinite stream, so it produces an unbounded PCollection.
Each element in a PCollection has an associated timestamp. Readers assign timestamps
to elements when they create PCollections, and other
PTransforms propagate these timestamps from their input to their output. See
the documentation on BoundedSource.BoundedReader and UnboundedSource.UnboundedReader for more information on
how these readers produce timestamps and watermarks.
Additionally, a PCollection has an associated
WindowFn and each element is assigned to a set of windows.
By default, the windowing function is GlobalWindows
and all elements are assigned into a single default window.
This default can be overridden with the Window
PTransform.
See the individual PTransform subclasses for specific information
on how they propagate timestamps and windowing.
| Modifier and Type | Class and Description |
|---|---|
static class |
PCollection.IsBounded
The enumeration of cases for whether a
PCollection is bounded. |
| Modifier and Type | Method and Description |
|---|---|
<OutputT extends POutput> |
apply(PTransform<? super PCollection<T>,OutputT> t)
Like
apply(String, PTransform) but defaulting to the name
of the PTransform. |
<OutputT extends POutput> |
apply(String name,
PTransform<? super PCollection<T>,OutputT> t)
Applies the given
PTransform to this input PCollection,
using name to identify this specific application of the transform. |
static <T> PCollection<T> |
createPrimitiveOutputInternal(Pipeline pipeline,
org.apache.beam.sdk.util.WindowingStrategy<?,?> windowingStrategy,
PCollection.IsBounded isBounded)
Creates and returns a new
PCollection for a primitive output. |
Coder<T> |
getCoder()
Returns the
Coder used by this PCollection to encode and decode
the values stored in it. |
String |
getName()
Returns the name of this
PCollection. |
org.apache.beam.sdk.util.WindowingStrategy<?,?> |
getWindowingStrategy()
Returns the
WindowingStrategy of this PCollection. |
PCollection.IsBounded |
isBounded() |
PCollection<T> |
setCoder(Coder<T> coder)
Sets the
Coder used by this PCollection to encode and decode the
values stored in it. |
PCollection<T> |
setIsBoundedInternal(PCollection.IsBounded isBounded)
Sets the
PCollection.IsBounded of this PCollection. |
PCollection<T> |
setName(String name)
Sets the name of this
PCollection. |
PCollection<T> |
setTypeDescriptorInternal(TypeDescriptor<T> typeDescriptor)
Sets the
TypeDescriptor<T> for this
PCollection<T>. |
PCollection<T> |
setWindowingStrategyInternal(org.apache.beam.sdk.util.WindowingStrategy<?,?> windowingStrategy)
Sets the
WindowingStrategy of this PCollection. |
finishSpecifying, getTypeDescriptorexpand, getKindString, isFinishedSpecifyingInternal, recordAsOutput, recordAsOutput, toStringfinishSpecifyingOutput, getPipeline, getProducingTransformInternalclone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, waitgetProducingTransformInternalexpand, finishSpecifyingOutput, getPipeline, recordAsOutputexpand, getPipelinepublic String getName()
PCollection.
By default, the name of a PCollection is based on the name of the
PTransform that produces it. It can be specified explicitly by
calling setName(java.lang.String).
getName in interface PValuegetName in class PValueBaseIllegalStateException - if the name hasn't been set yetpublic PCollection<T> setName(String name)
PCollection. Returns this.setName in class PValueBaseIllegalStateException - if this PCollection has already been
finalized and may no longer be set.
Once apply(org.apache.beam.sdk.transforms.PTransform<? super org.apache.beam.sdk.values.PCollection<T>, OutputT>) has been called, this will be the case.public Coder<T> getCoder()
Coder used by this PCollection to encode and decode
the values stored in it.getCoder in class TypedPValue<T>IllegalStateException - if the Coder hasn't been set, and
couldn't be inferred.public PCollection<T> setCoder(Coder<T> coder)
setCoder in class TypedPValue<T>IllegalStateException - if this PCollection has already
been finalized and may no longer be set.
Once apply(org.apache.beam.sdk.transforms.PTransform<? super org.apache.beam.sdk.values.PCollection<T>, OutputT>) has been called, this will be the case.public <OutputT extends POutput> OutputT apply(PTransform<? super PCollection<T>,OutputT> t)
apply(String, PTransform) but defaulting to the name
of the PTransform.PTransformpublic <OutputT extends POutput> OutputT apply(String name, PTransform<? super PCollection<T>,OutputT> t)
PTransform to this input PCollection,
using name to identify this specific application of the transform.
This name is used in various places, including the monitoring UI, logging,
and to stably identify this application node in the job graph.PTransformpublic org.apache.beam.sdk.util.WindowingStrategy<?,?> getWindowingStrategy()
WindowingStrategy of this PCollection.public PCollection.IsBounded isBounded()
public PCollection<T> setTypeDescriptorInternal(TypeDescriptor<T> typeDescriptor)
TypeDescriptor<T> for this
PCollection<T>. This may allow the enclosing
PCollectionTuple, PCollectionList, or PTransform<?, PCollection<T>>,
etc., to provide more detailed reflective information.setTypeDescriptorInternal in class TypedPValue<T>public PCollection<T> setWindowingStrategyInternal(org.apache.beam.sdk.util.WindowingStrategy<?,?> windowingStrategy)
public PCollection<T> setIsBoundedInternal(PCollection.IsBounded isBounded)
public static <T> PCollection<T> createPrimitiveOutputInternal(Pipeline pipeline, org.apache.beam.sdk.util.WindowingStrategy<?,?> windowingStrategy, PCollection.IsBounded isBounded)
PCollection for a primitive output.
For use by primitive transformations only.