@Experimental(value=SOURCE_SINK)
public class SyntheticBoundedIO
extends java.lang.Object
SyntheticBoundedIO class provides a parameterizable batch custom source that is
deterministic.
The SyntheticBoundedIO.SyntheticBoundedSource generates a PCollection of KV<byte[],
byte[]>. A fraction of the generated records KV<byte[], byte[]> are associated with
"hot" keys, which are uniformly distributed over a fixed number of hot keys. The remaining
generated records are associated with "random" keys. Each record will be slowed down by a certain
sleep time generated based on the specified sleep time distribution when the SyntheticSourceReader reads each record. The record KV<byte[], byte[]> is generated
deterministically based on the record's position in the source, which enables repeatable
execution for debugging. The SyntheticBoundedInput configurable parameters are defined in SyntheticBoundedIO.SyntheticSourceOptions.
To read a PCollection of KV<byte[], byte[]> from SyntheticBoundedIO,
use readFrom(org.apache.beam.sdk.io.synthetic.SyntheticBoundedIO.SyntheticSourceOptions) to construct the synthetic source with synthetic source
options. See SyntheticBoundedIO.SyntheticSourceOptions for how to construct an instance.
An example is below:
Pipeline p = ...;
SyntheticBoundedInput.SourceOptions sso = ...;
// Construct the synthetic input with synthetic source options.
PCollection<KV<byte[], byte[]>> input = p.apply(SyntheticBoundedInput.readFrom(sso));
| Modifier and Type | Class and Description |
|---|---|
static class |
SyntheticBoundedIO.ProgressShape
Shape of the progress reporting curve as a function of the current offset in the
SyntheticBoundedIO.SyntheticBoundedSource. |
static class |
SyntheticBoundedIO.SyntheticBoundedSource
A
SyntheticBoundedIO.SyntheticBoundedSource that reads KV<byte[], byte[]>. |
static class |
SyntheticBoundedIO.SyntheticSourceOptions
Synthetic bounded source options.
|
| Constructor and Description |
|---|
SyntheticBoundedIO() |
| Modifier and Type | Method and Description |
|---|---|
static org.apache.beam.sdk.io.Read.Bounded<org.apache.beam.sdk.values.KV<byte[],byte[]>> |
readFrom(SyntheticBoundedIO.SyntheticSourceOptions options)
Read from the synthetic source options.
|
public static org.apache.beam.sdk.io.Read.Bounded<org.apache.beam.sdk.values.KV<byte[],byte[]>> readFrom(SyntheticBoundedIO.SyntheticSourceOptions options)