@Experimental(value=SOURCE_SINK)
public class SyntheticBoundedSource
extends org.apache.beam.sdk.io.OffsetBasedSource<org.apache.beam.sdk.values.KV<byte[],byte[]>>
SyntheticBoundedSource that reads KV<byte[], byte[]>.
The SyntheticBoundedSource generates a PCollection of KV<byte[],
byte[]>. A fraction of the generated records KV<byte[], byte[]> are associated with
"hot" keys, which are uniformly distributed over a fixed number of hot keys. The remaining
generated records are associated with "random" keys. Each record will be slowed down by a certain
sleep time generated based on the specified sleep time distribution when the SyntheticBoundedSource.SyntheticSourceReader reads each record. The record KV<byte[],
byte[]> is generated deterministically based on the record's position in the source, which
enables repeatable execution for debugging. The SyntheticBoundedInput configurable parameters are
defined in SyntheticSourceOptions.*
org.apache.beam.sdk.io.OffsetBasedSource.OffsetBasedReader<T>| Constructor and Description |
|---|
SyntheticBoundedSource(long startOffset,
long endOffset,
SyntheticSourceOptions sourceOptions) |
SyntheticBoundedSource(SyntheticSourceOptions sourceOptions) |
| Modifier and Type | Method and Description |
|---|---|
org.apache.beam.sdk.io.synthetic.SyntheticBoundedSource.SyntheticSourceReader |
createReader(org.apache.beam.sdk.options.PipelineOptions pipelineOptions) |
SyntheticBoundedSource |
createSourceForSubrange(long start,
long end) |
long |
getBytesPerOffset() |
org.apache.beam.sdk.coders.Coder<org.apache.beam.sdk.values.KV<byte[],byte[]>> |
getDefaultOutputCoder() |
long |
getMaxEndOffset(org.apache.beam.sdk.options.PipelineOptions options) |
java.util.List<SyntheticBoundedSource> |
split(long desiredBundleSizeBytes,
org.apache.beam.sdk.options.PipelineOptions options) |
java.lang.String |
toString() |
void |
validate() |
public SyntheticBoundedSource(SyntheticSourceOptions sourceOptions)
public SyntheticBoundedSource(long startOffset,
long endOffset,
SyntheticSourceOptions sourceOptions)
public org.apache.beam.sdk.coders.Coder<org.apache.beam.sdk.values.KV<byte[],byte[]>> getDefaultOutputCoder()
getDefaultOutputCoder in class org.apache.beam.sdk.io.Source<org.apache.beam.sdk.values.KV<byte[],byte[]>>public long getBytesPerOffset()
getBytesPerOffset in class org.apache.beam.sdk.io.OffsetBasedSource<org.apache.beam.sdk.values.KV<byte[],byte[]>>public void validate()
validate in class org.apache.beam.sdk.io.OffsetBasedSource<org.apache.beam.sdk.values.KV<byte[],byte[]>>public java.lang.String toString()
toString in class org.apache.beam.sdk.io.OffsetBasedSource<org.apache.beam.sdk.values.KV<byte[],byte[]>>public final SyntheticBoundedSource createSourceForSubrange(long start, long end)
createSourceForSubrange in class org.apache.beam.sdk.io.OffsetBasedSource<org.apache.beam.sdk.values.KV<byte[],byte[]>>public long getMaxEndOffset(org.apache.beam.sdk.options.PipelineOptions options)
getMaxEndOffset in class org.apache.beam.sdk.io.OffsetBasedSource<org.apache.beam.sdk.values.KV<byte[],byte[]>>public org.apache.beam.sdk.io.synthetic.SyntheticBoundedSource.SyntheticSourceReader createReader(org.apache.beam.sdk.options.PipelineOptions pipelineOptions)
createReader in class org.apache.beam.sdk.io.BoundedSource<org.apache.beam.sdk.values.KV<byte[],byte[]>>public java.util.List<SyntheticBoundedSource> split(long desiredBundleSizeBytes, org.apache.beam.sdk.options.PipelineOptions options) throws java.lang.Exception
split in class org.apache.beam.sdk.io.OffsetBasedSource<org.apache.beam.sdk.values.KV<byte[],byte[]>>java.lang.Exception