SyntheticBoundedIO.SyntheticSourceOptions sourceOptions
long numRecords
long splitPointFrequencyRecords
SyntheticOptions.Sampler bundleSizeDistribution
When splitting into "desiredBundleSizeBytes", we'll compute the desired number of bundles N, then sample this many numbers from this distribution, normalize their sum to 1, and use that as the boundaries of generated bundles.
The Zipf distribution is expected to be particularly useful here.
E.g., empirically, with 100 bundles, the Zipf distribution with a parameter of 3.5 will generate bundles where the largest is about 3x-10x larger than the median; with a parameter of 3.0 this ratio will be about 5x-50x; with 2.5, 5x-100x (i.e. 1 bundle can be as large as all others combined).
java.lang.Integer forceNumInitialBundles
SyntheticBoundedIO.ProgressShape progressShape
SyntheticOptions.Sampler initializeDelayDistribution
SyntheticOptions.delayDistribution.long keySizeBytes
long valueSizeBytes
long bytesPerRecord
long numHotKeys
double hotKeyFraction
double largeKeyFraction
double largeKeySizeBytes
int seed
SyntheticOptions.Sampler delayDistribution
The field delayDistribution is not used in the synthetic unbounded source. The synthetic unbounded source uses RateLimiter to control QPS.
SyntheticOptions.DelayType delayType
double cpuUtilizationInMixedDelay
SyntheticStep.Options options
org.apache.beam.sdk.values.KV<K,V> idAndThroughput
org.apache.beam.sdk.metrics.Counter throttlingCounter
double outputRecordsPerInputRecord
boolean preservesInputKeyDistribution
SyntheticOptions, and input records are merely used as a “clock”; If true, the
shape of the input distribution is preserved, and the DoFn only does sleeping and
amplification/filtering.long maxWorkerThroughput
long perBundleDelay
maxWorkerThroughput >= 0.SyntheticOptions.DelayType perBundleDelayType
boolean reportThrottlingMicros