T - Type of records read by the source.public abstract class BoundedSource<T> extends Source<T>
Source that reads a finite amount of input and, because of that, supports
some additional operations.
The operations are:
splitIntoBundles(long, org.apache.beam.sdk.options.PipelineOptions);
getEstimatedSizeBytes(org.apache.beam.sdk.options.PipelineOptions);
producesSortedKeys(org.apache.beam.sdk.options.PipelineOptions);
reader has additional functionality to enable runners
to dynamically adapt based on runtime conditions.
BoundedSource.BoundedReader.getFractionConsumed())
BoundedSource.BoundedReader.getSplitPointsConsumed() and
BoundedSource.BoundedReader.getSplitPointsRemaining()).
BoundedSource.BoundedReader.splitAtFraction(double)).
To use this class for supporting your custom input type, derive your class
class from it, and override the abstract methods. For an example, see DatastoreIO.
| Modifier and Type | Class and Description |
|---|---|
static class |
BoundedSource.BoundedReader<T>
A
Reader that reads a bounded amount of input and supports some additional
operations, such as progress estimation and dynamic work rebalancing. |
Source.Reader<T>| Constructor and Description |
|---|
BoundedSource() |
| Modifier and Type | Method and Description |
|---|---|
abstract BoundedSource.BoundedReader<T> |
createReader(PipelineOptions options)
Returns a new
BoundedSource.BoundedReader that reads from this source. |
abstract long |
getEstimatedSizeBytes(PipelineOptions options)
An estimate of the total size (in bytes) of the data that would be read from this source.
|
abstract boolean |
producesSortedKeys(PipelineOptions options)
Whether this source is known to produce key/value pairs sorted by lexicographic order on
the bytes of the encoded key.
|
abstract List<? extends BoundedSource<T>> |
splitIntoBundles(long desiredBundleSizeBytes,
PipelineOptions options)
Splits the source into bundles of approximately
desiredBundleSizeBytes. |
getDefaultOutputCoder, populateDisplayData, validatepublic abstract List<? extends BoundedSource<T>> splitIntoBundles(long desiredBundleSizeBytes, PipelineOptions options) throws Exception
desiredBundleSizeBytes.Exceptionpublic abstract long getEstimatedSizeBytes(PipelineOptions options) throws Exception
Exceptionpublic abstract boolean producesSortedKeys(PipelineOptions options) throws Exception
Exceptionpublic abstract BoundedSource.BoundedReader<T> createReader(PipelineOptions options) throws IOException
BoundedSource.BoundedReader that reads from this source.IOException