T - the type of each of the elements of the input PCollectionpublic static class AvroIO.Write.Bound<T> extends PTransform<PCollection<T>,PDone>
PTransform that writes a bounded PCollection to an Avro file (or
multiple Avro files matching a sharding pattern).name| Modifier and Type | Method and Description |
|---|---|
PDone |
apply(PCollection<T> input)
Applies this
PTransform on the given InputT, and returns its
Output. |
protected Coder<Void> |
getDefaultOutputCoder()
Returns the default
Coder to use for the output of this
single-output PTransform. |
String |
getFilenamePrefix() |
String |
getFilenameSuffix() |
int |
getNumShards() |
Schema |
getSchema() |
String |
getShardNameTemplate()
Returns the current shard name template string.
|
String |
getShardTemplate() |
Class<T> |
getType() |
boolean |
needsValidation() |
void |
populateDisplayData(DisplayData.Builder builder)
Register display data for the given transform or component.
|
AvroIO.Write.Bound<T> |
to(String filenamePrefix)
Returns a new
PTransform that's like this one but
that writes to the file(s) with the given filename prefix. |
AvroIO.Write.Bound<T> |
withNumShards(int numShards)
Returns a new
PTransform that's like this one but
that uses the provided shard count. |
AvroIO.Write.Bound<T> |
withoutSharding()
Returns a new
PTransform that's like this one but
that forces a single file as output. |
AvroIO.Write.Bound<T> |
withoutValidation()
Returns a new
PTransform that's like this one but
that has GCS output path validation on pipeline creation disabled. |
<X> AvroIO.Write.Bound<X> |
withSchema(Class<X> type)
Returns a new
PTransform that's like this one but
that writes to Avro file(s) containing records whose type is the
specified Avro-generated class. |
AvroIO.Write.Bound<GenericRecord> |
withSchema(Schema schema)
Returns a new
PTransform that's like this one but
that writes to Avro file(s) containing records of the specified
schema. |
AvroIO.Write.Bound<GenericRecord> |
withSchema(String schema)
Returns a new
PTransform that's like this one but
that writes to Avro file(s) containing records of the specified
schema in a JSON-encoded string form. |
AvroIO.Write.Bound<T> |
withShardNameTemplate(String shardTemplate)
Returns a new
PTransform that's like this one but
that uses the given shard name template. |
AvroIO.Write.Bound<T> |
withSuffix(String filenameSuffix)
Returns a new
PTransform that's like this one but
that writes to the file(s) with the given filename suffix. |
getDefaultOutputCoder, getDefaultOutputCoder, getKindString, getName, toString, validatepublic AvroIO.Write.Bound<T> to(String filenamePrefix)
PTransform that's like this one but
that writes to the file(s) with the given filename prefix.
See AvroIO.Write.to(String) for more information
about filenames.
Does not modify this object.
public AvroIO.Write.Bound<T> withSuffix(String filenameSuffix)
PTransform that's like this one but
that writes to the file(s) with the given filename suffix.
See ShardNameTemplate for a description of shard templates.
Does not modify this object.
public AvroIO.Write.Bound<T> withNumShards(int numShards)
PTransform that's like this one but
that uses the provided shard count.
Constraining the number of shards is likely to reduce the performance of a pipeline. Setting this value is not recommended unless you require a specific number of output files.
Does not modify this object.
numShards - the number of shards to use, or 0 to let the system
decide.ShardNameTemplatepublic AvroIO.Write.Bound<T> withShardNameTemplate(String shardTemplate)
PTransform that's like this one but
that uses the given shard name template.
Does not modify this object.
ShardNameTemplatepublic AvroIO.Write.Bound<T> withoutSharding()
PTransform that's like this one but
that forces a single file as output.
This is a shortcut for
.withNumShards(1).withShardNameTemplate("")
Does not modify this object.
public <X> AvroIO.Write.Bound<X> withSchema(Class<X> type)
PTransform that's like this one but
that writes to Avro file(s) containing records whose type is the
specified Avro-generated class.
Does not modify this object.
X - the type of the elements of the input PCollectionpublic AvroIO.Write.Bound<GenericRecord> withSchema(Schema schema)
PTransform that's like this one but
that writes to Avro file(s) containing records of the specified
schema.
Does not modify this object.
public AvroIO.Write.Bound<GenericRecord> withSchema(String schema)
PTransform that's like this one but
that writes to Avro file(s) containing records of the specified
schema in a JSON-encoded string form.
Does not modify this object.
public AvroIO.Write.Bound<T> withoutValidation()
PTransform that's like this one but
that has GCS output path validation on pipeline creation disabled.
Does not modify this object.
This can be useful in the case where the GCS output location does not exist at the pipeline creation time, but is expected to be available at execution time.
public PDone apply(PCollection<T> input)
PTransformPTransform on the given InputT, and returns its
Output.
Composite transforms, which are defined in terms of other transforms, should return the output of one of the composed transforms. Non-composite transforms, which do not apply any transforms internally, should return a new unbound output and register evaluators (via backend-specific registration methods).
apply in class PTransform<PCollection<T>,PDone>public void populateDisplayData(DisplayData.Builder builder)
PTransformpopulateDisplayData(DisplayData.Builder) is invoked by Pipeline runners to collect
display data via DisplayData.from(HasDisplayData). Implementations may call
super.populateDisplayData(builder) in order to register display data in the current
namespace, but should otherwise use subcomponent.populateDisplayData(builder) to use
the namespace of the subcomponent.
By default, does not register any display data. Implementors may override this method to provide their own display data.
populateDisplayData in interface HasDisplayDatapopulateDisplayData in class PTransform<PCollection<T>,PDone>builder - The builder to populate with display data.HasDisplayDatapublic String getShardNameTemplate()
protected Coder<Void> getDefaultOutputCoder()
PTransformCoder to use for the output of this
single-output PTransform.
By default, always throws
getDefaultOutputCoder in class PTransform<PCollection<T>,PDone>public String getFilenamePrefix()
public String getShardTemplate()
public int getNumShards()
public String getFilenameSuffix()
public Schema getSchema()
public boolean needsValidation()