public static class TextIO.Write extends Object
PTransform that writes a PCollection to text file (or
multiple text files matching a sharding pattern), with each
element of the input collection encoded into its own line.| Modifier and Type | Class and Description |
|---|---|
static class |
TextIO.Write.Bound<T>
A PTransform that writes a bounded PCollection to a text file (or
multiple text files matching a sharding pattern), with each
PCollection element being encoded into its own line.
|
| Constructor and Description |
|---|
Write() |
| Modifier and Type | Method and Description |
|---|---|
static TextIO.Write.Bound<String> |
to(String prefix)
Returns a transform for writing to text files that writes to the file(s)
with the given prefix.
|
static <T> TextIO.Write.Bound<T> |
withCoder(Coder<T> coder)
Returns a transform for writing to text files that uses the given
Coder to encode each of the elements of the input
PCollection into an output text line. |
static TextIO.Write.Bound<String> |
withNumShards(int numShards)
Returns a transform for writing to text files that uses the provided shard count.
|
static TextIO.Write.Bound<String> |
withoutSharding()
Returns a transform for writing to text files that forces a single file as
output.
|
static TextIO.Write.Bound<String> |
withoutValidation()
Returns a transform for writing to text files that has GCS path validation on
pipeline creation disabled.
|
static TextIO.Write.Bound<String> |
withShardNameTemplate(String shardTemplate)
Returns a transform for writing to text files that uses the given shard name
template.
|
static TextIO.Write.Bound<String> |
withSuffix(String nameExtension)
Returns a transform for writing to text files that appends the specified suffix
to the created files.
|
public static TextIO.Write.Bound<String> to(String prefix)
"gs://<bucket>/<filepath>"
(if running locally or via the Google Cloud Dataflow service).
The files written will begin with this prefix, followed by
a shard identifier (see TextIO.Write.Bound.withNumShards(int), and end
in a common extension, if given by TextIO.Write.Bound.withSuffix(String).
public static TextIO.Write.Bound<String> withSuffix(String nameExtension)
public static TextIO.Write.Bound<String> withNumShards(int numShards)
Constraining the number of shards is likely to reduce the performance of a pipeline. Setting this value is not recommended unless you require a specific number of output files.
numShards - the number of shards to use, or 0 to let the system
decide.public static TextIO.Write.Bound<String> withShardNameTemplate(String shardTemplate)
See ShardNameTemplate for a description of shard templates.
public static TextIO.Write.Bound<String> withoutSharding()
public static <T> TextIO.Write.Bound<T> withCoder(Coder<T> coder)
Coder to encode each of the elements of the input
PCollection into an output text line.
By default, uses StringUtf8Coder, which writes input
Java strings directly as output lines.
T - the type of the elements of the input PCollectionpublic static TextIO.Write.Bound<String> withoutValidation()
This can be useful in the case where the GCS output location does not exist at the pipeline creation time, but is expected to be available at execution time.