T - the Protocol Buffers Message handled by this Coder.public class ProtoCoder<T extends com.google.protobuf.Message> extends AtomicCoder<T>
Coder using Google Protocol Buffers binary format. ProtoCoder supports both
Protocol Buffers syntax versions 2 and 3.
To learn more about Protocol Buffers, visit: https://developers.google.com/protocol-buffers
ProtoCoder is registered in the global CoderRegistry as the default
Coder for any Message object. Custom message extensions are also supported, but
these extensions must be registered for a particular ProtoCoder instance and that
instance must be registered on the PCollection that needs the extensions:
import MyProtoFile;
import MyProtoFile.MyMessage;
Coder<MyMessage> coder = ProtoCoder.of(MyMessage.class).withExtensionsFrom(MyProtoFile.class);
PCollection<MyMessage> records = input.apply(...).setCoder(coder);
ProtoCoder supports both versions 2 and 3 of the Protocol Buffers syntax. However,
the Java runtime version of the google.com.protobuf library must match exactly the
version of protoc that was used to produce the JAR files containing the compiled
.proto messages.
For more information, see the Protocol Buffers documentation.
ProtoCoder and DeterminismIn general, Protocol Buffers messages can be encoded deterministically within a single pipeline as long as:
map
fields..proto file JAR.ProtoCoder and Encoding StabilityWhen changing Protocol Buffers messages, follow the rules in the Protocol Buffers language
guides for
proto2
and
proto3
syntaxes, depending on your message type. Following these guidelines will ensure that the
old encoded data can be read by new versions of the code.
Generally, any change to the message type, registered extensions, runtime library, or compiled proto JARs may change the encoding. Thus even if both the original and updated messages can be encoded deterministically within a single job, these deterministic encodings may not be the same across jobs.
Coder.Context, Coder.NonDeterministicException| Modifier and Type | Method and Description |
|---|---|
org.apache.beam.sdk.util.CloudObject |
asCloudObject()
Returns the
CloudObject that represents this Coder. |
static CoderProvider |
coderProvider()
|
T |
decode(InputStream inStream,
Coder.Context context)
Decodes a value of type
T from the given input stream in
the given context. |
void |
encode(T value,
OutputStream outStream,
Coder.Context context)
Encodes the given value of type
T onto the given output stream
in the given context. |
boolean |
equals(Object other) |
String |
getEncodingId()
The encoding identifier is designed to support evolution as per the design of Protocol
Buffers.
|
com.google.protobuf.ExtensionRegistry |
getExtensionRegistry()
Returns the
ExtensionRegistry listing all known Protocol Buffers extension messages
to T registered with this ProtoCoder. |
Class<T> |
getMessageType()
Returns the Protocol Buffers
Message type this ProtoCoder supports. |
int |
hashCode() |
static <T extends com.google.protobuf.Message> |
of(Class<T> protoMessageClass)
Returns a
ProtoCoder for the given Protocol Buffers Message. |
static <T extends com.google.protobuf.Message> |
of(String protoMessageClassName,
List<String> extensionHostClassNames)
Deprecated.
For JSON deserialization only.
|
static <T extends com.google.protobuf.Message> |
of(TypeDescriptor<T> protoMessageType)
|
void |
verifyDeterministic()
Throw
Coder.NonDeterministicException if the coding is not deterministic. |
ProtoCoder<T> |
withExtensionsFrom(Class<?>... moreExtensionHosts)
|
ProtoCoder<T> |
withExtensionsFrom(Iterable<Class<?>> moreExtensionHosts)
Returns a
ProtoCoder like this one, but with the extensions from the given classes
registered. |
getCoderArguments, getInstanceComponentsconsistentWithEquals, getAllowedEncodings, getComponents, getEncodedElementByteSize, isRegisterByteSizeObserverCheap, registerByteSizeObserver, structuralValue, toString, verifyDeterministic, verifyDeterministicpublic static CoderProvider coderProvider()
public static <T extends com.google.protobuf.Message> ProtoCoder<T> of(Class<T> protoMessageClass)
ProtoCoder for the given Protocol Buffers Message.public static <T extends com.google.protobuf.Message> ProtoCoder<T> of(TypeDescriptor<T> protoMessageType)
public ProtoCoder<T> withExtensionsFrom(Iterable<Class<?>> moreExtensionHosts)
ProtoCoder like this one, but with the extensions from the given classes
registered.
Each of the extension host classes must be an class automatically generated by the
Protocol Buffers compiler, protoc, that contains messages.
Does not modify this object.
public ProtoCoder<T> withExtensionsFrom(Class<?>... moreExtensionHosts)
withExtensionsFrom(Iterable).
Does not modify this object.
public void encode(T value, OutputStream outStream, Coder.Context context) throws IOException
CoderT onto the given output stream
in the given context.IOException - if writing to the OutputStream fails
for some reasonCoderException - if the value could not be encoded for some reasonpublic T decode(InputStream inStream, Coder.Context context) throws IOException
CoderT from the given input stream in
the given context. Returns the decoded value.IOException - if reading from the InputStream fails
for some reasonCoderException - if the value could not be decoded for some reasonpublic boolean equals(Object other)
StandardCoderequals in class StandardCoder<T extends com.google.protobuf.Message>true if the two StandardCoder instances have the
same class and equal components.public int hashCode()
hashCode in class StandardCoder<T extends com.google.protobuf.Message>public String getEncodingId()
In particular, the encoding identifier is guaranteed to be the same for ProtoCoder
instances of the same principal message class, with the same registered extension host classes,
and otherwise distinct. Note that the encoding ID does not encode any version of the message
or extensions, nor does it include the message schema.
When modifying a message class, here are the broadest guidelines; see the above link for greater detail.
required field.
optional or repeated fields, with sensible defaults.
Code consuming this message class should be prepared to support all versions of the class until it is certain that no remaining serialized instances exist.
If backwards incompatible changes must be made, the best recourse is to change the name of your Protocol Buffers message class.
getEncodingId in interface Coder<T extends com.google.protobuf.Message>getEncodingId in class StandardCoder<T extends com.google.protobuf.Message>public void verifyDeterministic()
throws Coder.NonDeterministicException
DeterministicStandardCoderCoder.NonDeterministicException if the coding is not deterministic.
In order for a Coder to be considered deterministic,
the following must be true:
Object.equals()
or Comparable.compareTo(), if supported) have the same
encoding.
Coder always produces a canonical encoding, which is the
same for an instance of an object even if produced on different
computers at different times.
verifyDeterministic in interface Coder<T extends com.google.protobuf.Message>verifyDeterministic in class DeterministicStandardCoder<T extends com.google.protobuf.Message>Coder.NonDeterministicException - if this coder is not deterministic.public Class<T> getMessageType()
Message type this ProtoCoder supports.public com.google.protobuf.ExtensionRegistry getExtensionRegistry()
ExtensionRegistry listing all known Protocol Buffers extension messages
to T registered with this ProtoCoder.@Deprecated public static <T extends com.google.protobuf.Message> ProtoCoder<T> of(String protoMessageClassName, @Nullable List<String> extensionHostClassNames)
public org.apache.beam.sdk.util.CloudObject asCloudObject()
CoderCloudObject that represents this Coder.asCloudObject in interface Coder<T extends com.google.protobuf.Message>asCloudObject in class StandardCoder<T extends com.google.protobuf.Message>