InputT - the type of the elements in the input PCollectionpublic abstract static class ApproximateDistinct.GloballyDistinct<InputT>
extends org.apache.beam.sdk.transforms.PTransform<org.apache.beam.sdk.values.PCollection<InputT>,org.apache.beam.sdk.values.PCollection<java.lang.Long>>
ApproximateDistinct.globally().| Constructor and Description |
|---|
GloballyDistinct() |
| Modifier and Type | Method and Description |
|---|---|
org.apache.beam.sdk.values.PCollection<java.lang.Long> |
expand(org.apache.beam.sdk.values.PCollection<InputT> input) |
ApproximateDistinct.GloballyDistinct<InputT> |
withPrecision(int p)
Sets the precision
p. |
ApproximateDistinct.GloballyDistinct<InputT> |
withSparsePrecision(int sp)
Sets the sparse representation's precision
sp. |
public ApproximateDistinct.GloballyDistinct<InputT> withPrecision(int p)
p.
Keep in mind that p cannot be lower than 4, because the estimation would be too
inaccurate.
See ApproximateDistinct.precisionForRelativeError(double) and ApproximateDistinct.relativeErrorForPrecision(int) to have more information about the
relationship between precision and relative error.
p - the precision value for the normal representationpublic ApproximateDistinct.GloballyDistinct<InputT> withSparsePrecision(int sp)
sp.
Values above 32 are not yet supported by the AddThis version of HyperLogLog+.
Fore more information about the sparse representation, read Google's paper available here.
sp - the precision of HyperLogLog+' sparse representationpublic org.apache.beam.sdk.values.PCollection<java.lang.Long> expand(org.apache.beam.sdk.values.PCollection<InputT> input)
expand in class org.apache.beam.sdk.transforms.PTransform<org.apache.beam.sdk.values.PCollection<InputT>,org.apache.beam.sdk.values.PCollection<java.lang.Long>>