Enum StatisticsType
- java.lang.Object
-
- java.lang.Enum<StatisticsType>
-
- org.apache.iceberg.flink.sink.shuffle.StatisticsType
-
- All Implemented Interfaces:
java.io.Serializable,java.lang.Comparable<StatisticsType>
public enum StatisticsType extends java.lang.Enum<StatisticsType>
Range distribution requires gathering statistics on the sort keys to determine proper range boundaries to distribute/cluster rows before writer operators.
-
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static StatisticsTypevalueOf(java.lang.String name)Returns the enum constant of this type with the specified name.static StatisticsType[]values()Returns an array containing the constants of this enum type, in the order they are declared.
-
-
-
Enum Constant Detail
-
Map
public static final StatisticsType Map
Tracks the data statistics asMap<SortKey, Long>frequency. It works better for low-cardinality scenarios (like country, event_type, etc.) where the cardinalities are in hundreds or thousands.- Pro: accurate measurement on the statistics/weight of every key.
- Con: memory footprint can be large if the key cardinality is high.
-
Sketch
public static final StatisticsType Sketch
Sample the sort keys via reservoir sampling. Then split the range partitions via range bounds from sampled values. It works better for high-cardinality scenarios (like device_id, user_id, uuid etc.) where the cardinalities can be in millions or billions.- Pro: relatively low memory footprint for high-cardinality sort keys.
- Con: non-precise approximation with potentially lower accuracy.
-
Auto
public static final StatisticsType Auto
Initially use Map for statistics tracking. If key cardinality turns out to be high, automatically switch to sketch sampling.
-
-
Method Detail
-
values
public static StatisticsType[] values()
Returns an array containing the constants of this enum type, in the order they are declared. This method may be used to iterate over the constants as follows:for (StatisticsType c : StatisticsType.values()) System.out.println(c);
- Returns:
- an array containing the constants of this enum type, in the order they are declared
-
valueOf
public static StatisticsType valueOf(java.lang.String name)
Returns the enum constant of this type with the specified name. The string must match exactly an identifier used to declare an enum constant in this type. (Extraneous whitespace characters are not permitted.)- Parameters:
name- the name of the enum constant to be returned.- Returns:
- the enum constant with the specified name
- Throws:
java.lang.IllegalArgumentException- if this enum type has no constant with the specified namejava.lang.NullPointerException- if the argument is null
-
-