Enum StatisticsType

  • All Implemented Interfaces:
    java.io.Serializable, java.lang.Comparable<StatisticsType>

    public enum StatisticsType
    extends java.lang.Enum<StatisticsType>
    Range distribution requires gathering statistics on the sort keys to determine proper range boundaries to distribute/cluster rows before writer operators.
    • Enum Constant Summary

      Enum Constants 
      Enum Constant Description
      Auto
      Initially use Map for statistics tracking.
      Map
      Tracks the data statistics as Map<SortKey, Long> frequency.
      Sketch
      Sample the sort keys via reservoir sampling.
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      static StatisticsType valueOf​(java.lang.String name)
      Returns the enum constant of this type with the specified name.
      static StatisticsType[] values()
      Returns an array containing the constants of this enum type, in the order they are declared.
      • Methods inherited from class java.lang.Enum

        clone, compareTo, equals, finalize, getDeclaringClass, hashCode, name, ordinal, toString, valueOf
      • Methods inherited from class java.lang.Object

        getClass, notify, notifyAll, wait, wait, wait
    • Enum Constant Detail

      • Map

        public static final StatisticsType Map
        Tracks the data statistics as Map<SortKey, Long> frequency. It works better for low-cardinality scenarios (like country, event_type, etc.) where the cardinalities are in hundreds or thousands.
        • Pro: accurate measurement on the statistics/weight of every key.
        • Con: memory footprint can be large if the key cardinality is high.
      • Sketch

        public static final StatisticsType Sketch
        Sample the sort keys via reservoir sampling. Then split the range partitions via range bounds from sampled values. It works better for high-cardinality scenarios (like device_id, user_id, uuid etc.) where the cardinalities can be in millions or billions.
        • Pro: relatively low memory footprint for high-cardinality sort keys.
        • Con: non-precise approximation with potentially lower accuracy.
      • Auto

        public static final StatisticsType Auto
        Initially use Map for statistics tracking. If key cardinality turns out to be high, automatically switch to sketch sampling.
    • Method Detail

      • values

        public static StatisticsType[] values()
        Returns an array containing the constants of this enum type, in the order they are declared. This method may be used to iterate over the constants as follows:
        for (StatisticsType c : StatisticsType.values())
            System.out.println(c);
        
        Returns:
        an array containing the constants of this enum type, in the order they are declared
      • valueOf

        public static StatisticsType valueOf​(java.lang.String name)
        Returns the enum constant of this type with the specified name. The string must match exactly an identifier used to declare an enum constant in this type. (Extraneous whitespace characters are not permitted.)
        Parameters:
        name - the name of the enum constant to be returned.
        Returns:
        the enum constant with the specified name
        Throws:
        java.lang.IllegalArgumentException - if this enum type has no constant with the specified name
        java.lang.NullPointerException - if the argument is null