@EnumDescription(value="Modes for sorting records during bulk insert.") public enum BulkInsertSortMode extends Enum<BulkInsertSortMode>
| Enum Constant and Description |
|---|
GLOBAL_SORT |
NONE |
PARTITION_PATH_REPARTITION |
PARTITION_PATH_REPARTITION_AND_SORT |
PARTITION_SORT |
| Modifier and Type | Method and Description |
|---|---|
static BulkInsertSortMode |
valueOf(String name)
Returns the enum constant of this type with the specified name.
|
static BulkInsertSortMode[] |
values()
Returns an array containing the constants of this enum type, in
the order they are declared.
|
@EnumFieldDescription(value="No sorting. Fastest and matches `spark.write.parquet()` in number of files and overhead.") public static final BulkInsertSortMode NONE
@EnumFieldDescription(value="This ensures best file sizes, with lowest memory overhead at cost of sorting.") public static final BulkInsertSortMode GLOBAL_SORT
@EnumFieldDescription(value="Strikes a balance by only sorting within a Spark RDD partition, still keeping the memory overhead of writing low. File sizing is not as good as GLOBAL_SORT.") public static final BulkInsertSortMode PARTITION_SORT
@EnumFieldDescription(value="This ensures that the data for a single physical partition in the table is written by the same Spark executor. This should only be used when input data is evenly distributed across different partition paths. If data is skewed (most records are intended for a handful of partition paths among all) then this can cause an imbalance among Spark executors.") public static final BulkInsertSortMode PARTITION_PATH_REPARTITION
@EnumFieldDescription(value="This ensures that the data for a single physical partition in the table is written by the same Spark executor. This should only be used when input data is evenly distributed across different partition paths. Compared to PARTITION_PATH_REPARTITION, this sort mode does an additional step of sorting the records based on the partition path within a single Spark partition, given that data for multiple physical partitions can be sent to the same Spark partition and executor. If data is skewed (most records are intended for a handful of partition paths among all) then this can cause an imbalance among Spark executors.") public static final BulkInsertSortMode PARTITION_PATH_REPARTITION_AND_SORT
public static BulkInsertSortMode[] values()
for (BulkInsertSortMode c : BulkInsertSortMode.values()) System.out.println(c);
public static BulkInsertSortMode valueOf(String name)
name - the name of the enum constant to be returned.IllegalArgumentException - if this enum type has no constant with the specified nameNullPointerException - if the argument is nullCopyright © 2024 The Apache Software Foundation. All rights reserved.