public class BucketAssigner extends Object implements AutoCloseable
This assigner assigns the record one by one. If the record is an update, checks and reuse existing UPDATE bucket or generates a new one; If the record is an insert, checks the record partition for small files first, try to find a small file that has space to append new records and reuse the small file's data bucket, if there is no small file(or no left space for new records), generates an INSERT bucket.
Use {partition}_{fileId} as the bucket identifier, so that the bucket is unique within and among partitions.
| Modifier and Type | Field and Description |
|---|---|
protected HoodieWriteConfig |
config
The write config.
|
| Constructor and Description |
|---|
BucketAssigner(int taskID,
int maxParallelism,
int numTasks,
WriteProfile profile,
HoodieWriteConfig config) |
| Modifier and Type | Method and Description |
|---|---|
BucketInfo |
addInsert(String partitionPath) |
BucketInfo |
addUpdate(String partitionPath,
String fileIdHint) |
void |
close() |
String |
createFileIdOfThisTask() |
void |
reload(long checkpointId)
Refresh the table state like TableFileSystemView and HoodieTimeline.
|
void |
reset()
Reset the states of this assigner, should do once for each checkpoint,
all the states are accumulated within one checkpoint interval.
|
List<SmallFile> |
smallFilesOfThisTask(List<SmallFile> smallFiles) |
protected final HoodieWriteConfig config
public BucketAssigner(int taskID,
int maxParallelism,
int numTasks,
WriteProfile profile,
HoodieWriteConfig config)
public void reset()
public BucketInfo addUpdate(String partitionPath, String fileIdHint)
public BucketInfo addInsert(String partitionPath)
public void reload(long checkpointId)
@VisibleForTesting public String createFileIdOfThisTask()
@VisibleForTesting public List<SmallFile> smallFilesOfThisTask(List<SmallFile> smallFiles)
public void close()
close in interface AutoCloseableCopyright © 2023 The Apache Software Foundation. All rights reserved.