public class LSMTimelineWriter extends Object
| Modifier and Type | Field and Description |
|---|---|
static int |
FILE_LAYER_ZERO |
static long |
MAX_FILE_SIZE_IN_BYTES |
| Modifier and Type | Method and Description |
|---|---|
void |
clean(HoodieEngineContext context,
int compactedVersions)
Checks whether there is any unfinished compaction operation.
|
void |
compactAndClean(HoodieEngineContext context)
Compacts the small parquet files.
|
static String |
compactedFileName(List<String> files)
Returns a new file name.
|
void |
compactFiles(List<String> candidateFiles,
String compactedFileName) |
static LSMTimelineWriter |
getInstance(HoodieWriteConfig config,
HoodieTable<?,?,?,?> table) |
static LSMTimelineWriter |
getInstance(HoodieWriteConfig config,
HoodieTable<?,?,?,?> table,
Option<StoragePath> archivePath) |
static LSMTimelineWriter |
getInstance(HoodieWriteConfig config,
TaskContextSupplier taskContextSupplier,
HoodieTableMetaClient metaClient) |
void |
updateManifest(List<String> filesToRemove,
String fileToAdd)
Updates a manifest file.
|
void |
updateManifest(String fileToAdd)
Updates a manifest file.
|
void |
write(List<ActiveAction> activeActions,
Option<Consumer<ActiveAction>> preWriteCallback,
Option<Consumer<Exception>> exceptionHandler)
Writes the list of active actions into the timeline.
|
public static final int FILE_LAYER_ZERO
public static final long MAX_FILE_SIZE_IN_BYTES
public static LSMTimelineWriter getInstance(HoodieWriteConfig config, HoodieTable<?,?,?,?> table)
public static LSMTimelineWriter getInstance(HoodieWriteConfig config, HoodieTable<?,?,?,?> table, Option<StoragePath> archivePath)
public static LSMTimelineWriter getInstance(HoodieWriteConfig config, TaskContextSupplier taskContextSupplier, HoodieTableMetaClient metaClient)
public void write(List<ActiveAction> activeActions, Option<Consumer<ActiveAction>> preWriteCallback, Option<Consumer<Exception>> exceptionHandler) throws HoodieCommitException
activeActions - The active actionspreWriteCallback - The callback before writing each actionexceptionHandler - The handle for exceptionHoodieCommitExceptionpublic void updateManifest(String fileToAdd) throws IOException
3 steps:
fileToAdd - New file name to addIOExceptionpublic void updateManifest(List<String> filesToRemove, String fileToAdd) throws IOException
4 steps:
filesToRemove - File names to removefileToAdd - New file name to addIOExceptionpublic void compactAndClean(HoodieEngineContext context) throws IOException
The parquet naming convention is:
${min_instant}_${max_instant}_${level}.parquet
The 'min_instant' and 'max_instant' represent the instant time range of the parquet file. The 'level' represents the number of the level where the file is located, currently we have no limit for the number of layers.
These parquet files composite as an LSM tree layout, one parquet file contains instant metadata entries with consecutive timestamp. Different parquet files may have overlapping with the instant time ranges.
t1_t2_0.parquet, t3_t4_0.parquet, ... t5_t6_0.parquet L0 layer
\ /
\ /
|
V
t3_t6_1.parquet L1 layer
Compaction and cleaning: once the files number exceed a threshold(now constant 10) N, the oldest N files are then replaced with a compacted file in the next layer. A cleaning action is triggered right after the compaction.
context - HoodieEngineContextIOExceptionpublic void clean(HoodieEngineContext context, int compactedVersions) throws IOException
context - HoodieEngineContext used for parallelize to delete obsolete files if necessary.IOExceptionCopyright © 2024 The Apache Software Foundation. All rights reserved.