public class LSMTimeline extends Object
After several instants are accumulated as a batch on the active timeline, they would be flushed as a parquet file into the LSM timeline. In general the timeline is comprised with parquet files with LSM style file layout. Each new operation to the timeline yields a new snapshot version. Theoretically, there could be multiple snapshot versions on the timeline.
t111, t112 ... t120 ... ->
\ /
\ /
|
V
t111_t120_0.parquet, t101_t110_0.parquet,... t11_t20_0.parquet L0
\ /
\ /
|
V
t11_t100_1.parquet L1
manifest_1, manifest_2, ... manifest_12
|
V
_version_
The benchmark shows 1000 instants reading cost about 10 ms.
In order to make snapshot isolation of the LSM timeline write/read, we add two kinds of metadata files for the LSM tree version management:
| Modifier and Type | Field and Description |
|---|---|
static int |
LSM_TIMELINE_INSTANT_VERSION_1 |
| Constructor and Description |
|---|
LSMTimeline() |
| Modifier and Type | Method and Description |
|---|---|
static List<Integer> |
allSnapshotVersions(HoodieTableMetaClient metaClient)
Returns all the valid snapshot versions.
|
static int |
getFileLayer(String fileName)
Parse the layer number from the file name.
|
static StoragePath |
getManifestFilePath(HoodieTableMetaClient metaClient,
int snapshotVersion)
Returns the full manifest file path with given version number.
|
static StoragePathFilter |
getManifestFilePathFilter()
Returns a path filter for the manifest files.
|
static int |
getManifestVersion(String fileName)
Parse the snapshot version from the manifest file name.
|
static String |
getMaxInstantTime(String fileName)
Parse the maximum instant time from the file name.
|
static String |
getMinInstantTime(String fileName)
Parse the minimum instant time from the file name.
|
static org.apache.avro.Schema |
getReadSchema(HoodieArchivedTimeline.LoadMode loadMode) |
static StoragePath |
getVersionFilePath(HoodieTableMetaClient metaClient)
Returns the full version file path with given version number.
|
static boolean |
isFileFromLayer(String fileName,
int layer)
Returns whether a file belongs to the specified layer
layer within the LSM layout. |
static boolean |
isFileInRange(HoodieArchivedTimeline.TimeRangeFilter filter,
String fileName)
Returns whether the given file is located in the filter.
|
static HoodieLSMTimelineManifest |
latestSnapshotManifest(HoodieTableMetaClient metaClient)
Returns the latest snapshot metadata files.
|
static HoodieLSMTimelineManifest |
latestSnapshotManifest(HoodieTableMetaClient metaClient,
int latestVersion)
Reads the file list from the manifest file for the latest snapshot.
|
static int |
latestSnapshotVersion(HoodieTableMetaClient metaClient)
Returns the latest snapshot version.
|
static List<StoragePathInfo> |
listAllManifestFiles(HoodieTableMetaClient metaClient)
List all the parquet manifest files.
|
static List<StoragePathInfo> |
listAllMetaFiles(HoodieTableMetaClient metaClient)
List all the parquet metadata files.
|
public static final int LSM_TIMELINE_INSTANT_VERSION_1
public static org.apache.avro.Schema getReadSchema(HoodieArchivedTimeline.LoadMode loadMode)
public static boolean isFileInRange(HoodieArchivedTimeline.TimeRangeFilter filter, String fileName)
public static int latestSnapshotVersion(HoodieTableMetaClient metaClient) throws IOException
IOExceptionpublic static List<Integer> allSnapshotVersions(HoodieTableMetaClient metaClient) throws IOException
IOExceptionpublic static HoodieLSMTimelineManifest latestSnapshotManifest(HoodieTableMetaClient metaClient) throws IOException
IOExceptionpublic static HoodieLSMTimelineManifest latestSnapshotManifest(HoodieTableMetaClient metaClient, int latestVersion)
public static StoragePath getManifestFilePath(HoodieTableMetaClient metaClient, int snapshotVersion)
public static StoragePath getVersionFilePath(HoodieTableMetaClient metaClient)
public static List<StoragePathInfo> listAllManifestFiles(HoodieTableMetaClient metaClient) throws IOException
IOExceptionpublic static List<StoragePathInfo> listAllMetaFiles(HoodieTableMetaClient metaClient) throws IOException
IOExceptionpublic static int getManifestVersion(String fileName)
public static int getFileLayer(String fileName)
public static String getMinInstantTime(String fileName)
public static String getMaxInstantTime(String fileName)
public static boolean isFileFromLayer(String fileName, int layer)
layer within the LSM layout.public static StoragePathFilter getManifestFilePathFilter()
Copyright © 2024 The Apache Software Foundation. All rights reserved.