public class TimelineUtils extends Object
This is useful in multiple places including: 1) HiveSync - this can be used to query partitions that changed since previous sync. 2) Incremental reads - InputFormats can use this API to query
| Modifier and Type | Class and Description |
|---|---|
static class |
TimelineUtils.HollowCommitHandling |
| Constructor and Description |
|---|
TimelineUtils() |
| Modifier and Type | Method and Description |
|---|---|
static HoodieTimeline |
concatTimeline(HoodieTimeline timeline1,
HoodieTimeline timeline2,
HoodieTableMetaClient metaClient)
Concat two timelines timeline1 and timeline2 to build a new timeline.
|
static List<String> |
getAffectedPartitions(HoodieTimeline timeline)
Returns partitions that have been modified including internal operations such as clean in the passed timeline.
|
static Map<String,Option<String>> |
getAllExtraMetadataForKey(HoodieTableMetaClient metaClient,
String extraMetadataKey)
Get extra metadata for specified key from all active commit/deltacommit instants.
|
static HoodieCommitMetadata |
getCommitMetadata(HoodieInstant instant,
HoodieTimeline timeline)
Returns the commit metadata of the given instant.
|
static HoodieTimeline |
getCommitsTimelineAfter(HoodieTableMetaClient metaClient,
String exclusiveStartInstantTime,
Option<String> lastMaxCompletionTime)
Returns a Hudi timeline with commits after the given instant time (exclusive).
|
static List<String> |
getDroppedPartitions(HoodieTableMetaClient metaClient,
Option<String> lastCommitTimeSynced,
Option<String> lastCommitCompletionTimeSynced)
Returns partitions that have been deleted or marked for deletion in the timeline between given commit time range.
|
static Option<HoodieInstant> |
getEarliestInstantForMetadataArchival(HoodieActiveTimeline dataTableActiveTimeline,
boolean shouldArchiveBeyondSavepoint)
Gets the qualified earliest instant from the active timeline of the data table
for the archival in metadata table.
|
static Option<String> |
getExtraMetadataFromLatest(HoodieTableMetaClient metaClient,
String extraMetadataKey)
Get extra metadata for specified key from latest commit/deltacommit/replacecommit(eg.
|
static Option<String> |
getExtraMetadataFromLatestIncludeClustering(HoodieTableMetaClient metaClient,
String extraMetadataKey)
Get extra metadata for specified key from latest commit/deltacommit/replacecommit instant including internal commits
such as clustering.
|
static HoodieDefaultTimeline |
getTimeline(HoodieTableMetaClient metaClient,
boolean includeArchivedTimeline) |
static List<String> |
getWrittenPartitions(HoodieTimeline timeline)
Returns partitions that have new data strictly after commitTime.
|
static HoodieTimeline |
handleHollowCommitIfNeeded(HoodieTimeline completedCommitTimeline,
HoodieTableMetaClient metaClient,
TimelineUtils.HollowCommitHandling handlingMode)
Handles hollow commit as per
HoodieCommonConfig.INCREMENTAL_READ_HANDLE_HOLLOW_COMMIT
and return filtered or non-filtered timeline for incremental query to run against. |
static boolean |
isClusteringCommit(HoodieTableMetaClient metaClient,
HoodieInstant instant) |
static boolean |
isDeletePartition(WriteOperationType operation) |
static void |
validateTimestampAsOf(HoodieTableMetaClient metaClient,
String timestampAsOf)
Validate user-specified timestamp of time travel query against incomplete commit's timestamp.
|
public static List<String> getWrittenPartitions(HoodieTimeline timeline)
public static List<String> getDroppedPartitions(HoodieTableMetaClient metaClient, Option<String> lastCommitTimeSynced, Option<String> lastCommitCompletionTimeSynced)
public static List<String> getAffectedPartitions(HoodieTimeline timeline)
public static Option<String> getExtraMetadataFromLatest(HoodieTableMetaClient metaClient, String extraMetadataKey)
public static Option<String> getExtraMetadataFromLatestIncludeClustering(HoodieTableMetaClient metaClient, String extraMetadataKey)
public static Map<String,Option<String>> getAllExtraMetadataForKey(HoodieTableMetaClient metaClient, String extraMetadataKey)
public static boolean isClusteringCommit(HoodieTableMetaClient metaClient, HoodieInstant instant)
public static HoodieDefaultTimeline getTimeline(HoodieTableMetaClient metaClient, boolean includeArchivedTimeline)
public static HoodieTimeline getCommitsTimelineAfter(HoodieTableMetaClient metaClient, String exclusiveStartInstantTime, Option<String> lastMaxCompletionTime)
metaClient - HoodieTableMetaClient instance.exclusiveStartInstantTime - Start instant time (exclusive).lastMaxCompletionTime - Last commit max completion time syncedpublic static HoodieCommitMetadata getCommitMetadata(HoodieInstant instant, HoodieTimeline timeline) throws IOException
instant - The hoodie instanttimeline - The timelineIOExceptionpublic static Option<HoodieInstant> getEarliestInstantForMetadataArchival(HoodieActiveTimeline dataTableActiveTimeline, boolean shouldArchiveBeyondSavepoint)
the qualified earliest instant is chosen as the earlier one between the earliest commit (COMMIT, DELTA_COMMIT, and REPLACE_COMMIT only, considering non-savepoint commit only if enabling archive beyond savepoint) and the earliest inflight instant (all actions).
dataTableActiveTimeline - the active timeline of the data table.shouldArchiveBeyondSavepoint - whether to archive beyond savepoint.public static void validateTimestampAsOf(HoodieTableMetaClient metaClient, String timestampAsOf)
HoodieException - when time travel query's timestamp >= incomplete commit's timestamppublic static HoodieTimeline handleHollowCommitIfNeeded(HoodieTimeline completedCommitTimeline, HoodieTableMetaClient metaClient, TimelineUtils.HollowCommitHandling handlingMode)
HoodieCommonConfig.INCREMENTAL_READ_HANDLE_HOLLOW_COMMIT
and return filtered or non-filtered timeline for incremental query to run against.public static HoodieTimeline concatTimeline(HoodieTimeline timeline1, HoodieTimeline timeline2, HoodieTableMetaClient metaClient)
public static boolean isDeletePartition(WriteOperationType operation)
Copyright © 2024 The Apache Software Foundation. All rights reserved.