public class RemoteHoodieTableFileSystemView extends Object implements SyncableFileSystemView, Serializable
TableFileSystemView.BaseFileOnlyView, TableFileSystemView.BaseFileOnlyViewWithLatestSlice, TableFileSystemView.SliceView, TableFileSystemView.SliceViewWithLatestSlice| Constructor and Description |
|---|
RemoteHoodieTableFileSystemView(HoodieTableMetaClient metaClient,
FileSystemViewStorageConfig viewConf) |
RemoteHoodieTableFileSystemView(String server,
int port,
HoodieTableMetaClient metaClient) |
| Modifier and Type | Method and Description |
|---|---|
void |
close()
Allow View to release resources and close.
|
Stream<HoodieBaseFile> |
getAllBaseFiles(String partitionPath)
Stream all the data file versions grouped by FileId for a given partition.
|
Stream<HoodieFileGroup> |
getAllFileGroups(String partitionPath)
Stream all the file groups for a given partition.
|
Stream<HoodieFileGroup> |
getAllFileGroupsStateless(String partitionPath)
Stream all the file groups for a given partition without caching the file group mappings.
|
Stream<FileSlice> |
getAllFileSlices(String partitionPath)
Stream all the file slices for a given partition, latest or not.
|
Map<String,Stream<HoodieBaseFile>> |
getAllLatestBaseFilesBeforeOrOn(String maxCommitTime)
Streams the latest version base files in all partitions with precondition that
commitTime(file) before maxCommitTime.
|
Map<String,Stream<FileSlice>> |
getAllLatestFileSlicesBeforeOrOn(String maxCommitTime)
Stream all latest file slices with precondition that commitTime(file) before maxCommitTime.
|
Stream<HoodieFileGroup> |
getAllReplacedFileGroups(String partitionPath)
Stream all the replaced file groups for given partition.
|
Option<HoodieBaseFile> |
getBaseFileOn(String partitionPath,
String instantTime,
String fileId)
Get the version of data file matching the instant time in the given partition.
|
Stream<Pair<HoodieFileGroupId,HoodieInstant>> |
getFileGroupsInPendingClustering()
Filegroups that are in pending clustering.
|
Option<HoodieInstant> |
getLastInstant()
Last Known Instant on which the view is built.
|
Option<HoodieBaseFile> |
getLatestBaseFile(String partitionPath,
String fileId)
Get Latest data file for a partition and file-Id.
|
Stream<HoodieBaseFile> |
getLatestBaseFiles()
Stream all the latest data files, in the file system view.
|
Stream<HoodieBaseFile> |
getLatestBaseFiles(String partitionPath)
Stream all the latest data files in the given partition.
|
Stream<HoodieBaseFile> |
getLatestBaseFilesBeforeOrOn(String partitionPath,
String maxCommitTime)
Stream all the latest version data files in the given partition with precondition that commitTime(file) before
maxCommitTime.
|
Stream<HoodieBaseFile> |
getLatestBaseFilesInRange(List<String> commitsToReturn)
Stream all the latest data files pass.
|
Option<FileSlice> |
getLatestFileSlice(String partitionPath,
String fileId)
Get Latest File Slice for a given fileId in a given partition.
|
Stream<FileSlice> |
getLatestFileSliceInRange(List<String> commitsToReturn)
Stream all the latest file slices, in the given range.
|
Stream<FileSlice> |
getLatestFileSlices(String partitionPath)
Stream all the latest file slices in the given partition.
|
Stream<FileSlice> |
getLatestFileSlicesBeforeOrOn(String partitionPath,
String maxCommitTime,
boolean includeFileSlicesInPendingCompaction)
Stream all latest file slices in given partition with precondition that commitTime(file) before maxCommitTime.
|
Stream<FileSlice> |
getLatestFileSlicesIncludingInflight(String partitionPath)
Get the latest file slices for a given partition including the inflight ones.
|
Stream<FileSlice> |
getLatestFileSlicesStateless(String partitionPath)
Stream all the latest file slices in the given partition
without caching the file group mappings.
|
Stream<FileSlice> |
getLatestMergedFileSlicesBeforeOrOn(String partitionPath,
String maxInstantTime)
Stream all "merged" file-slices before on an instant time If a file-group has a pending compaction request, the
file-slice before and after compaction request instant is merged and returned.
|
Stream<FileSlice> |
getLatestUnCompactedFileSlices(String partitionPath)
Stream all the latest uncompacted file slices in the given partition.
|
Stream<Pair<String,CompactionOperation>> |
getPendingCompactionOperations()
Return Pending Compaction Operations.
|
Stream<Pair<String,CompactionOperation>> |
getPendingLogCompactionOperations()
Return Pending Compaction Operations.
|
Stream<HoodieFileGroup> |
getReplacedFileGroupsAfterOrOn(String minCommitTime,
String partitionPath)
Stream all the replaced file groups after or on minCommitTime.
|
Stream<HoodieFileGroup> |
getReplacedFileGroupsBefore(String maxCommitTime,
String partitionPath)
Stream all the replaced file groups before maxCommitTime for given partition.
|
Stream<HoodieFileGroup> |
getReplacedFileGroupsBeforeOrOn(String maxCommitTime,
String partitionPath)
Stream all the replaced file groups before or on maxCommitTime for given partition.
|
HoodieTimeline |
getTimeline()
Timeline corresponding to the view.
|
void |
loadAllPartitions()
Load all partition and file slices into view
|
void |
loadPartitions(List<String> partitionPaths)
Load all partition and file slices into view for the provided partition paths
|
boolean |
refresh() |
void |
reset()
Reset View so that they can be refreshed.
|
void |
sync()
Read the latest timeline and refresh the file-system view to match the current state of the file-system.
|
public static final String LATEST_PARTITION_SLICES_URL
public static final String LATEST_PARTITION_SLICES_INFLIGHT_URL
public static final String LATEST_PARTITION_SLICES_STATELESS_URL
public static final String LATEST_PARTITION_SLICE_URL
public static final String LATEST_PARTITION_UNCOMPACTED_SLICES_URL
public static final String ALL_SLICES_URL
public static final String LATEST_SLICES_MERGED_BEFORE_ON_INSTANT_URL
public static final String LATEST_SLICES_RANGE_INSTANT_URL
public static final String LATEST_SLICES_BEFORE_ON_INSTANT_URL
public static final String ALL_LATEST_SLICES_BEFORE_ON_INSTANT_URL
public static final String PENDING_COMPACTION_OPS_URL
public static final String PENDING_LOG_COMPACTION_OPS_URL
public static final String LATEST_PARTITION_DATA_FILES_URL
public static final String LATEST_PARTITION_DATA_FILE_URL
public static final String ALL_DATA_FILES_URL
public static final String LATEST_ALL_DATA_FILES_URL
public static final String LATEST_DATA_FILE_ON_INSTANT_URL
public static final String LATEST_DATA_FILES_RANGE_INSTANT_URL
public static final String LATEST_DATA_FILES_BEFORE_ON_INSTANT_URL
public static final String ALL_LATEST_BASE_FILES_BEFORE_ON_INSTANT_URL
public static final String ALL_FILEGROUPS_FOR_PARTITION_URL
public static final String ALL_FILEGROUPS_FOR_PARTITION_STATELESS_URL
public static final String ALL_REPLACED_FILEGROUPS_BEFORE_OR_ON_URL
public static final String ALL_REPLACED_FILEGROUPS_BEFORE_URL
public static final String ALL_REPLACED_FILEGROUPS_AFTER_OR_ON_URL
public static final String ALL_REPLACED_FILEGROUPS_PARTITION_URL
public static final String PENDING_CLUSTERING_FILEGROUPS_URL
public static final String LAST_INSTANT_URL
public static final String LAST_INSTANTS_URL
public static final String TIMELINE_URL
public static final String REFRESH_TABLE_URL
public static final String LOAD_ALL_PARTITIONS_URL
public static final String LOAD_PARTITIONS_URL
public static final String PARTITION_PARAM
public static final String PARTITIONS_PARAM
public static final String BASEPATH_PARAM
public static final String INSTANT_PARAM
public static final String MAX_INSTANT_PARAM
public static final String MIN_INSTANT_PARAM
public static final String INSTANTS_PARAM
public static final String FILEID_PARAM
public static final String LAST_INSTANT_TS
public static final String TIMELINE_HASH
public static final String REFRESH_OFF
public static final String INCLUDE_FILES_IN_PENDING_COMPACTION_PARAM
public static final String MULTI_VALUE_SEPARATOR
public RemoteHoodieTableFileSystemView(String server, int port, HoodieTableMetaClient metaClient)
public RemoteHoodieTableFileSystemView(HoodieTableMetaClient metaClient, FileSystemViewStorageConfig viewConf)
public Stream<HoodieBaseFile> getLatestBaseFiles(String partitionPath)
TableFileSystemView.BaseFileOnlyViewWithLatestSlicegetLatestBaseFiles in interface TableFileSystemView.BaseFileOnlyViewWithLatestSlicepublic Stream<HoodieBaseFile> getLatestBaseFiles()
TableFileSystemView.BaseFileOnlyViewWithLatestSlicegetLatestBaseFiles in interface TableFileSystemView.BaseFileOnlyViewWithLatestSlicepublic Stream<HoodieBaseFile> getLatestBaseFilesBeforeOrOn(String partitionPath, String maxCommitTime)
TableFileSystemView.BaseFileOnlyViewWithLatestSlicegetLatestBaseFilesBeforeOrOn in interface TableFileSystemView.BaseFileOnlyViewWithLatestSlicepublic Map<String,Stream<HoodieBaseFile>> getAllLatestBaseFilesBeforeOrOn(String maxCommitTime)
TableFileSystemView.BaseFileOnlyViewWithLatestSlicegetAllLatestBaseFilesBeforeOrOn in interface TableFileSystemView.BaseFileOnlyViewWithLatestSlicemaxCommitTime - The max commit time to consider.Map of partition path to the latest version base files before or on the
commit timepublic Option<HoodieBaseFile> getBaseFileOn(String partitionPath, String instantTime, String fileId)
TableFileSystemView.BaseFileOnlyViewgetBaseFileOn in interface TableFileSystemView.BaseFileOnlyViewpublic Stream<HoodieBaseFile> getLatestBaseFilesInRange(List<String> commitsToReturn)
TableFileSystemView.BaseFileOnlyViewWithLatestSlicegetLatestBaseFilesInRange in interface TableFileSystemView.BaseFileOnlyViewWithLatestSlicepublic Stream<HoodieBaseFile> getAllBaseFiles(String partitionPath)
TableFileSystemView.BaseFileOnlyViewgetAllBaseFiles in interface TableFileSystemView.BaseFileOnlyViewpublic Option<HoodieBaseFile> getLatestBaseFile(String partitionPath, String fileId)
TableFileSystemView.BaseFileOnlyViewWithLatestSlicegetLatestBaseFile in interface TableFileSystemView.BaseFileOnlyViewWithLatestSlicepublic Stream<FileSlice> getLatestFileSlices(String partitionPath)
TableFileSystemView.SliceViewWithLatestSlicegetLatestFileSlices in interface TableFileSystemView.SliceViewWithLatestSlicepublic Stream<FileSlice> getLatestFileSlicesIncludingInflight(String partitionPath)
TableFileSystemView.SliceViewWithLatestSlicegetLatestFileSlicesIncludingInflight in interface TableFileSystemView.SliceViewWithLatestSlicepartitionPath - The partition path of interestFileSlice in the partition path.public Stream<FileSlice> getLatestFileSlicesStateless(String partitionPath)
TableFileSystemView.SliceViewWithLatestSliceThis is useful for some table services such as compaction and clustering, these services may search around the files to clean within some ancient data partitions, if there triggers a full table service for enormous number of partitions, the cache could cause a huge memory pressure to the timeline server which induces an OOM exception.
The caching of these file groups does not benefit to writers most often because the writers write to recent data partitions usually.
getLatestFileSlicesStateless in interface TableFileSystemView.SliceViewWithLatestSlicepublic Option<FileSlice> getLatestFileSlice(String partitionPath, String fileId)
TableFileSystemView.SliceViewWithLatestSlicegetLatestFileSlice in interface TableFileSystemView.SliceViewWithLatestSlicepublic Stream<FileSlice> getLatestUnCompactedFileSlices(String partitionPath)
TableFileSystemView.SliceViewWithLatestSlicegetLatestUnCompactedFileSlices in interface TableFileSystemView.SliceViewWithLatestSlicepublic Stream<FileSlice> getLatestFileSlicesBeforeOrOn(String partitionPath, String maxCommitTime, boolean includeFileSlicesInPendingCompaction)
TableFileSystemView.SliceViewWithLatestSlicegetLatestFileSlicesBeforeOrOn in interface TableFileSystemView.SliceViewWithLatestSlicepartitionPath - Partition pathmaxCommitTime - Max Instant TimeincludeFileSlicesInPendingCompaction - include file-slices that are in pending compactionpublic Map<String,Stream<FileSlice>> getAllLatestFileSlicesBeforeOrOn(String maxCommitTime)
TableFileSystemView.SliceViewWithLatestSlicegetAllLatestFileSlicesBeforeOrOn in interface TableFileSystemView.SliceViewWithLatestSlicemaxCommitTime - Max Instant TimeMap of partition path to the latest file slices before maxCommitTime.public Stream<FileSlice> getLatestMergedFileSlicesBeforeOrOn(String partitionPath, String maxInstantTime)
TableFileSystemView.SliceViewWithLatestSlicegetLatestMergedFileSlicesBeforeOrOn in interface TableFileSystemView.SliceViewWithLatestSlicepartitionPath - Partition PathmaxInstantTime - Max Instant Timepublic Stream<FileSlice> getLatestFileSliceInRange(List<String> commitsToReturn)
TableFileSystemView.SliceViewWithLatestSlicegetLatestFileSliceInRange in interface TableFileSystemView.SliceViewWithLatestSlicepublic Stream<FileSlice> getAllFileSlices(String partitionPath)
TableFileSystemView.SliceViewgetAllFileSlices in interface TableFileSystemView.SliceViewpublic Stream<HoodieFileGroup> getAllFileGroups(String partitionPath)
TableFileSystemViewgetAllFileGroups in interface TableFileSystemViewpublic Stream<HoodieFileGroup> getAllFileGroupsStateless(String partitionPath)
TableFileSystemViewThis is useful for some table services such as cleaning, the cleaning service may search around the files to clean within some ancient data partitions, if there triggers a full table cleaning for enormous number of partitions, the cache could cause a huge memory pressure to the timeline server which induces an OOM exception.
The caching of these file groups does not benefit to writers most often because the writers write to recent data partitions usually.
getAllFileGroupsStateless in interface TableFileSystemViewpublic Stream<HoodieFileGroup> getReplacedFileGroupsBeforeOrOn(String maxCommitTime, String partitionPath)
TableFileSystemViewgetReplacedFileGroupsBeforeOrOn in interface TableFileSystemViewpublic Stream<HoodieFileGroup> getReplacedFileGroupsBefore(String maxCommitTime, String partitionPath)
TableFileSystemViewgetReplacedFileGroupsBefore in interface TableFileSystemViewpublic Stream<HoodieFileGroup> getReplacedFileGroupsAfterOrOn(String minCommitTime, String partitionPath)
TableFileSystemViewgetReplacedFileGroupsAfterOrOn in interface TableFileSystemViewpublic Stream<HoodieFileGroup> getAllReplacedFileGroups(String partitionPath)
TableFileSystemViewgetAllReplacedFileGroups in interface TableFileSystemViewpublic boolean refresh()
public void loadAllPartitions()
TableFileSystemViewloadAllPartitions in interface TableFileSystemViewpublic void loadPartitions(List<String> partitionPaths)
TableFileSystemViewloadPartitions in interface TableFileSystemViewpartitionPaths - List of partition paths to loadpublic Stream<Pair<String,CompactionOperation>> getPendingCompactionOperations()
TableFileSystemViewgetPendingCompactionOperations in interface TableFileSystemViewpublic Stream<Pair<String,CompactionOperation>> getPendingLogCompactionOperations()
TableFileSystemViewgetPendingLogCompactionOperations in interface TableFileSystemViewpublic Stream<Pair<HoodieFileGroupId,HoodieInstant>> getFileGroupsInPendingClustering()
TableFileSystemViewgetFileGroupsInPendingClustering in interface TableFileSystemViewpublic Option<HoodieInstant> getLastInstant()
TableFileSystemViewgetLastInstant in interface TableFileSystemViewpublic HoodieTimeline getTimeline()
TableFileSystemViewgetTimeline in interface TableFileSystemViewpublic void close()
SyncableFileSystemViewclose in interface AutoCloseableclose in interface SyncableFileSystemViewpublic void reset()
SyncableFileSystemViewreset in interface SyncableFileSystemViewpublic void sync()
SyncableFileSystemViewsync in interface SyncableFileSystemViewCopyright © 2024 The Apache Software Foundation. All rights reserved.