| Package | Description |
|---|---|
| org.apache.hudi.common.data | |
| org.apache.hudi.common.engine | |
| org.apache.hudi.metadata |
| Modifier and Type | Class and Description |
|---|---|
class |
HoodieListData<T>
In-memory implementation of
HoodieData holding internally a Stream of objects. |
| Modifier and Type | Method and Description |
|---|---|
HoodieData<T> |
HoodieData.distinct()
Returns new
HoodieData collection holding only distinct objects of the original one
This is a stateful intermediate operation |
HoodieData<T> |
HoodieListData.distinct() |
HoodieData<T> |
HoodieData.distinct(int parallelism)
Returns new
HoodieData collection holding only distinct objects of the original one
This is a stateful intermediate operation |
HoodieData<T> |
HoodieListData.distinct(int parallelism) |
default <O> HoodieData<T> |
HoodieData.distinctWithKey(SerializableFunction<T,O> keyGetter,
int parallelism) |
<O> HoodieData<T> |
HoodieListData.distinctWithKey(SerializableFunction<T,O> keyGetter,
int parallelism) |
HoodieData<T> |
HoodieData.filter(SerializableFunction<T,Boolean> filterFunc)
Returns new instance of
HoodieData collection only containing elements matching provided
filterFunc (ie ones it returns true on) |
HoodieData<T> |
HoodieListData.filter(SerializableFunction<T,Boolean> filterFunc) |
<O> HoodieData<O> |
HoodieData.flatMap(SerializableFunction<T,Iterator<O>> func)
Maps every element in the collection into a collection of the new elements using provided
mapping
func, subsequently flattening the result (by concatenating) into a single
collection
This is an intermediate operation |
<O> HoodieData<O> |
HoodieListData.flatMap(SerializableFunction<T,Iterator<O>> func) |
HoodieData<K> |
HoodieListPairData.keys() |
HoodieData<K> |
HoodiePairData.keys()
Returns a
HoodieData holding the key from every corresponding pair |
<O> HoodieData<O> |
HoodieListPairData.map(SerializableFunction<Pair<K,V>,O> func) |
<O> HoodieData<O> |
HoodiePairData.map(SerializableFunction<Pair<K,V>,O> func)
Maps key-value pairs of this
HoodiePairData container leveraging provided mapper
NOTE: That this returns HoodieData and not HoodiePairData |
<O> HoodieData<O> |
HoodieData.map(SerializableFunction<T,O> func)
Maps every element in the collection using provided mapping
func. |
<O> HoodieData<O> |
HoodieListData.map(SerializableFunction<T,O> func) |
<O> HoodieData<O> |
HoodieData.mapPartitions(SerializableFunction<Iterator<T>,Iterator<O>> func,
boolean preservesPartitioning)
Maps every element in the collection's partition (if applicable) by applying provided
mapping
func to every collection's partition
This is an intermediate operation |
<O> HoodieData<O> |
HoodieListData.mapPartitions(SerializableFunction<Iterator<T>,Iterator<O>> func,
boolean preservesPartitioning) |
HoodieData<T> |
HoodieData.repartition(int parallelism)
Re-partitions underlying collection (if applicable) making sure new
HoodieData has
exactly parallelism partitions |
HoodieData<T> |
HoodieListData.repartition(int parallelism) |
HoodieData<T> |
HoodieData.union(HoodieData<T> other)
Unions
HoodieData with another instance of HoodieData. |
HoodieData<T> |
HoodieListData.union(HoodieData<T> other) |
HoodieData<V> |
HoodieListPairData.values() |
HoodieData<V> |
HoodiePairData.values()
Returns a
HoodieData holding the value from every corresponding pair |
| Modifier and Type | Method and Description |
|---|---|
HoodieData<T> |
HoodieData.union(HoodieData<T> other)
Unions
HoodieData with another instance of HoodieData. |
HoodieData<T> |
HoodieListData.union(HoodieData<T> other) |
| Modifier and Type | Method and Description |
|---|---|
abstract <T> HoodieData<T> |
HoodieEngineContext.emptyHoodieData() |
<T> HoodieData<T> |
HoodieLocalEngineContext.emptyHoodieData() |
<T> HoodieData<T> |
HoodieEngineContext.parallelize(List<T> data) |
abstract <T> HoodieData<T> |
HoodieEngineContext.parallelize(List<T> data,
int parallelism) |
<T> HoodieData<T> |
HoodieLocalEngineContext.parallelize(List<T> data,
int parallelism) |
| Modifier and Type | Method and Description |
|---|---|
abstract <I,O> O |
HoodieEngineContext.aggregate(HoodieData<I> data,
O zeroValue,
Functions.Function2<O,I,O> seqOp,
Functions.Function2<O,O,O> combOp)
Aggregate the elements of each partition, and then the results for all the partitions, using given combine functions and a neutral "zero value".
|
<I,O> O |
HoodieLocalEngineContext.aggregate(HoodieData<I> data,
O zeroValue,
Functions.Function2<O,I,O> seqOp,
Functions.Function2<O,O,O> combOp) |
| Modifier and Type | Method and Description |
|---|---|
static HoodieData<HoodieRecord> |
HoodieTableMetadataUtil.convertFilesToBloomFilterRecords(HoodieEngineContext engineContext,
Map<String,List<String>> partitionToDeletedFiles,
Map<String,Map<String,Long>> partitionToAppendedFiles,
String instantTime,
HoodieTableMetaClient dataMetaClient,
int bloomIndexParallelism,
String bloomFilterType)
Convert added and deleted files metadata to bloom filter index records.
|
static HoodieData<HoodieRecord> |
HoodieTableMetadataUtil.convertFilesToColumnStatsRecords(HoodieEngineContext engineContext,
Map<String,List<String>> partitionToDeletedFiles,
Map<String,Map<String,Long>> partitionToAppendedFiles,
HoodieTableMetaClient dataMetaClient,
boolean isColumnStatsIndexEnabled,
int columnStatsIndexParallelism,
List<String> targetColumnsForColumnStatsIndex)
Convert added and deleted action metadata to column stats index records.
|
static HoodieData<HoodieRecord> |
HoodieTableMetadataUtil.convertFilesToPartitionStatsRecords(HoodieEngineContext engineContext,
List<HoodieTableMetadataUtil.DirectoryInfo> partitionInfoList,
HoodieMetadataConfig metadataConfig,
HoodieTableMetaClient dataTableMetaClient) |
static HoodieData<HoodieRecord> |
HoodieTableMetadataUtil.convertMetadataToBloomFilterRecords(HoodieCleanMetadata cleanMetadata,
HoodieEngineContext engineContext,
String instantTime,
int bloomIndexParallelism)
Convert clean metadata to bloom filter index records.
|
static HoodieData<HoodieRecord> |
HoodieTableMetadataUtil.convertMetadataToBloomFilterRecords(HoodieEngineContext context,
HoodieConfig hoodieConfig,
HoodieCommitMetadata commitMetadata,
String instantTime,
HoodieTableMetaClient dataMetaClient,
String bloomFilterType,
int bloomIndexParallelism)
Convert commit action metadata to bloom filter records.
|
static HoodieData<HoodieRecord> |
HoodieTableMetadataUtil.convertMetadataToColumnStatsRecords(HoodieCleanMetadata cleanMetadata,
HoodieEngineContext engineContext,
HoodieTableMetaClient dataMetaClient,
boolean isColumnStatsIndexEnabled,
int columnStatsIndexParallelism,
List<String> targetColumnsForColumnStatsIndex)
Convert clean metadata to column stats index records.
|
static HoodieData<HoodieRecord> |
HoodieTableMetadataUtil.convertMetadataToColumnStatsRecords(HoodieCommitMetadata commitMetadata,
HoodieEngineContext engineContext,
HoodieTableMetaClient dataMetaClient,
boolean isColumnStatsIndexEnabled,
int columnStatsIndexParallelism,
List<String> targetColumnsForColumnStatsIndex) |
static HoodieData<HoodieRecord> |
HoodieTableMetadataUtil.convertMetadataToPartitionStatsRecords(HoodieCommitMetadata commitMetadata,
HoodieEngineContext engineContext,
HoodieTableMetaClient dataMetaClient,
HoodieMetadataConfig metadataConfig) |
HoodieData<HoodieRecord<HoodieMetadataPayload>> |
FileSystemBackedTableMetadata.getRecordsByKeyPrefixes(List<String> keyPrefixes,
String partitionName,
boolean shouldLoadInMemory) |
HoodieData<HoodieRecord<HoodieMetadataPayload>> |
HoodieBackedTableMetadata.getRecordsByKeyPrefixes(List<String> keyPrefixes,
String partitionName,
boolean shouldLoadInMemory) |
HoodieData<HoodieRecord<HoodieMetadataPayload>> |
HoodieTableMetadata.getRecordsByKeyPrefixes(List<String> keyPrefixes,
String partitionName,
boolean shouldLoadInMemory)
Fetch records by key prefixes.
|
static HoodieData<HoodieRecord> |
HoodieTableMetadataUtil.readRecordKeysFromBaseFiles(HoodieEngineContext engineContext,
HoodieConfig config,
List<Pair<String,HoodieBaseFile>> partitionBaseFilePairs,
boolean forDelete,
int recordIndexMaxParallelism,
StoragePath basePath,
StorageConfiguration<?> configuration,
String activeModule)
Deprecated.
|
static HoodieData<HoodieRecord> |
HoodieTableMetadataUtil.readRecordKeysFromFileSlices(HoodieEngineContext engineContext,
List<Pair<String,FileSlice>> partitionFileSlicePairs,
boolean forDelete,
int recordIndexMaxParallelism,
String activeModule,
HoodieTableMetaClient metaClient,
EngineType engineType)
Reads the record keys from the given file slices and returns a
HoodieData of HoodieRecord to be updated in the metadata table. |
static HoodieData<HoodieRecord> |
HoodieTableMetadataUtil.readSecondaryKeysFromBaseFiles(HoodieEngineContext engineContext,
List<Pair<String,Pair<String,List<String>>>> partitionFiles,
int secondaryIndexMaxParallelism,
String activeModule,
HoodieTableMetaClient metaClient,
EngineType engineType,
HoodieIndexDefinition indexDefinition) |
static HoodieData<HoodieRecord> |
HoodieTableMetadataUtil.readSecondaryKeysFromFileSlices(HoodieEngineContext engineContext,
List<Pair<String,FileSlice>> partitionFileSlicePairs,
int secondaryIndexMaxParallelism,
String activeModule,
HoodieTableMetaClient metaClient,
EngineType engineType,
HoodieIndexDefinition indexDefinition) |
| Modifier and Type | Method and Description |
|---|---|
static Map<MetadataPartitionType,HoodieData<HoodieRecord>> |
HoodieTableMetadataUtil.convertMetadataToRecords(HoodieEngineContext engineContext,
HoodieCleanMetadata cleanMetadata,
String instantTime,
HoodieTableMetaClient dataMetaClient,
List<MetadataPartitionType> enabledPartitionTypes,
int bloomIndexParallelism,
boolean isColumnStatsIndexEnabled,
int columnStatsIndexParallelism,
List<String> targetColumnsForColumnStatsIndex)
Convert the clean action to metadata records.
|
static Map<MetadataPartitionType,HoodieData<HoodieRecord>> |
HoodieTableMetadataUtil.convertMetadataToRecords(HoodieEngineContext context,
HoodieConfig hoodieConfig,
HoodieCommitMetadata commitMetadata,
String instantTime,
HoodieTableMetaClient dataMetaClient,
List<MetadataPartitionType> enabledPartitionTypes,
String bloomFilterType,
int bloomIndexParallelism,
boolean isColumnStatsIndexEnabled,
int columnStatsIndexParallelism,
List<String> targetColumnsForColumnStatsIndex,
HoodieMetadataConfig metadataConfig)
Convert commit action to metadata records for the enabled partition types.
|
static Map<MetadataPartitionType,HoodieData<HoodieRecord>> |
HoodieTableMetadataUtil.convertMetadataToRecords(HoodieEngineContext engineContext,
HoodieTableMetaClient dataTableMetaClient,
HoodieRollbackMetadata rollbackMetadata,
String instantTime)
Convert rollback action metadata to metadata table records.
|
static Map<MetadataPartitionType,HoodieData<HoodieRecord>> |
HoodieTableMetadataUtil.convertMissingPartitionRecords(HoodieEngineContext engineContext,
List<String> deletedPartitions,
Map<String,Map<String,Long>> filesAdded,
Map<String,List<String>> filesDeleted,
String instantTime) |
Copyright © 2024 The Apache Software Foundation. All rights reserved.