public class FileIndex extends Object
It caches the partition paths to avoid redundant look up.
| Modifier and Type | Method and Description |
|---|---|
org.apache.hadoop.fs.FileStatus[] |
getFilesInPartitions()
Returns all the file statuses under the table base path.
|
List<String> |
getOrBuildPartitionPaths()
Returns all the relative partition paths.
|
List<Map<String,String>> |
getPartitions(List<String> partitionKeys,
String defaultParName,
boolean hivePartition)
Returns the partition path key and values as a list of map, each map item in the list
is a mapping of the partition key name to its actual partition value.
|
static FileIndex |
instance(org.apache.hadoop.fs.Path path,
org.apache.flink.configuration.Configuration conf,
org.apache.flink.table.types.logical.RowType rowType) |
void |
reset()
Reset the state of the file index.
|
void |
setFilters(List<org.apache.flink.table.expressions.ResolvedExpression> filters)
Sets up pushed down filters.
|
void |
setPartitionPaths(Set<String> partitionPaths)
Sets up explicit partition paths for pruning.
|
public static FileIndex instance(org.apache.hadoop.fs.Path path, org.apache.flink.configuration.Configuration conf, org.apache.flink.table.types.logical.RowType rowType)
public List<Map<String,String>> getPartitions(List<String> partitionKeys, String defaultParName, boolean hivePartition)
-- file:/// ... key1=val1/key2=val2/key3=val3 -- file:/// ... key1=val4/key2=val5/key3=val6
The return list should be [{key1:val1, key2:val2, key3:val3}, {key1:val4, key2:val5, key3:val6}].
partitionKeys - The partition key listdefaultParName - The default partition name for nullshivePartition - Whether the partition path is in Hive stylepublic org.apache.hadoop.fs.FileStatus[] getFilesInPartitions()
@VisibleForTesting public void reset()
public void setPartitionPaths(@Nullable Set<String> partitionPaths)
public void setFilters(List<org.apache.flink.table.expressions.ResolvedExpression> filters)
Copyright © 2022 The Apache Software Foundation. All rights reserved.