public class FilePathUtils extends Object
PartitionPathUtils
but supports simple partition path besides the Hive style.| Constructor and Description |
|---|
FilePathUtils() |
| Modifier and Type | Method and Description |
|---|---|
static String[] |
extractHivePartitionFields(org.apache.flink.configuration.Configuration conf)
Extracts the hive sync partition fields with given configuration.
|
static String[] |
extractPartitionKeys(org.apache.flink.configuration.Configuration conf)
Extracts the partition keys with given configuration.
|
static LinkedHashMap<String,String> |
extractPartitionKeyValues(org.apache.hadoop.fs.Path currPath,
boolean hivePartition,
String[] partitionKeys)
Generates partition key value mapping from path.
|
static String |
generatePartitionPath(LinkedHashMap<String,String> partitionKVs,
boolean hivePartition,
boolean sepSuffix)
Make partition path from partition spec.
|
static org.apache.hadoop.fs.FileStatus[] |
getFileStatusRecursively(org.apache.hadoop.fs.Path path,
int expectLevel,
org.apache.hadoop.conf.Configuration conf) |
static org.apache.hadoop.fs.FileStatus[] |
getFileStatusRecursively(org.apache.hadoop.fs.Path path,
int expectLevel,
org.apache.hadoop.fs.FileSystem fs) |
static List<Map<String,String>> |
getPartitions(org.apache.hadoop.fs.Path path,
org.apache.hadoop.conf.Configuration hadoopConf,
List<String> partitionKeys,
String defaultParName,
boolean hivePartition)
Returns the partition path key and values as a list of map, each map item in the list
is a mapping of the partition key name to its actual partition value.
|
static org.apache.hadoop.fs.Path[] |
getReadPaths(org.apache.hadoop.fs.Path path,
org.apache.flink.configuration.Configuration conf,
org.apache.hadoop.conf.Configuration hadoopConf,
List<String> partitionKeys)
Returns all the file paths that is the parents of the data files.
|
static boolean |
isHiveStylePartitioning(String path) |
static org.apache.hadoop.fs.Path[] |
partitionPath2ReadPath(org.apache.hadoop.fs.Path path,
List<String> partitionKeys,
List<Map<String,String>> partitionPaths,
boolean hivePartition)
Transforms the given partition key value mapping to read paths.
|
static List<org.apache.flink.api.java.tuple.Tuple2<LinkedHashMap<String,String>,org.apache.hadoop.fs.Path>> |
searchPartKeyValueAndPaths(org.apache.hadoop.fs.FileSystem fs,
org.apache.hadoop.fs.Path path,
boolean hivePartition,
String[] partitionKeys)
Search all partitions in this path.
|
static org.apache.flink.core.fs.Path |
toFlinkPath(org.apache.hadoop.fs.Path path)
Transforms the Hadoop path to Flink path.
|
static org.apache.flink.core.fs.Path[] |
toFlinkPaths(org.apache.hadoop.fs.Path[] paths)
Transforms the array of Hadoop paths to Flink paths.
|
static Set<String> |
toRelativePartitionPaths(List<String> partitionKeys,
List<Map<String,String>> partitionPaths,
boolean hivePartition)
Transforms the given partition key value mapping to relative partition paths.
|
static String |
unescapePathName(String path) |
static LinkedHashMap<String,String> |
validateAndReorderPartitions(Map<String,String> partitionKVs,
List<String> partitionKeys)
Reorder the partition key value mapping based on the given partition keys sequence.
|
public static String generatePartitionPath(LinkedHashMap<String,String> partitionKVs, boolean hivePartition, boolean sepSuffix)
partitionKVs - The partition key value mappinghivePartition - Whether the partition path is with Hive style,
e.g. {partition key} = {partition value}sepSuffix - Whether to append the path separator as suffixpublic static LinkedHashMap<String,String> extractPartitionKeyValues(org.apache.hadoop.fs.Path currPath, boolean hivePartition, String[] partitionKeys)
currPath - Partition file pathhivePartition - Whether the partition path is with Hive stylepartitionKeys - Partition keyspublic static List<org.apache.flink.api.java.tuple.Tuple2<LinkedHashMap<String,String>,org.apache.hadoop.fs.Path>> searchPartKeyValueAndPaths(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path, boolean hivePartition, String[] partitionKeys)
fs - File systempath - Search pathhivePartition - Whether the partition path is with Hive stylepartitionKeys - Partition keyspublic static org.apache.hadoop.fs.FileStatus[] getFileStatusRecursively(org.apache.hadoop.fs.Path path,
int expectLevel,
org.apache.hadoop.conf.Configuration conf)
public static org.apache.hadoop.fs.FileStatus[] getFileStatusRecursively(org.apache.hadoop.fs.Path path,
int expectLevel,
org.apache.hadoop.fs.FileSystem fs)
public static List<Map<String,String>> getPartitions(org.apache.hadoop.fs.Path path, org.apache.hadoop.conf.Configuration hadoopConf, List<String> partitionKeys, String defaultParName, boolean hivePartition)
-- file:/// ... key1=val1/key2=val2/key3=val3 -- file:/// ... key1=val4/key2=val5/key3=val6
The return list should be [{key1:val1, key2:val2, key3:val3}, {key1:val4, key2:val5, key3:val6}].
path - The base pathhadoopConf - The hadoop configurationpartitionKeys - The partition key listdefaultParName - The default partition name for nullshivePartition - Whether the partition path is in Hive stylepublic static LinkedHashMap<String,String> validateAndReorderPartitions(Map<String,String> partitionKVs, List<String> partitionKeys)
partitionKVs - The partition key and value mappingpartitionKeys - The partition key listpublic static org.apache.hadoop.fs.Path[] getReadPaths(org.apache.hadoop.fs.Path path,
org.apache.flink.configuration.Configuration conf,
org.apache.hadoop.conf.Configuration hadoopConf,
List<String> partitionKeys)
path - The base pathconf - The Flink configurationhadoopConf - The hadoop configurationpartitionKeys - The partition key listpublic static org.apache.hadoop.fs.Path[] partitionPath2ReadPath(org.apache.hadoop.fs.Path path,
List<String> partitionKeys,
List<Map<String,String>> partitionPaths,
boolean hivePartition)
path - The base pathpartitionKeys - The partition key listpartitionPaths - The partition key value mappinghivePartition - Whether the partition path is in Hive stylegetReadPaths(org.apache.hadoop.fs.Path, org.apache.flink.configuration.Configuration, org.apache.hadoop.conf.Configuration, java.util.List<java.lang.String>)public static Set<String> toRelativePartitionPaths(List<String> partitionKeys, List<Map<String,String>> partitionPaths, boolean hivePartition)
partitionKeys - The partition key listpartitionPaths - The partition key value mappinghivePartition - Whether the partition path is in Hive stylegetReadPaths(org.apache.hadoop.fs.Path, org.apache.flink.configuration.Configuration, org.apache.hadoop.conf.Configuration, java.util.List<java.lang.String>)public static org.apache.flink.core.fs.Path[] toFlinkPaths(org.apache.hadoop.fs.Path[] paths)
public static org.apache.flink.core.fs.Path toFlinkPath(org.apache.hadoop.fs.Path path)
public static String[] extractPartitionKeys(org.apache.flink.configuration.Configuration conf)
conf - The flink configurationpublic static String[] extractHivePartitionFields(org.apache.flink.configuration.Configuration conf)
conf - The flink configurationpublic static boolean isHiveStylePartitioning(String path)
Copyright © 2022 The Apache Software Foundation. All rights reserved.