public class StreamerUtil extends Object
| Constructor and Description |
|---|
StreamerUtil() |
| Modifier and Type | Method and Description |
|---|---|
static TypedProperties |
appendKafkaProps(FlinkStreamerConfig config) |
static TypedProperties |
buildProperties(List<String> props) |
static HoodieTableMetaClient |
createMetaClient(org.apache.flink.configuration.Configuration conf)
Creates the meta client.
|
static HoodieTableMetaClient |
createMetaClient(String basePath,
org.apache.hadoop.conf.Configuration hadoopConf)
Creates the meta client.
|
static Option<Transformer> |
createTransformer(List<String> classNames) |
static boolean |
fileExists(org.apache.hadoop.fs.FileSystem fs,
org.apache.hadoop.fs.Path path) |
static TypedProperties |
flinkConf2TypedProperties(org.apache.flink.configuration.Configuration conf)
Converts the give
Configuration to TypedProperties. |
static String |
generateBucketKey(String partitionPath,
String fileId)
Generates the bucket ID using format {partition path}_{fileID}.
|
static HoodieIndexConfig |
getIndexConfig(org.apache.flink.configuration.Configuration conf)
Returns the index config with given configuration.
|
static String |
getLastCompletedInstant(HoodieTableMetaClient metaClient) |
static String |
getLastPendingInstant(HoodieTableMetaClient metaClient) |
static String |
getLastPendingInstant(HoodieTableMetaClient metaClient,
boolean reloadTimeline) |
static org.apache.avro.Schema |
getLatestTableSchema(String path,
org.apache.hadoop.conf.Configuration hadoopConf) |
static long |
getMaxCompactionMemoryInBytes(org.apache.flink.configuration.Configuration conf)
Returns the max compaction memory in bytes with given conf.
|
static HoodiePayloadConfig |
getPayloadConfig(org.apache.flink.configuration.Configuration conf)
Returns the payload config with given configuration.
|
static TypedProperties |
getProps(FlinkStreamerConfig cfg) |
static org.apache.avro.Schema |
getSourceSchema(org.apache.flink.configuration.Configuration conf) |
static org.apache.avro.Schema |
getTableAvroSchema(HoodieTableMetaClient metaClient,
boolean includeMetadataFields) |
static Option<HoodieTableConfig> |
getTableConfig(String basePath,
org.apache.hadoop.conf.Configuration hadoopConf)
Returns the table config or empty if the table does not exist.
|
static boolean |
haveSuccessfulCommits(HoodieTableMetaClient metaClient)
Returns whether there are successful commits on the timeline.
|
static HoodieTableMetaClient |
initTableIfNotExists(org.apache.flink.configuration.Configuration conf)
Initialize the table if it does not exist.
|
static HoodieTableMetaClient |
initTableIfNotExists(org.apache.flink.configuration.Configuration conf,
org.apache.hadoop.conf.Configuration hadoopConf)
Initialize the table if it does not exist.
|
static long |
instantTimeDiffSeconds(String newInstantTime,
String oldInstantTime)
Returns the time interval in seconds between the given instant time.
|
static boolean |
isValidFile(org.apache.hadoop.fs.FileStatus fileStatus)
Returns whether the give file is in valid hoodie format.
|
static boolean |
isWriteCommit(HoodieTableType tableType,
HoodieInstant instant,
HoodieTimeline timeline)
Returns whether the given instant is a data writing commit.
|
static Option<String> |
medianInstantTime(String highVal,
String lowVal)
Returns the median instant time between the given two instant time.
|
static HoodieTableMetaClient |
metaClientForReader(org.apache.flink.configuration.Configuration conf,
org.apache.hadoop.conf.Configuration hadoopConf)
Creates the meta client for reader.
|
static boolean |
partitionExists(String tablePath,
String partitionPath,
org.apache.hadoop.conf.Configuration hadoopConf)
Returns whether the hoodie partition exists under given table path
tablePath and partition path partitionPath. |
static DFSPropertiesConfiguration |
readConfig(org.apache.hadoop.conf.Configuration hadoopConfig,
org.apache.hadoop.fs.Path cfgPath,
List<String> overriddenProps)
Read config from properties file (`--props` option) and cmd line (`--hoodie-conf` option).
|
static boolean |
tableExists(String basePath,
org.apache.hadoop.conf.Configuration hadoopConf)
Returns whether the hoodie table exists under given path
basePath. |
public static TypedProperties appendKafkaProps(FlinkStreamerConfig config)
public static TypedProperties getProps(FlinkStreamerConfig cfg)
public static TypedProperties buildProperties(List<String> props)
public static org.apache.avro.Schema getSourceSchema(org.apache.flink.configuration.Configuration conf)
public static DFSPropertiesConfiguration readConfig(org.apache.hadoop.conf.Configuration hadoopConfig, org.apache.hadoop.fs.Path cfgPath, List<String> overriddenProps)
public static HoodiePayloadConfig getPayloadConfig(org.apache.flink.configuration.Configuration conf)
public static HoodieIndexConfig getIndexConfig(org.apache.flink.configuration.Configuration conf)
public static TypedProperties flinkConf2TypedProperties(org.apache.flink.configuration.Configuration conf)
Configuration to TypedProperties.
The default values are also set up.conf - The flink configurationpublic static HoodieTableMetaClient initTableIfNotExists(org.apache.flink.configuration.Configuration conf) throws IOException
conf - the configurationIOException - if errors happens when writing metadatapublic static HoodieTableMetaClient initTableIfNotExists(org.apache.flink.configuration.Configuration conf, org.apache.hadoop.conf.Configuration hadoopConf) throws IOException
conf - the configurationIOException - if errors happens when writing metadatapublic static boolean tableExists(String basePath, org.apache.hadoop.conf.Configuration hadoopConf)
basePath.public static boolean partitionExists(String tablePath, String partitionPath, org.apache.hadoop.conf.Configuration hadoopConf)
tablePath and partition path partitionPath.tablePath - Base path of the table.partitionPath - The path of the partition.hadoopConf - The hadoop configuration.public static String generateBucketKey(String partitionPath, String fileId)
public static HoodieTableMetaClient metaClientForReader(org.apache.flink.configuration.Configuration conf, org.apache.hadoop.conf.Configuration hadoopConf)
The streaming pipeline process is long-running, so empty table path is allowed, the reader would then check and refresh the meta client.
StreamReadMonitoringFunctionpublic static HoodieTableMetaClient createMetaClient(String basePath, org.apache.hadoop.conf.Configuration hadoopConf)
public static HoodieTableMetaClient createMetaClient(org.apache.flink.configuration.Configuration conf)
public static Option<HoodieTableConfig> getTableConfig(String basePath, org.apache.hadoop.conf.Configuration hadoopConf)
public static Option<String> medianInstantTime(String highVal, String lowVal)
public static long instantTimeDiffSeconds(String newInstantTime, String oldInstantTime)
public static Option<Transformer> createTransformer(List<String> classNames) throws IOException
IOExceptionpublic static boolean isValidFile(org.apache.hadoop.fs.FileStatus fileStatus)
public static String getLastPendingInstant(HoodieTableMetaClient metaClient)
public static String getLastPendingInstant(HoodieTableMetaClient metaClient, boolean reloadTimeline)
public static String getLastCompletedInstant(HoodieTableMetaClient metaClient)
public static boolean haveSuccessfulCommits(HoodieTableMetaClient metaClient)
metaClient - The meta clientpublic static long getMaxCompactionMemoryInBytes(org.apache.flink.configuration.Configuration conf)
public static org.apache.avro.Schema getTableAvroSchema(HoodieTableMetaClient metaClient, boolean includeMetadataFields) throws Exception
Exceptionpublic static org.apache.avro.Schema getLatestTableSchema(String path, org.apache.hadoop.conf.Configuration hadoopConf)
public static boolean fileExists(org.apache.hadoop.fs.FileSystem fs,
org.apache.hadoop.fs.Path path)
public static boolean isWriteCommit(HoodieTableType tableType, HoodieInstant instant, HoodieTimeline timeline)
tableType - The table typeinstant - The instanttimeline - The timelineCopyright © 2023 The Apache Software Foundation. All rights reserved.