public class CopyOnWriteInputFormat
extends org.apache.flink.api.common.io.FileInputFormat<org.apache.flink.table.data.RowData>
FileInputFormat to read RowData records
from Parquet files.
Note: Reference Flink release 1.11.2
org.apache.flink.formats.parquet.ParquetFileSystemFormatFactory.ParquetInputFormat
to support TIMESTAMP_MILLIS.
Note: Override the createInputSplits(int) method from parent to rewrite the logic creating the FileSystem,
use FSUtils.getFs(java.lang.String, org.apache.hadoop.conf.Configuration) to get a plugin filesystem.
ParquetSplitReaderUtil,
Serialized Form| Constructor and Description |
|---|
CopyOnWriteInputFormat(org.apache.flink.core.fs.Path[] paths,
String[] fullFieldNames,
org.apache.flink.table.types.DataType[] fullFieldTypes,
int[] selectedFields,
String partDefaultName,
long limit,
org.apache.hadoop.conf.Configuration conf,
boolean utcTimestamp) |
| Modifier and Type | Method and Description |
|---|---|
boolean |
acceptFile(org.apache.hadoop.fs.FileStatus fileStatus)
A simple hook to filter files and directories from the input.
|
void |
close() |
org.apache.flink.core.fs.FileInputSplit[] |
createInputSplits(int minNumSplits) |
org.apache.flink.table.data.RowData |
nextRecord(org.apache.flink.table.data.RowData reuse) |
void |
open(org.apache.flink.core.fs.FileInputSplit fileSplit) |
boolean |
reachedEnd() |
void |
setFilesFilter(org.apache.flink.api.common.io.FilePathFilter filesFilter) |
boolean |
supportsMultiPaths() |
acceptFile, configure, decorateInputStream, extractFileExtension, getFilePath, getFilePaths, getFileStats, getFileStats, getInflaterInputStreamFactory, getInputSplitAssigner, getMinSplitSize, getNestedFileEnumeration, getNumSplits, getOpenTimeout, getSplitLength, getSplitStart, getStatistics, registerInflaterInputStreamFactory, setFilePath, setFilePath, setFilePaths, setFilePaths, setMinSplitSize, setNestedFileEnumeration, setNumSplits, setOpenTimeout, testForUnsplittable, toStringpublic void open(org.apache.flink.core.fs.FileInputSplit fileSplit)
throws IOException
open in interface org.apache.flink.api.common.io.InputFormat<org.apache.flink.table.data.RowData,org.apache.flink.core.fs.FileInputSplit>open in class org.apache.flink.api.common.io.FileInputFormat<org.apache.flink.table.data.RowData>IOExceptionpublic org.apache.flink.core.fs.FileInputSplit[] createInputSplits(int minNumSplits)
throws IOException
createInputSplits in interface org.apache.flink.api.common.io.InputFormat<org.apache.flink.table.data.RowData,org.apache.flink.core.fs.FileInputSplit>createInputSplits in interface org.apache.flink.core.io.InputSplitSource<org.apache.flink.core.fs.FileInputSplit>createInputSplits in class org.apache.flink.api.common.io.FileInputFormat<org.apache.flink.table.data.RowData>IOExceptionpublic boolean supportsMultiPaths()
supportsMultiPaths in class org.apache.flink.api.common.io.FileInputFormat<org.apache.flink.table.data.RowData>public boolean reachedEnd()
throws IOException
IOExceptionpublic org.apache.flink.table.data.RowData nextRecord(org.apache.flink.table.data.RowData reuse)
public void close()
throws IOException
close in interface org.apache.flink.api.common.io.InputFormat<org.apache.flink.table.data.RowData,org.apache.flink.core.fs.FileInputSplit>close in class org.apache.flink.api.common.io.FileInputFormat<org.apache.flink.table.data.RowData>IOExceptionpublic void setFilesFilter(org.apache.flink.api.common.io.FilePathFilter filesFilter)
setFilesFilter in class org.apache.flink.api.common.io.FileInputFormat<org.apache.flink.table.data.RowData>public boolean acceptFile(org.apache.hadoop.fs.FileStatus fileStatus)
fileStatus - The file status to check.Copyright © 2022 The Apache Software Foundation. All rights reserved.