public abstract class BinaryInputFormat<T> extends FileInputFormat<T>
| Modifier and Type | Class and Description |
|---|---|
protected class |
BinaryInputFormat.BlockBasedInput
Writes a block info at the end of the blocks.
Current implementation uses only int and not long. |
FileInputFormat.AbstractConfigBuilder<T>, FileInputFormat.ConfigBuilder, FileInputFormat.FileBaseStatistics, FileInputFormat.InputSplitOpenThread| Modifier and Type | Field and Description |
|---|---|
static String |
BLOCK_SIZE_PARAMETER_KEY
The config parameter which defines the fixed length of a record.
|
static long |
NATIVE_BLOCK_SIZE |
currentSplit, ENUMERATE_NESTED_FILES_FLAG, enumerateNestedFiles, filePath, INFLATER_INPUT_STREAM_FACTORIES, minSplitSize, numSplits, openTimeout, READ_WHOLE_SPLIT_FLAG, splitLength, splitStart, stream, unsplittable| Constructor and Description |
|---|
BinaryInputFormat() |
| Modifier and Type | Method and Description |
|---|---|
void |
configure(Configuration parameters)
Configures the file input format by reading the file path from the configuration.
|
BlockInfo |
createBlockInfo() |
FileInputSplit[] |
createInputSplits(int minNumSplits)
Computes the input splits for the file.
|
protected org.apache.flink.api.common.io.BinaryInputFormat.SequentialStatistics |
createStatistics(List<FileStatus> files,
FileInputFormat.FileBaseStatistics stats)
Fill in the statistics.
|
protected abstract T |
deserialize(T reuse,
DataInputView dataInput) |
protected List<FileStatus> |
getFiles() |
protected FileInputSplit[] |
getInputSplits() |
org.apache.flink.api.common.io.BinaryInputFormat.SequentialStatistics |
getStatistics(BaseStatistics cachedStats)
Obtains basic file statistics containing only file size.
|
T |
nextRecord(T record)
Tries to read the next pair from the input.
|
void |
open(FileInputSplit split)
Opens an input stream to the file defined in the input format.
|
boolean |
reachedEnd()
Method used to check if the end of the input is reached.
|
acceptFile, close, configureFileFormat, decorateInputStream, extractFileExtension, getFilePath, getFileStats, getInflaterInputStreamFactory, getInputSplitAssigner, getMinSplitSize, getNumSplits, getOpenTimeout, getSplitLength, getSplitStart, registerInflaterInputStreamFactory, setFilePath, setFilePath, setMinSplitSize, setNumSplits, setOpenTimeout, testForUnsplittable, toStringgetRuntimeContext, setRuntimeContextpublic static final String BLOCK_SIZE_PARAMETER_KEY
public static final long NATIVE_BLOCK_SIZE
public void configure(Configuration parameters)
FileInputFormatconfigure in interface InputFormat<T,FileInputSplit>configure in class FileInputFormat<T>parameters - The configuration with all parameters.InputFormat.configure(org.apache.flink.configuration.Configuration)public FileInputSplit[] createInputSplits(int minNumSplits) throws IOException
FileInputFormatcreateInputSplits in interface InputFormat<T,FileInputSplit>createInputSplits in interface InputSplitSource<FileInputSplit>createInputSplits in class FileInputFormat<T>minNumSplits - The minimum desired number of file splits.IOException - Thrown, when the creation of the splits was erroneous.InputFormat.createInputSplits(int)protected List<FileStatus> getFiles() throws IOException
IOExceptionpublic org.apache.flink.api.common.io.BinaryInputFormat.SequentialStatistics getStatistics(BaseStatistics cachedStats)
FileInputFormatgetStatistics in interface InputFormat<T,FileInputSplit>getStatistics in class FileInputFormat<T>cachedStats - The statistics that were cached. May be null.InputFormat.getStatistics(org.apache.flink.api.common.io.statistics.BaseStatistics)protected FileInputSplit[] getInputSplits() throws IOException
IOExceptionpublic BlockInfo createBlockInfo()
protected org.apache.flink.api.common.io.BinaryInputFormat.SequentialStatistics createStatistics(List<FileStatus> files, FileInputFormat.FileBaseStatistics stats) throws IOException
files - The files that are associated with this block input format.stats - The pre-filled statistics.IOExceptionpublic void open(FileInputSplit split) throws IOException
FileInputFormatThe stream is actually opened in an asynchronous thread to make sure any interruptions to the thread working on the input format do not reach the file system.
open in interface InputFormat<T,FileInputSplit>open in class FileInputFormat<T>split - The split to be opened.IOException - Thrown, if the spit could not be opened due to an I/O problem.public boolean reachedEnd()
throws IOException
InputFormatWhen this method is called, the input format it guaranteed to be opened.
IOException - Thrown, if an I/O error occurred.public T nextRecord(T record) throws IOException
InputFormatWhen this method is called, the input format it guaranteed to be opened.
record - Object that may be reused.IOException - Thrown, if an I/O error occurred.protected abstract T deserialize(T reuse, DataInputView dataInput) throws IOException
IOExceptionCopyright © 2014–2016 The Apache Software Foundation. All rights reserved.