public class CsvInputFormat extends GenericCsvInputFormat<Record>
Configuration.
The number of fields to parse must be configured as well.
For each field a data type must be specified using the FIELD_TYPE_PARAMETER_PREFIX config key.
The position within the text record can be configured for each field using the TEXT_POSITION_PARAMETER_PREFIX config key.
Either all text positions must be configured or none. If none is configured, the index of the config key is used.
The position of a value within the Record is the index of the config key.Configuration,
Record,
Serialized Form| Modifier and Type | Class and Description |
|---|---|
protected static class |
CsvInputFormat.AbstractConfigBuilder<T>
An abstract builder used to set parameters to the input format's configuration in a fluent way.
|
static class |
CsvInputFormat.ConfigBuilder
A builder used to set parameters to the input format's configuration in a fluent way.
|
FileInputFormat.FileBaseStatistics, FileInputFormat.InputSplitOpenThreadRECORD_DELIMITERDEFLATE_SUFFIX, enumerateNestedFiles, filePath, minSplitSize, numSplits, openTimeout, READ_WHOLE_SPLIT_FLAG, splitLength, splitStart, stream, unsplittable| Constructor and Description |
|---|
CsvInputFormat() |
CsvInputFormat(char fieldDelimiter) |
CsvInputFormat(char fieldDelimiter,
Class<? extends Value>... fields) |
CsvInputFormat(Class<? extends Value>... fields) |
| Modifier and Type | Method and Description |
|---|---|
void |
configure(Configuration config) |
static CsvInputFormat.ConfigBuilder |
configureRecordFormat(FileDataSource target)
Creates a configuration builder that can be used to set the input format's parameters to the config in a fluent
fashion.
|
void |
open(FileInputSplit split) |
Record |
readRecord(Record reuse,
byte[] bytes,
int offset,
int numBytes) |
void |
setFields(int[] sourceFieldIndices,
Class<? extends Value>[] fieldTypes) |
void |
setFieldTypes(Class<? extends Value>... fieldTypes) |
void |
setFieldTypesArray(Class<? extends Value>[] fieldTypes) |
getFieldDelimiter, getFieldParsers, getGenericFieldTypes, getNumberOfFieldsTotal, getNumberOfNonNullFields, isLenient, isSkippingFirstLineAsHeader, parseRecord, setFieldDelimiter, setFieldsGeneric, setFieldsGeneric, setFieldTypesGeneric, setLenient, setSkipFirstLineAsHeader, skipFieldsclose, configureDelimitedFormat, getBufferSize, getDelimiter, getLineLengthLimit, getNumLineSamples, getStatistics, loadGloablConfigParams, nextRecord, reachedEnd, readLine, setBufferSize, setDelimiter, setDelimiter, setDelimiter, setDelimiter, setDelimiter, setLineLengthLimit, setNumLineSamplesacceptFile, configureFileFormat, createInputSplits, getFilePath, getFileStats, getInputSplitAssigner, getMinSplitSize, getNumSplits, getOpenTimeout, getSplitLength, getSplitStart, setFilePath, setFilePath, setMinSplitSize, setNumSplits, setOpenTimeout, testForUnsplittable, toStringpublic CsvInputFormat()
public CsvInputFormat(char fieldDelimiter)
public void configure(Configuration config)
configure in interface InputFormat<Record,FileInputSplit>configure in class DelimitedInputFormat<Record>public void open(FileInputSplit split) throws IOException
open in interface InputFormat<Record,FileInputSplit>open in class GenericCsvInputFormat<Record>IOExceptionpublic Record readRecord(Record reuse, byte[] bytes, int offset, int numBytes) throws ParseException
readRecord in class DelimitedInputFormat<Record>ParseExceptionpublic static CsvInputFormat.ConfigBuilder configureRecordFormat(FileDataSource target)
Copyright © 2015 The Apache Software Foundation. All rights reserved.