Class CsvReader.CsvReaderBuilder
This builder is used to create configured instances of CsvReader. The default
configuration of this class adheres to RFC 4180:
- Field separator:
,(comma) - Quote character:
"(double quotes) - Comment strategy:
CommentStrategy.NONE(as RFC doesn't handle comments) - Comment character:
#(hash) (in case comment strategy is enabled) - Skip empty lines:
true - Allow extra fields:
false - Allow missing fields:
false - Allow extra characters after closing quotes:
false - Trim whitespaces around quotes:
false - Detect BOM header:
false - Max buffer size: 16,777,216 characters
The line delimiter (line-feed, carriage-return or the combination of both) is detected automatically and thus not configurable.
-
Method Summary
Modifier and TypeMethodDescriptionallowExtraCharsAfterClosingQuote(boolean allowExtraCharsAfterClosingQuote) Specifies whether the presence of characters between a closing quote and a field separator or the end of a line should be treated as an error or not.allowExtraFields(boolean allowExtraFields) Defines whether aCsvParseExceptionshould be thrown if records contain more fields than the first record.allowMissingFields(boolean allowMissingFields) Defines whether aCsvParseExceptionshould be thrown if records contain fewer fields than the first record.<T> CsvReader<T> build(CsvCallbackHandler<T> callbackHandler, InputStream inputStream) Constructs a new callback-basedCsvReaderfor the specified input stream.<T> CsvReader<T> build(CsvCallbackHandler<T> callbackHandler, InputStream inputStream, Charset charset) Constructs a new callback-basedCsvReaderfor the specified input stream and character set.<T> CsvReader<T> build(CsvCallbackHandler<T> callbackHandler, Reader reader) Constructs a new callback-basedCsvReaderfor the specified reader.<T> CsvReader<T> build(CsvCallbackHandler<T> callbackHandler, String data) Constructs a new callback-basedCsvReaderfor the specified String.<T> CsvReader<T> build(CsvCallbackHandler<T> callbackHandler, Path file) Constructs a new callback-basedCsvReaderfor the specified file.<T> CsvReader<T> build(CsvCallbackHandler<T> callbackHandler, Path file, Charset charset) Constructs a new callback-basedCsvReaderfor the specified file and character set.commentCharacter(char commentCharacter) Sets thecommentCharacterused to comment lines.commentStrategy(CommentStrategy commentStrategy) Sets the strategy that defines how (and if) commented lines should be handled (default:CommentStrategy.NONEas comments are not defined in RFC 4180).detectBomHeader(boolean detectBomHeader) Defines if an optional BOM (Byte order mark) header should be detected.fieldSeparator(char fieldSeparator) Sets thefieldSeparatorused when reading CSV data.fieldSeparator(String fieldSeparator) Sets thefieldSeparatorused when reading CSV data.maxBufferSize(int maxBufferSize) Defines the maximum buffer size used when parsing data.ofCsvRecord(InputStream inputStream) Constructs a new index-basedCsvReaderfor the specified input stream.ofCsvRecord(InputStream inputStream, Charset charset) Constructs a new index-basedCsvReaderfor the specified input stream and character set.ofCsvRecord(Reader reader) Constructs a new index-basedCsvReaderfor the specified reader.ofCsvRecord(String data) Constructs a new index-basedCsvReaderfor the specified String.ofCsvRecord(Path file) Constructs a new index-basedCsvReaderfor the specified file.ofCsvRecord(Path file, Charset charset) Constructs a new index-basedCsvReaderfor the specified file and character set.ofNamedCsvRecord(InputStream inputStream) Constructs a new name-basedCsvReaderfor the specified input stream.ofNamedCsvRecord(InputStream inputStream, Charset charset) Constructs a new name-basedCsvReaderfor the specified input stream and character set.ofNamedCsvRecord(Reader reader) Constructs a new name-basedCsvReaderfor the specified reader.ofNamedCsvRecord(String data) Constructs a new name-basedCsvReaderfor the specified String.ofNamedCsvRecord(Path file) Constructs a new name-basedCsvReaderfor the specified file.ofNamedCsvRecord(Path file, Charset charset) Constructs a new name-basedCsvReaderfor the specified file and character set.<T> TofSingleCsvRecord(CsvCallbackHandler<T> callbackHandler, String data) Convenience method to read a single CSV record using a custom callback handler.ofSingleCsvRecord(String data) Convenience method to read a single CSV record from the specified string.quoteCharacter(char quoteCharacter) Sets thequoteCharacterused when reading CSV data.skipEmptyLines(boolean skipEmptyLines) Defines whether empty lines should be skipped when reading data.toString()trimWhitespacesAroundQuotes(boolean trimWhitespacesAroundQuotes) Defines whether whitespaces before an opening quote and after a closing quote should be allowed and trimmed.
-
Method Details
-
fieldSeparator
Sets thefieldSeparatorused when reading CSV data.- Parameters:
fieldSeparator- the field separator character (default:,- comma).- Returns:
- This updated object, allowing additional method calls to be chained together.
- See Also:
-
fieldSeparator
Sets the
fieldSeparatorused when reading CSV data.Unlike
fieldSeparator(char), this method allows specifying a string of multiple characters to separate fields. The entire string is used as the delimiter, meaning fields are only separated if the full string matches. Individual characters within the string are not treated as separate delimiters.If multiple characters are used, the less performant
RelaxedCsvParseris used!- Parameters:
fieldSeparator- the field separator string (default:,- comma).- Returns:
- This updated object, allowing additional method calls to be chained together.
- Throws:
IllegalArgumentException- if fieldSeparator isnullor empty- See Also:
-
quoteCharacter
Sets thequoteCharacterused when reading CSV data.- Parameters:
quoteCharacter- the character used to enclose fields (default:"- double quotes).- Returns:
- This updated object, allowing additional method calls to be chained together.
-
commentStrategy
Sets the strategy that defines how (and if) commented lines should be handled (default:
CommentStrategy.NONEas comments are not defined in RFC 4180).If a comment strategy other than
CommentStrategy.NONEis used, special parsing rules are applied for commented lines. FastCSV defines a comment as a line that starts with a comment character. No (whitespace) character is allowed before the comment character. Everything after the comment character until the end of the line is considered the comment value.- Parameters:
commentStrategy- the strategy for handling comments.- Returns:
- This updated object, allowing additional method calls to be chained together.
- See Also:
-
commentCharacter
Sets thecommentCharacterused to comment lines.- Parameters:
commentCharacter- the character used to comment lines (default:#- hash)- Returns:
- This updated object, allowing additional method calls to be chained together.
- See Also:
-
skipEmptyLines
Defines whether empty lines should be skipped when reading data.
The default implementation interprets empty lines as lines that do not contain any data (no whitespace, no quotes, nothing).
Commented lines are not considered empty lines. Use
commentStrategy(CommentStrategy)for handling commented lines.- Parameters:
skipEmptyLines- Whether empty lines should be skipped (default:true).- Returns:
- This updated object, allowing additional method calls to be chained together.
-
allowExtraFields
Defines whether aCsvParseExceptionshould be thrown if records contain more fields than the first record. The first record is defined as the first record that is not a comment or an empty line.- Parameters:
allowExtraFields- Whether extra fields should be allowed (default:false).- Returns:
- This updated object, allowing additional method calls to be chained together.
- See Also:
-
allowMissingFields
Defines whether a
CsvParseExceptionshould be thrown if records contain fewer fields than the first record. The first record is defined as the first record that is not a comment or an empty line.Empty lines are allowed even if this is set to
false.- Parameters:
allowMissingFields- Whether missing fields should be allowed (default:false).- Returns:
- This updated object, allowing additional method calls to be chained together.
- See Also:
-
allowExtraCharsAfterClosingQuote
public CsvReader.CsvReaderBuilder allowExtraCharsAfterClosingQuote(boolean allowExtraCharsAfterClosingQuote) Specifies whether the presence of characters between a closing quote and a field separator or the end of a line should be treated as an error or not.
Example:
"a"b,"c"If this is set to
true, the valueabwill be returned for the first field.If this is set to
false, aCsvParseExceptionwill be thrown.- Parameters:
allowExtraCharsAfterClosingQuote- allow extra characters after closing quotes (default:false).- Returns:
- This updated object, allowing additional method calls to be chained together.
-
trimWhitespacesAroundQuotes
Defines whether whitespaces before an opening quote and after a closing quote should be allowed and trimmed.
RFC 4180 does not allow whitespaces between the quotation mark and the field separator or the end of the line. CSV data that contains such whitespaces causes two major problems for the parser:
- Whitespace before an opening quote causes the parser to treat the field as unquoted, leading it to misinterpret characters that are meant to be regular data as control characters, such as field separators, even though they are not intended as control characters.
- Whitespaces after a closing quote are appended to the field value (without the quote character).
It is then unclear whether the whitespace is part of the field value or not, leading to potential
misinterpretations of the data. A
CsvParseExceptionis thrown in this case unlessallowExtraCharsAfterClosingQuote(boolean)is enabled.
Enabling this option allows the parser to handle such cases more leniently (Whitespaces are shown as underscores (
_) for clarity.):A record
_"x"_,_"foo,bar"_would be parsed as two fields:xandfoo,bar.Whitespace in this context is defined as any character whose code point is less than or equal to
U+0020(the space character) – the same logic as in Java'sString.trim()method.When enabling this, the less performant
RelaxedCsvParseris used!- Parameters:
trimWhitespacesAroundQuotes- if whitespaces should be allowed/trimmed (default:false).- Returns:
- This updated object, allowing additional method calls to be chained together.
-
detectBomHeader
Defines if an optional BOM (Byte order mark) header should be detected.
BOM detection only applies for
InputStreamandPathbased data sources. It does not apply forReaderorStringbased data sources as they are already decoded.Supported BOMs are: UTF-8, UTF-16LE, UTF-16BE, UTF-32LE, UTF-32BE.
- Parameters:
detectBomHeader- if detection should be enabled (default:false)- Returns:
- This updated object, allowing additional method calls to be chained together.
-
maxBufferSize
Defines the maximum buffer size used when parsing data.
The size of the internal buffer is automatically adjusted to the needs of the parser. To protect against out-of-memory errors, its maximum size is limited.
The buffer is used for two purposes:
- Reading data from the underlying stream of data in chunks
- Storing the data of a single field before it is passed to the callback handler
Set a larger value only if you expect to read fields larger than the default limit. In that case you probably also need to adjust the maximum field size of the callback handler.
Set a smaller value if your runtime environment has not enough memory available for the default value. Setting values smaller than 16,384 characters will most likely lead to performance degradation.
- Parameters:
maxBufferSize- the maximum buffer size in characters (default: 16,777,216)- Returns:
- This updated object, allowing additional method calls to be chained together.
- Throws:
IllegalArgumentException- if maxBufferSize is not positive
-
ofSingleCsvRecord
Convenience method to read a single CSV record from the specified string.
If the string contains multiple records, only the first one is returned.
- Parameters:
data- the CSV data to read; must not benull- Returns:
- a single
CsvRecordinstance containing the parsed data - Throws:
NullPointerException- if data isnullCsvParseException- if the data cannot be parsed- See Also:
-
ofSingleCsvRecord
Convenience method to read a single CSV record using a custom callback handler.
If the string contains multiple records, only the first one is returned.
- Type Parameters:
T- the type of the CSV record.- Parameters:
callbackHandler- the record handler to use. Do not reuse a handler after it has been used!data- the CSV data to read; must not benull- Returns:
- a single record as processed by the callback handler
- Throws:
NullPointerException- if callbackHandler or data isnullCsvParseException- if the data cannot be parsed- See Also:
-
ofCsvRecord
Constructs a new index-based
CsvReaderfor the specified input stream.This is a convenience method for calling
build(CsvCallbackHandler,InputStream)withCsvRecordHandleras the callback handler.If
detectBomHeader(boolean)is enabled, the character set is determined by the BOM header. Per default the character set isStandardCharsets.UTF_8.- Parameters:
inputStream- the input stream to read data from.- Returns:
- a new CsvReader - never
null. - Throws:
NullPointerException- if inputStream isnull- See Also:
-
ofCsvRecord
Constructs a new index-based
CsvReaderfor the specified input stream and character set.This is a convenience method for calling
build(CsvCallbackHandler,InputStream,Charset)withCsvRecordHandleras the callback handler.- Parameters:
inputStream- the input stream to read data from.charset- the character set to use. If BOM header detection is enabled (viadetectBomHeader(boolean)), this acts as a default when no BOM header was found.- Returns:
- a new CsvReader - never
null. - Throws:
NullPointerException- if inputStream or charset isnull- See Also:
-
ofCsvRecord
Constructs a new index-based
CsvReaderfor the specified reader.This is a convenience method for calling
build(CsvCallbackHandler,Reader)withCsvRecordHandleras the callback handler.detectBomHeader(boolean)has no effect on this method.- Parameters:
reader- the data source to read from.- Returns:
- a new CsvReader - never
null. - Throws:
NullPointerException- if reader isnull
-
ofCsvRecord
Constructs a new index-based
CsvReaderfor the specified String.This is a convenience method for calling
build(CsvCallbackHandler,String)withCsvRecordHandleras the callback handler.detectBomHeader(boolean)has no effect on this method.- Parameters:
data- the data to read.- Returns:
- a new CsvReader - never
null. - Throws:
NullPointerException- if data isnull
-
ofCsvRecord
Constructs a new index-based
CsvReaderfor the specified file.This is a convenience method for calling
build(CsvCallbackHandler,Path)withCsvRecordHandleras the callback handler.If
detectBomHeader(boolean)is enabled, the character set is determined by the BOM header. Per default the character set isStandardCharsets.UTF_8.- Parameters:
file- the file to read data from.- Returns:
- a new CsvReader - never
null. Don't forget to close it! - Throws:
IOException- if an I/O error occurs.NullPointerException- if file isnull- See Also:
-
ofCsvRecord
Constructs a new index-based
CsvReaderfor the specified file and character set.This is a convenience method for calling
build(CsvCallbackHandler,Path,Charset)withCsvRecordHandleras the callback handler.- Parameters:
file- the file to read data from.charset- the character set to use. If BOM header detection is enabled (viadetectBomHeader(boolean)), this acts as a default when no BOM header was found.- Returns:
- a new CsvReader - never
null. Don't forget to close it! - Throws:
IOException- if an I/O error occurs.NullPointerException- if file or charset isnull- See Also:
-
ofNamedCsvRecord
Constructs a new name-based
CsvReaderfor the specified input stream.This is a convenience method for calling
build(CsvCallbackHandler,InputStream)withNamedCsvRecordHandleras the callback handler.If
detectBomHeader(boolean)is enabled, the character set is determined by the BOM header. Per default the character set isStandardCharsets.UTF_8.- Parameters:
inputStream- the input stream to read data from.- Returns:
- a new CsvReader - never
null. - Throws:
NullPointerException- if reader isnull- See Also:
-
ofNamedCsvRecord
Constructs a new name-based
CsvReaderfor the specified input stream and character set.This is a convenience method for calling
build(CsvCallbackHandler,InputStream,Charset)withNamedCsvRecordHandleras the callback handler.- Parameters:
inputStream- the input stream to read data from.charset- the character set to use. If BOM header detection is enabled (viadetectBomHeader(boolean)), this acts as a default when no BOM header was found.- Returns:
- a new CsvReader - never
null. - Throws:
NullPointerException- if file or charset isnull- See Also:
-
ofNamedCsvRecord
Constructs a new name-based
CsvReaderfor the specified reader.This is a convenience method for calling
build(CsvCallbackHandler,Reader)withNamedCsvRecordHandleras the callback handler.detectBomHeader(boolean)has no effect on this method.- Parameters:
reader- the data source to read from.- Returns:
- a new CsvReader - never
null. - Throws:
NullPointerException- if reader isnull
-
ofNamedCsvRecord
Constructs a new name-based
CsvReaderfor the specified String.This is a convenience method for calling
build(CsvCallbackHandler,String)withNamedCsvRecordHandleras the callback handler.detectBomHeader(boolean)has no effect on this method.- Parameters:
data- the data to read.- Returns:
- a new CsvReader - never
null. - Throws:
NullPointerException- if data isnull
-
ofNamedCsvRecord
Constructs a new name-based
CsvReaderfor the specified file.This is a convenience method for calling
build(CsvCallbackHandler,Path)withNamedCsvRecordHandleras the callback handler.If
detectBomHeader(boolean)is enabled, the character set is determined by the BOM header. Per default the character set isStandardCharsets.UTF_8.- Parameters:
file- the file to read data from.- Returns:
- a new CsvReader - never
null. Don't forget to close it! - Throws:
IOException- if an I/O error occurs.NullPointerException- if file isnull- See Also:
-
ofNamedCsvRecord
Constructs a new name-based
CsvReaderfor the specified file and character set.This is a convenience method for calling
build(CsvCallbackHandler,Path,Charset)withNamedCsvRecordHandleras the callback handler.- Parameters:
file- the file to read data from.charset- the character set to use. If BOM header detection is enabled (viadetectBomHeader(boolean)), this acts as a default when no BOM header was found.- Returns:
- a new CsvReader - never
null. Don't forget to close it! - Throws:
IOException- if an I/O error occurs.NullPointerException- if file or charset isnull- See Also:
-
build
Constructs a new callback-based
CsvReaderfor the specified input stream.This is a convenience method for calling
build(CsvCallbackHandler,InputStream,Charset).This library uses built-in buffering, so you do not need to pass in a buffered InputStream implementation such as
BufferedInputStream. Performance may be even likely better if you do not.If
detectBomHeader(boolean)is enabled, the character set is determined by the BOM header. Per default the character set isStandardCharsets.UTF_8.Use
build(CsvCallbackHandler,Path)for optimal performance when reading files.- Type Parameters:
T- the type of the CSV record.- Parameters:
callbackHandler- the record handler to use. Do not reuse a handler after it has been used!inputStream- the input stream to read data from.- Returns:
- a new CsvReader - never
null. - Throws:
NullPointerException- if callbackHandler or inputStream isnull- See Also:
-
build
public <T> CsvReader<T> build(CsvCallbackHandler<T> callbackHandler, InputStream inputStream, Charset charset) Constructs a new callback-based
CsvReaderfor the specified input stream and character set.This library uses built-in buffering, so you do not need to pass in a buffered InputStream implementation such as
BufferedInputStream. Performance may be even likely better if you do not.If
detectBomHeader(boolean)is enabled, this method will immediately cause consumption of the input stream to read the BOM header and determine the character set.Use
build(CsvCallbackHandler,Path,Charset)for optimal performance when reading files.- Type Parameters:
T- the type of the CSV record.- Parameters:
callbackHandler- the record handler to use. Do not reuse a handler after it has been used!inputStream- the input stream to read data from.charset- the character set to use. If BOM header detection is enabled (viadetectBomHeader(boolean)), this acts as a default when no BOM header was found.- Returns:
- a new CsvReader - never
null. - Throws:
NullPointerException- if callbackHandler, inputStream or charset isnull- See Also:
-
build
Constructs a new callback-based
CsvReaderfor the specified reader.This library uses built-in buffering, so you do not need to pass in a buffered Reader implementation such as
BufferedReader. Performance may be even likely better if you do not.Use
build(CsvCallbackHandler,Path)for optimal performance when reading files andbuild(CsvCallbackHandler,String)when reading Strings.detectBomHeader(boolean)has no effect on this method.- Type Parameters:
T- the type of the CSV record.- Parameters:
callbackHandler- the record handler to use. Do not reuse a handler after it has been used!reader- the data source to read from.- Returns:
- a new CsvReader - never
null. - Throws:
NullPointerException- if callbackHandler or reader isnullIllegalArgumentException- if argument validation fails.
-
build
Constructs a new callback-based
CsvReaderfor the specified String.detectBomHeader(boolean)has no effect on this method.- Type Parameters:
T- the type of the CSV record.- Parameters:
callbackHandler- the record handler to use. Do not reuse a handler after it has been used!data- the data to read.- Returns:
- a new CsvReader - never
null. - Throws:
NullPointerException- if callbackHandler or data isnullIllegalArgumentException- if argument validation fails.
-
build
Constructs a new callback-based
CsvReaderfor the specified file.If
detectBomHeader(boolean)is enabled, the character set is determined by the BOM header. Per default the character set isStandardCharsets.UTF_8.- Type Parameters:
T- the type of the CSV record.- Parameters:
callbackHandler- the record handler to use. Do not reuse a handler after it has been used!file- the file to read data from.- Returns:
- a new CsvReader - never
null. Remember to close it! - Throws:
IOException- if an I/O error occurs.NullPointerException- if callbackHandler or file isnull- See Also:
-
build
public <T> CsvReader<T> build(CsvCallbackHandler<T> callbackHandler, Path file, Charset charset) throws IOException Constructs a new callback-basedCsvReaderfor the specified file and character set.- Type Parameters:
T- the type of the CSV record.- Parameters:
callbackHandler- the record handler to use. Do not reuse a handler after it has been used!file- the file to read data from.charset- the character set to use. If BOM header detection is enabled (viadetectBomHeader(boolean)), this acts as a default when no BOM header was found.- Returns:
- a new CsvReader - never
null. Remember to close it! - Throws:
IOException- if an I/O error occurs.NullPointerException- if callbackHandler, file or charset isnull- See Also:
-
toString
-