Class CsvReader.CsvReaderBuilder

java.lang.Object
de.siegmar.fastcsv.reader.CsvReader.CsvReaderBuilder
Enclosing class:
CsvReader<T>

public static final class CsvReader.CsvReaderBuilder extends Object

This builder is used to create configured instances of CsvReader. The default configuration of this class adheres to RFC 4180:

  • Field separator: , (comma)
  • Quote character: " (double quotes)
  • Comment strategy: CommentStrategy.NONE (as RFC doesn't handle comments)
  • Comment character: # (hash) (in case comment strategy is enabled)
  • Skip empty lines: true
  • Allow extra fields: false
  • Allow missing fields: false
  • Allow extra characters after closing quotes: false
  • Trim whitespaces around quotes: false
  • Detect BOM header: false
  • Max buffer size: 16,777,216 characters

The line delimiter (line-feed, carriage-return or the combination of both) is detected automatically and thus not configurable.

  • Method Details

    • fieldSeparator

      public CsvReader.CsvReaderBuilder fieldSeparator(char fieldSeparator)
      Sets the fieldSeparator used when reading CSV data.
      Parameters:
      fieldSeparator - the field separator character (default: , - comma).
      Returns:
      This updated object, allowing additional method calls to be chained together.
      See Also:
    • fieldSeparator

      public CsvReader.CsvReaderBuilder fieldSeparator(String fieldSeparator)

      Sets the fieldSeparator used when reading CSV data.

      Unlike fieldSeparator(char), this method allows specifying a string of multiple characters to separate fields. The entire string is used as the delimiter, meaning fields are only separated if the full string matches. Individual characters within the string are not treated as separate delimiters.

      If multiple characters are used, the less performant RelaxedCsvParser is used!

      Parameters:
      fieldSeparator - the field separator string (default: , - comma).
      Returns:
      This updated object, allowing additional method calls to be chained together.
      Throws:
      IllegalArgumentException - if fieldSeparator is null or empty
      See Also:
    • quoteCharacter

      public CsvReader.CsvReaderBuilder quoteCharacter(char quoteCharacter)
      Sets the quoteCharacter used when reading CSV data.
      Parameters:
      quoteCharacter - the character used to enclose fields (default: " - double quotes).
      Returns:
      This updated object, allowing additional method calls to be chained together.
    • commentStrategy

      public CsvReader.CsvReaderBuilder commentStrategy(CommentStrategy commentStrategy)

      Sets the strategy that defines how (and if) commented lines should be handled (default: CommentStrategy.NONE as comments are not defined in RFC 4180).

      If a comment strategy other than CommentStrategy.NONE is used, special parsing rules are applied for commented lines. FastCSV defines a comment as a line that starts with a comment character. No (whitespace) character is allowed before the comment character. Everything after the comment character until the end of the line is considered the comment value.

      Parameters:
      commentStrategy - the strategy for handling comments.
      Returns:
      This updated object, allowing additional method calls to be chained together.
      See Also:
    • commentCharacter

      public CsvReader.CsvReaderBuilder commentCharacter(char commentCharacter)
      Sets the commentCharacter used to comment lines.
      Parameters:
      commentCharacter - the character used to comment lines (default: # - hash)
      Returns:
      This updated object, allowing additional method calls to be chained together.
      See Also:
    • skipEmptyLines

      public CsvReader.CsvReaderBuilder skipEmptyLines(boolean skipEmptyLines)

      Defines whether empty lines should be skipped when reading data.

      The default implementation interprets empty lines as lines that do not contain any data (no whitespace, no quotes, nothing).

      Commented lines are not considered empty lines. Use commentStrategy(CommentStrategy) for handling commented lines.

      Parameters:
      skipEmptyLines - Whether empty lines should be skipped (default: true).
      Returns:
      This updated object, allowing additional method calls to be chained together.
    • allowExtraFields

      public CsvReader.CsvReaderBuilder allowExtraFields(boolean allowExtraFields)
      Defines whether a CsvParseException should be thrown if records contain more fields than the first record. The first record is defined as the first record that is not a comment or an empty line.
      Parameters:
      allowExtraFields - Whether extra fields should be allowed (default: false).
      Returns:
      This updated object, allowing additional method calls to be chained together.
      See Also:
    • allowMissingFields

      public CsvReader.CsvReaderBuilder allowMissingFields(boolean allowMissingFields)

      Defines whether a CsvParseException should be thrown if records contain fewer fields than the first record. The first record is defined as the first record that is not a comment or an empty line.

      Empty lines are allowed even if this is set to false.

      Parameters:
      allowMissingFields - Whether missing fields should be allowed (default: false).
      Returns:
      This updated object, allowing additional method calls to be chained together.
      See Also:
    • allowExtraCharsAfterClosingQuote

      public CsvReader.CsvReaderBuilder allowExtraCharsAfterClosingQuote(boolean allowExtraCharsAfterClosingQuote)

      Specifies whether the presence of characters between a closing quote and a field separator or the end of a line should be treated as an error or not.

      Example: "a"b,"c"

      If this is set to true, the value ab will be returned for the first field.

      If this is set to false, a CsvParseException will be thrown.

      Parameters:
      allowExtraCharsAfterClosingQuote - allow extra characters after closing quotes (default: false).
      Returns:
      This updated object, allowing additional method calls to be chained together.
    • trimWhitespacesAroundQuotes

      public CsvReader.CsvReaderBuilder trimWhitespacesAroundQuotes(boolean trimWhitespacesAroundQuotes)

      Defines whether whitespaces before an opening quote and after a closing quote should be allowed and trimmed.

      RFC 4180 does not allow whitespaces between the quotation mark and the field separator or the end of the line. CSV data that contains such whitespaces causes two major problems for the parser:

      • Whitespace before an opening quote causes the parser to treat the field as unquoted, leading it to misinterpret characters that are meant to be regular data as control characters, such as field separators, even though they are not intended as control characters.
      • Whitespaces after a closing quote are appended to the field value (without the quote character). It is then unclear whether the whitespace is part of the field value or not, leading to potential misinterpretations of the data. A CsvParseException is thrown in this case unless allowExtraCharsAfterClosingQuote(boolean) is enabled.

      Enabling this option allows the parser to handle such cases more leniently (Whitespaces are shown as underscores (_) for clarity.):

      A record _"x"_,_"foo,bar"_ would be parsed as two fields: x and foo,bar.

      Whitespace in this context is defined as any character whose code point is less than or equal to U+0020 (the space character) – the same logic as in Java's String.trim() method.

      When enabling this, the less performant RelaxedCsvParser is used!

      Parameters:
      trimWhitespacesAroundQuotes - if whitespaces should be allowed/trimmed (default: false).
      Returns:
      This updated object, allowing additional method calls to be chained together.
    • detectBomHeader

      public CsvReader.CsvReaderBuilder detectBomHeader(boolean detectBomHeader)

      Defines if an optional BOM (Byte order mark) header should be detected.

      BOM detection only applies for InputStream and Path based data sources. It does not apply for Reader or String based data sources as they are already decoded.

      Supported BOMs are: UTF-8, UTF-16LE, UTF-16BE, UTF-32LE, UTF-32BE.

      Parameters:
      detectBomHeader - if detection should be enabled (default: false)
      Returns:
      This updated object, allowing additional method calls to be chained together.
    • maxBufferSize

      public CsvReader.CsvReaderBuilder maxBufferSize(int maxBufferSize)

      Defines the maximum buffer size used when parsing data.

      The size of the internal buffer is automatically adjusted to the needs of the parser. To protect against out-of-memory errors, its maximum size is limited.

      The buffer is used for two purposes:

      • Reading data from the underlying stream of data in chunks
      • Storing the data of a single field before it is passed to the callback handler

      Set a larger value only if you expect to read fields larger than the default limit. In that case you probably also need to adjust the maximum field size of the callback handler.

      Set a smaller value if your runtime environment has not enough memory available for the default value. Setting values smaller than 16,384 characters will most likely lead to performance degradation.

      Parameters:
      maxBufferSize - the maximum buffer size in characters (default: 16,777,216)
      Returns:
      This updated object, allowing additional method calls to be chained together.
      Throws:
      IllegalArgumentException - if maxBufferSize is not positive
    • ofSingleCsvRecord

      public CsvRecord ofSingleCsvRecord(String data)

      Convenience method to read a single CSV record from the specified string.

      If the string contains multiple records, only the first one is returned.

      Parameters:
      data - the CSV data to read; must not be null
      Returns:
      a single CsvRecord instance containing the parsed data
      Throws:
      NullPointerException - if data is null
      CsvParseException - if the data cannot be parsed
      See Also:
    • ofSingleCsvRecord

      public <T> T ofSingleCsvRecord(CsvCallbackHandler<T> callbackHandler, String data)

      Convenience method to read a single CSV record using a custom callback handler.

      If the string contains multiple records, only the first one is returned.

      Type Parameters:
      T - the type of the CSV record.
      Parameters:
      callbackHandler - the record handler to use. Do not reuse a handler after it has been used!
      data - the CSV data to read; must not be null
      Returns:
      a single record as processed by the callback handler
      Throws:
      NullPointerException - if callbackHandler or data is null
      CsvParseException - if the data cannot be parsed
      See Also:
    • ofCsvRecord

      public CsvReader<CsvRecord> ofCsvRecord(InputStream inputStream)

      Constructs a new index-based CsvReader for the specified input stream.

      This is a convenience method for calling build(CsvCallbackHandler,InputStream) with CsvRecordHandler as the callback handler.

      If detectBomHeader(boolean) is enabled, the character set is determined by the BOM header. Per default the character set is StandardCharsets.UTF_8.

      Parameters:
      inputStream - the input stream to read data from.
      Returns:
      a new CsvReader - never null.
      Throws:
      NullPointerException - if inputStream is null
      See Also:
    • ofCsvRecord

      public CsvReader<CsvRecord> ofCsvRecord(InputStream inputStream, Charset charset)

      Constructs a new index-based CsvReader for the specified input stream and character set.

      This is a convenience method for calling build(CsvCallbackHandler,InputStream,Charset) with CsvRecordHandler as the callback handler.

      Parameters:
      inputStream - the input stream to read data from.
      charset - the character set to use. If BOM header detection is enabled (via detectBomHeader(boolean)), this acts as a default when no BOM header was found.
      Returns:
      a new CsvReader - never null.
      Throws:
      NullPointerException - if inputStream or charset is null
      See Also:
    • ofCsvRecord

      public CsvReader<CsvRecord> ofCsvRecord(Reader reader)

      Constructs a new index-based CsvReader for the specified reader.

      This is a convenience method for calling build(CsvCallbackHandler,Reader) with CsvRecordHandler as the callback handler.

      detectBomHeader(boolean) has no effect on this method.

      Parameters:
      reader - the data source to read from.
      Returns:
      a new CsvReader - never null.
      Throws:
      NullPointerException - if reader is null
    • ofCsvRecord

      public CsvReader<CsvRecord> ofCsvRecord(String data)

      Constructs a new index-based CsvReader for the specified String.

      This is a convenience method for calling build(CsvCallbackHandler,String) with CsvRecordHandler as the callback handler.

      detectBomHeader(boolean) has no effect on this method.

      Parameters:
      data - the data to read.
      Returns:
      a new CsvReader - never null.
      Throws:
      NullPointerException - if data is null
    • ofCsvRecord

      public CsvReader<CsvRecord> ofCsvRecord(Path file) throws IOException

      Constructs a new index-based CsvReader for the specified file.

      This is a convenience method for calling build(CsvCallbackHandler,Path) with CsvRecordHandler as the callback handler.

      If detectBomHeader(boolean) is enabled, the character set is determined by the BOM header. Per default the character set is StandardCharsets.UTF_8.

      Parameters:
      file - the file to read data from.
      Returns:
      a new CsvReader - never null. Don't forget to close it!
      Throws:
      IOException - if an I/O error occurs.
      NullPointerException - if file is null
      See Also:
    • ofCsvRecord

      public CsvReader<CsvRecord> ofCsvRecord(Path file, Charset charset) throws IOException

      Constructs a new index-based CsvReader for the specified file and character set.

      This is a convenience method for calling build(CsvCallbackHandler,Path,Charset) with CsvRecordHandler as the callback handler.

      Parameters:
      file - the file to read data from.
      charset - the character set to use. If BOM header detection is enabled (via detectBomHeader(boolean)), this acts as a default when no BOM header was found.
      Returns:
      a new CsvReader - never null. Don't forget to close it!
      Throws:
      IOException - if an I/O error occurs.
      NullPointerException - if file or charset is null
      See Also:
    • ofNamedCsvRecord

      public CsvReader<NamedCsvRecord> ofNamedCsvRecord(InputStream inputStream)

      Constructs a new name-based CsvReader for the specified input stream.

      This is a convenience method for calling build(CsvCallbackHandler,InputStream) with NamedCsvRecordHandler as the callback handler.

      If detectBomHeader(boolean) is enabled, the character set is determined by the BOM header. Per default the character set is StandardCharsets.UTF_8.

      Parameters:
      inputStream - the input stream to read data from.
      Returns:
      a new CsvReader - never null.
      Throws:
      NullPointerException - if reader is null
      See Also:
    • ofNamedCsvRecord

      public CsvReader<NamedCsvRecord> ofNamedCsvRecord(InputStream inputStream, Charset charset)

      Constructs a new name-based CsvReader for the specified input stream and character set.

      This is a convenience method for calling build(CsvCallbackHandler,InputStream,Charset) with NamedCsvRecordHandler as the callback handler.

      Parameters:
      inputStream - the input stream to read data from.
      charset - the character set to use. If BOM header detection is enabled (via detectBomHeader(boolean)), this acts as a default when no BOM header was found.
      Returns:
      a new CsvReader - never null.
      Throws:
      NullPointerException - if file or charset is null
      See Also:
    • ofNamedCsvRecord

      public CsvReader<NamedCsvRecord> ofNamedCsvRecord(Reader reader)

      Constructs a new name-based CsvReader for the specified reader.

      This is a convenience method for calling build(CsvCallbackHandler,Reader) with NamedCsvRecordHandler as the callback handler.

      detectBomHeader(boolean) has no effect on this method.

      Parameters:
      reader - the data source to read from.
      Returns:
      a new CsvReader - never null.
      Throws:
      NullPointerException - if reader is null
    • ofNamedCsvRecord

      public CsvReader<NamedCsvRecord> ofNamedCsvRecord(String data)

      Constructs a new name-based CsvReader for the specified String.

      This is a convenience method for calling build(CsvCallbackHandler,String) with NamedCsvRecordHandler as the callback handler.

      detectBomHeader(boolean) has no effect on this method.

      Parameters:
      data - the data to read.
      Returns:
      a new CsvReader - never null.
      Throws:
      NullPointerException - if data is null
    • ofNamedCsvRecord

      public CsvReader<NamedCsvRecord> ofNamedCsvRecord(Path file) throws IOException

      Constructs a new name-based CsvReader for the specified file.

      This is a convenience method for calling build(CsvCallbackHandler,Path) with NamedCsvRecordHandler as the callback handler.

      If detectBomHeader(boolean) is enabled, the character set is determined by the BOM header. Per default the character set is StandardCharsets.UTF_8.

      Parameters:
      file - the file to read data from.
      Returns:
      a new CsvReader - never null. Don't forget to close it!
      Throws:
      IOException - if an I/O error occurs.
      NullPointerException - if file is null
      See Also:
    • ofNamedCsvRecord

      public CsvReader<NamedCsvRecord> ofNamedCsvRecord(Path file, Charset charset) throws IOException

      Constructs a new name-based CsvReader for the specified file and character set.

      This is a convenience method for calling build(CsvCallbackHandler,Path,Charset) with NamedCsvRecordHandler as the callback handler.

      Parameters:
      file - the file to read data from.
      charset - the character set to use. If BOM header detection is enabled (via detectBomHeader(boolean)), this acts as a default when no BOM header was found.
      Returns:
      a new CsvReader - never null. Don't forget to close it!
      Throws:
      IOException - if an I/O error occurs.
      NullPointerException - if file or charset is null
      See Also:
    • build

      public <T> CsvReader<T> build(CsvCallbackHandler<T> callbackHandler, InputStream inputStream)

      Constructs a new callback-based CsvReader for the specified input stream.

      This is a convenience method for calling build(CsvCallbackHandler,InputStream,Charset).

      This library uses built-in buffering, so you do not need to pass in a buffered InputStream implementation such as BufferedInputStream. Performance may be even likely better if you do not.

      If detectBomHeader(boolean) is enabled, the character set is determined by the BOM header. Per default the character set is StandardCharsets.UTF_8.

      Use build(CsvCallbackHandler,Path) for optimal performance when reading files.

      Type Parameters:
      T - the type of the CSV record.
      Parameters:
      callbackHandler - the record handler to use. Do not reuse a handler after it has been used!
      inputStream - the input stream to read data from.
      Returns:
      a new CsvReader - never null.
      Throws:
      NullPointerException - if callbackHandler or inputStream is null
      See Also:
    • build

      public <T> CsvReader<T> build(CsvCallbackHandler<T> callbackHandler, InputStream inputStream, Charset charset)

      Constructs a new callback-based CsvReader for the specified input stream and character set.

      This library uses built-in buffering, so you do not need to pass in a buffered InputStream implementation such as BufferedInputStream. Performance may be even likely better if you do not.

      If detectBomHeader(boolean) is enabled, this method will immediately cause consumption of the input stream to read the BOM header and determine the character set.

      Use build(CsvCallbackHandler,Path,Charset) for optimal performance when reading files.

      Type Parameters:
      T - the type of the CSV record.
      Parameters:
      callbackHandler - the record handler to use. Do not reuse a handler after it has been used!
      inputStream - the input stream to read data from.
      charset - the character set to use. If BOM header detection is enabled (via detectBomHeader(boolean)), this acts as a default when no BOM header was found.
      Returns:
      a new CsvReader - never null.
      Throws:
      NullPointerException - if callbackHandler, inputStream or charset is null
      See Also:
    • build

      public <T> CsvReader<T> build(CsvCallbackHandler<T> callbackHandler, Reader reader)

      Constructs a new callback-based CsvReader for the specified reader.

      This library uses built-in buffering, so you do not need to pass in a buffered Reader implementation such as BufferedReader. Performance may be even likely better if you do not.

      Use build(CsvCallbackHandler,Path) for optimal performance when reading files and build(CsvCallbackHandler,String) when reading Strings.

      detectBomHeader(boolean) has no effect on this method.

      Type Parameters:
      T - the type of the CSV record.
      Parameters:
      callbackHandler - the record handler to use. Do not reuse a handler after it has been used!
      reader - the data source to read from.
      Returns:
      a new CsvReader - never null.
      Throws:
      NullPointerException - if callbackHandler or reader is null
      IllegalArgumentException - if argument validation fails.
    • build

      public <T> CsvReader<T> build(CsvCallbackHandler<T> callbackHandler, String data)

      Constructs a new callback-based CsvReader for the specified String.

      detectBomHeader(boolean) has no effect on this method.

      Type Parameters:
      T - the type of the CSV record.
      Parameters:
      callbackHandler - the record handler to use. Do not reuse a handler after it has been used!
      data - the data to read.
      Returns:
      a new CsvReader - never null.
      Throws:
      NullPointerException - if callbackHandler or data is null
      IllegalArgumentException - if argument validation fails.
    • build

      public <T> CsvReader<T> build(CsvCallbackHandler<T> callbackHandler, Path file) throws IOException

      Constructs a new callback-based CsvReader for the specified file.

      If detectBomHeader(boolean) is enabled, the character set is determined by the BOM header. Per default the character set is StandardCharsets.UTF_8.

      Type Parameters:
      T - the type of the CSV record.
      Parameters:
      callbackHandler - the record handler to use. Do not reuse a handler after it has been used!
      file - the file to read data from.
      Returns:
      a new CsvReader - never null. Remember to close it!
      Throws:
      IOException - if an I/O error occurs.
      NullPointerException - if callbackHandler or file is null
      See Also:
    • build

      public <T> CsvReader<T> build(CsvCallbackHandler<T> callbackHandler, Path file, Charset charset) throws IOException
      Constructs a new callback-based CsvReader for the specified file and character set.
      Type Parameters:
      T - the type of the CSV record.
      Parameters:
      callbackHandler - the record handler to use. Do not reuse a handler after it has been used!
      file - the file to read data from.
      charset - the character set to use. If BOM header detection is enabled (via detectBomHeader(boolean)), this acts as a default when no BOM header was found.
      Returns:
      a new CsvReader - never null. Remember to close it!
      Throws:
      IOException - if an I/O error occurs.
      NullPointerException - if callbackHandler, file or charset is null
      See Also:
    • toString

      public String toString()
      Overrides:
      toString in class Object