Package de.mhus.lib.core.io
Class CSVReader
- java.lang.Object
-
- de.mhus.lib.core.io.CSVReader
-
public class CSVReader extends Object
Read CSV (Comma Separated Value) files. This format is used my Microsoft Word and Excel. Fields are separated by commas, and enclosed in quotes if they contain commas or quotes. Embedded quotes are doubled. Embedded spaces do not normally require surrounding quotes. The last field on the line is not followed by a comma. Null fields are represented by two commas in a row. We optionally trim leading and trailing spaces on fields, even inside quotes. File must normally end with a single CrLf, other wise you will get a null when trying to read a field on older JVMs.- Author:
- copyright (c) 2002-2006 Roedy Green Canadian Mind Products version
1.0 2002 March 27
1.1 2002 March 28 - close - configurable separator char - no longer sensitive to line-ending convention. - uses a categorise routine to massage categories for use in case clauses. - faster skipToNextLine
1.2 2002 April 23 - put in to separate package
1.4 2002 April 19 - fix bug if last field on line is empty, was not counting as a field.
1.6 2002 May 25 - allow choice of " or ' quote char.
1.7 2002 August 29 - getAllFieldsInLine
1.8 2002 November 12 - allow Microsoft Excel format fields that can span several lines. sponsored by Steve Hunter of agilense.com
1.9 2002 November 14 - trim parameter to control whether fields are trimmed of lead/trail whitespace (blanks, Cr, Lf, Tab etc.)
2.0 2003 August 10 - getInt, getLong, getFloat, getDouble
2.1 2005-07-16 - reorganisation, new bat files.
2.2 2005-08-28 - add CSVAlign and CSVPack to the suite.
There is another CSVReader at: at http://ostermiller.org/utils/ExcelCSV.html If this CSVReader is not suitable for you, try that one.
There is one written in C# at http://www.csvreader.com/
Future ideas:
1. allow specify various comment chars that mean the rest of the line should be ignored. e.g. ; ! #. These chars have to be in quotes in data then.
2. allow \ to be used for quoting characters.
-
-
Field Summary
Fields Modifier and Type Field Description static charNO_QUOTS
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description voidclose()Close the Reader.Stringget()Read one field from the CSV fileStringget(int row)Stringget(String row)String[]getAllFieldsInLine()Get all fields in the lineString[]getCurrentLine()doublegetDouble()Read one double field from the CSV file.floatgetFloat()Read one float field from the CSV file.intgetInt()Read one integer field from the CSV fileintgetLineColumns()intgetLineCount()longgetLong()Read one long field from the CSV fileString[]getRowNames()static voidmain(String[] args)Test driverbooleannext()voidreadHeader(boolean lower)voidskip(int fields)Skip over fields you don't want to process.voidskipToNextLine()Skip over remaining fields on this line you don't want to process.
-
-
-
Field Detail
-
NO_QUOTS
public static final char NO_QUOTS
- See Also:
- Constant Field Values
-
-
Constructor Detail
-
CSVReader
public CSVReader(Reader r)
convenience Constructor, default to comma separator, " for quote, no multiline fields, with trimming.- Parameters:
r- input Reader source of CSV Fields to read.
-
CSVReader
public CSVReader(Reader r, char separator, char quote, boolean allowMultiLineFields, boolean trim)
Constructor- Parameters:
r- input Reader source of CSV Fields to read.separator- field separator character, usually ',' in North America, ';' in Europe and sometimes '\t' for tab.quote- char to use to enclose fields containing a separator, usually '\"'allowMultiLineFields- true if reader should allow quoted fields to span more than one line. Microsoft Excel sometimes generates files like this.trim- true if reader should trim lead/trailing whitespace e.g. blanks, Cr, Lf. Tab off fields.
-
-
Method Detail
-
close
public void close() throws IOExceptionClose the Reader.- Throws:
IOException
-
get
public String get() throws EOFException, IOException
Read one field from the CSV file- Returns:
- String value, even if the field is numeric. Surrounded and embedded double quotes are stripped. possibly "". null means end of line.
- Throws:
EOFException- at end of file after all the fields have been read.IOException- Some problem reading the file, possibly malformed data.
-
getAllFieldsInLine
public String[] getAllFieldsInLine() throws EOFException, IOException
Get all fields in the line- Returns:
- Array of strings, one for each field. Possibly empty, but never null.
- Throws:
EOFExceptionIOException
-
getDouble
public double getDouble() throws EOFException, IOException, NumberFormatExceptionRead one double field from the CSV file.- Returns:
- houble value, empty field returns 0, as does end of line.
- Throws:
EOFException- at end of file after all the fields have been read.IOException- Some problem reading the file, possibly malformed data.NumberFormatException- , if field does not contain a well-formed int.
-
getFloat
public float getFloat() throws EOFException, IOException, NumberFormatExceptionRead one float field from the CSV file.- Returns:
- float value, empty field returns 0, as does end of line.
- Throws:
EOFException- at end of file after all the fields have been read.IOException- Some problem reading the file, possibly malformed data.NumberFormatException- , if field does not contain a well-formed int.
-
getInt
public int getInt() throws EOFException, IOException, NumberFormatExceptionRead one integer field from the CSV file- Returns:
- int value, empty field returns 0, as does end of line.
- Throws:
EOFException- at end of file after all the fields have been read.IOException- Some problem reading the file, possibly malformed data.NumberFormatException- , if field does not contain a well-formed int.
-
getLong
public long getLong() throws EOFException, IOException, NumberFormatExceptionRead one long field from the CSV file- Returns:
- long value, empty field returns 0, as does end of line.
- Throws:
EOFException- at end of file after all the fields have been read.IOException- Some problem reading the file, possibly malformed data.NumberFormatException- , if field does not contain a well-formed int.
-
skip
public void skip(int fields) throws EOFException, IOExceptionSkip over fields you don't want to process.- Parameters:
fields- How many field you want to bypass reading. The newline counts as one field.- Throws:
EOFException- at end of file after all the fields have been read.IOException- Some problem reading the file, possibly malformed data.
-
skipToNextLine
public void skipToNextLine() throws EOFException, IOExceptionSkip over remaining fields on this line you don't want to process.- Throws:
EOFException- at end of file after all the fields have been read.IOException- Some problem reading the file, possibly malformed data.
-
readHeader
public void readHeader(boolean lower) throws EOFException, IOException- Throws:
EOFExceptionIOException
-
getRowNames
public String[] getRowNames()
-
next
public boolean next() throws IOException- Throws:
IOException
-
get
public String get(String row) throws IOException
- Throws:
IOException
-
get
public String get(int row)
-
getLineCount
public int getLineCount()
-
getLineColumns
public int getLineColumns()
-
getCurrentLine
public String[] getCurrentLine()
-
main
public static void main(String[] args)
Test driver- Parameters:
args- not used
-
-