|
intarsys runtime library | |||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectde.intarsys.tools.reader.ReaderTools
public class ReaderTools
Tool class for common Reader related tasks.
| Constructor Summary | |
|---|---|
ReaderTools()
|
|
| Method Summary | |
|---|---|
static InputStreamReader |
createReaderScanBom(InputStream is)
Try to detect the unicode transformation format (UTF encoding) from the BOM. |
static InputStreamReader |
createReaderScanMeta(InputStream is)
Try to detect the input stream encoding from the meta tags "$$$" embedded in the stream. |
static TaggedReader |
createTaggedReader(InputStream is,
String defaultCharsetName,
int size)
Create a TaggedReader and automatically detect the encoding from
different heuristics. |
static Map.Entry<String,String> |
readEntry(Reader reader,
char delimiter)
Read a Map.Entry object from r. |
static Map<String,String> |
readMetaData(Reader reader)
Try to detect meta data embedded in the input. |
static String |
readMetaEncoding(Reader reader)
Try to detect encoding specific meta data embedded in the input. |
static String |
readToken(Reader reader,
char delimiter)
Read a string token from r. |
| Methods inherited from class java.lang.Object |
|---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
|---|
public ReaderTools()
| Method Detail |
|---|
public static InputStreamReader createReaderScanBom(InputStream is)
throws IOException
The InputStream is must support the mark operation!
For BOM marker bytes, see http://unicode.org/faq/utf_bom.html
Bytes Encoding Form 00 00 FE FF UTF-32, big-endian FF FE 00 00 UTF-32, little-endian FE FF UTF-16, big-endian FF FE UTF-16, little-endian EF BB BF UTF-8
is -
InputStreamReader with the correct encoding
IOException
public static InputStreamReader createReaderScanMeta(InputStream is)
throws IOException
The InputStream is must support the mark operation!
is -
InputStreamReader with the correct encoding
IOException
public static TaggedReader createTaggedReader(InputStream is,
String defaultCharsetName,
int size)
throws IOException
TaggedReader and automatically detect the encoding from
different heuristics. First, the BOM markers are checked, then embedded
meta information is scanned.
If no encoding can be guessed, either the defaultCharsetName or the platform encoding is used.
Meta information tags (lines starting with '$$$') are scanned.
is - defaultCharsetName -
TaggedReader with the correct encoding
IOException
public static Map.Entry<String,String> readEntry(Reader reader,
char delimiter)
throws IOException
The syntax for an entry is
ws* key ws* '=' value [delimiter | EOF] value = string | quoted_string quoted_string = '"' [ char | escape ]* '"'
reader - delimiter -
IOException
public static Map<String,String> readMetaData(Reader reader)
throws IOException
Meta data lines start with a '$$$' immediately at the line beginning and end at the line end. Meta data lines are scanned until a line without meta data is found. Meta data is encoded as entries (as provided in readEntry method).
The maximum length for a meta data line is 1024.
After execution reader is either positioned after the last meta tag. The reader instance must support the "mark/reset" sequence.
reader -
Map
IOException
public static String readMetaEncoding(Reader reader)
throws IOException
After execution reader is either positioned at the start or after the "encoding" meta tag. The reader instance must support the "mark/reset" sequence.
For more information on meta data see readMetaData.
reader -
IOException
public static String readToken(Reader reader,
char delimiter)
throws IOException
value [delimiter | EOF] value = string | quoted_string quoted_string = '"' [ char | escape ]* '"'
reader - delimiter -
IOException
|
intarsys runtime library | |||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||