|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | ENUM CONSTANTS | FIELD | METHOD | DETAIL: ENUM CONSTANTS | FIELD | METHOD | |||||||||
java.lang.Objectjava.lang.Enum<DiacriticalMark>
net.sf.mmm.util.text.api.DiacriticalMark
public enum DiacriticalMark
This enum contains the most important diacritical marks.
If you are NOT familiar with unicode and languages that use non-ASCII
characters, you should know that each DiacriticalMark represents a
specific shape like e.g. '~', '^', etc. that is added at a specific position
(on top, at bottom, etc.) to a letter. For instance if you add
two dots to the letter 'a' you get 'ä'.
To make things really complicated, unicode added
combining characters representing the mark
itself in addition to the precomposed characters (combination of a specific
character with the mark[s]).
| Enum Constant Summary | |
|---|---|
ACUTE
A mark that can be placed on top of some Latin, Cyrillic or Greek characters. |
|
BREVE
A mark that can be placed on top of some Latin, ... characters. |
|
CARON
A mark that can be placed on top of some Latin, ... characters. |
|
CEDILLA
A mark that can be placed at the bottom of some Latin characters. |
|
CIRCUMFLEX
A mark that can be placed on top of some Latin characters (e.g. in French). |
|
DIAERESIS
Two dots on top (trema, diaeresis, or umlaut). |
|
DOT_ABOVE
A mark attached at the top right of the letters o and u in the Vietnamese alphabet (overdot). |
|
DOT_BELOW
TODO |
|
DOUBLE_ACUTE
Like ACUTE but doubled. |
|
DOUBLE_GRAVE
Like GRAVE but doubled. |
|
GRAVE
A mark that can be placed on top of some Latin, Cyrillic or Greek characters. |
|
HOOK_ABOVE
A little question mark without the dot, that is placed on top of Vietnamese letters. |
|
HORN_ABOVE
A ... that is placed on top of Vietnamese vowels. |
|
MACRON
A ... |
|
OGONEK
A ... |
|
RING_ABOVE
A ... |
|
TILDE
~ on top. |
|
| Field Summary | |
|---|---|
private char |
combiningCharacter
|
private Collection<Character> |
composedCharacters
|
private Map<Character,Character> |
composeMap
|
private Map<Character,Character> |
decomposeMap
|
private char |
separateCharacter
|
private String |
title
|
| Method Summary | |
|---|---|
protected void |
addComposition(char uncomposed,
char composed)
This method adds the given composition pair. |
Character |
compose(char character)
This method composes the given character with this
DiacriticalMark. |
Character |
decompose(char character)
This method de-composes the given character with this
DiacriticalMark. |
char |
getCombiningCharacter()
This method gets the combining character for this DiacriticalMark. |
Collection<Character> |
getComposedCharacters()
This method gets a Collection with all precomposed
characters containing this mark. |
char |
getSeparateCharacter()
|
String |
getTitle()
This method gets the title of this datatype. |
Character |
getValue()
This method returns the raw value of this datatype. |
protected abstract void |
initialize()
This method is called at construction. |
String |
normalizeToAscii(char character)
This method gets the ASCII-representation of the given character composed with this
DiacriticalMark. |
protected void |
normalizeToAsciiRecursive(char decomposed,
StringBuilder buffer,
int compositionCount)
This is the internal recursive implemenation of normalizeToAscii(char). |
String |
toString()
This method needs to return the same result a Datatype.getTitle(). |
static DiacriticalMark |
valueOf(String name)
Returns the enum constant of this type with the specified name. |
static DiacriticalMark[] |
values()
Returns an array containing the constants of this enum type, in the order they are declared. |
| Methods inherited from class java.lang.Enum |
|---|
clone, compareTo, equals, finalize, getDeclaringClass, hashCode, name, ordinal, valueOf |
| Methods inherited from class java.lang.Object |
|---|
getClass, notify, notifyAll, wait, wait, wait |
| Enum Constant Detail |
|---|
public static final DiacriticalMark ACUTE
public static final DiacriticalMark BREVE
public static final DiacriticalMark CARON
public static final DiacriticalMark CEDILLA
public static final DiacriticalMark CIRCUMFLEX
public static final DiacriticalMark DIAERESIS
public static final DiacriticalMark DOT_ABOVE
public static final DiacriticalMark DOT_BELOW
public static final DiacriticalMark DOUBLE_ACUTE
ACUTE but doubled. If your environment supports unicode, you
can see it here: ˝
public static final DiacriticalMark DOUBLE_GRAVE
GRAVE but doubled. If your environment supports unicode, you
can see it here: TODO
public static final DiacriticalMark GRAVE
public static final DiacriticalMark HOOK_ABOVE
public static final DiacriticalMark HORN_ABOVE
public static final DiacriticalMark MACRON
public static final DiacriticalMark OGONEK
public static final DiacriticalMark RING_ABOVE
public static final DiacriticalMark TILDE
| Field Detail |
|---|
private final char separateCharacter
getSeparateCharacter()private final char combiningCharacter
getCombiningCharacter()private final String title
getTitle()private final Map<Character,Character> composeMap
compose(char)private final Map<Character,Character> decomposeMap
decompose(char)private final Collection<Character> composedCharacters
getComposedCharacters()| Method Detail |
|---|
public static DiacriticalMark[] values()
for (DiacriticalMark c : DiacriticalMark.values()) System.out.println(c);
public static DiacriticalMark valueOf(String name)
name - the name of the enum constant to be returned.
IllegalArgumentException - if this enum type has no constant
with the specified name
NullPointerException - if the argument is nullprotected abstract void initialize()
protected void addComposition(char uncomposed,
char composed)
composition pair.
uncomposed - is the uncomposed character.composed - is the composed character.public char getSeparateCharacter()
public char getCombiningCharacter()
DiacriticalMark.
It represents the mark itself but is TODO. Therefore unicode allows to
express 'ä' as two TODO.
public String getTitle()
NlsMessage).Datatype.toString() is quite weak, this
method is added to explicitly express the presence of the title and to
ensure implementors of this interface can NOT miss to implement this.
getTitle in interface Datatype<Character>Datatype.toString()public Character getValue()
java.lang datatype. In case of a composed datatype it
is also legal that this method returns the datatype instance itself.
getValue in interface Datatype<Character>public Character compose(char character)
character with this
DiacriticalMark.
character - is the character to compose (e.g. 'a').
null if no such composition exists in unicode.public Character decompose(char character)
character with this
DiacriticalMark. In other words this DiacriticalMark is
removed from the given character if it is
composed. It is the inverse operation of
compose(char).
character - is the character to de-compose (e.g. 'ä' or
'á').
null if the
given character does is not composed with this DiacriticalMark.public String normalizeToAscii(char character)
character composed with this
DiacriticalMark. This is similar to decompose(char) but
e.g. for the character 'e' is appended.
character - is the character to normalize to ASCII (e.g. 'Ä' or
'á').
null
if the given character does is not
composed with this DiacriticalMark.UnicodeUtil.normalize2Ascii(char)
protected void normalizeToAsciiRecursive(char decomposed,
StringBuilder buffer,
int compositionCount)
normalizeToAscii(char).
decomposed - is the decomposed character to normalize to ASCII.buffer - is the StringBuilder where tocompositionCount - is the recursion counter used to detect infinity
loops in case of a missconfiguration.public Collection<Character> getComposedCharacters()
Collection with all precomposed
characters containing this mark.
public String toString()
Datatype.getTitle().
toString in interface Datatype<Character>toString in class Enum<DiacriticalMark>
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | ENUM CONSTANTS | FIELD | METHOD | DETAIL: ENUM CONSTANTS | FIELD | METHOD | |||||||||