net.sf.sfac.string
Class StringUtils

java.lang.Object
  extended by net.sf.sfac.string.StringUtils

public abstract class StringUtils
extends Object

String manipulation/comparison utility class.

Author:
Olivier Berlanger

Constructor Summary
StringUtils()
           
 
Method Summary
static boolean areEquals(CharIterator it1, CharIterator it2, boolean normalized)
          Check if the content of the two iterators is the same.
static String firstOfWordsUpperCase(String src)
          Transform the string to have the first character of each word in uppercase.
static String firstOfWordsUpperCase(String src, boolean othersToLowercase)
          Transform the string to have the first character of each word in uppercase.
static String firstToLowerCase(String src)
           
static String firstToUpperCase(String src)
           
static String getEncodedString(String src)
          Encode a string to avoid spaces and non-alphanumeric characters.
static String[] getNormalizedKeywords(String keywordString)
           
static String getNormalizedString(String src)
          Normalize a string.
static char getUppercaseChar(char ch)
          Get the uppercase char corresponding to the given character with removed diacritic mark.
static boolean matchKeywords(String keywords, boolean matchAll, CharIterator src)
          Check if all/any of the given keywords are contained in the iterator.
static boolean matchNormalizedKeywords(String[] keywords, boolean matchAll, CharIterator src)
          Check if all/any of the given keywords are contained in the iterator.
static boolean matchPattern(String pattern, CharIterator src)
          Check if the given pattern is contained in the iterator.
static boolean matchString(String pattern, CharIterator src, boolean ignoreCase)
          Check if the string is contained in the iterator.
static char removeDiacritic(char ch)
          Get the equivalent char with removed diacritic marks (like accents, cedillas, dots, tildes ...).
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

StringUtils

public StringUtils()
Method Detail

matchString

public static boolean matchString(String pattern,
                                  CharIterator src,
                                  boolean ignoreCase)
Check if the string is contained in the iterator. The match will be strict (including whitespace and non-letter chars).

Parameters:
pattern - the pattern to find in the char iterator.
src - A CharIterator on the text to search.
Returns:
true iff the pattern was found in the char iterator.

matchPattern

public static boolean matchPattern(String pattern,
                                   CharIterator src)
Check if the given pattern is contained in the iterator. The pattern will be normalized (as the CharIterator content) before comparison.

Parameters:
pattern - the pattern to find in the char iterator.
src - A CharIterator on the text to search.
Returns:
true iff the pattern was found in the char iterator.

areEquals

public static boolean areEquals(CharIterator it1,
                                CharIterator it2,
                                boolean normalized)
Check if the content of the two iterators is the same. if normalized is true, the iterators content will be normalized and trimmed for comparison.

Parameters:
it1 - first char iterator.
it2 - second char iterator.
normalized - true content should be normalized for comparison.
Returns:
true iff the content of the two iterators is the same.

matchKeywords

public static boolean matchKeywords(String keywords,
                                    boolean matchAll,
                                    CharIterator src)
Check if all/any of the given keywords are contained in the iterator. The keywords will be normalized (as the CharIterator content) and tokenized before comparison.

Parameters:
keywords - String containing list of keyword to compare.
matchAll - true if all the keyword have to be matched, false if only one of the keywords have to be matched.
src - A CharIteraor on the text to search.
Returns:
true iff all/any of the given keywords were found in the char iterator.

getNormalizedKeywords

public static String[] getNormalizedKeywords(String keywordString)

matchNormalizedKeywords

public static boolean matchNormalizedKeywords(String[] keywords,
                                              boolean matchAll,
                                              CharIterator src)
Check if all/any of the given keywords are contained in the iterator. The keywords will be normalized (as the CharIterator content) and tokenized before comparison.

Parameters:
keywords - String containing list of keyword to compare.
matchAll - true if all the keyword have to be matched, false if only one of the keywords have to be matched.
src - A CharIteraor on the text to search.
Returns:
true iff all/any of the given keywords were found in the char iterator.

getNormalizedString

public static String getNormalizedString(String src)
Normalize a string.
The result will be:

Parameters:
src - Source string
Returns:
normalized string.

removeDiacritic

public static final char removeDiacritic(char ch)
Get the equivalent char with removed diacritic marks (like accents, cedillas, dots, tildes ...).
The character case will be preserved. If the given char has no diacritic mark, it will be returned without change. The characters taken in accout by this method are in the range 0000-024F = unicode blocks "Basic Latin", "Latin 1 supplement", "Latin extended A" and "Latin extended B". (but all accentued chars of those blocks are between 00C0 and 021F). the other chars will be returned without changes.

Parameters:
ch - the possibly accentued char to convert.
Returns:
the corresponding non-accentued char.

getUppercaseChar

public static final char getUppercaseChar(char ch)
Get the uppercase char corresponding to the given character with removed diacritic mark.
So this method will transform '�' to 'A', '�' to 'E' ... while the default Character.toUpperCase implementation transforms '�' to '�', '�' to '�' ...

Parameters:
ch - the character.
Returns:
Corresponding uppercase character with any diacritic mark removed.

firstToUpperCase

public static final String firstToUpperCase(String src)

firstToLowerCase

public static final String firstToLowerCase(String src)

firstOfWordsUpperCase

public static final String firstOfWordsUpperCase(String src)
Transform the string to have the first character of each word in uppercase.
Note that this method transform only some characters from lowercase to uppercase, the character that are not the first of a word are left as-is.

Parameters:
src - source string
Returns:
transformed string.

firstOfWordsUpperCase

public static final String firstOfWordsUpperCase(String src,
                                                 boolean othersToLowercase)
Transform the string to have the first character of each word in uppercase.

Parameters:
src - source string
othersToLowercase - if true, the characters that are not the first of a word are forced to lowercase, otherwise they are left unchanged.
Returns:
transformed string.

getEncodedString

public static final String getEncodedString(String src)
Encode a string to avoid spaces and non-alphanumeric characters. It's used to generate file names supported on all platforms.
Examples:

Parameters:
src - the soure string
Returns:
the string encoded.


Copyright © 2012. All Rights Reserved.