This utility class provides pre-computed BitSet instances containing allowed characters for different HTTP components according to RFC 3986 (URI) and RFC 7230 (HTTP) specifications. All character sets are optimized for high-performance validation with O(1) character lookups.
Design Principles
- RFC Compliance - Strict adherence to HTTP and URI specifications
- Performance Optimized - Pre-computed BitSets for O(1) character validation
- Thread Safety - Immutable after initialization, safe for concurrent access
- Memory Efficient - Shared instances reduce memory overhead
Character Set Categories
- RFC3986_UNRESERVED - Basic unreserved characters from RFC 3986
- RFC3986_PATH_CHARS - Characters allowed in URL paths
- RFC3986_QUERY_CHARS - Characters allowed in URL query parameters
- RFC7230_HEADER_CHARS - Characters allowed in HTTP headers
- HTTP_BODY_CHARS - Characters allowed in HTTP request/response bodies
Usage Examples
// Get character set for URL path validation
BitSet pathChars = CharacterValidationConstants.getCharacterSet(ValidationType.URL_PATH);
// Check if character is allowed in URL paths
char ch = '/';
boolean isAllowed = pathChars.get(ch); // Returns true
// Validate string characters
String input = "/api/users";
for (int i = 0; i < input.length(); i++) {
char c = input.charAt(i);
if (!pathChars.get(c)) {
throw new IllegalArgumentException("Invalid character: " + c);
}
}
Performance Characteristics
- O(1) character lookup time using BitSet.get()
- Minimal memory footprint - shared across all validators
- No runtime computation - all sets pre-computed during class loading
- Thread-safe concurrent access without synchronization
RFC References
- RFC 3986 - Uniform Resource Identifier (URI) character definitions
- RFC 7230 - HTTP/1.1 Message Syntax and Routing header field definitions
Security Note: These character sets define allowed characters only. Additional security validation (pattern matching, length limits, etc.) should be applied by higher-level validation stages.
Implements: Task V5 from HTTP verification specification
- Since:
- 1.0
- See Also:
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final BitSetHTTP body content characters (permissive for JSON, XML, text, etc.).static final BitSetRFC 3986 path characters including unreserved + path-specific characters.static final BitSetRFC 3986 query characters including unreserved + query-specific characters.static final BitSetRFC 3986 unreserved characters: ALPHA / DIGIT / "-" / "." / "_" / "~".static final BitSetRFC 7230 header field characters (visible ASCII minus delimiters). -
Method Summary
Modifier and TypeMethodDescriptionstatic BitSetReturns the appropriate character set for the specified validation type.
-
Field Details
-
RFC3986_UNRESERVED
RFC 3986 unreserved characters: ALPHA / DIGIT / "-" / "." / "_" / "~".These are the basic safe characters allowed in URIs without percent-encoding.
-
RFC3986_PATH_CHARS
RFC 3986 path characters including unreserved + path-specific characters.Includes all unreserved characters plus: / @ : ! $ & ' ( ) * + , ; =
-
RFC3986_QUERY_CHARS
RFC 3986 query characters including unreserved + query-specific characters.Includes all unreserved characters plus: ? & = ! $ ' ( ) * + , ;
-
RFC7230_HEADER_CHARS
RFC 7230 header field characters (visible ASCII minus delimiters).Includes space through tilde (32-126) plus tab character.
-
HTTP_BODY_CHARS
HTTP body content characters (permissive for JSON, XML, text, etc.).Includes printable ASCII (32-126), tab, LF, CR, and extended ASCII (128-255).
-
-
Method Details
-
getCharacterSet
Returns the appropriate character set for the specified validation type.This method provides a centralized mapping from validation types to their corresponding RFC-compliant character sets. The returned BitSet is the actual instance (not a copy) for performance reasons and must not be modified.
Validation Type Mappings:
URL_PATH→RFC3986_PATH_CHARSPARAMETER_NAME, PARAMETER_VALUE→RFC3986_QUERY_CHARSHEADER_NAME, HEADER_VALUE→RFC7230_HEADER_CHARSBODY→HTTP_BODY_CHARSCOOKIE_NAME, COOKIE_VALUE→RFC3986_UNRESERVED
- Parameters:
type- The validation type specifying which HTTP component is being validated- Returns:
- The corresponding BitSet containing allowed characters for the validation type
- Throws:
NullPointerException- iftypeis null- See Also:
-