Tokenize Language
Available as of Camel version 2.0
The tokenizer language is a built-in language in camel-core, which is
most often used only with the Splitter EIP to split
a message using a token-based strategy.
The tokenizer language is intended to tokenize text documents using a
specified delimiter pattern. It can also be used to tokenize XML
documents with some limited capability. For a truly XML-aware
tokenization, the use of the XMLTokenizer
language is recommended as it offers a faster, more efficient
tokenization specifically for XML documents. For more details
see Splitter.
Tokenize Options
The Tokenize language supports 10 options which are listed below.
| Name | Default | Java Type | Description | 
|---|---|---|---|
| token | 
 | The (start) token to use as tokenizer for example you can use the new line token. You can use simple language as the token to support dynamic tokens. | |
| endToken | 
 | The end token to use as tokenizer if using start/end token pairs. You can use simple language as the token to support dynamic tokens. | |
| inheritNamespaceTagName | 
 | To inherit namespaces from a root/parent tag name when using XML You can use simple language as the tag name to support dynamic names. | |
| headerName | 
 | Name of header to tokenize instead of using the message body. | |
| regex | 
 | 
 | If the token is a regular expression pattern. The default value is false | 
| xml | 
 | 
 | Whether the input is XML messages. This option must be set to true if working with XML payloads. | 
| includeTokens | 
 | 
 | Whether to include the tokens in the parts when using pairs The default value is false | 
| group | 
 | To group N parts together for example to split big files into chunks of 1000 lines. You can use simple language as the group to support dynamic group sizes. | |
| skipFirst | 
 | 
 | To skip the very first element | 
| trim | 
 | 
 | Whether to trim the value to remove leading and trailing whitespaces and line breaks |