XML Tokenize

Since Camel 2.14

The xml tokenizer language is a built-in language in camel-core, which is a truly XML-aware tokenizer that can be used with the Splitter as the conventional Tokenizer to efficiently and effectively tokenize XML documents. XMLTokenizer is capable of not only recognizing XML namespaces and hierarchical structures of the document but also more efficiently tokenizing XML documents than the conventional Tokenizer.

For more details see Splitter.

XML Tokenizer Options

The XML Tokenize language supports 4 options, which are listed below.

Name Default Java Type Description

headerName

String

Name of header to tokenize instead of using the message body.

mode

String

The extraction mode. The available extraction modes are: i - injecting the contextual namespace bindings into the extracted token (default) w - wrapping the extracted token in its ancestor context u - unwrapping the extracted token to its child content t - extracting the text content of the specified element

group

Integer

To group N parts together

trim

true

Boolean

Whether to trim the value to remove leading and trailing whitespaces and line breaks

Spring Boot Auto-Configuration

When using xtokenize with Spring Boot make sure to use the following Maven dependency to have support for auto configuration:

<dependency>
  <groupId>org.apache.camel.springboot</groupId>
  <artifactId>camel-xml-jaxp-starter</artifactId>
  <version>x.x.x</version>
  <!-- use the same version as your Camel core version -->
</dependency>

The component supports 3 options, which are listed below.

Name Description Default Type

camel.language.xtokenize.enabled

Whether to enable auto configuration of the xtokenize language. This is enabled by default.

Boolean

camel.language.xtokenize.mode

The extraction mode. The available extraction modes are: i - injecting the contextual namespace bindings into the extracted token (default) w - wrapping the extracted token in its ancestor context u - unwrapping the extracted token to its child content t - extracting the text content of the specified element.

String

camel.language.xtokenize.trim

Whether to trim the value to remove leading and trailing whitespaces and line breaks.

true

Boolean