Very often, values printed on documents are not in the desired format or contain unwanted characters such as spaces, commas, colons, etc. Given this, it may be beneficial to transform the values extracted by the AI instead of using the original ones contained in the document. It is most commonly seen with the removal of non-alphanumerical characters from values such as the VAT, IBAN, and account number.
Value Transformations is a configurable extension that can help with this process. It offers powerful, configurable string manipulation features that use regular expressions to replace chosen string patterns automatically.
Advanced users can also chain defined transformations to cover more complex cases when one regular expression is insufficient. However, advanced technical expertise isn’t required to use the extension.
Configuration examples for common use cases
It might be complicated to define transformation rules just by using the description of the available parameters (found later in this article). Here are some configuration examples that can be copied and modified for an easier time setting up the extension.
Please note that backslashes in regular expressions must be escaped (doubled) in the extension configuration of the Rossum UI. The example configuration below already contains escaped regular expressions.
Removal of non-alphanumeric characters
The Value Transformations extension with the configuration below removes all non-alphanumeric characters from the Vendor VAT Number and IBAN fields.
Example:
- Input: DE 12345-6789
- Output: DE123456789
{
"actions": [
{
"transformations": [
{
"pattern_to_replace": "[^a-zA-Z\\d]",
"value_to_replace_with": "",
"replace_if_this_pattern_matches": "[^a-zA-Z\\d]"
}
],
"source_target_mappings": [
[
"sender_vat_id",
"sender_vat_id_normalized"
],
[
"iban",
"iban_normalized"
]
]
}
]
}
Extracting and normalizing part of the line item description
Value Transformations with the configuration below use two chained transformations to extract and normalize item code from the item description.
The first transformation removes everything after the first space character in the string. The second one removes all hyphens from the result of the first transformation.
Notice also that there is an action condition defined in this configuration. This action will only be performed when the Vendor Name is “Lacte“. The condition is optional.
Example:
- Input: 1234-567-89 This is a line item description with the code at the beginning.
- Output: 123456789
{
"actions": [
{
"transformations": [
{
"pattern_to_replace": " ([\\s\\S]*)$",
"value_to_replace_with": "",
"replace_if_this_pattern_matches": " ([\\s\\S]*)$"
},
{
"pattern_to_replace": "-",
"value_to_replace_with": "",
"replace_if_this_pattern_matches": "-"
}
], "action_condition": {
"value": "Lacte",
"schema_id": "sender_name"
},
"source_target_mappings": [
[
"item_description",
"item_code"
]
]
}
]
}
Setting up the extension
Setting up the extension itself takes a few simple steps:
- Prepare your queues and schemas
- Activate Value Transformations in the Rossum Store
- Specify the queue(s) the extension is going to be used for
- Set up the actions and transformations
Step 1: Prepare your queues and schemas
The first step is identifying the queue(s) with the documents that require Value Transformations. Then identify the schema IDs of the fields containing the extracted values set to be transformed by the extension.
If you use the Dedicated Engine, create new schema fields that store the results of the transformations (see the info panel below). If you use the Generic Engine, configure the extension to modify the value of the field “in-place“ (same source and target field).
Please note that by using the Dedicated Engine and configuring the extension to modify the value of a particular field, the results of the accuracy calculation for that field will be significantly lower compared to the real accuracy. To avoid this, modifying the values extracted by the AI and OCR manually or programmatically is not recommended when using the Dedicated Engine.
Step 2: Activate Value Transformations in the Rossum Store
In order to activate Value Transformations, go to the Rossum application and:
- Click the Extensions button in the main menu to open the Rossum Store.
- Once in the Rossum Store section, you will see the “Value Transformations” extension tile.
- Click on it.
- Click “Try extension.”

Step 3: Specify the queue(s) the extension is going to be used for
Once in the “Rossum Store extension settings,” scroll down to “Queues” and select the queue(s) that to which you want to add the extension.

Step 4: Set up the actions and transformations
The extension is configured through the configuration field in the UI or by using the settings attribute of the hook API object. The configuration is in JSON format (see the description of the available parameters below).

This configuration consists of a list of actions that can work with values from different fields in the schema. Each action has a set of transformations, source/target field definitions, and the condition under which the action will be performed.
You can paste the code from the examples above in case of a matching use case. Then, the only step needed is simply replacing the fields whose values are to be transformed.
You can see the complete list of available parameters below.
Root | Param name | Description |
---|---|---|
actions | List of actions to be performed by the extension. Description of the action parameters is shown below. | |
actions | source_target_mappings | List of source and target field schema ids. Each pair of the fields is a small list containing two strings (see example below). |
actions | transformations | List of transformations to be performed on the value of the source field. See description of the transformation parameters below. |
actions | queue_id | ID of the queue where the particular action should be performed. It is possible to assign the extension to multiple queues and specify multiple actions for different queues in one instance. This parameter is optional. If it is not present in the configuration, then the action will be performed on all the queues that the extension is assigned to. |
actions | action_condition | Definition of a condition for a particular action. If defined, the action will only be performed if the value in the field defined in the schema_id equals the value it’s configured for in the value parameter of the condition. |
action_condition | schema_id | Schema ID of the field used in the condition. |
action_condition | value | Value which will be compared to the value of the field defined in the schema_id parameter of the condition. This action will only be performed if the defined value and the field value match. |
transformations | pattern_to_replace | Regular expression which defines a pattern in the value to be found and replaced. See python regular expressions for details |
transformations | value_to_replace_with | The value which will replace all occurrences of the pattern matching the regular expression defined in the pattern_to_replace parameter. |
transformations | replace_if_this_pattern_matches | Regular expression which defines the condition for a transformation to be applied. The transformation will only be applied if the value matches the expression. See python regular expressions for details |