Very often values printed on documents are not in the desired format or contain unwanted characters such as spaces, commas, colons, etc. Given this, it may be beneficial to transform the values extracted by the AI instead of using the original ones contained in the document.
This is most commonly seen with the removal of non-alphanumerical characters from values such as the VAT, IBAN, and account number.
Value Transformations is a configurable extension that can help with this process by offering powerful configurable string manipulation features that use regular expressions to replace chosen string patterns automatically.
Advanced users can also chain defined transformations to cover more complex cases when one regular expression is insufficient. However, advanced technical expertise isn’t required to use the extension.
Configuration examples for common use cases
Given that it might be complicated to define transformation rules just by using the description of the available parameters (found later in this article), here are some configuration examples that can be copied and modified for an easier time setting up the extension.
Please note that backslashes in regular expressions must be escaped (doubled) in the extension configuration of the Rossum UI. The example configuration below already contains escaped regular expressions.
Removal of non-alphanumeric characters
The Value Transformations extension with the configuration below removes all non-alphanumeric characters from the Vendor VAT Number and IBAN fields.
Example:
- Input: DE 12345-6789
- Output: DE123456789
{
"actions": [
{
"transformations": [
{
"pattern_to_replace": "[^a-zA-Z\\d]",
"value_to_replace_with": "",
"replace_if_this_pattern_matches": "[^a-zA-Z\\d]"
}
],
"source_target_mappings": [
[
"sender_vat_id",
"sender_vat_id_normalized"
],
[
"iban",
"iban_normalized"
]
]
}
]
}
Extracting and normalizing part of the line item description
Value Transformations with the configuration below uses two chained transformations to extract and normalize item code from the item description.
The first transformation removes everything after the first space character in the string. The second one removes all hyphens from the result of the first transformation.
Notice also that there is an action condition defined in this configuration. This action will only be performed when the Vendor Name is “Lacte“. The condition is optional.
Example:
- Input: 1234-567-89 This is a line item description with the code at the beginning.
- Output: 123456789
{
"actions": [
{
"transformations": [
{
"pattern_to_replace": " ([\\s\\S]*)$",
"value_to_replace_with": "",
"replace_if_this_pattern_matches": " ([\\s\\S]*)$"
},
{
"pattern_to_replace": "-",
"value_to_replace_with": "",
"replace_if_this_pattern_matches": "-"
}
], "action_condition": {
"value": "Lacte",
"schema_id": "sender_name"
},
"source_target_mappings": [
[
"item_description",
"item_code"
]
]
}
]
}
Setting up the extension
Setting up the extension itself takes a few simple steps:
- Prepare your queues and schemas
- Activate Value Transformations in the Rossum Store
- Specify the queue(s) the extension is going to be used for
- Set up the actions and transformations
Step 1: Prepare your queues and schemas
The first step is identifying the queue(s) with the documents that require Value Transformations. Once that’s done, identify the schema IDs of the fields that will contain the extracted values set to be transformed by the extension.
If using the Dedicated Engine, make sure to create new schema fields that will store the results of the transformations (see the info panel below). If the Generic Engine is being used, configure the extension to modify the value of the field “in-place“ (same source and target field).
Please note that by using the Dedicated Engine and configuring the extension to modify the value of a particular field, the results of the accuracy calculation for that field will be significantly lower compared to the real accuracy. To avoid this, modifying the values extracted by the AI and OCR manually or programmatically is not recommended when using the Dedicated Engine.
Step 2: Activate Value Transformations in the Rossum Store
In order to activate Value Transformations, go to the Rossum application and:
- Click the Extensions button in the main menu and you will be taken to the Rossum Store.
- Once in the Rossum Store section, Value Transformations should be visible. If not, click “See all”.
- Click the “Value Transformations” extension tile.
- Click “Try extension”.

Step 3: Specify the queue(s) the extension is going to be used for
Once in the “Rossum Store extension settings”, scroll down to “Queues” and select the queue(s) that the extension should be used for.

Step 4: Set up the actions and transformations
The extension is configured through the configuration field in the UI or by using the settings attribute of the hook API object. The configuration is in JSON format (see the description of the available parameters below).

This configuration consists of a list of actions that can work with values from different fields in the schema. Each action has a set of transformations, source/target field definitions, and the condition under which the action will be performed.
The code from examples above can be pasted here in case of a matching use case. Then, only step needed is a simple replacement of the fields whose values are to be transformed.
The full list of available parameters is shown below.
Root | Param name | Description |
---|---|---|
actions | List of actions to be performed by the extension. Description of the action parameters is shown below. | |
actions | source_target_mappings | List of source and target field schema ids. Each pair of the fields is a small list containing two strings (see example below). |
actions | transformations | List of transformations to be performed on the value of the source field. See description of the transformation parameters below. |
actions | queue_id | ID of the queue where the particular action should be performed. It is possible to assign the extension to multiple queues and specify multiple actions for different queues in one instance. This parameter is optional. If it is not present in the configuration, then the action will be performed on all the queues that the extension is assigned to. |
actions | action_condition | Definition of a condition for a particular action. If defined, the action will only be performed if the value in the field defined in the schema_id equals the value it’s configured for in the value parameter of the condition. |
action_condition | schema_id | Schema ID of the field used in the condition. |
action_condition | value | Value which will be compared to the value of the field defined in the schema_id parameter of the condition. This action will only be performed if the defined value and the field value match. |
transformations | pattern_to_replace | Regular expression which defines a pattern in the value to be found and replaced. See python regular expressions for details |
transformations | value_to_replace_with | The value which will replace all occurrences of the pattern matching the regular expression defined in the pattern_to_replace parameter. |
transformations | replace_if_this_pattern_matches | Regular expression which defines the condition for a transformation to be applied. The transformation will only be applied if the value matches the expression. See python regular expressions for details |