From this article you will learn:
- What is the Duplicate Handling extension
- What is the common use case configuration
- How to set up Duplicate Handling extension
What is the Duplicate Handling extension?
Organizations deal with a large number of incoming documents on a daily basis. Sometimes, these documents can be duplicated, leading to inefficiencies and a waste of resources.
For instance, multiple departments might process the same document, leading to redundant work. Also, in some cases, duplicates can lead to errors in data processing or decision-making.
So to address this problem, Rossum provides a configurable function that can detect incoming duplicates. You can base the detection on the following:
- image hash
- extracted field values
- document attributes
Once the extension detects a duplicate, it can take actions such as:
- fill field values,
- forward the duplicate annotation to a different queue/status,
- mark the annotation as duplicate,
- show a message with custom text on the document,
- stop automation.
Please keep in mind that duplicates detected by the extension are included in the billing.
What is the common use case configuration
Below is a basic configuration example you can copy and modify for an easier setup.
Detecting duplicate documents based on field values
Out of the box, Rossum detects duplicates by comparing the image hash to previously received documents. But sometimes you may receive the same document, just scanned differently. So in such cases, you need to compare the documents based on the extracted values.
You can achieve it with the Duplicate Handling function. The following configuration will detect incoming duplicates with matching invoice_id
and sender_name
fields against already processed documents in the same queue. You can also configure the detection scope to detect duplicates across queues/workspaces). If such a document is detected, Rossum will show an error message on the document.
{
"configurations": [{
"logic": [{
"rules": [{
"id": 1,
"attribute": "field",
"field_schema_id": "invoice_id"
},
{
"id": 2,
"attribute": "field",
"field_schema_id": "sender_name"
}
],
"scope": {
"object": "queue"
},
"matching_flow": ["1and2"],
"actions": [{
"type": "show_message",
"message_type": "error",
"message": "Duplicates detected: %ANNOTATION_ID%"
}]
}]
}]
}
For detailed documentation regarding the configuration JSON, please see this interactive guide.
How to set up Duplicate Handling extension
Setting up the extension itself takes a few simple steps.
Step 1: Activate Duplicate Handling in the Rossum Store
To enable Duplicate Handling, go to the Rossum application and:
- Click on the Extensions tab at the top of the app.
- Click on the Rossum store option to display all the available extensions.
- Select the “Duplicate Handling” extension tile.
- Click “Try extension”.
Step 2: Specify to which queue(s) you want to add this extension
In the “Rossum Store Extension Settings,” scroll down to “Queues” and select the queue(s) in which you want to use the function. Please remember to save your changes once you’ve chosen the desired queues.

Step 3: Set up the configuration
You can set it up using the configuration field in the UI or using the settings attribute of the hook API object. The configuration is in JSON format (see the interactive documentation of the available parameters here).
