Customizing data extraction

Generic Engine: Certificates of Analysis

Certificates of analysis are quality control documents that are common in the food and beverage industry. These documents confirm that the received food products comply with the desired parameters and targets. Rossum Generic Engine has been pre-trained to recognize and process the fields in Certificates of Analysis. The tables below contain the full list of […]

Generic Engine: Accounts Payable and Receivable

The Accounts Payable and Receivable AI engine is pre-trained to recognize Rossum’s general dataset of semi-structured documents. Rossum generic AI engine has been specifically trained to recognize invoices and purchase orders.  Supported Fields Rossum’s Accounts Payable and Receivable AI Engine has been pre-trained to recognize data fields in invoices and purchase orders. You can use this engine to automate your AP and AR […]

Generic Engine: Chinese Invoices

Rossum’s Chinese Invoices engine has been pre-trained to recognize data fields in Governmental Tax Invoices from Mainland China (fapiaos). Now you can achieve the fastest Chinese invoice (fapiao) processing time and go from days to hours thanks to data extraction and data learning from your incoming Chinese invoices. Avoid misalignments, errors in balancing and reconciling […]

How to Use the Distributive Webhook

What is the Distributive Webhook? Distributive Webhook is an extension that enables you to capture headings or other data within tables. When there is a value above or below some rows in the line item table, that value cannot be captured using a simple point-and-click approach. To distribute these values to rows below or above, […]

Configuring Fields for Data Extraction

Each Queue defines the structure of Data fields that Rossum extracts. When editing this structure you have two options: Use pre-trained Data fields – Rossum’s Generic AI engine has been pre-trained to recognize specific Data fields and enables you to start extracting data without any additional training for the AI. Define Custom data fields and train the […]

How to Edit an Extraction Schema

Preparing an extraction schema is one of the most important tasks you need to do when configuring Rossum. You have a lot to define, including the names and formats of data fields you want to extract, necessary value constraints, and, if you’re using them, enum options. In order to correctly configure the data fields, go […]

Capturing a New Data Field in Rossum

Rossum’s AI Engine can capture a predefined set of data fields from the very beginning. You will see those fields when you create a new Rossum account or a new queue from a given regional extraction schema template.  However, it is a common case that you need to create a new custom data field that […]

What Is the Rossum AI Engine?

The Rossum AI Engine is the brain that runs the Rossum application. It is responsible for reading and extracting data from each document that Rossum receives. Rossum’s AI Engine learns to identify and recognize information in Accounts Payable and Receivable documents holistically. It allows the engine to make generalized decisions based on hundreds of thousands […]

Data Fields Extracted by the Generic AI Engine

Rossum’s Generic AI Engine has been pre-trained to recognize data fields in invoices and purchase orders. You can use this engine to automate your AP process without any complex training procedures. The engine also supports most standard invoicing fields. The tables below contain the full list of standard data fields the Generic AI Engine can […]

Do You Need Rossum’s Generic or Dedicated AI Engine?

Rossum has two types of AI Engine: Generic AI Engine and Dedicated AI Engine. Rossum has several types of Generic Engines that have learned to process a specific document type. These include: Accounting Engine (Accounts Payable and Receivable Engine) Chinese Invoice Generic Engine Certificate of Analysis Generic Engine Generic Engines learn to recognize fields from […]

Capturing Custom Table Data in Rossum

A basic element in the extraction schema is the data field. However, Rossum enables the capture of even more complex structures like tables. At the moment, Rossum’s AI engine automatically extracts 2 types of tables – Line items and Tax details. Both of them come with a variety of predefined sets of extracted data fields […]

Capturing fields with multiple values in Rossum

Basic data field in Rossum extraction schema can be used to extract a single value, e.g. extraction of Invoice Number. However, in some cases you might need to extract multiple values under one field. For such cases, Rossum schema supports also extraction of data fields with multiple values – multivalue fields. What is a multivalue […]

How Dedicated AI Engine Training Works

Note: If you want to use the Rossum dedicated AI engine, make sure to purchase the feature before you start the training process. After purchasing the dedicated AI engine, the training process will consist of the following steps: Set up your extraction schema. Dedicated AI engine training requires that a special schema be set up. Rossum […]

Continuous Engine Updates

After your dedicated AI engine has been trained for the first time, you will notice a steady improvement in data extraction accuracy. As you continue using Rossum, the platform uses the data you review and validate in the validation screen to retrain the engine and increase its precision. These continuous engine updates retrain your dedicated […]

Preparing Documents for Dedicated AI Engine Training

All documents that are to be used for dedicated AI engine training must pass through a verification process, which is carried out through the Rossum validation screen. After Rossum processes an uploaded document, a data entry operator opens the validation screen and reviews the captured data. The operator can correct errors by pointing and clicking […]

Annotations Guide, Part 1: What to Keep in Mind

In addition to using the Rossum validation screen for the purpose of dedicated AI engine training, there are three general rules that you must follow throughout the annotation process: Keep the annotations consistent: consistency is the main pillar of a successful training process. If data is present in multiple locations in an invoice, always annotate […]

Annotations Guide, Part 2: Basic Rules

Throughout the entire training period, you must follow the rules of the annotation process. Below are some basic rules that will help you increase the accuracy of your dedicated AI engine: Provide at least 500 documents. This is usually the minimum to achieve satisfactory accuracy. However, you may need more documents if you want to […]

Annotations Guide, Part 3: Practical Examples

Below, you can find some additional instructions and practical examples that may come in handy during your annotation process and help you increase the extraction accuracy of your Dedicated AI Engine. 1. Annotate only the values, not the labels. 2. Move the bounding box a little.  If the bounding box crosses right through a word, […]

Dedicated AI Engine Training: Annotation Best Practices

These guidelines are deprecated. Refer to our new annotation guides series instead. If you have purchased the dedicated AI engine functionality, Rossum will automatically train its AI to suit your specific needs, such as custom fields or specific document types. The training process runs in the background while you are using the validation interface; however, it’s […]

How Rossum Data Extraction Billing Works

Rossum’s pricing has a complex structure, but a major part of Rossum’s subscription is the volume allowance for document processing. This article clarifies the exact cases of when and how Rossum charges for a document or a page. The basic principle of Rossum’s billing is that we charge for documents or pages processed by our […]

Annotations Guide, Part 4: Line Items

1. Always use Magic Grid to annotate the structured line items tables. Structured tables are tables where data is placed in separate columns, one data type per column, each line item in a separate row. When using a Magic Grid, drag the grid lines over the data itself. See our user guide article or the […]

Supported Data Types In Master Data Matching

When using Rossum’s Data matching feature, you can upload various types of data representing list of vendors, purchase orders or delivery notes. However, each of the possible document types is a little bit different and each of its fields might need some pre-processing in order to achieve the best master data matching results. By default […]

Automation of Fields in Rossum

You have probably already noticed the icon with a green A on your main screen in Rossum or the green and grey ticks displayed in the validation screen. In this article we will explain more what they mean and how they work. Documents validated by AI When a document successfully passes the automation pipeline without […]

Making Fields in Rossum Required

When capturing fields in Rossum you often want your users to always capture value for a specific field. In such cases, Rossum should enforce capturing of such a field by showing a specific message to the user. This is already possible in the Rossum’s validation screen and in this article you will find out how […]

Setting up Extraction schema for Dedicated Engine on Multiple Queues

Use the same setup for fields on different queues Please use the same logic for all fields that represent the same fields on different queues. The fields can be set up using the Rossum’s Extraction schema editor. If the same fields need to be extracted, they have to have the same field IDs and the […]

Automate data extraction from your documents with Artificial Intelligence.
Free trial