Data extraction

Configuring Fields for Data Extraction

Each Queue defines the structure of Data fields that Rossum extracts. When editing this structure you have two options: Use pre-trained Data fields – Rossum’s Generic AI engine has been pre-trained to recognize specific Data fields and enables you to start extracting data without any additional training for the AI. Define Custom data fields and train the […]

How to Edit an Extraction Schema

Preparing an extraction schema is one of the most important tasks you need to do when configuring Rossum. You have a lot to define, including the names and formats of data fields you want to extract, necessary value constraints, and, if you’re using them, enum options. In order to correctly configure the data fields, go […]

Capturing a New Data Field in Rossum

Rossum’s AI Engine can capture a predefined set of data fields from the very beginning. You will see those fields when you create a new Rossum account or a new queue from a given regional extraction schema template.  However, it is a common case that you need to create a new custom data field that […]

What Is the Rossum AI Engine?

The Rossum AI Engine is the brain that runs the Rossum application. It is responsible for reading and extracting data from each document that Rossum receives. Rossum’s AI Engine learns to identify and recognize information in documents holistically. It allows the engine to make generalized decisions based on hundreds of thousands of pieces of data […]

Data Fields Extracted by the Generic AI Engine

Rossum’s Generic AI Engine has been pre-trained to recognize data fields in invoices and purchase orders. You can use this engine to automate your AP process without any complex training procedures. The engine also supports most standard invoicing fields. The tables below contain the full list of standard data fields the Generic AI Engine can […]

Do You Need Rossum’s Generic or Dedicated AI Engine?

Rossum has two types of AI Engine: Generic AI Engine and Dedicated AI Engine. The Generic AI Engine is pre-trained by Rossum to process a specific document type. Currently, this engine variant focuses solely on invoice processing; it has been trained to recognize fields from many various invoice layouts, languages, and content. This is the […]

Capturing Custom Table Data in Rossum

A basic element in the extraction schema is the data field. However, Rossum enables the capture of even more complex structures like tables. At the moment, Rossum’s AI engine automatically extracts 2 types of tables – Line items and Tax details. Both of them come with a variety of predefined sets of extracted data fields […]

Capturing fields with multiple values in Rossum

Basic data field in Rossum extraction schema can be used to extract a single value, e.g. extraction of Invoice Number. However, in some cases you might need to extract multiple values under one field. For such cases, Rossum schema supports also extraction of data fields with multiple values – multivalue fields. What is a multivalue […]

How Dedicated AI Engine Training Works

Note: If you want to use a Rossum dedicated AI engine, make sure you have purchased the feature before you start the training process. After purchasing a dedicated AI engine, the training process consists of the following steps: Set up your extraction schema. Dedicated AI engine training requires a special schema set up. Rossum Solution Engineers […]

Continuous Engine Updates

After your dedicated AI engine has been trained for the first time, you will notice a steady improvement in data extraction accuracy. As you continue using Rossum, the platform uses the data you review and validate in the validation screen to retrain the engine and increase its precision. These continuous engine updates retrain your dedicated […]

Preparing Documents for Dedicated AI Engine Training

All documents that are to be used for dedicated AI engine training must pass through a verification process, which is carried out through the Rossum validation screen. After Rossum processes an uploaded document, a data entry operator opens the validation screen and reviews the captured data. The operator can correct errors by pointing and clicking […]

Annotations Guide, Part 1: What to Keep in Mind

In addition to using the Rossum validation screen for the purpose of dedicated AI engine training, there are three general rules that you must follow throughout the annotation process: Keep the annotations consistent: consistency is the main pillar of a successful training process. If data is present in multiple locations in an invoice, always annotate […]

Annotations Guide, Part 2: Basic Rules

Throughout the entire training period, you must follow the rules of the annotation process. Below are some basic rules that will help you increase the accuracy of your dedicated AI engine: Provide at least 500 documents. This is usually the minimum to achieve satisfactory accuracy. However, you may need more documents if you want to […]

Annotations Guide, Part 3: Practical Examples

Below, you can find some additional instructions and practical examples that may come in handy during your annotation process and help you increase the extraction accuracy of your Dedicated AI Engine. 1. Annotate only the values, not the labels. 2. Move the bounding box a little.  If the bounding box crosses right through a word, […]

Dedicated AI Engine Training: Annotation Best Practices

These guidelines are deprecated. Refer to our new annotation guides series instead. If you have purchased the dedicated AI engine functionality, Rossum will automatically train its AI to suit your specific needs, such as custom fields or specific document types. The training process runs in the background while you are using the validation interface; however, it’s […]

How Rossum Data Extraction Billing Works

Rossum’s pricing has a complex structure, but a major part of Rossum’s subscription is the volume allowance for document processing. This article clarifies the exact cases of when and how Rossum charges for a document or a page. The basic principle of Rossum’s billing is that we charge for documents or pages processed by our […]

Annotations Guide, Part 4: Line Items

1. Always use Magic Grid to annotate the structured line items tables. Structured tables are tables where data is placed in separate columns, one data type per column, each line item in a separate row. When using a Magic Grid, drag the grid lines over the data itself. See our user guide article or the […]

Automate data extraction from your documents with Artificial Intelligence.
Free trial