Data extraction

Data Fields Extracted by the Generic AI Engine

Rossum’s generic AI engine has been pretrained to recognize data fields in invoices and purchase orders. You can use this engine to automate your AP process without any complex training procedures. The engine also supports most standard invoicing fields. The tables below contain the full list of standard data fields the generic AI engine can […]

Continuous Engine Updates

After your dedicated AI engine has been trained for the first time, you will notice a steady improvement in data extraction accuracy. As you continue using Rossum, the platform uses the data you review and validate in the validation screen to retrain the engine and increase its precision. These continuous engine updates retrain your dedicated […]

Configuring Fields for Data Extraction

Each queue defines the structure of data fields that Rossum extracts. When editing this structure you have two options: Use pre-trained data fields – Rossum’s generic AI engine has been pre-trained to recognize specific data fields and enables you to start extracting data without any additional training for the AI. Define custom fields and train the dedicated […]

How to Edit an Extraction Schema

Preparing an extraction schema is one of the most important tasks you need to do when configuring Rossum. You have a lot to define, including the names and formats of data fields you want to extract, necessary value constraints, and, if you’re using them, enum options. To configure your account, you can use the API […]

What Is the Rossum AI Engine?

The Rossum AI engine is the brain that runs the Rossum application. It is responsible for reading and extracting data from each document that Rossum receives. Rossum’s AI engine learns to identify and recognize information in documents holistically. It allows the engine to make generalized decisions based on hundreds of thousands of pieces of data […]

Do You Need Rossum’s Generic or Dedicated AI Engine?

Rossum has two types of AI engine: generic AI engine and dedicated AI engine. The generic AI engine is pre-trained by Rossum to process a specific document type. Currently, this engine variant focuses solely on invoice processing; it has been trained to recognize many various invoice layouts, languages, and content. This is the default engine […]

How Dedicated AI Engine Training Works

Note: If you want to use a Rossum dedicated AI engine, make sure you have purchased the feature before you start the training process. After purchasing a dedicated AI engine, the training process consists of the following steps: Set up your extraction schema. Dedicated AI engine training requires a special schema set up. Rossum Solution Engineers […]

Preparing Documents for Dedicated AI Engine Training

All documents that are to be used for dedicated AI engine training must pass through a verification process, which is carried out through the Rossum validation screen. After Rossum processes an uploaded document, a data entry operator opens the validation screen and reviews the captured data. The operator can correct errors by pointing and clicking […]

Dedicated AI Engine Training: Annotation Best Practices

If you have purchased the dedicated AI engine functionality, Rossum will automatically train its AI to suit your specific needs, such as custom fields or specific document types. The training process runs in the background while you are using the validation interface; however, it’s important to follow the best practices that are mentioned below to ensure […]

Annotations Guide, Part 2: Basic Rules

Throughout the entire training period, you must follow the rules of the annotation process. Below are some basic rules that will help you increase the accuracy of your dedicated AI engine: Provide at least 500 documents. This is usually the minimum to achieve satisfactory accuracy. However, you may need more documents if you want to […]

Annotations Guide, Part 1: What to Keep in Mind

In addition to using the Rossum validation screen for the purpose of dedicated AI engine training, there are three general rules that you must follow throughout the annotation process: Keep the annotations consistent: consistency is the main pillar of a successful training process. If data is present in multiple locations in an invoice, always annotate […]

Automate data extraction from your documents with Artificial Intelligence.
Free trial