Rossum: Technical Webinar Recording and Slides

Setting up an automated invoice data capture system has always been a time-consuming and expensive process. Rossum changes that.

In this technical webinar, Rossum founders Petr Baudis and Tomas Gogar, explained how developers and automation engineers can set up the Rossum’s template-less capture platform and automate invoice processing in a couple of hours.

During this webinar, you will learn:

  • Why Rossum believes in a template-less approach
  • An overview of Rossum’s cognitive data capture platform
  • How to technically set up, integrate and configure Rossum API to fit your needs

NOTE: Basic programming and/or system integration experience is recommended to understand all of the concepts explained in this webinar.

For any further questions, do not hesitate to contact us or email

Technical Webinar Slides:

Technical Webinar Recording:

FAQs from the webinar:

  1. What is the process to handle unrecognized documents? This webinar was mostly about processing invoices, but the AI engine of Rossum is completely document agnostic, and can be trained to understand custom documents. We have some use cases right now that give very good results, so it can always be discussed to customize the engine.
  2. Does the type you define in the configuration schema (string, etc.) have some impact for recognition? The type specified is mainly to ensure that in the follow up integrations, when you export data from Rossum, the data is in a well-defined format that you can then process automatically. So, if you need to work with an amount (a numerical type), it is enforced during the AI capture (and during the manual capture) that this really stays a numerical value and amount. It’s a constraint that the data really needs to follow at all times.
  3. Can Rossum Natural Language Processing to return a ‘service description’ summary of the invoice e.g. “legal services” or would this be an example of where we should use custom extensions to connect to other NLP tools from third parties? We do support this in general and we are capable of automatically classifying documents in categories. We have some predefined categories, like distinguishing a regular invoice from a pro-forma invoice. We are currently exploring using the same interface and the same categorization for finer classification when based on topic. This would be an implementation option just as customizing the document types and the fields captured. So far, the best use case for Rossum is to identify particular values to be captured rather than to do high level summaries. This was already done in a couple of custom extensions, like for classified particular line-item details.
  4. How does Rossum app work if human review is required? When a human needs to review the documents, the data entry clerks usually have a specific time frame during their work day when they log in to the app, and they see all of the documents that require their review. So it’s super easy. They just open a web browser, log in to Rossum app, and open a specific queue that they need to process and they go through the documents one-by-one. Rossum shows you the fields that need to be verified and it writes the extracted value just under the value on the invoice, so you don’t need to move your eyes from the left to the right. You just quickly go through the fields, one-by-one and if you need to make a correction, you just point and click with your mouse, you don’t need to type anything.
  5. Is the full-service version of the product more like the version shown on the website? The trial version comes with pretty much all of the features. There are some custom add-ons that are not available in the trial version, but it contains almost everything. The only thing that is turned off during the trial is the automated self improving, but if you want to test this feature, just reach out to us because we do a lot of POCs and tests to show how the self-improving works.
  6. What is the performance in terms of response time if using the API? The response time, means the AI processing time, perhaps. So currently, the time for processing is around 30 seconds per page. It is not super fast, and the reason is that the machine learning pipeline is super complex behind Rossum. We believe in accuracy first, and the speed is not a big issue for our customers, because usually our customers are processing tens of thousands of invoices a month, and they have a large backlog anyway, so the processing time is not the first priority.  When it comes to speed, there are two different aspects to consider – one is the latency and the other is the throughput. The latency is when you submit a particular document into Rossum, how soon is it ready for data capture, and the throughput is how fast can Rossum process a large batch of documents. The Rossum AI engine is highly parallel, so it can process a lot of documents at once in a shorter time than processing each document separately.
  7. Would it also be possible to create a custom extension to send reviewers a daily reminder that they have X invoices awaiting review? This is a very nice example of a custom connector, and this is easily doable. I believe this can be done in just a couple of hundred lines of code. If we see some pattern that multiple customers require the same feature, we are able to move from custom extensions to the platform itself.
  8. How is uploaded and extracted data stored and secured? Please see our terms and conditions for basic details regarding security and storage. We could make our engine as accurate as it is only by annotating and training on samples of customer-submitted documents, which is why we reserve the right to keep them archived. However, we are happy to arrange completely customized retention policies with our enterprise customers – if your volume is higher than 10.000 documents/month, please contact our sales team for more details.

Ready to get started?

Make a quantum leap in your document processing approach. Boost accuracy and effectiveness with an AI-powered data capture solution for all documents.