Extracting invoices using AI in a few lines of code

Our AI researcher, Bohumir Zamecnik, has long been exploring new ways to extract data from invoices. What people originally were doing by hand, then via software that required an extensive setup of templates, is now being done with artificial intelligence. Bohumir has written a step-by-step guide for extracting information from invoices using AI with only a few lines of code. It really is as easy as 1, 2, 3! With our Python API, developers can now test their skills and easily integrate it into their own code. Click here to read his full post ‘Extracting invoices using AI in a few lines of code’, and try our Elis Extraction API for yourself!

Standard

The Future of Data Capture Systems (2/2): The Rossum Approach

This is the second part of our special founders blog post on data capture technology and how Rossum represents a radically different approach to the whole problem of information extraction from business documents.

Last time we covered the fundamental issues of the traditional OCR systems due to their machine-like approach, in contrast with the magical efficiency of human mind in this task. Now, it’s time to delve into how exactly we replicated the human approach using deep learning, and show that it certainly is delivering as promised. Continue reading

Standard

The Future of Data Capture Systems (1/2): Imitating Human Behavior

Data capture for invoices ought to have been solved a long time ago! That’s what most people think, especially if they’ve never tried to actually do it.

That’s what we thought when we started talking to customers, looking for the ideal application of Rossum’s machine vision technology. It is genuinely surprising how hard this problem actually is, and how big an advantage a human mind has compared to a fixed algorithm. That’s also the reason Rossum’s approach stands out so much within this domain.

This is a special founder blogpost, in two parts written by the original minds behind Rossum’s technology – Petr, Tomas and Tomas. We will walk you through the concrete limitations of the current OCR systems, why we built Rossum, which lets anyone capture data from invoices without manual capture setup, and how it achieves this.

Who are we? Standard nerds, albeit with many big accomplishments between us in machine learning, computer vision, and AI. Just about 2 years ago, we decided to it was time to stop fiddling with AlphaGo and image recognition, and focus on one super-hard problem with a real impact on the lives of millions of people every day. Surprisingly enough, it turned out to be invoices. Here’s why:

Continue reading

Standard

Building Our Own Version of AlphaGo Zero

At Rossum, we are building artificial intelligence for understanding of documents. Our main line of attack lies in supervised machine learning, the most efficient approach to make neural networks achieve the highest accuracy. However, we need very detailed training data in this setup and that’s why are also pursuing less direct approaches, e.g. based on generative-adversarial networks or advanced reinforcement learning.

Petr Baudiš’ Story: From Go to Neural Networks And Unexpected Reunion

Continue reading

Standard