Under the Hood: Rossum Improvement News
May 2019

We have compiled a list of the most important updates to the Data Extraction API  (DE) and the Document Management API (DM). The Data Extraction API is the core AI engine, taking care of the automated data capture process. The Document Management API is the workflow system maintaining document queues, callbacks, the verification user interface, web app etc., recommended as the main interface for all new users. We also list updates to the elisctl tool that lets you make changes to your account more easily.

Try it out for yourself. Sign up for a free trial.

May

This month, we have brought two much-requested features to Rossum: receipt capture and statistics. Receipts can be tricky documents to capture because of their inherently complicated nature. Statistics will help in analyzing your processes in data capture and optimizing them. We plan on further developing these features in the coming months.

20th May

For invoice and receipt capture on the go, we have created an Android app.

Rossum Android app

14th May

Now, you can download an Excel spreadsheet summarizing basic Rossum usage statistics including document counts (imported, exported) and time spent for each user (and per queue). We have also documented how to pass document annotation data updates, e.g. for custom engine training purposes. Use https://example.app.rossum.ai/api/manage/usage_report
(Please note that direct access to the API may require you to login using Rossum credentials. User must have “admin” or “manager” role.)

5th May

Do you also need to work with receipts? We have dramatically improved the accuracy of receipt capture with a more comfortable and flawless user experience. We have also  reduced the processing latency of submitted documents at periods of peak traffic.

April

Receipt capture is on our list of improvements that we’ve been working on because we know you have been asking for it. We started with a new version of our OCR and continue to build a full-blown receipt capture tool. From April onwards, you can rotate documents, so upside down invoices can now be processed quickly in Rossum.

25th April

We now offer basic support for receipt capture: a new version of “skimming OCR” that’s accurate on receipt fonts and environments. We also extensively trained our AI on receipt samples, significantly increasing localization accuracy.

15th April

Have you uploaded invoices upside down? We have introduced document rotation support, with a view control widget that lets you control the orientation of the document. Rotated documents are automatically reprocessed by the AI engine, saving you time.

11th April

Minor update to the connector.

9th April

We will automatically identify and extract new information payment states such as paid, unpaid etc. Improved OCR speed and accuracy means that clicking on a page to read data is now 2x faster. We also got rid of extra characters appearing in some table cells. Tax details with a 0% rate were cleaned up and discarded in a few cases where they shouldn’t have been extracted.

8th April

Minor update to the Document Management API, such as improved Magic Grid behavior, browser compatibility, improvements to verification view and user interface language.

3rd April

Update to the Rossum Document Management API, including a significant speedup of on-demand text extraction.

March

26th March

Minor update to the Elis Data Extraction API and Elis Document Management API.

20th March

Registering a trial account in the Rossum DM is now openly available, with three default schemas to choose from (US, UK & EU). The user interface should feel noticeably faster. We also implemented a range of visual and stability improvements – a document page filling a much larger part of the screen for an improved user experience.

15th March

We have released v2.0 of the elisctl tool with various usability improvements such as experimental support for editing schemas (sidebar description) using Excel (xlsx) files. Customizing Rossum features is much easier.

14th March

Minor security update to the Rossum Document Management API

13th March

Rossum Data Extraction API update fixed a table extraction bug that produced empty or half empty cells. We also improved reading of noisy images including outside tables.

10th March

Rossum Document Management API features a new Download button in the Exported tab in the web dashboard. You can now download all captured data in csv, xml, or json format. The Document Management API respects filters selected in the Exported tab (particularly the search string).

7th March

Major update to the Rossum Document Management API: the Magic Grid tool for rapid line item data capture. It is now available via a new button within the line items multivalue section. See our blog post for more details and a video demo.

6th March

Features update to the Rossum Data Extraction API that introduces 2 new properties of table cells – value and value_type, which are straight analogies to the header fields’ properties of the same name. In the Rossum Document Management API, this improves automatic table data capture quality especially on the amount columns. We now process digital PDF documents that do not require OCR slightly faster. We improved the accuracy of document property classifiers (document type, currency and language).

1st March

An update to the Rossum Document Management API user interface. We have differentiated review behavior by clicking the “Start processing” button vs. opening a specific invoice.

In batch review, exporting an invoice brings up the next invoice in the queue that is available for review. When pulling up a specific invoice, an “annotation stack” is available where you can browse the invoices back and forth. Therefore, for regular operation, the “Start processing” button should be used, whereas opening a specific invoice is meant for inspecting the queue rather than processing its entirety. Individual documents may now be opened in new tabs again.

February

28th February

Major update to the Rossum Document Management API with line item automation support. Line items may now be pre-captured which enables line item automation for the very first time! There is now an API endpoint for creating new organizations and queue export supports the same full set of filters as the queue annotation list. This means exported captured data may be filtered by a wider variety of timestamps or even just an explicit list of document ids.

24th February

Bugfix update to the Rossum Data Extraction API. Fixed the API behavior in case of some unprocessable documents when those documents permanently hung in the “processing” state from the API perspective. This caused some documents to be stuck as “importing” in the Rossum Document Management API, which is also fixed by this update.

24th February

Update to the Rossum Data Extraction API, introduces a feature: determination of table column types. You now get information, not just about how the table is split into cells (which rows are header or data) but also what each column means semantically: is it a quantity, is it a description? We also compiled a brand new table extraction format documentation.

We improved accuracy on low-quality scans and VAT details parsed (an unusually written VAT rate such as “20,000” is now interpreted as 20%).

14th February

We have overhauled our Rossum Document Management API documentation. We have also released a new tool, “elisctl“, for controlling Rossum on the configuration level.

10th February

An update to Rossum Data Extraction API and the Document Management API. Our core AI engine models have been updated to more accurate versions. We have improved attachment extraction of our email gateway to cover all kinds of forwarded, signed or deeply nested mail messages.

January

31st January

Update to Rossum Document Management API handles queue export API, fixes xml export and improves annotation search with a new `tolerance` parameter.

Rossum Data Extraction API is enhanced with table improvements: table extraction engine is now accurate in terms of both determining tabular areas on a page and splitting rows and cells. We also fixed OCR handling of multi-line table cells.

Data extraction is faster, we have sped up a portion of the field OCR process and expect the processing time per page about 2 seconds faster on average.

About Rossum app

Rossum app is the cognitive invoice data capture tool, powered by Artificial Intelligence, enabling companies to capture data from financial documents efficiently and with human-level accuracy. Unlike existing text mining solutions, Rossum’s unique artificial intelligence technology reflects the way humans read documents. This eliminates the need for costly manual implementation, a game changer in the data capture business.

About Rossum

Rossum is an artificial intelligence company that extracts data from documents with human-level parity helping companies automate their data entry tasks and thus creates significant savings. Our mission is to teach computers to support human creativity, and unshackle the human mind from rows and spreadsheets.

Ready to get started?

Make a quantum leap in your document processing approach. Boost accuracy and effectiveness with an AI-powered data capture solution for all documents.