Paper-based back office processes are one of the obvious targets for automation. And many solutions did try! However, not every kind of automation is alike – or the costs. In this series, we have seen how much the right process matters for the cost in the first article, and where the costs come from in the second. Now, it's time to look at how different automation approaches affect the OCR process costs – a template-based, on-prem OCR solution, and the new kid on the block: a cognitive, cloud-based data extraction solution. And to sum this topic up, let's see what the future holds for the invoice data capture methods.
Template-based, on-prem OCR solution
During the 1980's, the rise of computers in the enterprise sector marked a big change in data capture. Optical Character Recognition technology, converting printed text into data, enabled extracting invoice data using templates automatically. This template is either a zonal OCR with fixed page locations for individual data fields, such as total amount or supplier address, or a rule-based OCR configured with a sequence of if-then rules telling the software where to look for specific information. This software is installed on company premises, set up, and continuously updated by supplier technicians and used by the internal AP department.
The most usual mistake when it comes to template-based OCR systems? That the cost for processing an invoice with a template system would equal to the price of the document as determined by the licence.
The actual cost structure
On-premise software carries several costs that are often neglected, from the one-time setup to regular maintenance. The initial implementation and set up of the system is typically quite time-consuming, expensive and is performed by supplier experts. Trained staff perform the actual invoice data extraction. One of the related costs is the training, which can take just half a day, but retraining is needed relatively often due to personnel rotation. We need to take into account all those costs so we can calculate the TCO of template-based invoice data capture.
The system configuration is specific for every vendor and needs regular updates, due to natural vendors rotation and change of invoice layouts. Our model company, The Corp, Inc. has 2,000 unique suppliers and the invoice traffic follows the typical structure of 70% of volume generated by 30% of suppliers. This translates to 600 layouts configured to cover 70% of incoming invoice volume.
Since setting up a template is expensive, it is never done for all suppliers. Even though 600 configured layouts might be on the high end for a template-based system, we will roll with this figure to stay conservative. From our clients, we hear template coverage rate reports of anywhere between 20% (!) and 80% of volume.
Because template-based systems can process only invoices from vendors that are input in the system, the remaining 30% needs to be processed manually, the same way we described for manual data entry (only except the 30 second invoice open time – we will correct this figure to 5 seconds in an automated workflow).
The invoices that fit the template also need manual verification and possible correction. This aspect of cost is very often forgotten, even though it will account for quite a significant piece of the cost, as we will see later.
Effort spent: 3.1 direct FTE Load, 1.1 indirect FTE Load
The 30% of invoices needs to be processed manually, taking 81 + 5 seconds per document as analyzed above. For the 30,000 invoices total of our model company, the manual workload is 2 direct FTE Load. Rework brings an additional 0.9 indirect FTE Load.
For the 70% of invoices where a template kicked in, validation needs to be performed for each invoice. In the Rossum validation interface, purely the validation part in the model company would take 10.5 seconds; we will generously allow the same speed for validation in legacy template systems.
In approximately 10% of fields (optimistically), validation would spot a mistake and manual correction would be required – taking into account the kpms as well as observed ergonomy of these systems, this would take another 5 seconds per field on average, or 7.5 seconds per document of correction time on average.
Multiple field errors often go together in a single document (since e.g. the scan was faulty, automatic validation failed etc.). That helps the speed of validation as the process is smoother for the user when it is not interrupted by correction efforts.
Compared to the manual process, we will add just 5 seconds overhead for invoice switching, bringing us to the direct effort total of 23 seconds per invoice. For the volume of our model company, this means 1.1 direct FTE Load. We consider rework to take same effort as in the manual data entry case, but apply only to a portion of just 10% of manually corrected documents, causing just a 0.20 indirect FTE Load.
Cost per invoice: $1.03
Before we consider the effort costs, we will take a look at the costs associated with the OCR system itself. We need to stress at this point that those costs can vary by hundreds of percent depending on the vendor. We are estimating them from our experience, our clients' costs and industry knowledge. For purposes of this comparison, we have used the average price of the most used template-based OCR invoice extractors.
One-time investments are the original implementation and hardware/software purchase ($100,000 and $15,000, respectively) – remember we are amortizing this cost over three years, translating to a cost of $38,333 per year. The implementation effort is 50 to 200 man days and apart from the process integration, system configuration and setup, template configuration is also necessary. Well, with 600 layouts to set up, this implementation is quite on the cheap side!
Every year, a licence needs to be purchased with fees for additional pages ($20,000), and updates to vendor templates and system setup that can be expressed as a percentage of the original setup ($25,000), totaling $45,000 / year.
The effort of operating the process translates to 4.2 FTE Loads. Assuming 1 manager FTE and 3.2 junior FTEs, this represents the direct and indirect cost of $249,600 per year. Following the same principle as for manual process analysis, the hidden cost will represent a further 15% premium ($37,400).
Hidden costs will have a different structure as nature of errors will vary compared to a manual process. More care has been given to the process integration, catching some errors and speeding up the process. However, new hidden costs will be associated with managing the external consultants (or internal IT) that maintain the OCR software.
The total cost of this setup is $370,333 per year, putting the price per invoice to $1.03. Other sources, such as analyst group PayStream Advisors, estimate the cost less conservatively – as much as $4 per invoice! It is important to add that the efficiency of this setup decreases in time, raising the average price.
Cognitive, cloud-based solution
Software as a Service (SaaS) is taking over the world, presenting a lightweight alternative to on-premise systems in all kinds of industries. The other driver of change is machine learning. A cognitive cloud-based OCR solution powered by AI has the potential for even higher rate of automation than a template-based system, but does not need the costly and time-consuming implementation and updates. In fact, a prototype process can be set up in days.
The secret? The Artificial Intelligence looks at the invoice the same way humans do and reads the data, therefore not needing any templates.
The user training for a template system can take up to half a day but the cloud based solution features an intuitive interface that does not need a long explanation. The whole training for brand new users can be done in fifteen minutes, and employee satisfaction makes for a completely different story. We often encounter the situation when a new, AI-powered AP department needs only 3 persons now, a head of AP, a manager and a typist. The time previously put into invoice extraction is usually invested towards long-term projects and improvements. Let's take a look at the TCO of cognitive invoice data extraction!
The actual cost structure
Just like any other OCR system, a cognitive data capture solution will come with some implementation and licence costs. This is besides the operation effort costs and hidden costs that we are going to come back to below.
Implementing the AI solution is a simple matter. Just like with template-based OCR, the essential components are the process design and integration with other company systems. But the OCR configuration is a matter of hours and its integration into a simple workflow with robots can be accomplished with a few man days of effort in most ERP systems.
The key difference compared to template-based OCR is that the configuration does not involve layout by layout setup. The solution comes with an already high accuracy all over the map and further adapts to the processed invoices based on a user. And it even improves by itself, just by the virtue of the OCR user base widening over time – that means the more diverse layouts seen, the better out-of-the-box performance.
Not only does this mean massive savings on implementation, but there is no longer a sharp 70-30 distinction between manual and automatic processing of documents.
All documents are processed automatically in a cognitive data capture solution, just the degree of accuracy varies a little.
"Full automation" of document processing may be attainable on a case by case basis. But just to be on a safe side, let's consider that all invoices still need to be manually validated. Sometimes, manual corrections are going to be required. However, over time, we see the error rate drop well below the 10% level assumed for template-based systems – this is due to all the automatic learning facilities.
For the case of our model company, we would aim at 95% field accuracy in Rossum, with most of the corrections just involving a few keystrokes to correct a typo (we observe 1.43 correction keystrokes per document, and 95% accuracy means an average of 0.75 fields wrong per document).
Effort spent: 1.3 direct FTE Load, 0.3 indirect FTE Load
We can finally calculate the TCO of cognitive invoice data extraction. Let's describe the entire process first. All invoices are processed automatically: validated, then potentially corrected. The validation takes 10.5 seconds – remember, the estimate for template-based OCR was actually based on Rossum's measurements. We will assume the correction time per field to be 4 seconds as there are going to be very few keystrokes, but the fixed overhead of 3.8 second remains.
With 0.75 corrected fields per invoice and 5 seconds fixed per-invoice overhead (just like with template OCR), that makes the total direct effort 18.5 seconds per invoice. For the volume of our model company, this means 1.3 direct FTE Load. We again consider rework to take same effort as in the manual data entry case, but apply only to a portion of the 10% of manually corrected documents again, causing just a 0.3 indirect FTE Load.
Cost per invoice: $0.45
The licence's cost varies according to several factors. Most importantly, the volume processed and fields captured. For our model company, it is going to be on the cheap side at any rate due to the small set of fields, all of them standard AP fields, and no special enterprise requirements. We are going to use a round $40,000 per year, even though the actual price may well be south of this number.
To make room for a complex solution analysis at the beginning and for iterative process improvements after the initial deployment, we are budgeting $10,000 for the implementation – but you can certainly save on that significantly too. Annually, let's budget 25% again ($2,500) for system maintenance, e.g. reconfiguration due to changing business requirements. The user training costs are negligible compared to the template solution (the training takes 15 minutes).
All in all, this means the cost of $45,833 annually for the solution itself as we are amortizing the implementation over three years again.
The effort of operating the process translates to 1.6 FTE Loads. Assuming 0.6 manager FTE and 1 junior FTEs, this represents the direct and indirect cost of $105,600 per year. The hidden cost will again represent a half of indirect cost, thus now only a 9% premium ($9,500).
Finally putting this all together, we now have the TCO of cognitive invoice data extraction: a $160,833 total cost per year or $0.45 per invoice. And that's while still being rather pessimistic about the benefits of cognitive data capture – and optimistic about the other methods.
Clearly, we have come quite a long way!
And what will the future bring for users of cognitive data capture systems? That's the final beauty of it. While the cost for fixed processes is going to increase over time, the total cost of a cloud-based system is going to decrease still. New technology improvements become available to all users, and the automatic AI learning means a continuous further reduction in effort required. This means that in the long term, automation of the AP process is only going to increase further for AI-based OCR users. Do you want to learn more about benefits of AI in invoice data capture? We have discussed them in this whitepaper.
What will tomorrow bring?
Invoices need to be processed in every organization, the faster, cheaper and more accurate the better. We have calculated the costs of invoice data capture for three frequently used solutions. If your process is primarily manual (your AP retypes the data themselves), you will pay $2.03 per invoice, with template-based, on-prem $1.03 per invoice, and using a raising alternative, AI-powered cloud solution, your costs are going to be $0.45 per invoice. We can definitely agree with content service provider Hyland that AI is revolutionizing data capture.
Today, moving from a fully manual process to a cognitive data capture service for our model company means a staggering 4.5x reduction of costs for an AP process.
Those numbers illustrate the current status of the fundamental differences between those processes. Of course, we have taken just the data entry angle. Other powerful tools to increase AP process efficiency certainly exist - a comprehensive Purchase Order system, robots applying automated verification steps to catch manual entry errors, or even a procure-to-pay EDI platform for frequent suppliers. But of course, all these tools compound with a cognitive data capture service just as nicely.
Outsourcing the AP data entry process completely may sound like another sexy alternative, and it can indeed do wonders to your bottom line. However, it is a project that is fraught with risk – you are giving up control, and e.g. quality issues may take a long time to iron out during implementation. We will come back to discuss this topic in the future.
What will tomorrow bring? The gap between the manual, OCR and cognitive system will only grow further with the increase in the volume of the invoices, or any extra complexity in the process (such as line items, vendor address capture or purchase orders). Even today, staff willing to retype data is increasingly difficult to find. Isn't it time to switch to an automated solution and unshackle your employees from the meaningless retyping of invoices?
If you want to analyze your own use-case, book a call with our automation expert here.