For the last of our Founders series, I sat down with our Research Team Leader, Tomáš Tunys, to discuss his beginnings, what it’s like working as a researcher and his view on Rossum’s product. Keep on reading to take a peek into a researcher’s life!
JT: Hi Tomáš, thank you for taking the time from your work and sitting down with me. Can you first tell me when and how you became interested in AI?
TT: You are going to be surprised, but I did not want to be an AI scientist when I was a kid (jokingly). Also to be honest, I think it would be more adequate to use the term Machine Learning (ML) instead of AI. AI is overused quite a lot these days as a blanket statement describing anything that has to do with automation, and I personally reserve this word to describe something more like a general principle. Anyway, back to the question.
When I was studying software engineering at University, I had little to no experience with machine learning at all. Towards the end of my studies, I struggled a lot to find a captivating topic for my thesis in software engineering, and luckily, at that time I met Jan Sedivy. Under his supervision I worked on an accelerometer-based hand-gesture recognition engine for Android applications and it was an absolutely amazing experience. I enjoyed working with Jan mainly because he was one of the foremost experts in the field of machine learning models I used in my work. The one-on-one sessions we had together were mind blowing and looking back, I think it is thanks to him (and my weird personality, of course) why I fell in love with ML.
JT: Can you share what was the original motivation behind creating a product like Rossum?
TT: I met the other co-founders, Tomas and Petr, during my PhD studies in a research group under the supervision of Jan Sedivy. Tom focused on information extraction, Petr, on document representation and mainly question answering, while I focused on information retrieval and learning to rank (e.g. how to sort documents the best way for a given query). We worked together on various projects which later inspired us to form our own company in order to leverage what we had learnt in a more meaningful way.
At that time, we definitely did not know that we would end up with what we have right now. Actually, we started with something completely different; maybe it would be fair to say we didn’t have any idea what to do at all, but that turned around quickly during our time at Startup Yard. After a couple of conversations with mentors there, we found out that invoice processing, a mundane and routine task that was presented to us as technically solved, was not solved at all and still had many underlying issues. Long story short, we decided to address these issues in order to improve the way that companies and their employees process their invoices and that is where Rossum’s journey started.
Fast forward to the present, we have always known that the machine learning models may never cut it on their own. In spite of what is presented to the public, the “AI” technology is still not there. So since the beginning we have been trying to marry the best of two worlds: developing state-of-the-art models for information extraction from documents while also presenting them via our application in the best way possible. The goal was always to make people’s work faster, easier, and hopefully more enjoyable, which ultimately brings value to the companies.
JT: What does a day in a researcher’s life look like? What is the most interesting aspect of being an AI researcher in Rossum for you?
TT: Looking out from our kitchen across the open space towards our research team I would say it is a mysterious process of turning coffee into a bug-free effective python code running flawlessly in production (jokingly).
A day in the researcher’s life could be summarized as full of discussions, brainstorming, thinking, reading related work, coding, debugging, code reviewing, waiting on, sharing, and celebrating the results, and so forth. Most of it is individual work on an assigned project but we also spend quite a lot of time together. We have a stand-up meeting every morning. Lunch is a very convenient time to discuss everything from the current machine learning problems up to the promises of teleportation technology. Not to mention we hold research meetings every week, where we discuss in more depth some of the projects we have been working on.
I think that for me, the most interesting aspect of being a researcher at Rossum is having the opportunity to have hands-on experience in the entire process; from working with data, designing and developing machine learning models for non-trivial (and sometimes even seemingly impossible) tasks, to actually delivering them all the way to production.
JT: What are you the most proud of in your department and team?
TT: It might sound corny but I am proud of the whole team, which is made up of a bunch of incredibly gifted people. I love working with them, celebrating their successes, and sharing their failures. It is great to see how they grow and improve and are a part of the process. And, of course, I am proud of what we have built so far and how our solution works given how difficult the task is.
JT: Considering the competition is so big, what are the biggest challenges you face?
TT: That depends on how we define the battleground on which we compete. For example, right now we are among very few cloud-based solutions on the market in our space so I would dare to say that there is actually not as much competition as you might think - but I’d rather keep the discussion in the realm of machine learning and the challenges we face in that respect.
In my humble opinion, I think the way we approach the data extraction from (semi-)structured documents is superb and unprecedented. Contrary to our competitors who usually use OCR coupled with hand-crafted template rules (which simply cannot scale), we managed to marry CV with NLP (computer vision and natural language processing) together, which works very similarly to how people are doing “information extraction” in real life. I won’t go into details here, but I think Petr has done a terrific job in that regard, as is described in his blog post, which you can read here.
With that being said, it does not mean that everything’s perfect. By far the biggest obstacle we face is in line-item extraction and the model adaptation speed, i.e. how fast the model learns from our customer’s feedback. These are the main short-term challenges we are currently facing (and working on right now). The long-term challenges are more general - machine learning as a technology is starting to become ubiquitous and commoditized. It is absolutely true that what we have currently developed will become common knowledge and anyone with the right data (and the right amount) can replicate what we have today. However, for us, data cannot be and never will be the only defense against our competition. The only defense is to be ahead of the curve, which can mean a lot of things, e.g. working on innovative solutions, generalizing the technology to unstructured documents or other types of documents (which we already do), making the models work and learn in real time, etc. There is a lot of amazing work ahead of us, that is for sure, and I am really looking forward to it.
JT: Recent advances in AI led to many success stories of AI technology undertaking real-world problems. What are the challenges of deploying AI systems?
TT: Completely ignoring a huge bag of problems associated with the deployment of any relatively complex software, I think that the biggest challenge is communicating to our customers exactly what Rossum provides and what benefits it offers. The signal-to-noise ratio is frankly not in our favor, and I dare to say that the world is flooded with all sorts of wannabe AI-based solutions promising the highest accuracy and whatnot. This optimistically biases the opinions and expectations of people who, when confronted with “the real deal”, end up disappointed. It’s about setting the right expectations with an honest and clear message describing what exact value we bring to the table.
JT: What would be your best advice for somebody who wants to be a researcher one day?
TT: As with anything worthwhile, if you really believe it is for you, the answer is plain and simple - just do it! The amount of materials on the internet that you can start learning from is literally insane. So many open courses, free books, YouTube videos, research papers - there is so much of it that it is actually hard to pick sometimes, but I would stress that the hands-on experience is equally important as well. Research and review of existing work is one thing, but your own experience is completely different. You can get a lot more insight and feel for the craft from your own experimentation.
JT: How do you see Rossum improving its technology over the next year?
TT: I want us to have a more effective model of learning - a more interactive one. We have already started working on it, but so far it is not as fast as we would like it to be. A huge thing on my wishlist, which I am not really sure we will get to as of yet, is to move from just transforming data from one format (PDF/Image) into another (CSV/JSON) - called information extraction, towards more sophisticated models capable of reasoning with the information extracted from the documents. The "reasoning" part is definitely going to be a game changer for us and make us one of the best AI-driven companies in the world. In light of my opening remark regarding the use of the term "AI", I don't hesitate to use it here because that is exactly what I imagine will be behind it and where I see Rossum's technology headed in the future.
JT: Thank you, Tomáš, for your time and insight. We can't wait to see what Rossum comes up with next!