
This was not possible just 6 months ago, and now, look at this! We expect it to improve over time both in terms of accuracy and language support, so stay tuned! Automatic Text Translation It is very early and sure it is not perfect, but when it works, it is so impressive. And having tools to bridge the physical world with the digital one sure makes sense.

Why? Many people still need real paper to express their creativity. Microsoft announced that they would be tackling this problem in the last few months, and we decided to have a test run in Prizmo Go. Except for deep neural networks, of course. In the latter case though, where just the final image is available, it’s been mostly impractical. Some apps have been handling this very well is the last few years, just like the Nebo app, great stuff. The first case is somewhat easier, because not only the final image is available, but also the temporal sequence of inputs of “how it is made”. In fact, there are two cases: when writing directly on the device, or the photo-based one. quality of results are highly dependent on writing style.įor the last 20–30 years, we thought that handwriting recognition were, in the best scenario, something that would be solved “later”. DNN-based OCR typically is more resource hungry than other OCRs, but specific optimizations and the power of recent devices are making it a reality. That got us thinking how this could fit in our apps, even though there were still important challenges. Despite some quirks, the gain in accuracy was immediately obvious. We found that pretty exciting after witnessing similar cloud-based technologies and we gave it a test run. Second, as of last year, we noticed the Tesseract project was experimenting with LSTM (this is a form of recurrent neural network). Building over our 10-year experience in this field, we further improved it to generate such images that would specifically please neural network-based OCR, in turn improving the overall accuracy. a custom build of Tesseract (open source OCR library by Google)įirst, image preprocessing is important because shooting text image with a smartphone is so variable, going from a perfectly lit and framed image to one that even humans cannot read.smart image preprocessing & layout analysis.Prizmo Go 2.0 comes with our CeedOCR text recognition library that is built on three things: Why? Because it has complementary advantages, such as constant availability and better privacy. Even though we are embracing them fully, we decided to double down on the integrated OCR offering as well. With the advent of cloud services that perform so well, we could have chosen to go only with those. Let’s walk through these new things in a bit more detail. You can have a look at this video trailer to quickly review the new features:

Prizmo Go 2.0 features a new neural network-based OCR that runs on the device, a new handwriting-capable OCR in the cloud, automatic cloud-based text translation, as well as the addition of a new subscription model. Well trained deep neural networks (DNN) come with that human-level capability of generalising to non-optimal conditions.

Sometimes in odd ways, as we humans could essentially still read the text clearly. But this would degrade rapidly when the conditions were either non optimal or unexpected. With previous generation OCR engines, we could achieve extremely good results when the image was of high quality and well framed. Modern neural network techniques and deep learning had found their way into the text recognition world, giving it cognition capabilities that come closer and closer to humans. That’s the fun part in small companies, like, let’s build that major new feature 3 months prior to release 😉, right? It was so good actually that we changed our plans to integrate it into 1.0, despite being late already. At that moment, we heard of Azure’s offering and thought we’d give it a shot anyways, and we were pleasantly surprised by the accuracy of it. We hadn’t thought a second about the cloud processing.

This year, we wanted to revisit its core feature, namely, text recognition.Īs an anecdote, when we were working on version 1.0, in late 2016, we had only planned to have optical character recognition (OCR) performed on the device itself. Prizmo Go got a great welcome last year when we released it, and was seen as a great and fresh execution of a number of technologies to simply achieve the goal of capturing text instantly. We are introducing this week version 2.0 of our Prizmo Go app, for instantly capturing text from the physical world.
