When researching Enterprise Content Management capture projects, the question of handwriting recognition comes up again and again — and many people aren’t sure what to expect. More commonly, their expectations are unrealistic. They think there is no hope at all, ever. On the other end of the spectrum, some think that tiny fevered cursive scribblings from a rushed meeting can be scanned (or even faxed) and read with accuracy. In helping people think about their forms and the viability of capturing handwriting, I have a few simple guidelines to consider which seem to apply in a majority of cases.
- Are handwritten forms really the only option? If the form is available online, can the data be made “fillable” and then submitted directly to your database tables? Can you let the user fill the form online and print, thus producing machine print and eliminating handwriting? How about taking the data that a user entered and bar coding it (if the form must be printed rather than be submitted)? Also helpful and sometimes overlooked: prefilling form data from your database through a merge process with a bar code index for retrieval of that same data.
- Does your Capture software support ICR? Intelligent Character Recognition (ICR) is what you need to read handwriting. Optical Character Recognition (OCR) is much more common and is designed to read machine print. Please don’t try to make it read handwriting – you won’t like the results!
- Make sure the handwriting is constrained. Annoying? Perhaps. But making the person filling the form write in boxes sets you up for the most successful ICR results. The catch phrase here could be “Curse the cursive”. When a character is joined to another character it is faster to write. However, the ICR software really struggles to figure out where one character starts and another stops. And here’s where recognition tanks. With the real world example below, we can generally expect 100% recognition.
- Ask for all caps handwriting. You can often tell your ICR engine to look for upper case characters only. This really