OCR of shorthand


I’m a sys admin for a living, so I know a bit about code and APIs, but I’m no genius. I also type in stenography and write in shorthand. I’ve created my own shorthand such that each word is fully unique. (Gregg Shorthand does not meet that standard.) I’m hoping some day to make it truly machine readable, and am wondering whether there’s any chance I could succeed using your product.

The plan would be to break a page down into a group of outlines, and submit each outline to be “read”. I would expect there to be 5,000-50,000 “letters” in my “font”, with each corresponding to a full word. I can provide all the outlines and their definitions as computer-generated images with metadata.

Could your tool be made to work within that use case? I know it’s a big ask.

Thank you!


This is an interesting project, but our OCR service is not trainable (yet). So you can use it only with the pretrained languages that we offer.

One solution could be to use Tesseract and train the system using our own handwriting font. But that is a big project.


Perfect. Thank you for taking the time to answer!