How To Easily Create Text Files From Scanned Files

Many translation projects come to us in a scanned format which prevents us from properly scoping the project, or using any of our translation memory tools. This can be solved by using an OCR Tool.

We have tried many OCR or Optical Character Recognition tools in the past but we believe the best one for our purposes is ABBYY Fine Reader. The tool is available in a few different editions (express, professional, corporate, site license). For most users, the Express or Professional license should be adequate. The principal difference between the versions is related to how many users you have as well as how you plan to roll out the installation in your organization.

Here are the reasons why we like this tool so much!

Ease of use
The application is very user-friendly. The interface is very easy to navigate. Once installed you will even have some right-click options to send files directly to the FineReader application. Once you import the file to be converted you can easily modify the chunks of content by defining if the content is an image, a table, or a block of text. The system does this automatically but you can adjust the setting and reanalyze the file to tweak your output. The output is displayed in a window on the right-hand side of the screen where you can easily tweak the results of the OCR. Once you are happy with the results you can save to MS-Word (.doc and .docx, RTF, or text formats.)

Supports many languages
The application supports 186 languages and includes dictionaries for extended support in 39 of those languages. This is by far the most comprehensive support of languages in any OCR package on the market today.

Ability to maintain formatting on complex blocks of text
We have thrown some complex documents at the application and it has successfully handled tables, TOC’s, numbered lists, bulleted items, and headers and footers with no real issues.

The application works very quickly. We have processed files as large as 300 pages in minutes. This is especially helpful if you need to quickly determine an estimated word count. You can always save your project and refine your results later.


Related Blog Articles

Why It Is Important to Provide Source Files With a Translation Quote Request
Read article ›
How To Translate an XLIFF Bundle From MadCap Lingo
Read article ›
5 Data Security Questions for Your Translation Provider
Read article ›
7 Reasons Why You Should Store Your Translation Files on Box
Read article ›