a selection of useful programs (OCR) and services

RecognitionHave a good day!

Today, in a note, I want to address a “painful” desktop issue: “…here I have a photo of a page from a book/document, how can I put it into Word to edit the text…?”. (variations may vary slightly)

Main The “problem” here is that there is no text on the photo (scan) – it is presented there as a graphic image, or in other words, the letters on the photo – they are just black sticks, squares and circles on a white background (the usual drawing in the form of letters)! Those. they are not characters, they cannot be selected, copied and pasted into Word!

What to do? First, “someone” will have to convert these “sticks and circles” (i.e. the letters in the image) to plain text, symbols (this operation is called in English OCR // optical character recognition // optical character recognition). But then the text can be transferred to Word and edited…

In fact, programs and services that solve this problem will be discussed today… 👌

*

📌 On this topic!

1) How to scan a document to a computer from a printer (MFP) – https://ocomp.info/kak-otskanirovat-dokument.html

2) How to scan a document using an Android smartphone – https://ocomp.info/skaniruem-dokumentyi-android.html

*

The content of the article

“How to” recognize text (OCR)

Software for Windows

Fine Reader 👍

Website: https://pdf.abbyy.com/ru/finereader-pdf/

A working example with FineReader

A working example with FineReader

One of the best OCR software for photos, scans and PDFs. Thanks to powerful algorithms (with automatic area selection) – the process of translating “graphics” into text becomes simple and easy!

It should be noted that FineReader has almost no competitors and it is very difficult to replace it (especially if the scans for recognition are of poor quality or with rare fonts).

Advantages :

  1. supports all popular languages ​​(Russian, English, Ukrainian, German, etc.) and fonts (even partially handwritten);
  2. manual and automatic operating modes;
  3. multi-page mode (when you can immediately open 3 different documents – and the program will process them automatically);
  4. built-in editor for correcting errors and correcting text;
  5. the ability to transfer recognized text to MS Word with a click of the mouse!
Open in OCR editor - FineReader

Open in OCR editor – FineReader

How to use: Simply open the desired photo or PDF file, then press the button “Recognize page”. Then the program will do everything automatically. See screenshots above, arrows show everything. 👆

*

NAPS2

Website: https://naps2.ru/

naps2 logo

Compact and simple program for fast scanning and recognition of documents. Ideal for working with scanners and MFPs – you can immediately “get” a Word document with text for editing from a sheet of paper …

By the way, in the NAPS2 menu you can specifically specify which languages ​​you will use (most often it is Russian and English). To note.: the program supports more than 100 languages! See screenshot below. 👇

Download Russian (NAPS2)

Download Russian (NAPS2)

What to use: then everything is simple here. First, we specify the languages, then add the necessary files (JPG, TIFF, PNG, PDF, etc.), press the recognize button and save the resulting pages.

An example of working with a page from a book - NAPS2

An example of working with a page from a book – NAPS2

*

Cuneiform

Developer: Cognitive technologies

Can download from soft.mydiv.net

Despite the fact that the program has not been updated for a long time – Russian and English. She recognizes the text fairly well. Its menu is made in a minimalist style (there is nothing superfluous): it is enough to select a file, specify recognition parameters and proceed with the operation. See example below. 👇

Cuneiform - a working example with a page

Cuneiform – a working example with a page

Can be distinguished:

  1. support for 20 languages;
  2. built-in dictionary to check the document;
  3. majority support printed fonts;
  4. support for pages printed on older fax machines, dot-matrix printers, etc. (not all software can handle this!).

*

simpleocr

Website: https://www.simpleocr.com/download/

To note: See the Classic version first (it’s free).

SimpleOCR - a working example

SimpleOCR – a working example

SimpleOCR is an extremely simple utility for working with scanners (and documents received from them). Allows you to convert BMP, TIF, JPG files to text formats.

By default, SimpleOCR can only work with documents on EnglishFrench, German (Russian must be pre-installed manually!).

Also note that more advanced features are provided for a fee.

In my opinion, SimpleOCR can be suitable if you are actively working with chorus scans. quality with foreign text (fortunately, it does well!).

*

Scannitto Pro

Website: https://www.scanitto.com/en/

A working example with Scannitto Pro - text page recognition

A working example with Scannitto Pro – text page recognition

Scannitto Pro – this program is more suitable for obtaining scans from MFPs and scanners (and in this respect there are many options: rotations, cropping, and templates…). However, in its arsenal of functions there are recognition, by the way, Russian and English. fonts are supported! (although there aren’t enough options here…).

The essence of the work is as follows.: you must first add a page, then open additional. recognition window, highlight the block of text in blue and perform the operation. Then the text needs to be edited (I note that there are more errors here than in the same FineReader, and therefore high quality scans are needed!).

Important: program Scannitto Pro paid (this is another minus)!

Otherwise, there are no particular complaints. Perfect for those who have difficulties with other software, or who scan occasionally…

*

Online services (OCR)

📌 Img2txt.com

The service supports relatively small files, the size of which does not exceed 8 MB. Available formats: PDF, JPG, PNG, BMP, etc.

As for the quality, it is average (FineReader losses, but better than a number of other software and services).

img2txt.com - online recognition service (main page screenshot)

img2txt.com – online recognition service (main page screenshot)

*

📌 Onlineocr.net

This service outclasses the previous one by supporting 15 MB of files, but loses in recognition quality (at least for Russian fonts).

How to use: simply select a file on your hard disk, then specify its language and press the button “Convert”. After that, you can download the doc file with the recognized text. Ideally?!

Onlineocr.net - screenshot of the main page of the site

Onlineocr.net – screenshot of the main page of the site

*

📌 Convertonlinefree.com

This service compares favorably in that it can handle not only PDFs and images, but also archives with many files (do you agree that’s more convenient?!). And the quality of recognition is very good (for Russian and English text, I double-checked it on my documents).

To note: note that only 20 pages are processed on the service! Large documents will need to be broken down before being uploaded to this site.

convertonlinefree.com - site page screenshot

convertonlinefree.com – site page screenshot

*

📌 Conversion

This service is good because it supports dozens of different files + good recognition quality. By the way, the free version allows you to process only 10 pages. Results can be saved in Word, PDF, TXT documents.

To note: full support for Russian fonts, a wide range of imported files: PDF, JPG, BMP, GIF, JP2, JPEG, PBM, PCX, PGM, PNG, PPM, TGA, TIFF, WBMP.

Convertio - website screenshot

Convertio – website screenshot

*

Additions to the subject of the note – welcome to the comments!

For this I say goodbye, good luck to all!

👋

make a donation

dzen-ya

Useful software:

  • video montage
  • Video montage

  • Excellent software to create your first videos (all steps are step by step!).
    The video will make even a beginner!

  • optimization utility
  • computing accelerator

  • A program for cleaning Windows from “garbage” (deletes temporary files, speeds up the system, optimizes the registry).

Other entries: