PDF OCR is based on OCR technology to convert scanned PDF paper books and documents into editable electronic text files fast and easily.
$39.95 EXPIRED
User rating: 340 (40%) 508 (60%) 60 comments

PDF OCR was available as a giveaway on June 1, 2010!

PDF OCR is based on OCR technology to convert scanned PDF paper books and documents into editable electronic text files fast and easily. PDF OCR has a build-in text editor which allows you to edit ocr result text without MS Word.

PDF OCR also supports batch mode to OCR all pages of pdf file to text at a time. PDF OCR has a Scanned Image To PDF Converter, which means you can create your own scanned PDF books.

System Requirements:

Windows ME/2000/XP/2003/Vista/7





File Size:

14.5 MB



Comments on PDF OCR

Failed in installing in regular method, but successful by following #27. Thanks

Reply   |   Comment by Shankar  –  6 years ago  –  Did you find this comment useful? yes | no (-1)

It works very well IF the document has no formatting.

Reply   |   Comment by GOTD Supporter  –  6 years ago  –  Did you find this comment useful? yes | no (0)

Good Program. Very Little features but does the jobs perfectely.
Thanks GAOTD !

Reply   |   Comment by Vikram  –  6 years ago  –  Did you find this comment useful? yes | no (0)

Downloaded and registered easily. This is the best pdf converter I have used. It is fast and accurate and unlike other converters it delivers text that can be edited. You do have to cut and paste from Text to whatever word processing system you use, but this is a small task.
Thanks GAOTD. This is a keeper.

Reply   |   Comment by Allen  –  6 years ago  –  Did you find this comment useful? yes | no (0)

First of all, some of you are a little confused as to what use this software might be. It is not a converter, to save PDF files as DOC files, for example.

It doesn't preserve formatting, it just renders plain ASCII text. If you have a scanner, which many printers include, you may have OCR software already installed. However, this OCR software is usually for printed pages, or for fax files, like TIF files. Not many free OCR programs can scan a PDF file.

So what use is this? Well, let's say you use NovaPDF or CutePDF to print an ebook that you've downloaded. The resulting file is likely to be bitmaps embedded in a PDF file. Or suppose you download an ebook from a newsgroup. Many of these are bitmap scans of books or magazines that have no text, only an image of the text. If you have a PDF ebook reader, you're probably OK, but let's say you have one of these MP4 players with an alleged ebook reader built in. Most of these only support text ebooks. So now you have a way to convert your PDF to a form you can read on your MP4 player. Or your "not-so-smart" phone, perhaps.

How well does it work? I scanned 150 pages of a book in about 6 minutes, so it seems to be fast enough for most people. The recognition rate was fairly high, with most text rendered perfectly, although some fonts proved trouble to the OCR engine, particularly with the letter 'k', rendered as "l<", and the letter 'H' rendered as "i-i". These appear to be in chapter titles and with italic fonts, and can easily be fixed with a spell check and global find and replace, if you so choose.

In all, an effective piece of software that does what it says, albeit not one that a lot of people may have a use for. But if you have a long commute on public transportation, and don't have a dedicated ereader, you might find this useful.

Reply   |   Comment by Jim  –  6 years ago  –  Did you find this comment useful? yes | no (+2)

#43 -- I had the same problem and had to run "compatibility" for it. Right-click on the program file and select "Troubleshoot Compatibility". It was able to correct and I didn't have it shut down anymore.

Reply   |   Comment by Kelly  –  6 years ago  –  Did you find this comment useful? yes | no (0)

There is no PDF OCR software anywhere at any price that can do a perfect conversion or handle heavy formatting.

I had no problem with this PDF OCR installation. It produces the output in WordPad.

The first pdf page I tested was very complex - a Hebrew to English transliteration consisting of Hebrew characters, Hebrew words written with the English alphabet, and the English language translation.

PDF OCR converted the Hebrew characters into some garbled text, but the Hebrew words in English alphabet came across perfectly, as did the regular English words.

The original PDF had many columns, and PDF OCR converted the columns into rows of plain text sentences, which is what I wanted in this case, but certainly not in all cases.

So if your pdf has columns, be aware that this PDF OCR only converts by rows across all columns, rather than converting one column and then the next.

Would be great to have a choice of converting by row or column.

Reply   |   Comment by mark  –  6 years ago  –  Did you find this comment useful? yes | no (+2)

Today's giveaway is not bad at all and definitely worth downloading and not just because it's FREE for 24 hours...LOL!!

As for FREEWARE ALTERNATIVE I recommend the following apps:

1) to convert PDF files to editable WORD DOCUMENTS:


2) to Convert a Scanned (IMAGE) Document to a PDF File you can use FREE ONLINE CONVERTERS such as ZAMZAR or FreePdfConvert or better yet a FREE nice app called "IMAGE TO PDF CONVERTER FREE".

With this FREE TOOL you can batch convert (scanned) image files (supports JPG, BMP, PNG, TIF, TGA, GIF formats) to PDF document, set page size and PDF information, such as title, subject, author, etc.

Pls note that this tool is standalone software, so it doesn't require Adobe Acrobat Reader and above all it doesn't depend on any print driver meaning that it will NOT install any print driver on your PC.



Reply   |   Comment by Giovanni (King of Freebies....LOL!!)  –  6 years ago  –  Did you find this comment useful? yes | no (-3)

@48 You have to register using the key provided in the readme file when you unzip the package.

Reply   |   Comment by Law  –  6 years ago  –  Did you find this comment useful? yes | no (0)

Many people appear to be confused about what this does. Some people have tried to explain. Please allow me to summarize and clarify:

There are two different ways that a PDF may show you a page:

- Formatted text with fonts and lines made up of individual characters
- A picture of a page

A third form is an overlapped combination of the two. Adobe makes an expensive, corporate product that does the OCR at the same time as the scanning. It shows you the image and has the OCR text hidden but selectable. But the basic two forms are still there, along with their basic limitations.

The first form starts with a Word file or other document. All the words are specified, the fonts, the spacing, etc., inside Word. The PDF contains all that information, usually as boxes, one for each line, and the PDF Reader puts that on your screen. The Reader BUILDS the image out of the text and font information in the PDF file.

The second form starts with a picture of the page. This picture was probably taken by a scanner, but it is just an image. As far as the file structure is concerned, it could be an album of baby photos. There are no words in the file, just image data.

PDF-to-Word converters work with the first type of PDF. They look inside the PDF file and find the text and the formatting details. Some just give you that with the individual PDF boxes, one for each line, the text, the font, and the position on the page. Some apply some intelligence and try to figure out which lines go together into a paragraph, but that information is usually not contained inside the PDF, it has to be figured out. The program has to interpret intent, with is more AI that file format conversion.

(A newer form of PDF from Adobe has "reflowable" text and does keep the paragraphs together. But that makes bigger files and the reflow can only be done if you edit the file with Acrobat Pro, so at this time most PDFs do not include reflowable text.)

This program is for the second form of PDF, where the file does not have the text, just a picture. This program uses OCR technology to figure out that one picture blob is an "A" and another similar picture blob is a "B".

Some people appear to expect that because a PDF-to-Word type of program can extract the font and tell you that some text is Ariel and some Helvetica, that this program ought to be able to do that too. But that is like comparing Morse Code to Mozart because they both have rhythm.

This program is about OCR, about recognizing the Optical Characters, reading the blobs. To expect it to recognize fonts too is way too much for today's technology. Even professional designers have trouble identifying exact fonts when they have full alphabets and high-resolution images to work with. Some OCR system do make guesses at fonts, sometimes guessing as many as four different fonts in the same word which you then have to manually fix. And, as I mentioned above in the limitations of PDF-to-Word converters, deciding what makes a paragraph is a bigger problem.

This program will save you the trouble of reading the page and retyping it. Mostly. The more fuzzy your images, the more mistakes it will make.

But it will not recreate the original source document for you.

I hope this helps,

Reply   |   Comment by August  –  6 years ago  –  Did you find this comment useful? yes | no (+35)

Windows XP Pro SP3 - IBM Thinkpad T42
- no problems with installation or activation
- tested on pdf file printed with "CutePDF Writer" worked quickly and easily. Very good results.
- very pleased, thank you GAOTD!

Reply   |   Comment by 6o6  –  6 years ago  –  Did you find this comment useful? yes | no (+1)

@ 24, I think what 8 was saying is that it outputs PLAIN text. Do you know the difference between unformatted plain text and something like a fully formatted DOC or RTF file?

Reply   |   Comment by Nobody  –  6 years ago  –  Did you find this comment useful? yes | no (-3)

Maybe I missed it but, why not tell us before we install the software that the limit to OCR is three (3) pages. If I had known, I would not have installed it.

Reply   |   Comment by Sterling  –  6 years ago  –  Did you find this comment useful? yes | no (-3)

#39, williegetaway,
Thannks, I just followed your advice: "Registers on the 2nd attempt for some reason." Worked fine that way on XP, no administrator or anything, just two cycles of enter registration code, close program, run program.

Reply   |   Comment by Joe T.  –  6 years ago  –  Did you find this comment useful? yes | no (-1)

Not too bad. I took a photo of a page from a book and it recognized ~70% via image->pdf->OCR->textfile. It has major problems with text curving a bit due to perspective distortions ( if the camera was not held parallel to the page ), but it is "understandable". It should work quite well with scanned images. I have tried other OCR software on the same image ( SimpleOCR ) and it did not produce as good a results as this one did. Thumbs Up ( for English text ).

Reply   |   Comment by KingOhTheHuns  –  6 years ago  –  Did you find this comment useful? yes | no (+6)

Nitro Reader 1 (beta) for Windows XP, Vista, 7 22.6MB available here
http://www.nitroreader.com/download/ does this and a lot more besides and is freeware

Reply   |   Comment by Tom  –  6 years ago  –  Did you find this comment useful? yes | no (+4)

I tried it on XP with a text PDF that was scanned as an image. It worked well except for the words in italics, it doesn't seem to recognize those.

Reply   |   Comment by Brenduh  –  6 years ago  –  Did you find this comment useful? yes | no (+2)

Unfortunately it shut down my laptop twice .
I have Windows 7 Ultimate and after reaching some progress PDF OCR shuts down the computer.
I tried it with and without Administrator privileges but got the same result.

Reply   |   Comment by smmisme  –  6 years ago  –  Did you find this comment useful? yes | no (-3)

Quite surprised by this program. All it does is make a PDF of an image.
It's not editable at all!

Reply   |   Comment by Buzz  –  6 years ago  –  Did you find this comment useful? yes | no (-10)

Registered fine first time on WinXP SP3.

Program wanted to place the install directory at the root of the drive, but allowed me to change it to a directory under "Program Files" just fine.

PDF OCR is actually two separate programs. Image to PDF converts a picture to PDF and DOES NOT perform any OCR on the image. The PDF is accurately rendered, but is not searchable. However, you can now use this PDF with the 2nd part of the tool: PDF OCR.

Note that PDF OCR does NOT produce searchable PDFs.

Rather, it does an OCR of the text in an Image PDF and extracts it to a text document. I find this useful, but much less useful than a tool such as Abbyy Transformer (commercial, not free) that can create searchable PDFs from Image PDFs.

The sample PDF OCR that I tried would require a lot of editing to get all of the errors corrected.

Usage 1:
You'd like to get the text from an image file (PNG, JPG, etc.). You can convert Image to PDF, and then PDF OCR the Image PDF and have the text in a text file for further editing. Saves you having to manually type the text from the image file yourself (with the exception of correcting errors from the OCR function).

Usage 2:
You have a PDF file with lots of text, but when you try to search, you find that each page of the PDF text was captured as images, and you can't edit or search it (some PDF printer drivers/writers create PDFs this way). You can use PDF OCR on these PDF files to create searchable text files to go with the PDFs.

So, for me, the Image to PDF function works well, but the PDF OCR portion is okay as a give-away, but not at a quality level that I'd buy it yet.

Reply   |   Comment by Doug A  –  6 years ago  –  Did you find this comment useful? yes | no (+10)

not a bad little utility. Could use a set up menu where you could specify where to send your output file.

Reply   |   Comment by John  –  6 years ago  –  Did you find this comment useful? yes | no (-4)

Registers on the 2nd attempt for some reason. Very basic piece of software - you won't be tweaking this one to suit your preferences.

For pdf to text it merely scans across the page from left to right, top to bottom. If your pdf has any text-boxes, or if it's written in columns, then the output text will simply be whatever you would read going just across the page line by line. You would need a lot of editing to make it usable. Another drawback, as someone else said, is that you cannot define where the output will go. It creates a 'pdftoocr_output' folder on your drive and will only output to that for the resulting text. No options to change that I can find. The quality of the scanned text is also quite poor. Simple mistakes crept in, like it thinking the word "song" was "seng". A couple of times it mistook an "s" for an "r"... annoying mistakes like that.

The 'scan image to pdf' works quite a bit better. Navigate to the jpeg on your computer and select it - you can in this case also browse to any folder you'd like for the output to go - and start it up. Fast and very nice quality image produced in a .pdf file.

To pay almost $40 for this... wow! :(
For free you might want to keep this if you have nothing better.

Reply   |   Comment by williegetaway  –  6 years ago  –  Did you find this comment useful? yes | no (+4)

I have a 30 page document from our home owners' association that is in pdf, but is a scanned image, not an editable pdf document. This program processed all 30 pages in about 2 minutes, converting it to editable text. Note that this is PDF to Text, folks, not PDF to Word, which one would hope would be better formatted. But, at least now everything can be moved to Word and edited as needed.

Lots of extra 'junk' in the text file, but that's because the document is fairly old and there is lots of 'junk' in the original pdf scan.

Default location of the C:\ drive for the program and the resulting conversions text files is very non-standard. There should be an option as to where to save the files. Also the default installation off of the root C:\ drive is very non-standard; at least I could change that to C:\Program Files.

I agree that the green arrow is misleading. I clicked it at the end of my scan process and woosh, closed! At least I was able to find my processed document in their folder, but again, let me choose where to put it.

Good automation; good functionality; poor UI. But a tool in my arsenal to use periodically.

Reply   |   Comment by mike2977  –  6 years ago  –  Did you find this comment useful? yes | no (+18)

Satisfactorily does what it says it will do. John, it will capture the text of the form, but not the boxes or formatting. It's great for my use of wanting to extract and adapt things from an external source that come to me in PDF, but for adapting a PDF form, you'd be better off going to the originator and asking for an editable file, if possible.

Does great with straight text, baffled by graphics, offers humorous but understandable efforts with italic or "creative" fonts.

A keeper for me, especially at the price.

Reply   |   Comment by SweetLorraine  –  6 years ago  –  Did you find this comment useful? yes | no (+7)

Question for everyone - From an elementary school teacher standpoint - Would this be simple to use for a school secretary? They often get things in pdf that need modification. It appears to me that this is good for getting a pdf and then changing it but if the orginal is a form with boxes it would not keep that format?

Reply   |   Comment by John  –  6 years ago  –  Did you find this comment useful? yes | no (-4)

I have lots of PDF to Text and word programs. This is the only one that successfully a PDF that was composed of screen shots. The most the others did was convert the PDF into individual images.

Totally thumbs up on this one

Reply   |   Comment by MrWeb  –  6 years ago  –  Did you find this comment useful? yes | no (+9)

This PDF OCR package has two independent features such that you have to register twice to erase the "Unregistered Version" text in the interface. I think it does a great job in converting a scanned image to PDF; the pdf is as good as the scanned image really. However, the scanned pdf to text part of the program works well if the source pdf is a pure text file. If it contains images, the converter outputs lots and lots of garbage that you can't read the relevant information. If only the pdf converter to text ignores images and other non-text stuff, this would have been perfect for extracting text information from images! On the whole though I'll still keep this program. Thanks PDFZilla and GAOTD!

Reply   |   Comment by Albert Born  –  6 years ago  –  Did you find this comment useful? yes | no (+7)

#8 what do you think OCR is? It is text!

Reply   |   Comment by Bubba  –  6 years ago  –  Did you find this comment useful? yes | no (-14)

Well, I installed and activated it very easily.
(#21, try running the program as Administrator, then press the Scanned PDF to Text button, click "Close This Window" on the window that pops up, then go to About -> Register and enter your key there.)
And so I decided to test it. I made a silly little document and turned it into a PDF file VIA CutePDF (A PDF printer). So now it was time to test PDF OCR. I gave it the file, I told it to convert the page, and what I came out with was almost an EXACT COPY of what I typed down on that document, except it misrepresented the o's as 0s. Then I tried another document which explained information for a trip I went to a while ago, and it made quite a few mistakes. For one, it always got the slogan of the tour company wrong, and then it made a few mistakes here and there (like Aftermoom, etc.) But other than that, this program is VERY GOOD at converting English PDFs into English TXTs. However, don't put any symbols in the document. The last page of that second document I tried out had musical notes, and on the OCR conversion, they were either turned into Ps or Fs. I will rate it thumbs up, but let's face it; PDF OCR needs support for not just symbols, but also support for other countries' languages.

Reply   |   Comment by SkippyElectrochomp  –  6 years ago  –  Did you find this comment useful? yes | no (+14)


Your Softpedia review link is for version 3.2 which, interestingly enough, was freeware!

Reply   |   Comment by janet  –  6 years ago  –  Did you find this comment useful? yes | no (+8)

WORKS with Vista!

Difficult to register, as others have said, but it is worth the trouble.

There were no mistakes when it turned a one-page PDF into unformatted text. (Naturally, it did not try to work with an image on the page).


Reply   |   Comment by Bill  –  6 years ago  –  Did you find this comment useful? yes | no (-2)

OCR stands for "Optical Character Recognition" - so by definition it should be able to recognize text from a scanned PDF, if it doesn't, it's a mis-named program.

Reply   |   Comment by wlegrow  –  6 years ago  –  Did you find this comment useful? yes | no (-16)

This is a very quick and simple tool to get editable formatted text.
It installed very easily on XP sp3. It needs a Tools/Options tab so that the program saves the output file where I can easily find it. Otherwise pretty neat, thanks Gaotd.

Reply   |   Comment by GeorgeIbiza  –  6 years ago  –  Did you find this comment useful? yes | no (+11)

I suggest you download this but only to get the serial number - don't install the program but download the program from their web site www.pdfocr.net and install that version but use the GOTD serial number - worked for me.
I normally try this with othe GOTD as it gives me a `reinstallable' program

Reply   |   Comment by Graham  –  6 years ago  –  Did you find this comment useful? yes | no (+48)

OpenOffice has a PDF editor plug-in, and you can export to PDF after you're done editing.

Reply   |   Comment by jim  –  6 years ago  –  Did you find this comment useful? yes | no (-9)

Works well. Very simple and leaves you with a text file, which, although losing some formatting, is much better than many PDF to Word programs which leave everything in boxes. Also (of course) can be used with downloaded PDFs. Many thanks GOTD.

Reply   |   Comment by chris  –  6 years ago  –  Did you find this comment useful? yes | no (+9)

The program is quite good because it can pull out all texts from a very blurred PDF document accurately, and a lot faster than some programs I tried before. #4: As I understand, OCR programs can only pull out texts as unformated. You will need to use "PDF to Word" programs to obtain formated texts, but these programs can only pull out texts from PDF documents with a layer of texts, and they will give you a blank page if used on a scanned PDF which is formated like an image with no layer of texts.

Reply   |   Comment by Van  –  6 years ago  –  Did you find this comment useful? yes | no (+10)

@ #7; Well non-support of non-English characters would make Scandinavian languages problematic for a start, so you don't have to be working in Russian or Chinese to see limited language support as a serious drawback.

Reply   |   Comment by mandoran  –  6 years ago  –  Did you find this comment useful? yes | no (-1)

seem no chinese support... p.s.to#6, it is pdf ocr..... ocr for pdf

Reply   |   Comment by c933103  –  6 years ago  –  Did you find this comment useful? yes | no (-12)

Even using the convoluted procedure described in #2, the thing will not register; I'm still exhorted to "Buy it now!" Is there any chance that this will actually register? If not, I'll delete it now, rather than getting used to something that will stop functioning. Should it be considered a 'giveaway' if you have to be Stephen Hawking to figure out the registration process?

Reply   |   Comment by Jake Shakespeare  –  6 years ago  –  Did you find this comment useful? yes | no (-1)

the pdf to ocr does not let you choose where the output goes.
Was not very impressed by the quality of OCR or usefulness as output is a txt file rather than word .doc or .rdf so all formatting is lost

Reply   |   Comment by Natalie  –  6 years ago  –  Did you find this comment useful? yes | no (+4)

A detailed review of this software can be found here:

They gave it a 4 star out of 5 rating two months ago. I'm pretty sure it's the same version.

Reply   |   Comment by Eli  –  6 years ago  –  Did you find this comment useful? yes | no (+22)

#6: No, you don't need scanner. This is from PDF to text. You just need a PDF document and nothing else.

Reply   |   Comment by TheSmokingTeam  –  6 years ago  –  Did you find this comment useful? yes | no (+55)

Downloaded, installed, and running fine on Vista 32-bit. Thanks, GOTD & PDFZilla.

Reply   |   Comment by Just Passing By  –  6 years ago  –  Did you find this comment useful? yes | no (-2)

@2 - thanks for this advice. I'd probably have been lost without it. Works for Vista, too.

Reply   |   Comment by Wendy Peterson  –  6 years ago  –  Did you find this comment useful? yes | no (-2)

To #6
Of course you have to have a scanner ......... how else can you "scan to OCR" ?? !!

Reply   |   Comment by Roscoe  –  6 years ago  –  Did you find this comment useful? yes | no (-71)

I suggest the manufacturer tackle non-Latin languages. There is a big market out there for that that is almost untouched. But remember to make it good. I tried out one that produced more gibberish than valid text.

Reply   |   Comment by Pat Kopp  –  6 years ago  –  Did you find this comment useful? yes | no (-9)

User interface: it is a bit strange to me, to say the least, that the Exit button is a large green arrow pointing forward. A red X or an octogonal Stop traffic sign would have been better choice.

UI: button bar in PDF to Text mode has no text under buttons. Button bar in Image to PDF does have text below button symbol. No way to choose yourself. Inconsistent, can be improved by developers.

No way to switch from 'PDF to Text' mode to 'Image to PDF' mode except for exitting the program, start it again and choose the other mode from the initial screen.

Reply   |   Comment by paul  –  6 years ago  –  Did you find this comment useful? yes | no (+13)

Bit of a pain to register..
Windows 7 installed ok then a window popped up with Rego Number..Then had to “Run ths program as an Administrator”, (even though my W7 is already running as Administrator) Bit if stuffing around as per #2..
But its going and i can make use of this..I havent found an OCR yet that does the job really well, so its pretty well par.

Reply   |   Comment by Paul Grenfell  –  6 years ago  –  Did you find this comment useful? yes | no (-9)

Installed Fine with Reg Key in text file on win 7 32bit but after using it to scan a few pdf's the observations are as follows:

1) if u taking this for PDF editor this is not your choice.
2)Stores output as Text file with no source formatting and the output is of no use unless its the same as source (atleast in my case)
3) Does not scan signatures etc.
4) if the source is other than english then the converter goes on for a long time before generating the output.

Reply   |   Comment by Ashish razdan  –  6 years ago  –  Did you find this comment useful? yes | no (+18)
