Every day we offer FREE licensed software you’d have to buy otherwise.
PDF OCR 4.3.3 was available as a giveaway on June 30, 2014!
PDF OCR is based on OCR technology to convert scanned PDF paper books and documents into editable electronic text files fast and easily. PDF OCR has a build-in text editor which allows you to edit ocr result text without MS Word. PDF OCR also supports batch mode to OCR all pages of pdf file to text at a time.
PDF OCR has a Scanned Image To PDF Converter, which means you can create your own scanned PDF books.
Windows 7, Vista, XP, 2003, 2000, ME; Pentium Processor or better, Pentium 4 or higher recommended; 128MB RAM or more, 256MB RAM is recommended; 20MB Hard disk space for install
27.9 MB
$49.95
PDF to JPG Converter for Mac is a Mac application that fast converts PDF documents to image files like JPG, PNG, BMP, GIF or TIFF. The program also let users customize the output image DPI (Dots per Inch) to get large high quality images, or small thumbnails. When users want to convert several PDF files in bulk, there is a folder creation function can convert each PDF file to each new folder named as the PDF filename.
JPG To PDF Converter for Mac creates PDF documents from image files that you select. Simply drag and drop your images and photos onto the application, and click Convert Now button to start the task.
PDF To Excel Converter is a windows application to convert PDF document to Excel XLS format fast and easily. With PDF To Excel Converter, users will be able to edit their PDF forms in MS Excel in few clicks.
Did not work well for me. Aiseesoft worked so much better on PDF to Word
Save | Cancel
Easy to install and register (Windows 8.0 64-bit). Agree with Karl that output path being fixed to c:\PDFOCR_Output is not good practice, since most people try to keep their C: drive clean. Likewise the default installation directory should be a sub-directory of C:\Program Files (x86)\ which is where I installed it.
Scan and conversion is pretty quick, but after trying on an assortment of input files can only evaluate as producing very mixed results (sometimes quite garbled and meaningless text is produced, in other cases the text is a fair to middling representation of the original). Still, it's probably worth having for the occasional OCR conversion.
Save | Cancel
Thanks to PDFZilla + GOTD.
Uninstalled by old version and installed this new version.
Does a basic task well.
Unchangeable default output folder? Small issue.
Just move the output files from C:\PDFOCR_Output folder to any other folder that you like.
Multi-column image of words?
Crop each column and then OCR with this software; or
use ABBYY Screenshot Reader and use cross hairs to define each column ; or
use ABBYY Fine Reader that can detect multi-columns.
Save | Cancel
@17 Angela
I'd suspect that newspaper print is too low definition for OCR to work well, being composed of a fairly coarse pattern of dots - OK for the human eye/brain but not for something as dumb as a scanner.
Just a thought.
Save | Cancel
Over the years I've found that the little freebie (well 22mb) gImageReader (A graphical GTK frontend to tesseract-ocr) gives good results. It's a little bit quiry, but does alow you to highliight chunks of text, append didfferent chunks of text, strip out line breaks etc and has an on0board editor. The developer responds quickly. Get it here: http://sourceforge.net/projects/gimagereader/
Save | Cancel
@13 XP-Man
Sorry to contradict you but the links haven't changed since my first comment publishing them here some months ago. But I fear if they are published too often they will vanish as I think they were just forgotten.
Save | Cancel
Installed and registered smoothly. The first thing I did was to load a pdf form (Name:____________, Yes[] No[]) type form. The program completely destroyed the format of the page and only showed text letters, no boxes, lines, ect. It was like reading paragraph after paragraph of jumbled, misspelled words, spaces where they didn't belong, etc. Uninstalled.
Save | Cancel
An OCR software that is, like today’s GOTD, capable of retriving only text without its proper layout, is generally useless. Those who think otherwise (and a number of members of the forum seem so to do), in order to test their convictions, may try to open a book containing a novel and read the first line from the left page, then the first line from the right page until the last lines on the page, and then continue reading the novel in this way. I’d love to hear their resumes of what they have read, providing that they are able to read the whole story in this way and not to lose mental sanity.
Sorry to be sarcastic, but I highly appreciate the professional level of the forum the tradition of which is to point out statements seemingly pushing sales rather than quality and competence.
Save | Cancel
Easy to install; quickly uninstalled. Viewing and downloading pdf files from online has always worked beautifully until I installed PDF OCR this morning. Now when I try to upload online PDFs, all I get are fatal DDE errors. Removed PDF OCR and the fatality remains. Never had a lick of trouble until this install. While this may be an set of unrelated events, it's hard to overlook the coincidence.
Before uninstalling, however, I did try to convert a saved document with a layered watermark (a library marking indicating that the file is outdated). The document converted to gobblydegook, so PDF OCR must have been unable to separate the layers.
Save | Cancel
@#10 Karl: You are correct: Aiseesoft PDF Converter is the best.
Here's what I found with today's offering: This program does not do or accommodate non-text input (i.e., pictures, tables, etc.), and seems to have about a 10% error rate on good, solid plain text. Italics are almost completely unrecognized. This is barely acceptable for anyone with a long document in scanned form, since the output would have to be tediously error-corrected by hand, by comparing the original with the OCR’d output. It might be acceptable for use with short documents.
Save | Cancel
Checked the developer's website, as I always do, and found that the supported versions of Windows are "Windows ME/2000/XP/2003/Vista/7". ME? That's going back a while. Then at the bottom of the page I saw this "Copyright © 2010 PDFOCR.net." Looking in the installation folder, I also found that most of the files were dated 2008/09/10. The program doesn't seem to have been updated recently.
Tried it out on a PDF file in both English and Dutch which also contained graphics. Although the selected language was English, it didn't seem to have any problem with the Dutch text. The trouble was it tried to OCR the graphics and the output produced things like;
, - , ·
1 _ i . · -·· . · .
41 in i» -- a i ¤-=
- M_ _ I gg V
i _ "" f Ai-• .
gi I i E- -»—--»
ii.
I; ’ r ’ P " 4 “"i»
· # ‘ ‘‘‘‘‘ ‘* *- ·*·· ‘
-— ... _ ‘*~=r =~¤¤~r gl S
` . ' ...
If you're working with documents containing graphics, this program isn't the best choice. And why does it install an old (pre-ribbon) version of Wordpad in the program folder in order to display the program's output? There must be any number of freeware / open source text editors that could have been used rather than risking the wrath of Microsoft.
Save | Cancel
# XP-Man
OK for registration with the top link
impossible to download the program....what is wrong ?
Save | Cancel
#8 JULIA wrote:
"A program which chooses to install itself into a unique root folder by default, and uses yet another one as a fixed output folder? No thank you. Developers who are so cavalier about folder structures do not inspire any confidence in me."
I had no problems installing to D:/APPLICATIONS/PDF/pdfOCR---just where I wanted it....:-)....
Save | Cancel
IF julia wasn'i in such a hurry to install she would have had time to follow the instruction to choose an installation folder. I put in, you guessed it: program files. A good freebie for converiing almost anything printableto pdf is nitro pdf
Save | Cancel
Does not work at all on scanned newspaper. Outputs the same text as everything else I've tried. Takes FAR less time to just type it than it would to clean up the mess it produces.
Save | Cancel
Is not designed to convert a scanned PDF to a PDF with editable text. Output will just be a TXT file. I have a scanned PDF that has "mixed content": text with objects. I use it to test all OCR programs like this one. Output was a jumbled mess. Uninstalled. If PDFZilla hopes to compete with all of the other programs on the market, they need to step up their game. This isn't going to cut it.
Save | Cancel
Don't overlook PDF X-Change (ALWAYS FREE)... which includes OCR as well as many basic PDF viewing/markup tools.
Save | Cancel
Steve,
You also might be able to use Unix utilities cut or awk to extract one column at a time. There would have to be a way to separate fields (tabs or multiple spaces) - something that doesn't occur in the body of the column.
http://answers.microsoft.com/en-us/windows/forum/windows_xp-windows_programs/how-to-run-grep-and-awk-on-windows/73fb9fd6-92f3-4b5f-baf4-38c60c0ca397
Save | Cancel
#4 Giovanni
Your link to ABBY Fine Reader http://forum.raymond.cc/threads/abbyy-finereader-5-0-pro-for-free.35458 has modified the way it operates.
The top link used to go to the serial number and the lower link to download for the program itself. This has changed, ABBY themselves appear to have taken over this site and are using it as a promotion of their OCR engine.
In my opinion ABBY still make the best OCR engine so it is still worth trying to get a copy of their FineReader; the fact that this is an older version is probably not too important as I believe their OCR engine has not changed much.
Although it does not convert directly to PDF, OpenOffice or Libra Office will import the results from ABBY and then you can export them to PDF.
Save | Cancel
Installed OK BUT, not much user involved choices. I use PDF-XChange Viewer, it has a Built-In OCR Feature with multiple languages plus a Built-In Search Engine that lets you search your PDF's for any word or phrase. It has mild editing capabilities along with a pdf converter, all for free.
http://www.tracker-software.com/product/downloads
michael clyde
Save | Cancel
@9
Well, no. Good OCR software can identify blocks of text in the image, retain formatting etc. But they do tend to cost lots of money.
Save | Cancel
#5 Lag
The former GAOTD Aiseesoft PDFtoWord Converter does a pretty good job. It works with the Abby Finereader engine :
Original Page 40
http://www.xup.to/dl,16164998/Page40-orig.pdf/
To Word OCR'd
http://www.xup.to/dl,47448517/Page40-OCR.doc/
The actual Finereader would it even do better...
Save | Cancel
Steve claimed that "If your PDF has text in columns, this OCR just reads across the page and jumbles the text up in the resulting text file – absolutely no use at all!". Well, Steve, it seems you are ignoring the primary and most important basic fact that basically, OCR is intended to locate, recognize and extract TEXT included in a PHOTO. Thus, In case of need to recognize text in columns, first of all CROP the given photo into TWO seperate photos, in which one includes the left column and the other the right one (and so forth if there are more columns in the same page) and only then process them one by one in the same order, using the OCR feature.
Save | Cancel
A program which chooses to install itself into a unique root folder by default, and uses yet another one as a fixed output folder? No thank you. Developers who are so cavalier about folder structures do not inspire any confidence in me.
Save | Cancel
Its a keeper for me, it can convert images to pdf too,! i am just afraid it wouldn't expire after some trial period. Likely after 4/5 months.
Which i have noticed in most of the Giveaways i have downloaded, which doesn't sit with the idea of Giveaway.
Thank you for this.
Save | Cancel
Installed & reg'd - no problem - Windows 7
Does exactly what it says on the tin - in fact this is the first PDF editor I've tried which actually works ! All the others I tried were simply image editors.
I agree with Karl - Auto saving to C:\PDFOCR_Output is wrong and although I can 'save as' to wherever I like I still need to go back to C:\PDFOCR_Output to delete - tedious. I would like to see options/settings/preference within the program to change this but I imagine most would be happy if the 'auto save' went to 'My Documents'
I empathise with Steve about losing columns - but I didn't expect an OCR to be able to read formatting as it does not contain 'characters'
$50 seems over the top to me but if I had a large workload - perhaps it isn't.
All in all - great program which I will keep
Thanks for this one GOTD
Save | Cancel
I'm testing programs like this on this book from 1967 - http://archiwalna.glos.pl/files/biblioteka/Ksenofont%20-%20Pisma%20Sokratyczne.pdf - no program so far could handle it. Go to page 40 f.e. and check out the numbers that divides the sections of text (there's also a footnote). This always result in total mess after conversion.
Save | Cancel
It's OK...easy to use, can edit the OCR results with a built-in text editor, thus enabling you to edit scanned PDF files without using Word or other third party text editors.
It also gives you the option to convert one single page, a range of pages, or the entire PDF file in batch mode (supports even the Italian Language ==> simply unbelievable:...LOL!)
As for conversion quality, I found it pretty GOOD with regard to TEXT, not so good when required to extract text from images.
But obviously if you are looking for a more professional OCR product (for instance an ABBYY OCR software) you have to spend some money!!!
Anyway, THUMBS UP from me!
BEST FREE ALTERNATIVES (most of them require you to pay attention during installation)
http://www.ocronline.com (==> Supports over 153 languages)
http://www.paperfile.net (==> It also uses the powerful Tesseract engine by Google like this GAOTD)
http://capture2text.sourceforge.net
http://www.softpedia.com/get/Office-tools/PDF/Free-Image-OCR.shtml
http://vietocr.sourceforge.net
Or just try this old version of ABBY Fine Reader (it does not support PDF files though):
http://forum.raymond.cc/threads/abbyy-finereader-5-0-pro-for-free.35458
And if you use Chrome:
http://projectnaptha.com
Finally to create a PDF file directly from scanned documents and images for FREE:
http://www.softpedia.com/get/Office-tools/PDF/Free-Scan-to-PDF.shtml
http://www.softwareok.com/?seite=Microsoft/WinScan2PDF
Enjoy!! ^_^
Save | Cancel
If your PDF has text in columns, this OCR just reads across the page and jumbles the text up in the resulting text file - absolutely no use at all!
Save | Cancel
Installed and registered without problems on a Win 8.1 Pro 64 bit system.
A (US) company without name and address, but a toll free phone number.
A resizable window opens, you can either scan a .pdf to text or an image to .pdf.
In the .pdf to .txt path, there are no options, but the OCR language and the output path is fixed to c:\PDFOCR_Output to where it simply not belongs.
In the image to .pdf task, you simply convert an image, you can change the default output c:\output to a more useful path, enter author, subject and keyword. That's it.
More interesting is the OCR. This program uses the free Tesseract-OCR, which claims is probably the most accurate open source OCR engine available. Tesseract is maintained by Google.
It is indeed simply a frontend to tesseract - but does not mention tesseract.
The OCR capabilities are quite good, not as good as the commercial program - but it has a quite unique feature, important for German users. You can change the recognition language to "Fraktur (old German)". To recognize fraktur belongs to the most difficult OCR tasks. The "Fraktur" OCR is astonishing good!
The tesseract engine does quite a good job with this.
I have uploaded two examples, which you can test easily by yourself. They are in german language:
This is from a book of Fanny Lewald, printed 1845
http://www.xup.to/dl,15640991/Seite_25.pdf/
And this side is from an old Hungarian cookbook, a book, which I use always for test purposes:
http://www.xup.to/dl,15441766/Seite_8..pdf/
A simple to use OCR program, for German "Fraktur" readers nearly a must.
I'll keep it for this reason.
Save | Cancel
I had no idea a program like this was available. I have scanned a very old and rare document as a PDF file. I have been painstakingly transcribing it into my word processor. At my pace it was going to take a good six months to complete. This "Give Away" is a true blessing. It installed perfectly on my laptop with Windows 7. Thank you GAOTD and PDF OCR. I love this product and will certainly spend the money to purchase it. It is well worth the $50 price tag.
Save | Cancel