Converting .PDFs and Images to text via OCR software

Started by CrackSmokeRepublican, December 12, 2010, 06:13:48 PM

Previous topic - Next topic

CrackSmokeRepublican

I've been looking at this but don't want to use Google. Anybody know of a free service in bulk doing this?  Seems like this Google project could be taken inhouse for the job... :


http://www.labnol.org/software/convert- ... ocr/17418/

http://www.labnol.org/internet/perform- ... ocs/10059/

I've been looking at Abbyy fine as well:
http://finereader.abbyy.com/
After the Revolution of 1905, the Czar had prudently prepared for further outbreaks by transferring some $400 million in cash to the New York banks, Chase, National City, Guaranty Trust, J.P.Morgan Co., and Hanover Trust. In 1914, these same banks bought the controlling number of shares in the newly organized Federal Reserve Bank of New York, paying for the stock with the Czar\'s sequestered funds. In November 1917,  Red Guards drove a truck to the Imperial Bank and removed the Romanoff gold and jewels. The gold was later shipped directly to Kuhn, Loeb Co. in New York.-- Curse of Canaan

MikeWB

CSR,

Do you need text just to be searchable/copyable or do you want to completely replace imaged text with fonts?

If only the first option interests you, get Acrobat (full edition) and perform OCR on it:

http://www.designer-daily.com/how-to-us ... robat-2802

http://thepiratebay.org/torrent/5968706 ... 0_Windows_[Multilingual]___Keygen_(CORE)


For second option, where you want just the text and wish to replace it all with fonts,  Abbyy's great for this: http://thepiratebay.org/torrent/5851845 ... _Multilang

On a Mac or iPhone you also have option of Prizmo: http://www.creaceed.com/prizmo/  It's by far my favorite option because you don't need a scanner to get a good copy out of something... you can use your camera or phone camera to take pics of docs and this software does the rest. I've been using it for over a year with great results. I needed some docs and went to library and just took pics of them... and ran them through Prizmo.
1) No link? Select some text from the story, right click and search for it.
2) Link to TiU threads. Bring traffic here.

CrackSmokeRepublican

Thanks Mike!

I'll give the links a spin. I'm basically trying to get searchable-index ready plain text results from the graphic texts.  I didn't want to shell out too much for Acrobat or Abby if a freeware solution was available. IYKWIM... ;)
After the Revolution of 1905, the Czar had prudently prepared for further outbreaks by transferring some $400 million in cash to the New York banks, Chase, National City, Guaranty Trust, J.P.Morgan Co., and Hanover Trust. In 1914, these same banks bought the controlling number of shares in the newly organized Federal Reserve Bank of New York, paying for the stock with the Czar\'s sequestered funds. In November 1917,  Red Guards drove a truck to the Imperial Bank and removed the Romanoff gold and jewels. The gold was later shipped directly to Kuhn, Loeb Co. in New York.-- Curse of Canaan

Panoptimist

CSR, what do you mean "shell out?" My man, did you cop that username/password to Hotfile I posted in SPECTEC's "WHO IS JURI LINA" thread? Search this http://www.hotfilesearch.com/ and www.filestube.com for hotfiles containing what you need.

PEACE.
The Orthodox Nationalist [11/18/10] - Berdayev and Dostoevsky; Modernism and Materialism; The critique of the bourgeois [Must Listen]
"[W]ithin himself / The danger lies, yet lies within his power]PL[/i] Book IX, ln. 349-356.