Free Republic
Browse · Search
General/Chat
Topics · Post Article

Skip to comments.

OCR, MAC's and PC's

Posted on 06/16/2006 1:09:02 PM PDT by Snoopers-868th

I belong to a Vets' association and I attempting to put all the past Newsletters on CD in a format that is secure not allowing changes and searchable to both a MAC and a PC. Below, is the process used as a test. I am in dire need for some solutions and hope there is someone out there that has used various software to accomplish a similar result.

My dilemma:

First, I OCR scan using MS Office Document & Imaging. I save the .tif file and send the text to Word 2002.

Next, I run spellcheck on the newly created Word text to correct the errors that occurred during the optical read conversion.

Finally, I rescan the original pages that contain pictures. Using MGI PhotoSuite I am able to cut and paste the pictures in the appropriate place in my newly generated MS Word file.

Question 1:

Does anyone have an idea that will utilize the mixed data (pictures & text--Word 2002) to yield a secure, readable and searchable format for use on both the Mac or PC that can be burned to CD?

I know nothing about Adobe and sure don't want to purchase a $500 program. Further, I have no experience with Adobe. On Adobe's web-site they have an on-line subscription service (monthly fee) which looks reasonable. However, for the life of me I can't figure out if the on-line service allows conversion of my mixed Word document to PDF while retaining the search ability OR if I have to use Adobe's OCR program, OR if the on-line feature even offers the search ability in the $10/month service.

https://createpdf.adobe.com/?v=AHP

Question 2:

Is anyone familiar with Adobe and knows the answer(s)? Has anyone used this service in this manner?

Question 3:

To all you MS 2003 users, is there an option in MS 2003 Word to save a PDF file? Or is there some utility or plug-in that will let me do this?

My computer is XP-Pro w/XP Office 2002 and there is no such save ability under Word. There are numerous more options in Excel. I tried saving in .rtf (as I think MAC can read it) and 13 pages was 20,000+KB's. Wayyyyy toooo big.

I saved files as PDF files when I worked but I cannot remember how and can't find anything on it. I have spent two days searching for answers and hope one of you has a solution. Different options are welcome. Thanks


TOPICS: Computers/Internet
KEYWORDS: adobe; computer; mac; pc; software
Navigation: use the links below to view more comments.
first previous 1-2021-4041-50 next last
To: Snoopers-868th
Whatever is editable in Word is searchable in Word or any other Word-compatible program. You just give them the Word document and let them do their own searches.

Also, anything output as searchable PDF is searchable with the free Adobe Acrobat Reader.
21 posted on 06/16/2006 1:54:01 PM PDT by ConservativeMind
[ Post Reply | Private Reply | To 19 | View Replies]

To: Snoopers-868th
I used to scan at 600 DPI. When a OCR program couldn't convert a certain area and left a graphic version of the text there instead, it still looked good.

600 dpi with color or 8-bit gray is almost overkill, but it assures a program like Omnipage gets the best results.

However, several OCR programs can't use the gray or color scan depth to increase accuracy.
22 posted on 06/16/2006 1:56:39 PM PDT by ConservativeMind
[ Post Reply | Private Reply | To 18 | View Replies]

To: ConservativeMind

Thank you very much. That is exactly what I needed someone to tell me. Instead of Dumbest, I can now be Dumber. LOL


23 posted on 06/16/2006 1:57:56 PM PDT by Snoopers-868th (Send-a-Brick.com. Send a brick to Washington and cash to Minutemen for a wall.)
[ Post Reply | Private Reply | To 20 | View Replies]

To: ConservativeMind

Do I get a bunch of junk from Yahoo with that search tool? Is it on their main page? I never go there. My home page is FR. Thanks


24 posted on 06/16/2006 2:02:07 PM PDT by Snoopers-868th (Send-a-Brick.com. Send a brick to Washington and cash to Minutemen for a wall.)
[ Post Reply | Private Reply | To 20 | View Replies]

To: Snoopers-868th

The way I make PDFs from text is searchable, at least on my 'puter. Here's what I did to check, just now. On this very thread, I went up to file, print, and then changed my printer to Adobe PDF, then printed (make sure you indicate where you want it saved because if you don't, like anything, you won't be able to find it). Name it, too, unless you like the name it makes for itself. When you print to the Adobe PDF printer, you're not really printing, you're creating a PDF. Then, if it doesn't open all by itself, which mine did, go to it, open it, and see if you can search it. This is what I did just now with this thread to check it. And it worked perfectly. And it took seconds.


25 posted on 06/16/2006 2:03:05 PM PDT by Auntie Mame (Fear not tomorrow. God is already there.)
[ Post Reply | Private Reply | To 1 | View Replies]

To: Snoopers-868th

No, at this time you only get junk already on your hard drive! :-)

There are several free search programs, but the one that supports the most file formats (by far) is the Yahoo! Desktop Search.

It is available here:

http://desktop.yahoo.com


26 posted on 06/16/2006 2:04:21 PM PDT by ConservativeMind
[ Post Reply | Private Reply | To 24 | View Replies]

To: Auntie Mame

That is what I did. Thank for the memories!! So how do I get Adobe as a printer option? Any input? It is not there now.


27 posted on 06/16/2006 2:11:06 PM PDT by Snoopers-868th (Send-a-Brick.com. Send a brick to Washington and cash to Minutemen for a wall.)
[ Post Reply | Private Reply | To 25 | View Replies]

To: ConservativeMind
600 dpi with color or 8-bit gray is almost overkill, but it assures a program like Omnipage gets the best results.

It probably is overkill for the photographic images, but good for small text. That may explain why your file sizes are so large.

28 posted on 06/16/2006 2:15:11 PM PDT by HAL9000 (Get a Mac - The Ultimate FReeping Machine)
[ Post Reply | Private Reply | To 22 | View Replies]

To: HAL9000
It probably is overkill for the photographic images, but good for small text.

Wikipedia - Nyquist–Shannon sampling theorem

29 posted on 06/16/2006 2:20:33 PM PDT by HAL9000 (Get a Mac - The Ultimate FReeping Machine)
[ Post Reply | Private Reply | To 28 | View Replies]

To: HAL9000; ConservativeMind; ShadowAce; Auntie Mame

Thanks a million to all of you. HAL your link looked like rocket science to me. ROTFLOL


30 posted on 06/16/2006 2:30:19 PM PDT by Snoopers-868th (Send-a-Brick.com. Send a brick to Washington and cash to Minutemen for a wall.)
[ Post Reply | Private Reply | To 29 | View Replies]

To: ConservativeMind

Thanks for the Yahoo search. It took my CPU usage up to 80%. I wondered what the heck is it doing--well, indexing, naturally. Thanks again.


31 posted on 06/16/2006 2:34:59 PM PDT by Snoopers-868th (Send-a-Brick.com. Send a brick to Washington and cash to Minutemen for a wall.)
[ Post Reply | Private Reply | To 26 | View Replies]

To: Snoopers-868th

If you have Adobe Reader (the free version), the writer should install with the reader, at least I'm pretty sure of that. Maybe you could reinstall Adobe Reader from Adobe's website, see if that puts an Adobe PDF Writer in your printer folder. It's free!

Maybe this will help: http://www.geneseo.edu/CMS/display.php?page=2628&dpt=cit

However, I tried to do what they said but it didn't work.

As far as Adobe's website PDF maker, the one you make through their website, they let you make a few for free, and it worked incredibly well, for me. I converted WP5.1 for DOS documents on it and they were beautiful. It's pretty affordable, too. But you should be able to make these PDFs on your computer, I just don't understand why you don't have the writer already there, unless you don't have Adobe Reader installed.


32 posted on 06/16/2006 2:58:17 PM PDT by Auntie Mame (Fear not tomorrow. God is already there.)
[ Post Reply | Private Reply | To 27 | View Replies]

To: Snoopers-868th

One more thought, and I'll let you be. If worse comes to worse, go to Ebay and for $19.95 you can buy Adobe Acrobat as a download, too. It's really the standard in the industry and everyone should have Reader on their computer so it's probably the best way to go. The only thing I'm not sure of if you do a search from your computer, not from within Acrobat, if it will pick up words in Acrobat documents. If it doesn't it means that if you're doing a search, you have to open up every single Acrobat document, oh, wait, I think you can search all your Acrobat documents from inside Acrobat.

Okay, I'm finished, now that you're probably totally confused.

But, hey, let me know how everything works out! Regards....


33 posted on 06/16/2006 3:04:11 PM PDT by Auntie Mame (Fear not tomorrow. God is already there.)
[ Post Reply | Private Reply | To 27 | View Replies]

To: Auntie Mame

I do have the reader 7.0 but the writer I think they sell. I don't know because I really dislike Adobe. It is so slow. I might do a restore point and update Adobe and see if I can find something. Maybe I just have to add it somehow--like a printer, but I don't know and wouldn't know where to go if I get the search for file screen. Can you tell me if the file extension on a PDF file is PDF? I can do a save to a file rather than print but it is a PRN file extension. Don't know what that is. Guess, I will do a google search and see. Thanks for the input and you are not bothering me.


34 posted on 06/16/2006 3:11:32 PM PDT by Snoopers-868th (Send-a-Brick.com. Send a brick to Washington and cash to Minutemen for a wall.)
[ Post Reply | Private Reply | To 33 | View Replies]

To: Snoopers-868th
your link looked like rocket science to me

To put it simply, try to scan your document with enough resolution to distinctly separate each letter on the page. Look at the scanned image and make sure a visible gap exists between each letter - except for letters printed with a ligature, i.e., without a gap, which typically occur in combinations like "rt". About 200-to-300 dots per inch (DPI) of resolution is usually sufficient to get accurate OCR output for most documents.

35 posted on 06/16/2006 3:18:50 PM PDT by HAL9000 (Get a Mac - The Ultimate FReeping Machine)
[ Post Reply | Private Reply | To 30 | View Replies]

To: HAL9000
ligature

You proved my point exactly. You know rocket science! I learned something new today. Thanks for the pointers they all are helpful. Do you know what Auntie Mame is talking about by having Adobe available on your printer menu? What do I need to do to have that happen, do you know?

36 posted on 06/16/2006 4:12:58 PM PDT by Snoopers-868th (Send-a-Brick.com. Send a brick to Washington and cash to Minutemen for a wall.)
[ Post Reply | Private Reply | To 35 | View Replies]

To: Auntie Mame
Adobe does not give away the PDF Distiller (the "writer" of which you speak). It only gives away the Reader.

If one purchases Acrobat Standard or Acrobat Professional, it comes with Acrobat Distiller. However, these two products let you edit PDFs.
37 posted on 06/16/2006 4:25:47 PM PDT by ConservativeMind
[ Post Reply | Private Reply | To 32 | View Replies]

To: Auntie Mame

If Acrobat Standard or Professional is available for $19.95 as a download, you are very likely getting a pirated version.


38 posted on 06/16/2006 4:28:16 PM PDT by ConservativeMind
[ Post Reply | Private Reply | To 33 | View Replies]

To: Snoopers-868th

It will take a few hours to index your system.

Make sure you have enough RAM memory to do what you are trying to do, as well. Hopefully you have at least 512 MB of RAM, and ideally, more. It is cheap for what it will do, so this might be a good time to buy.


39 posted on 06/16/2006 4:31:48 PM PDT by ConservativeMind
[ Post Reply | Private Reply | To 31 | View Replies]

To: Snoopers-868th
What you want is a program that will do the following:

1. Scan the image.

2. OCR the image

3. Allow you to save the text as a PDF

In my testing, I have found that ScanSoft OmniPage has the needed features. Once you scan and OCR, you can save the PDF in a format they call something like "text with overlay".

This places the actual scanned image, directly overlaid on top of the text as it was OCR'ed. Thus, if you full-text index a group of PDFs, or if you open a particular PDF and search for text, you will be given the text, but, it will look like the original document.

Any marks or graphics that are not recognized by the OCR engine, can still be seen, but not searched on (they are saved in the graphics overlay part, which you look at).

I hope I explained this well enough ...

40 posted on 06/16/2006 5:00:08 PM PDT by ikka
[ Post Reply | Private Reply | To 1 | View Replies]


Navigation: use the links below to view more comments.
first previous 1-2021-4041-50 next last

Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.

Free Republic
Browse · Search
General/Chat
Topics · Post Article

FreeRepublic, LLC, PO BOX 9771, FRESNO, CA 93794
FreeRepublic.com is powered by software copyright 2000-2008 John Robinson