Free Republic
Browse · Search
News/Activism
Topics · Post Article

To: JustaCowgirl

I tried to search for specific words using the files from the CBS site. I did it using the Acrobat reader acting as a browser plug-in, or as a stand-alone application on the actual file which I downloaded to my computer. The search did not work in either case which means the documents are just image files embedded in the pdf file.

My understanding is that a scanned document can be converted to pdf format in a process which uses optical character recognition (OCR) software to convert the word images to digital format, which is then rendered by the reader into a visible (and searchable) display format. Any images or illustrations in the original document are inserted into the pdf file so they show up in the correct places. Every document converted in this manner must be proofread to correct mistakes that the process introduces.

Since the search function does not work, I conclude that this conversion process was not done, so what you see in the file is just an image of the original, and is an accurate representation.


676 posted on 09/09/2004 4:49:52 PM PDT by Fresh Wind (Gen. G.S. Patton: There is no soap ever invented that can wash that blood off (Kerry's) hands.)
[ Post Reply | Private Reply | To 620 | View Replies ]


To: Fresh Wind

I think you're right, Fresh Wind. And your explanation of how an image of a file becomes searchable for text was more articulate than mine. It's very likely that OCR was not done on the documents, and I'm pretty sure that process would not change the appearance of the document anyway. It's just a ruleout I want to see made to leave no wiggle room.


1,017 posted on 09/10/2004 5:06:37 AM PDT by JustaCowgirl
[ Post Reply | Private Reply | To 676 | View Replies ]

Free Republic
Browse · Search
News/Activism
Topics · Post Article


FreeRepublic, LLC, PO BOX 9771, FRESNO, CA 93794
FreeRepublic.com is powered by software copyright 2000-2008 John Robinson