Replies

I tried to search for specific words using the files from the CBS site. I did it using the Acrobat reader acting as a browser plug-in, or as a stand-alone application on the actual file which I downloaded to my computer. The search did not work in either case which means the documents are just image files embedded in the pdf file.

My understanding is that a scanned document can be converted to pdf format in a process which uses optical character recognition (OCR) software to convert the word images to digital format, which is then rendered by the reader into a visible (and searchable) display format. Any images or illustrations in the original document are inserted into the pdf file so they show up in the correct places. Every document converted in this manner must be proofread to correct mistakes that the process introduces.

Since the search function does not work, I conclude that this conversion process was not done, so what you see in the file is just an image of the original, and is an accurate representation.

I think you're right, Fresh Wind. And your explanation of how an image of a file becomes searchable for text was more articulate than mine. It's very likely that OCR was not done on the documents, and I'm pretty sure that process would not change the appearance of the document anyway. It's just a ruleout I want to see made to leave no wiggle room.