Free Republic
Browse · Search
News/Activism
Topics · Post Article

To: Trityn

“Your assumption about how OCR software works is wrong.”
No it’s not. Take a look on the ocropus soucrcode[1] for example. It does seperate the document into “blocks” and the remaining “background”. Each of the blocks is then OCRed.

“The image is always maintained.”
It’s not. Usually the ocr engine decides which elements it will store as “image” and which it will store as “text”. Usually it tries to store as much as “text” as possible to make the document indexable/searchable.

“It does not change because of the OCR, error or no error. The OCR generates a text layer that may have an error, but that error would only show up when you copied and pasted the text elsewhere.”

No it doesn’t. The “text layer” as you described it is shown as you can easily see. The PDF file format doesn’t allow text to be “hidden” It can only get covered by another layer, get replaced or get removed. That’s why OCR composed PDFs do not contain everything as a image. See the PDF spec on [2][3]. You need a basic understanding of what the PDF file format describes and how it get’s rendered to understand.

“Viewing the PDF with a generic viewer would show the original image. So the TXE is in the original image.”

No. a ordinary PDF viewer would do what it’s assumed to do. Put the “background layer” into the background and the “text layer” on top of it.

“Not saying this makes it fake or otherwise but it certainly does raise flags.”

No it doesn’t. Not that i think obozo is a good president, i think he’s even worse then carter, but all these birthers and their paranoid conspiracy plots make all of us conservatives look like total loons. Even if they would have been witness to his birth and and seen it with their own eyes they would still claim that they didn’t knew where they have been at this moment.

[1]http://code.google.com/p/ocropus/
[2]http://wwwimages.adobe.com/www.adobe.com/content/dam/Adobe/en/devnet/pdf/pdfs/adobe_supplement_iso32000.pdf
[3]http://wwwimages.adobe.com/www.adobe.com/content/dam/Adobe/en/devnet/pdf/pdfs/adobe_supplement_iso32000_1.pdf


107 posted on 04/27/2011 10:18:56 PM PDT by buzzer
[ Post Reply | Private Reply | To 98 | View Replies ]


To: buzzer

I know that a lot of time has transpired but wanted to follow up on what you posted. Reviewed your links.

The link you posted on ocropus isn’t Adobe, but I did review the link. I did not see anything in it that indicated that the original image would be modified.

I read through the supplement Adobe documents you posted and the ONLY mention of images are barcodes, forms, and geospatial specific images.

That said, I partially agree with you. Modern day Adobe OCR converters CAN and DO replace image text as well as some other very special image types (such as those mentioned in the supplements) with Adobe OBJECTs or FONTs.

However, it will be abundantly clear when this occurs. As you scale the document the objects or fonts will scale very clearly if the original image has been replaced. That is not the case here. Scaling the area in question causes it to blur, therefore it has NOT been replaced.

In any case the original image can and will be preserved in any historical or official document, unless there is something to hide.

I think we both can agree that Obozo is the worst pResident ever.


110 posted on 08/05/2011 2:01:28 PM PDT by Trityn (FUBO and the Soros you rode in on.)
[ Post Reply | Private Reply | To 107 | View Replies ]

Free Republic
Browse · Search
News/Activism
Topics · Post Article


FreeRepublic, LLC, PO BOX 9771, FRESNO, CA 93794
FreeRepublic.com is powered by software copyright 2000-2008 John Robinson