if you can demonstrate how all this was done through OCR and optimization, do it. It would settle many of the issues.
I have scanned similar documents on a mac with many different settings, including optimize for OCR.
Nothing like the problems that show up occur on *any* of my scans.
- no layers
- no issues with character fuzziness disappearing
- no partial words falling to another layer
- no differing pixel sizes
- no solid black words/letters
(you get the point)
You claim that the process done by the WH included OCR and then somehow undid it?
You do realize that a text search on the WH_LFCOLB.pdf comes up with zero results for any text. Right?
The technical specs in the file show the programs used - if you think some crazy process was followed, do us all a favor and replicate it.
(I tried with a mac scanning and pdf conversion program - it it does not even come close.)
Someone at the National Review already took care of that one:
Ive confirmed that scanning an image, converting it to a PDF, optimizing that PDF, and then opening it up in Illustrator, does in fact create layers similar to what is seen in the birth certificate PDF. You can try it yourself at home.You claim that the process done by the WH included OCR and then somehow undid it?
I don't claim anything about what happened. I claim that it's possible that if you use scanning software that creates an invisible OCR layer on a PDF, then open that PDF in software that doesn't support OCR, then save a PDF from that second piece of software, you could plausibly end up with a PDF that has been subject to an OCR routine at one point but no longer has searchable text.
I'm not even wedded to the OCR theory. I've read that straightforward PDF optimization can result in some areas being pure black and others still in color, and in variation in pixel sizes. I posted a link to one such discussion earlier in this thread, from a woman that WND called an "expert" when they thought she supported their case.
The technical specs in the file show the programs used - if you think some crazy process was followed, do us all a favor and replicate it.
As far as I know, they just say it was created with "Mac OS X 10.6.7 Quartz PDFContext," which isn't a program. It's part of the native OS X PDF handling software, which means the file was last touched by a Mac program, probably Preview. But that's all we know.
OCR doesn't necessarily mean the resulting document is searchable. The software "reads" text and other elements, but what it does to them is up to the user. Making the document searchable doesn't make much sense unless the text is clear and crisp to begin with, which isn't the case here.