The way software is written nowadays, it wouldn't surprise me if unnecessary functions were performed even when they served no purpose. Anyone who's used a Microsoft product is familiar with this phenomenon. How much simpler would things be if you only created a text version when you wanted one?
It's possible that feature was on by default in whatever software was used to scan the BC, even though in this case it wouldn't be needed (and would have saved a lot of trouble if it had been turned off).
Proof of incompetence by someone, Nicht Wahr?
Again, without knowing what scanner and software were used, and how, this has to be just speculation.
Somewhere in one of these threads "Bushpilot1" (I think) posted a link that purports to show what software was used to create the document. If I recall properly, he said it was some Apple application. Supposedly this data was contained in the PDF file.
The answer would be (I think) that the OCR didn't recognize the first 'R' as an R, and so left it part of the background--i.e., it didn't try to extract and "read" it--probably because it was more gray than black. In the second instance, though, it grouped it with the text rather than with the background.
I think that theory does nothing to explain why the pixels of the "R" are 4 times the size of the pixels of all the other letters. If it thought it was part of the background, it would use the default pixel size. I argue that the destination surface was of uniform size, and the software copied what it saw using 4 pixels of the destination surface for every pixel of the source image. The question is, how could it "see" pixels four times larger (and with different bit depth) on one character and not any others?
I see this as the "smoking gun" of a paste up. It is not the only peculiarity.
Probably just lack of awareness. We don't know where the scan was made, but it was probably at the Hawaii DOH, Obama's lawyer's office, or the WH. In all cases, I can see the IT person installing the document archiving system and turning on the OCR part, because 90% of the time, that's what they'd want and they don't want the user to have to think about it. In this case, the user didn't think about it.
Somewhere in one of these threads "Bushpilot1" (I think) posted a link that purports to show what software was used to create the document. If I recall properly, he said it was some Apple application.
It's not really an application, it's a PDF handler built into OS X--I think it's what lets any Mac application "print" a PDF file. It doesn't tell us what software was used for scanning, or even for printing the PDF. I do know (I'm a Mac user) that there's a filter for "Reduce File Size" when you're creating a PDF in at least one Apple program--I don't know exactly what it does, though.
I think that theory does nothing to explain why the pixels of the "R" are 4 times the size of the pixels of all the other letters. If it thought it was part of the background, it would use the default pixel size.
It would be because after the letters were extracted from the background, the background was "downsampled" to a lower resolution. That "expert" I pointed you to before (I'm only using quotes because it's WND that labeled her an expert--I can't vouch for her myself) wrote:
The use of OCR software and image optimization have a number of other effects on documents. Each of these issues, which can result from OCR or optimization processing, may have led to the appearance of tampering and manipulation, and accusations of forgery.Pixel size: In any scanned image, pixels are all the same size. Pixels in the Presidents birth certificate, however, are not. The pixels around the optimized text are a much smaller size than the background pixels.