Thanks. Like you, I'll meet civility with civility.
I cannot comprehend how something which is designed to recognize characters (a useless function when making a copy of something)
Not so useless if you were making an electronic copy of something you later wanted to search for by text within it. Scanners for document archiving can add an invisible text layer to a PDF document for that purpose. It's possible that feature was on by default in whatever software was used to scan the BC, even though in this case it wouldn't be needed (and would have saved a lot of trouble if it had been turned off). Again, without knowing what scanner and software were used, and how, this has to be just speculation.
Either the OCR didn't recognize it as the same character, or it didn't behave consistently. If there *IS* a theory of software that explains this, I would like to hear it.
The answer would be (I think) that the OCR didn't recognize the first 'R' as an R, and so left it part of the background--i.e., it didn't try to extract and "read" it--probably because it was more gray than black. In the second instance, though, it grouped it with the text rather than with the background.
OCR or not, we have two type sets for R. That is a miracle. Maybe “O” is super natural.
The way software is written nowadays, it wouldn't surprise me if unnecessary functions were performed even when they served no purpose. Anyone who's used a Microsoft product is familiar with this phenomenon. How much simpler would things be if you only created a text version when you wanted one?
It's possible that feature was on by default in whatever software was used to scan the BC, even though in this case it wouldn't be needed (and would have saved a lot of trouble if it had been turned off).
Proof of incompetence by someone, Nicht Wahr?
Again, without knowing what scanner and software were used, and how, this has to be just speculation.
Somewhere in one of these threads "Bushpilot1" (I think) posted a link that purports to show what software was used to create the document. If I recall properly, he said it was some Apple application. Supposedly this data was contained in the PDF file.
The answer would be (I think) that the OCR didn't recognize the first 'R' as an R, and so left it part of the background--i.e., it didn't try to extract and "read" it--probably because it was more gray than black. In the second instance, though, it grouped it with the text rather than with the background.
I think that theory does nothing to explain why the pixels of the "R" are 4 times the size of the pixels of all the other letters. If it thought it was part of the background, it would use the default pixel size. I argue that the destination surface was of uniform size, and the software copied what it saw using 4 pixels of the destination surface for every pixel of the source image. The question is, how could it "see" pixels four times larger (and with different bit depth) on one character and not any others?
I see this as the "smoking gun" of a paste up. It is not the only peculiarity.