Free Republic
Browse · Search
Bloggers & Personal
Topics · Post Article

To: WhiskeyX

Thanks for the further explanation. It is news to me that the OCR part of a scan-to-PDF process generates a font specification, even a Multiple Masters one. I don’t completely understand why, since as I understand it the text layer will never be printed—I would have thought it would be simpler and more compact to just use the ASCII designations. But maybe it’s more compact or efficient to specify the characters as part of a font, I dunno.

It does, however, seem plausible to me that the font info got stripped out when the file was opened in Preview. It’s clear the PDF that was posted was generated from Preview, for whatever reason—the fact that there’s no font info in that file when opened in Acrobat doesn’t prove there was no font info in whatever file was opened in Preview.


127 posted on 08/03/2011 10:19:56 AM PDT by Ha Ha Thats Very Logical
[ Post Reply | Private Reply | To 123 | View Replies ]


To: Ha Ha Thats Very Logical
OCR software can scan and output to a wide variety of output formats rqanging from different forms of text only to different forms of searchable text and full graphics; plain text to XML. The adobe Portable Document Formats (PDF) are among the possible choices, with the PDF also coming in multiple choices.

Adobe® Acrobat PDF Searchable Image (Exact) (formerly known as PDF Original Image with Hidden Text) embeds fonts.

I'm not familiar with Adobe Preview and what it can and cannot do, but I will be surprised to learn that it can strip embedded fonts from a PDF file. These kinds of PDF files are notorious for making it extremely difficult to extract the fonts from the file. Anyone familiar with Preview?

131 posted on 08/03/2011 12:01:03 PM PDT by WhiskeyX
[ Post Reply | Private Reply | To 127 | View Replies ]

Free Republic
Browse · Search
Bloggers & Personal
Topics · Post Article


FreeRepublic, LLC, PO BOX 9771, FRESNO, CA 93794
FreeRepublic.com is powered by software copyright 2000-2008 John Robinson