Free Republic
Browse · Search
News/Activism
Topics · Post Article

Skip to comments.

Student uncovers US military secrets
The Register ^ | Thursday 13th May 2004 | Lucy Sherriff

Posted on 05/16/2004 12:05:42 PM PDT by E. Pluribus Unum

An Irish graduate student has uncovered words blacked-out of declassified US military documents using nothing more than a dictionary and text analysis software.

Claire Whelan, a computer science student at Dublin City University was given the problems by her PhD supervisor as a diversion. David Naccache, a cryptographer with Gemplus, challenged her to discover the words missing from two documents: one was a memo to George Bush, and another concerned military modifications to civilian helicopters.

The process is quite straightforward, and according to Naccache, Whelan's success proves that merely blotting words out of declassified documents will not keep the contents secret.

The first task is to identify the font, and font size the missing word was written in. Once that is done, the dictionary search begins for words that fit the space, plus or minus three pixels, Naccache explained.

This process yielded 1,530 possibilities for word blanked out of a sentence in the Bush memo. Then, the text anaysis routine checks for words that would make sense in English. The sentence was: "An Egyptian Islamic Jihad (EIJ) operative told an XXXXXXXX service at the same time that Bin Ladin was planning to exploit the operative's access to the US to mount a terrorist strike." Just 346 words remained on the list at this stage.

The next stage is to involve the brain of the researcher. This eliminated all but seven words: Ugandan, Ukrainian, Egyptian, uninvited, incursive, indebted and unofficial. Naccache plumped for Egyptian, in this case.

Whelan subjected the helicopter memo to the same scrutiny, and the results suggested South Korea was the most likely anonymous supplier of helicopter knowledge to Iraq.

Although the technique is no good for tackling larger sections of text, it does show that officials need to be more careful with their sensitive documents. Naccache argues that the most important conclusion of this work "is that censoring text by blotting out words and re-scanning is not a secure practice".

According to the original report in Nature, intelligence experts may consider changing procedures. ®


TOPICS: Business/Economy; Government; News/Current Events
KEYWORDS: classified; cryptography; privacy; secret; security

1 posted on 05/16/2004 12:05:42 PM PDT by E. Pluribus Unum
[ Post Reply | Private Reply | View Replies]

To: E. Pluribus Unum

And just how do they know the result is correct?


2 posted on 05/16/2004 12:17:07 PM PDT by Grig
[ Post Reply | Private Reply | To 1 | View Replies]

To: E. Pluribus Unum
Process documents for random character spacing between a certain range. Also, use proportional fonts only.

Too simple?

3 posted on 05/16/2004 12:17:19 PM PDT by atomicpossum (Hey, I wouldn't touch Camryn Manheim's uterus on a bet.)
[ Post Reply | Private Reply | To 1 | View Replies]

To: E. Pluribus Unum

Interesting method. Wonder how many documents are floating around out there that have just a few words blacked-out but the full-text versions are still classified? Must be zillions of 'em.


4 posted on 05/16/2004 12:17:35 PM PDT by LibWhacker
[ Post Reply | Private Reply | To 1 | View Replies]

To: E. Pluribus Unum

just a computerized version of a known method


5 posted on 05/16/2004 12:18:19 PM PDT by steplock (http://www.gohotsprings.com)
[ Post Reply | Private Reply | To 1 | View Replies]

To: Grig

You don't know 100%.

But it is likely in the example to be right by at least 99.9%...

Only a limited number of words will fit the space exactly. From that only a few words will make sense relative to the context. In the article's example only one word remaining fits the context.

Chances of being correct: very high.


6 posted on 05/16/2004 12:21:24 PM PDT by DB (©)
[ Post Reply | Private Reply | To 2 | View Replies]

To: Grig

Seems like an overly fancy way at guessing to me.


7 posted on 05/16/2004 12:22:01 PM PDT by aft_lizard (I actually Voted for John Kerry before I voted against Him)
[ Post Reply | Private Reply | To 2 | View Replies]

To: atomicpossum

Retype the document with a fixed number of "X"s for all redacted words.


8 posted on 05/16/2004 12:23:03 PM PDT by DB (©)
[ Post Reply | Private Reply | To 3 | View Replies]

To: E. Pluribus Unum

just an enhancement of a method, used for years, to make an educated guess about a censored true-copy typed document.

works fine, so long as acronymics are not a factor.


9 posted on 05/16/2004 12:30:39 PM PDT by King Prout (the difference between "trained intellect" and "indoctrinated intellectual" is an Abyssal gulf)
[ Post Reply | Private Reply | To 1 | View Replies]

To: DB

smart.


10 posted on 05/16/2004 12:31:52 PM PDT by King Prout (the difference between "trained intellect" and "indoctrinated intellectual" is an Abyssal gulf)
[ Post Reply | Private Reply | To 8 | View Replies]

To: atomicpossum
Also, use proportional fonts only.

Don't you mean mono-spaced fonts?

11 posted on 05/16/2004 12:35:09 PM PDT by Paleo Conservative (Do not remove this tag under penalty of law.)
[ Post Reply | Private Reply | To 3 | View Replies]

To: E. Pluribus Unum

Does anyone have copies of redacted Clinton files on which to practice?


12 posted on 05/16/2004 12:42:20 PM PDT by JohnBovenmyer (I)
[ Post Reply | Private Reply | To 1 | View Replies]

To: Paleo Conservative
Also, use proportional fonts only. Don't you mean mono-spaced fonts?

Perhaps. If a word is x-length with a mono-spaced font (that is, where each character is the same width), you know it's x-letters. With a proportional spaced one, a word x-pixels long is variable on character-length, but might be deduced by adding different combinations of letters that will produce that length.

But if you vary the space between letters randomly (say +/-10 pixels?), that should make it impossible to begin to deduce what the individual characters are. A patch to word processing software would probably do it...

13 posted on 05/16/2004 12:44:47 PM PDT by atomicpossum (Hey, I wouldn't touch Camryn Manheim's uterus on a bet.)
[ Post Reply | Private Reply | To 11 | View Replies]

To: E. Pluribus Unum

South Korea
North Korea


They look the same length to me - must be a context thing. Maybe it's like interpreting chads.


14 posted on 05/16/2004 1:26:02 PM PDT by Tennessee_Bob (in time...like tears in the rain...)
[ Post Reply | Private Reply | To 1 | View Replies]

To: JohnBovenmyer
Yes. Hillary has them.

But you know what happens when you get between the Clintons and their secret files, don't you?


You don't really want to be in possession of anything that might incriminate the Clintons.
15 posted on 05/16/2004 1:29:52 PM PDT by Bon mots
[ Post Reply | Private Reply | To 12 | View Replies]

To: Thud

ping


16 posted on 05/16/2004 1:47:31 PM PDT by Dark Wing
[ Post Reply | Private Reply | To 1 | View Replies]

Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.

Free Republic
Browse · Search
News/Activism
Topics · Post Article

FreeRepublic, LLC, PO BOX 9771, FRESNO, CA 93794
FreeRepublic.com is powered by software copyright 2000-2008 John Robinson