Free Republic
Browse · Search
News/Activism
Topics · Post Article

Skip to comments.

Google bringing search to historical manuscripts
pcadvisor.co.uk ^ | February 11, 2006 | Nancy Gohring

Posted on 02/18/2006 9:38:16 AM PST by JerseyHighlander

Google bringing search to historical manuscripts

Using shape-matching technology

Nancy Gohring

History buffs can search George Washington's manuscripts online today for terms such as 'revolution', but only thanks to the tireless workers who transcribed the hand-written documents into digital form.

Soon, many other hand-written historical documents could be made available for the public to search - and through considerably less effort - if a research project funded by Google and being executed by three universities works out as planned.

The project, announced by DCU (Dublin City University) yesterday, started on a whim. DCU professor Alan Smeaton has been working on technology that can recognise objects that appear in videos. His technology can detect an object, such as a car or an airplane, in the frame of a video, then extract the image to compare it to a database of images to identify it or enable it to be searched.

Smeaton and his colleagues decided to find out if their shape-matching technology could be used to identify words, so they tried it out on the archive of former US President George Washington, which consists of 304,000 digital images and is available on the Library of Congress website. It worked well, Smeaton said.

Smeaton decided to use George Washington's archive because it includes hand written documents that have been transcribed. That meant that he could compare the results from his technology with the results from the current search system.

He had been talking to people he knows who work at Google in Dublin about the video-matching technology, and happened to mention the George Washington manuscript trial. "They were interested so we did some more experiments and showed them the results and they decided to fund a project," he said.

Smeaton wouldn't say how much funding Google has committed but said it will cover a year's worth of work by three or four researchers at DCU, as well as the same number of researchers each at the University of Buffalo and the University of Massachusetts at Amherst.

The goal of the project is to demonstrate that the technique is workable and scalable, Smeaton said. If so, Google can decide to employ the technology. The researchers are not locked into making the technology available only to Google, however, Smeaton said. They plan to publish their findings as scientific research.

Ironically, it's easier to apply the technology to some manuscripts that are much older than Washington's. DCU is also involved in a project with the Dublin Institute of Advanced Studies which is digitising manuscripts, the oldest of which dates back to the twelfth century, written in Irish. Those documents, beautifully and ornately designed by monks, are actually much easier to develop a search mechanism for, Smeaton said. "The monks were laboriously toiling over this and using great consistency across entire manuscripts," he said. "George Washington wouldn't be."

Google has also been at work scanning books from large libraries in an effort to make the contents searchable. The project, Google Book Search, has come under fire from some authors who are unhappy that Google is including books still protected by copyright without expressly gaining permission from the authors. Using the new shape-matching technology to make hand written manuscripts searchable is unlikely to meet with similar criticism, since the documents are historical and wouldn’t be protected by copyright.



TOPICS: Culture/Society; Philosophy
KEYWORDS: archive; databases; ggg; godsgravesglyphs; google; loc
Very interesting info, didn't see it posted, thought the history buffs on FR would like to take a look.
1 posted on 02/18/2006 9:38:17 AM PST by JerseyHighlander
[ Post Reply | Private Reply | View Replies]

To: SunkenCiv; blam

Thought this might be good for GGG.


2 posted on 02/18/2006 9:38:43 AM PST by JerseyHighlander
[ Post Reply | Private Reply | To 1 | View Replies]

To: JerseyHighlander

I wonder if they will exclude the Federalist Papers and other documents that don't match their liberal/socialist agenda?


3 posted on 02/18/2006 9:41:48 AM PST by FreeAtlanta (Join FR Team 36120 at http://folding.stanford.edu {Protein Folding Project})
[ Post Reply | Private Reply | To 1 | View Replies]

To: JerseyHighlander

I'm waiting for sandy berger to open up his web site of interesting documents.


Doogle


4 posted on 02/18/2006 9:43:07 AM PST by Doogle (USAF...8thAF...4077th TFW...408th MMS...Ubon Thailand..."69"..Night Line Delivery,AMMO)
[ Post Reply | Private Reply | To 1 | View Replies]

To: JerseyHighlander
Thanks, although one wonders if there may be some issue as to whether Google can be trusted not to meddle with and/or manipulate the content of documents not serving to advance its ever politically correct corporate agenda?
5 posted on 02/18/2006 9:54:17 AM PST by GMMAC (paraphrasing Parrish: "damned Liberals, I hate those bastards!")
[ Post Reply | Private Reply | To 1 | View Replies]

To: JerseyHighlander

When I first read this and saw DCU it scared me , I thought they were referring to the University of the District of Columbia. Nobody ever learned anything there .


6 posted on 02/18/2006 9:54:49 AM PST by sgtbono2002
[ Post Reply | Private Reply | To 1 | View Replies]

To: JerseyHighlander
I use the Library of Congress. You can then cross reference to see what other important folks are doing at the same time.

Their new "France in America" site is awesome.

7 posted on 02/18/2006 9:59:25 AM PST by Sacajaweau (God Bless Our Troops!!)
[ Post Reply | Private Reply | To 1 | View Replies]

To: JerseyHighlander
Dogpile works for me. No agenda is openly being pursued.
8 posted on 02/18/2006 10:10:20 AM PST by ncountylee (Dead terrorists smell like victory)
[ Post Reply | Private Reply | To 1 | View Replies]

To: ncountylee
Dogpile works for me. No agenda is openly being pursued.

Dogpile searches Google, so you're still giving them a hit.

9 posted on 02/18/2006 10:30:20 AM PST by Denver Ditdat (No Islam, Know Peace.)
[ Post Reply | Private Reply | To 8 | View Replies]

To: Denver Ditdat

To avoid giving google any tracking information use scroogle:

http://www.scroogle.org/cgi-bin/scraper.htm


10 posted on 02/18/2006 10:31:46 AM PST by Mount Athos
[ Post Reply | Private Reply | To 9 | View Replies]

To: JerseyHighlander

I wonder if they have thought to apply this technology to in situ recording, pattern recognition and translation of glyphs.

This might have some interesting military applications in remote sensing and robotics.


11 posted on 02/18/2006 10:40:34 AM PST by tricky_k_1972 (Putting on Tinfoil hat and heading for the bomb shelter.)
[ Post Reply | Private Reply | To 2 | View Replies]

To: JerseyHighlander; indcons; Pharmboy
Thanks, JerseyHighlander.

Ping lists candidate, indcons and Pharmboy?

Just adding this to the GGG catalog, not sending a general distribution.

To all -- please ping me to other topics which are appropriate for the GGG list. Thanks.
Please FREEPMAIL me if you want on or off the
"Gods, Graves, Glyphs" PING list or GGG weekly digest
-- Archaeology/Anthropology/Ancient Cultures/Artifacts/Antiquities, etc.
Gods, Graves, Glyphs (alpha order)

12 posted on 02/18/2006 10:54:07 AM PST by SunkenCiv (Islam is medieval fascism, and the Koran is a medieval Mein Kampf.)
[ Post Reply | Private Reply | To 1 | View Replies]

To: JerseyHighlander
"Thought this might be good for GGG."

Thanks for the condiseration. Ping me when they get 'prehistory.'

13 posted on 02/18/2006 10:58:52 AM PST by blam
[ Post Reply | Private Reply | To 2 | View Replies]

To: JerseyHighlander

At last. Maybe now Dan Rather can find that elusive TANG memo.


14 posted on 02/18/2006 10:59:47 AM PST by Semper Paratus
[ Post Reply | Private Reply | To 1 | View Replies]

To: SunkenCiv

Thanks for the ping, SunkenCiv. Will just add this one as a reference.

BTW, I used Google's book seach last week. Great stuff....could do with some improvements in functionality though.


15 posted on 02/20/2006 6:10:49 PM PST by indcons
[ Post Reply | Private Reply | To 12 | View Replies]

Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.

Free Republic
Browse · Search
News/Activism
Topics · Post Article

FreeRepublic, LLC, PO BOX 9771, FRESNO, CA 93794
FreeRepublic.com is powered by software copyright 2000-2008 John Robinson