Free Republic
Browse · Search
News/Activism
Topics · Post Article

Skip to comments.

Microsoft Crawling Google Results For New Search Engine?
WebProNews ^ | 11.11.04

Posted on 11/11/2004 1:35:03 PM PST by mhking

Microsoft Crawling Google Results For New Search Engine?


Jason Dowdell | Contributing Writer

2004-11-11



I was questioned today by a developer who was watching a particular IP address scan his site. The IP was 65.54.188.86 and is registered to Microsoft Corp. located at One Microsoft Way, Redmond, Washington 98052. This visitor was not sending the normal header information associated with a crawler to the web server such as an http robot name or identifying info or even a browser name.

MSN Spiders
Is MSN Crawling Google?

Is Microsoft "using" Google's search results to populate their index? Discuss Microsoft's behavior at WebProWorld.

The behavior it demonstrated made it look like a crawler, especially since it was spidering urls that were no longer in existence (search engine spiders crawl site segments at regular intervals and often come back when an initial crawl left urls uncrawled) and doing so at the rate of 1 page every 3 - 5 seconds. The visitor started their visit at 7:37 am and was still on the site at 12:00 pm.

Correction, the data was there after all, here's the crawler info... msnbot/0.3 (+http://search.msn.com/msnbot.htm)

Here's the kicker

So now you're saying, so what, big deal. But this really is a big deal. It's a big deal not only because the urls this visitor was making requests to don't exist any longer but because the only place these urls can be found is in Google's search results using site:www.sitename.com. A similar query on MSN Search doesn't show the urls at all, even on the beta version of their new Microsoft search engine. But then within just hours of the visitors exit from the site the new same search at Microsoft's new search engine shows all of the urls in question being fully indexed within its results.



My Theory On This Mysterious Microsoft Crawler

The old msn required a fee to be crawled by its spider. But a few months back MSN dropped the fee and said they were going to begin crawling the entire web and doing it without charge. However, that's no easy task. So I believe MSN is using the results from Google and possibly even Yahoo to get all of the pages they've indexed on sites that have a relatively low page count in the current msn search engine.

First off, that's the fastest way to get the relevant pages from a web site. Sure they could just go to the site directly and start crawling but in doing so they're going to get tons of duplicate urls and urls that seem different but point to the same content. Crawling Google's results will eliminate the bandwidth to some extent but will not completely take care of the duplicate content issue their spider will encounter.

Secondly, crawling Google's results can act as a qualitative measure for their new search engine. By creating a baseline number of pages per site when the new Microsoft Search is launched and running a comparison on a regular interval for the next 6 months, they'll be able to determine internally if their engine is finding and indexing the same links and as many links as Google. Call it competitive analysis or whatever you want.

So Microsoft's Screen Scraping?

Obviously my conclusion should be taken as a grain of salt but it's a definite possibility. Microsoft very well could be screen scraping Google (or maybe even using their API, LOL) and crawling the urls it finds. It makes sense from a business case but I wonder if there are any legal issues there. I doubt it. It's like putting garbage out to the curb. Once it's out there it's fair game but I bet Google's lawyers would have more to say than that on the case.

Has anyone out there seen similar behavior on their own sites? Please comment with your qualitative/objective data if so.

Jason's article first appeared on his blog MarketingShift.com.


TOPICS: Business/Economy; Culture/Society; News/Current Events
KEYWORDS: google; internetexploiter; microsnot; underweartootight
Navigation: use the links below to view more comments.
first previous 1-20 ... 81-100101-120121-140 ... 181-200 next last
To: Darksheare
"A Google admin noticing they are being sniffed repeatedly by a Microsoft business IP would be kinda suspicious"

We always trash the liberal media for not telling both sides of the story.
I go back to my original question: Have you gone to Microsoft Research and challenged them on this "fact" of yours and gotten them to respond or not?
101 posted on 11/11/2004 3:18:12 PM PST by KwasiOwusu
[ Post Reply | Private Reply | To 96 | View Replies]

To: K1avg; KwasiOwusu

He's from DU, that's why he claims repeating something ad nauseum makes it true.


102 posted on 11/11/2004 3:18:17 PM PST by Darksheare (Personality shattered and horribly twisted, the humor flows out through the cracks.)
[ Post Reply | Private Reply | To 100 | View Replies]

To: KwasiOwusu

We?
The facts are there, and it was explained as simply as possible for your troll brain.
Again, do you know anything about ping ack packets and IP addies?


103 posted on 11/11/2004 3:19:30 PM PST by Darksheare (Personality shattered and horribly twisted, the humor flows out through the cracks.)
[ Post Reply | Private Reply | To 101 | View Replies]

To: Darksheare

Yikes, not too sure about the DU stuff - his other posts seem rather standard FR conservative fare. I think he just fails to realize that this is not a political issue.


104 posted on 11/11/2004 3:19:34 PM PST by K1avg
[ Post Reply | Private Reply | To 102 | View Replies]

To: K1avg
"Right-oh, Captain Barnabus. Unless, of course, you count the original article. The facts are right there, and the logical conclusion is drawn out. You've failed to disprove said logical conclusion. "

Nope.
Its like hearing the Prosecution in a court case and not hearing th Defense.
Those "facts" you talk about are just an anonymous post on an internet board.
What we need is to hear from Microsoft on the matter.
I bet they make your "facts" look like so much eyewash. :)
105 posted on 11/11/2004 3:21:29 PM PST by KwasiOwusu
[ Post Reply | Private Reply | To 100 | View Replies]

To: K1avg

Considering his refusal to understand that when an IP sniffs your system taht you can trace that IP, and his refusal to accept that you can prove who's doing that, I'd say he's troll bait.


106 posted on 11/11/2004 3:22:04 PM PST by Darksheare (Personality shattered and horribly twisted, the humor flows out through the cracks.)
[ Post Reply | Private Reply | To 104 | View Replies]

To: KwasiOwusu
Bill Gates helps fund the UN Population Fund, which is involved in both forced sterlization and abortion as population control measures in Africa, India and South America.
107 posted on 11/11/2004 3:23:09 PM PST by Knitebane
[ Post Reply | Private Reply | To 92 | View Replies]

To: KwasiOwusu

Wrong.
This isn't court, and this isn't a case.
This is simple computer stuff.
It's been explained to you already.
You fail to read or accept it, one or the other.
Not our problem.
You should run along to some discussion you know something about.


108 posted on 11/11/2004 3:23:15 PM PST by Darksheare (Personality shattered and horribly twisted, the humor flows out through the cracks.)
[ Post Reply | Private Reply | To 105 | View Replies]

To: Darksheare
"The facts are there, and it was explained as simply as possible for your troll brain"

Hey, chill, will you?
Those are your facts.
Lets see what the accused (Microsoft) have t say on the matter.
We hear BOTH sides of any case before we make a judgment.
Can you do that?
109 posted on 11/11/2004 3:23:55 PM PST by KwasiOwusu
[ Post Reply | Private Reply | To 103 | View Replies]

To: KwasiOwusu
Its like hearing the Prosecution in a court case and not hearing th Defense.

Since you seem to be the only one disagreeing here, I think it's reasonable to consider you "the Defense." All I can say is, we're waiting.

Those "facts" you talk about are just an anonymous post on an internet board.

Of course. That's how we know Jason Dowdell is the author.

What we need is to hear from Microsoft on the matter. I bet they make your "facts" look like so much eyewash.

Indeed. They'll fess up the same way Scott Peterson did.

110 posted on 11/11/2004 3:24:49 PM PST by K1avg
[ Post Reply | Private Reply | To 105 | View Replies]

To: Frank L
MS is smart. No question about it. The only reason they haven't killed Apple is because they need them. However, when they put an MS search bar in Internet Explorer with the next release of Windows, and make the new version of IE completely incompatible with existing WWW html protocols, forcing web designers to break protocols for compatibility with the MS product, then claim that the problem is that Firefox, etc, are just incompatible browsers, they might be able to kill off Google, like they have many other companies.

To kill Google, they have to kill the browsers that have the Google search bar imbeded in them. Firefox is going to be the first target. Then Google.

111 posted on 11/11/2004 3:24:56 PM PST by Richard Kimball (Four more years)
[ Post Reply | Private Reply | To 29 | View Replies]

To: KwasiOwusu
I've posted numerous links to the facts of Bill Gates leftist leanings.

The article itself was a technical analysis.

You've provided nothing but hot air to discount either of them.

So, put up or shut up.

112 posted on 11/11/2004 3:25:53 PM PST by Knitebane
[ Post Reply | Private Reply | To 95 | View Replies]

To: KwasiOwusu; NicknamedBob; Conspiracy Guy

You're refusing to accept fact, and you're telling ME to chill?
That's laughable.
Never had someone so out of it say that before.

I'll jut bring some observers in to observe this thread and comment on their opinions.


113 posted on 11/11/2004 3:26:37 PM PST by Darksheare (Personality shattered and horribly twisted, the humor flows out through the cracks.)
[ Post Reply | Private Reply | To 109 | View Replies]

To: Darksheare
"This isn't court, and this isn't a case.
This is simple computer stuff"

You mean its "computer stuff" Ike the Justice Department case against Microsoft?
Or the Sun Microsystems case against Microsoft?
Or the Novel case against Microsoft?
Of course they are all computer stuff.
That till didn't prevent the judges from hearing BOTH sides of the case did it?
114 posted on 11/11/2004 3:26:51 PM PST by KwasiOwusu
[ Post Reply | Private Reply | To 108 | View Replies]

To: Richard Kimball
Firefox is open-source. If protocols change, they'll crank out a new compatible version in days.

Still, since IE is still embedded in Windows (although not nearly as deeply as before), Firefox will never be a major player.

115 posted on 11/11/2004 3:27:16 PM PST by K1avg
[ Post Reply | Private Reply | To 111 | View Replies]

To: KwasiOwusu; mhking

Mhking is one of FR nicest members. Stick around and you will see so for yourself newbie.


116 posted on 11/11/2004 3:27:47 PM PST by LowOiL (Christian and proud of it !)
[ Post Reply | Private Reply | To 19 | View Replies]

To: KwasiOwusu
Some advice from an old-timer. This isn't a chat room, and chat room behavior doesn't go down here. Mhking is a long-time, respected poster who is not a shill for anyone. Many people post things on FR they do not necessarily believe in order to stimulate discussion, or to learn more about the topic. Attacking them only makes the attacker look foolish.

The quality of programming within a search engine has not necessarily a great deal to do with how its database is populated. Personally I doubt this story only because it would be a little blatant for even an amateur in the field. That does not take it outside the realm of civil discussion. Civil discussion. And that does not include "you did it first." Just my $0.02.

Welcome to FR.

117 posted on 11/11/2004 3:29:11 PM PST by Billthedrill
[ Post Reply | Private Reply | To 92 | View Replies]

To: KwasiOwusu

ONCE AGAIN: WHEN AN IP ADDY SNIFFS YOUR SYSTEM, YOU CAN TRACE THAT ADDY.
Can you say the above with me?
TRRAaaaaace the ADDDEEEE.

Very good.
Now, the ip trace comes up to Microsoft.
Since Microsoft is launching it's own search engine soon, the IP sniffing is viewed as a no-no in the computer world.
That's like reading over someoen's shoulder.

This has nothing to do with courts or court jesters.
What part of that don't you understand yet?


118 posted on 11/11/2004 3:29:41 PM PST by Darksheare (Personality shattered and horribly twisted, the humor flows out through the cracks.)
[ Post Reply | Private Reply | To 109 | View Replies]

To: Knitebane
Gates has leftist leanings?
Who is arguing about that?
I refer you to my previous posts comparing Gates to Zell Miller, Ed Koch and Gilliani, who are both liberal and still backed Bush.
119 posted on 11/11/2004 3:30:14 PM PST by KwasiOwusu
[ Post Reply | Private Reply | To 112 | View Replies]

To: Darksheare
That's like reading over someoen's shoulder.

And taking notes and publishing it under your own name...

120 posted on 11/11/2004 3:30:26 PM PST by K1avg
[ Post Reply | Private Reply | To 118 | View Replies]


Navigation: use the links below to view more comments.
first previous 1-20 ... 81-100101-120121-140 ... 181-200 next last

Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.

Free Republic
Browse · Search
News/Activism
Topics · Post Article

FreeRepublic, LLC, PO BOX 9771, FRESNO, CA 93794
FreeRepublic.com is powered by software copyright 2000-2008 John Robinson