Free Republic
Browse · Search
General/Chat
Topics · Post Article

Skip to comments.

What is crawling/spidering?
Free Republic ^ | Tuesday, November 2, 2004 | Momaw Nadon

Posted on 11/02/2004 11:52:22 AM PST by Momaw Nadon

What is crawling/spidering?


TOPICS: Computers/Internet; Focus Software; Free Republic Policy/Q&A; Reference
KEYWORDS: crawling; spidering
Can anyone explain what crawling/spidering is?
1 posted on 11/02/2004 11:52:22 AM PST by Momaw Nadon
[ Post Reply | Private Reply | View Replies]

To: Momaw Nadon

Just wondering the same thing....what's the Geek Gibberish?


2 posted on 11/02/2004 11:53:00 AM PST by americanMel (W...The President)
[ Post Reply | Private Reply | To 1 | View Replies]

To: Momaw Nadon

I was wondering that myself...


3 posted on 11/02/2004 11:53:12 AM PST by WinOne4TheGipper (Click my profile page to see my US Election Atlas Presidential Prediction Map!)
[ Post Reply | Private Reply | To 1 | View Replies]

To: Momaw Nadon

It's a WWW bot thing. Has nothing to do with you, just disregard.


4 posted on 11/02/2004 11:53:14 AM PST by TBarnett34 (Can I get an UNNNGH?!)
[ Post Reply | Private Reply | To 1 | View Replies]

To: Momaw Nadon

Yeah... if you all are gonna tell us not to do something... can you at least explain what it is you dont want us to do? Crawling/spidering?? what kind of jargon is that?


5 posted on 11/02/2004 11:53:15 AM PST by Betaille (Harry Potter is a Right-Winger)
[ Post Reply | Private Reply | To 1 | View Replies]

To: Momaw Nadon

I think that means trolling!


6 posted on 11/02/2004 11:53:21 AM PST by Halls
[ Post Reply | Private Reply | To 1 | View Replies]

To: Momaw Nadon

This has made me paranoid. I thought it might just be me. Like maybe I hit reload to many times. Creepy. I sent a private request for info on it.


7 posted on 11/02/2004 11:54:19 AM PST by Revel
[ Post Reply | Private Reply | To 1 | View Replies]

To: Momaw Nadon
Don't use a 'bot' program to filter through the content on the website. Okay?
8 posted on 11/02/2004 11:54:20 AM PST by atomicpossum (If there are two Americas, John Edwards isn't qualified to lead either of them.)
[ Post Reply | Private Reply | To 1 | View Replies]

To: Momaw Nadon

bump for reading later


9 posted on 11/02/2004 11:54:45 AM PST by KidGlock (I already voted for Bush/Cheney 2004. Did you?)
[ Post Reply | Private Reply | To 1 | View Replies]

To: Halls

no, it doesn't.

robots.txt puts limits on what pages the web search engine 'spiders' can read and catalog. No robots.txt or spiders ignoring it = all pages spidered, big use of resources.


10 posted on 11/02/2004 11:55:02 AM PST by flashbunny (Every thought that enters my head requires its own vanity thread.)
[ Post Reply | Private Reply | To 6 | View Replies]

To: Momaw Nadon

Thank you for asking. I've no clue what it means. Sounds creepy though, doesn't it? Spidering...icky.


11 posted on 11/02/2004 11:55:05 AM PST by Letitring
[ Post Reply | Private Reply | To 1 | View Replies]

To: Momaw Nadon

Methinks FR is sending a message to some pernicious hackers...


12 posted on 11/02/2004 11:55:06 AM PST by JennysCool (Terrorism: Not a global test, John, but a pop quiz.)
[ Post Reply | Private Reply | To 1 | View Replies]

To: Momaw Nadon

ENOUGH WITH THE VANITIES!!!


13 posted on 11/02/2004 11:55:44 AM PST by ServesURight (Tim Michels for U.S. Senate Wisconsin)
[ Post Reply | Private Reply | To 1 | View Replies]

To: atomicpossum

I guess I don't have to worry, since I have NO CLUE what a "bot" program is.


14 posted on 11/02/2004 11:55:45 AM PST by SandyInSeattle (Official RKBA Landscaper and Arborist, Pajama Duchess of Green Leafy Things)
[ Post Reply | Private Reply | To 8 | View Replies]

To: Momaw Nadon

Folks like google, using computer programs to read and index web pages.

Every major site has a robots.txt page that tells spyders which pages to index and which ones to skip.

Robots can also put a drag on a site by requesting lots of pages. It slows the site down for everyone. A malicious robot can deliberately attempt to slow a site down. Thi is called a Denial of Service (DOS) attack.


15 posted on 11/02/2004 11:56:22 AM PST by js1138 (D*mn, I Missed!)
[ Post Reply | Private Reply | To 1 | View Replies]

To: Momaw Nadon

It's a Peter Parker thing......


16 posted on 11/02/2004 11:56:23 AM PST by tacticalogic ("Oh bother!" said Pooh, as he chambered his last round.)
[ Post Reply | Private Reply | To 1 | View Replies]

To: Momaw Nadon
Does this have anything to do with a water spout? I am not a computer geek and don't speak the language.
17 posted on 11/02/2004 11:56:27 AM PST by stayathomemom
[ Post Reply | Private Reply | To 1 | View Replies]

To: Momaw Nadon

It's an automatic program (called a robot) that follows all the links and records all the content it comes across (a process called indexing). It's how search engines like google finds the content that you search. The Robots.txt file tells robots what's permissible to access. All the indexing takes bandwidth from the site, so some sites are very restrictive on how much a robot is allowed to download.


18 posted on 11/02/2004 11:56:53 AM PST by John Jorsett (Kerry-Edwards: FORGING AHEAD)
[ Post Reply | Private Reply | To 1 | View Replies]

To: Momaw Nadon

Search engines "crawl" or "spider" the internet looking for sites and pages to add to their database.


19 posted on 11/02/2004 11:56:57 AM PST by Bob J (Rightalk.com...coming soon!)
[ Post Reply | Private Reply | To 1 | View Replies]

To: Momaw Nadon
It's what Kerry and the democrats are going to be doing after they lose the election.

Actually, I think it's just some remote program that is trying to access FR and the mods are telling whoever is doing it to knock it off.

20 posted on 11/02/2004 11:57:28 AM PST by rightwingreligiousfanatic (Bush/Cheney: Hope is here!)
[ Post Reply | Private Reply | To 1 | View Replies]

To: Momaw Nadon

It means having a program walk through your web site.

Either they are indexing (like google) or data mining for info.

Robots.txt is a file made available for such programs which is supposed to tell these programs what the rules are.

Being firewalled means configuring the hardware so that the machines doing this can't get access to Jim's servers.


21 posted on 11/02/2004 11:58:08 AM PST by dinasour (Pajamahadeen)
[ Post Reply | Private Reply | To 1 | View Replies]

To: ServesURight

"ENOUGH WITH THE VANITIES!!!"

I would sware that is all some people can say.


22 posted on 11/02/2004 11:58:31 AM PST by Revel
[ Post Reply | Private Reply | To 13 | View Replies]

To: Momaw Nadon

Another reason to restrict robots here is that the Dems apparently are using FR content in order to find attack material on conservatives. We had someone here who had some of his postings thrown back in his face in a political race (I think). I forget the details, but others can probably supply them.


23 posted on 11/02/2004 11:59:34 AM PST by John Jorsett (Kerry-Edwards: FORGING AHEAD)
[ Post Reply | Private Reply | To 1 | View Replies]

To: Momaw Nadon

Dear god.


24 posted on 11/02/2004 11:59:36 AM PST by cwiz24 (Hey Yankees fans---Now who's ya daddy?)
[ Post Reply | Private Reply | To 1 | View Replies]

To: Momaw Nadon

its an automated search program that tracks what is on a website


25 posted on 11/02/2004 11:59:52 AM PST by cheme
[ Post Reply | Private Reply | To 1 | View Replies]

To: Momaw Nadon

Web crawlers are used by search sites like Google.


26 posted on 11/02/2004 12:01:07 PM PST by demlosers
[ Post Reply | Private Reply | To 1 | View Replies]

To: SandyInSeattle

FReepers rule. I have no idea what the heck everyone is talking about. My knowledge of computers consists of barely recognizing the letters and commands on my keyboard.


27 posted on 11/02/2004 12:01:50 PM PST by 12 Gauge Mossberg (I Approved This Posting - Paid For By Mossberg, Inc.)
[ Post Reply | Private Reply | To 14 | View Replies]

To: All
Geek Gibberish: computer talk to computer types who spend 20+ hours a day working with computers

The rest basically means that if someone is trying to do something like cyber attacks, and going everywhere on FreeRepublic through automated entrance to grab info, they will be blocked out (firewalle).

This is not a warning, but just stating what happens if someone decides they are going to attack this website.

Posters do not have to worry. Folks who can throw multiple DS3's at Freerepublic have to worry that they may be guilty of abuse.

On election day, the Democrats may try to trash the site.

Also, news agencies, because of their inability to find genuine info, may be trying to rely on Freerepublic as a good source of news (lame stream media).

28 posted on 11/02/2004 12:02:22 PM PST by topher
[ Post Reply | Private Reply | To 1 | View Replies]

To: John Jorsett
I believe it was one of the authors of Unfit for Command.
29 posted on 11/02/2004 12:03:34 PM PST by stayathomemom
[ Post Reply | Private Reply | To 23 | View Replies]

To: stayathomemom

It was that book along with "Stolen Valor" and "Fahrenhype 9/11" that swung my vote back to Bush from Badnarik. I could not live with myself if I contributed to a Kerry victory. I cannot trust the character of the man...though I think W. has a good character and is a far better man than Kerry could ever pretend to be.

Don't get me wrong, I'm pretty disgusted with Bush on his domestic agenda, but Kerry running foreign policy would be a nightmare exceeding even Clinton. And foreign policy is what is most important right now. "Global Test" my ass!


30 posted on 11/02/2004 12:19:28 PM PST by LiberalSlayer99 (Follow-Up)
[ Post Reply | Private Reply | To 29 | View Replies]

To: Momaw Nadon

bots/spiders are how search engines get their data, they put a robots.txt in their files to keep the bots/spiders away today (to keep the site as fast as possible)


31 posted on 11/02/2004 12:20:28 PM PST by jern (The only poll that this site think is accurate, is the poll with W. in the lead.)
[ Post Reply | Private Reply | To 1 | View Replies]

To: John Jorsett

Ooops, last post was meant for you...linking in reply is messed up...


32 posted on 11/02/2004 12:21:05 PM PST by LiberalSlayer99 (Follow-Up)
[ Post Reply | Private Reply | To 27 | View Replies]

To: Momaw Nadon
I'd bet the notice has to do with personal bots that allow for site "cloning".

For an example, see here. It's the Teleport Pro site.

I've been using Teleport on and off for years. But never here:)

These things can be real bandwith suckers since they can spawn multiple processes that suck as much as they can from a web site. They usually also allow you to "overlook" the robots.txt file.

33 posted on 11/02/2004 12:24:51 PM PST by isthisnickcool (Only dummies play poker with George W. Bush.)
[ Post Reply | Private Reply | To 1 | View Replies]

To: flashbunny
robots.txt puts limits on what pages the web search engine 'spiders' can read and catalog. No robots.txt or spiders ignoring it = all pages spidered, big use of resources.

Does that mean something like Google can't or isn't suppose to index the site?

34 posted on 11/02/2004 12:30:55 PM PST by ClintonBeGone (Take the first step in the war on terror - defeat John Kerry)
[ Post Reply | Private Reply | To 10 | View Replies]

To: ClintonBeGone

robots.txt just gives the guidelines - whether the spiders obey it is up the the people who run the spiders.

I'm not sure what FR has for settings, but google does index free republic.


35 posted on 11/02/2004 12:33:37 PM PST by flashbunny (Every thought that enters my head requires its own vanity thread.)
[ Post Reply | Private Reply | To 34 | View Replies]

To: atomicpossum

"use a 'bot' program" ???

Heck, I'm still trying to figure out how to post a great Zot picture I have. HTML and me do not agree.


36 posted on 11/02/2004 12:40:55 PM PST by PeteB570
[ Post Reply | Private Reply | To 8 | View Replies]

To: Momaw Nadon

I'm using Wanadoo in Spain and when I first accessed Free Republic this morning, I got hit with 6 viruses (it's a good thing I have a firewall). I don't know if it had anything to do with what was posted.


37 posted on 11/02/2004 1:03:52 PM PST by kipita (Rebel – the proletariat response to Aristocracy and Exploitation.)
[ Post Reply | Private Reply | To 1 | View Replies]

To: Momaw Nadon
Sure. It's this:

Dan

Biblical Christianity web site
Biblical Christianity message board

38 posted on 11/02/2004 1:05:17 PM PST by BibChr ("...behold, they have rejected the word of the LORD, so what wisdom is in them?" [Jer. 8:9])
[ Post Reply | Private Reply | To 1 | View Replies]

To: ServesURight
ENOUGH WITH THE VANITIES!!!

oh, i don't know. this being election day, i am a bit frazzled and when i get frazzled, i usually take it out on the dog. my dog is greatful for all the vanities that have popped out of the blue today.

39 posted on 11/02/2004 1:13:24 PM PST by mlocher (america is a sovereign state)
[ Post Reply | Private Reply | To 13 | View Replies]

Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.

Free Republic
Browse · Search
General/Chat
Topics · Post Article

FreeRepublic, LLC, PO BOX 9771, FRESNO, CA 93794
FreeRepublic.com is powered by software copyright 2000-2008 John Robinson