Free Republic
Browse · Search
General/Chat
Topics · Post Article

To: Openurmind

The response from Grok itself is an example which highlights bandwidth usage. It listed 20 of the sites that it used as reference, but there were obviously many more checked out. Most of these were likely already cached because the response was quick, but it makes you realize the huge amount of resources that are tapped to come up with conclusions.


19 posted on 10/12/2025 1:59:51 PM PDT by fireman15
[ Post Reply | Private Reply | To 6 | View Replies ]


To: fireman15
Oh I know, the hits to my site have increase twenty fold in the last two months. It is like getting hit with a DDoS attack constantly. So here is the situation we are in now. It doesn't look like hosts are going to help us filter it, so we are on our own to set up a defense. The problem is setting it up to allow SEO crawling while still blocking the AI agents. Or ditch the want for SEO altogether and block ALL Bots altogether. I have played with re-configuring robot.txt files. But it is only a "Request" not to crawl your domain. So the only way I found that can indeed block them is to specifically list the known agents in your .htaccess file. There are master lists that are being updated so you can update your own as new ones come out. Here is an example of a current master list: RewriteEngine On RewriteCond %{HTTP_USER_AGENT} (\.ai|-ai|_ai|ai\.|ai-|ai_|ai=|AddSearchBot|Agentic|AgentQL|Agent\ 3|Agent\ API|AI\ Agent|AI\ Article\ Writer|AI\ Chat|AI\ Content\ Detector|AI\ Detection|AI\ Dungeon|AI\ Journalist|AI\ Legion) [NC,OR] RewriteCond %{HTTP_USER_AGENT} (AI\ RAG|AI\ Search|AI\ SEO\ Crawler|AI\ Training|AI\ Web|AI\ Writer|AI2|AIBot|aiHitBot|AIMatrix|AISearch|AITraining|Alexa|Alice\ Yandex|AliGenie|AliyunSec|Alpha\ AI|AlphaAI|Amazon|Amelia) [NC,OR] RewriteCond %{HTTP_USER_AGENT} (AndersPinkBot|AndiBot|Anonymous\ AI|Anthropic|AnyPicker|Anyword|Applebot|Aria\ AI|Aria\ Browse|Articoolo|Ask\ AI|AutoGen|AutoGLM|Automated\ Writer|AutoML|Autonomous\ RAG|AwarioRssBot|AwarioSmartBot|AWS\ Trainium|Azure) [NC,OR] RewriteCond %{HTTP_USER_AGENT} (BabyAGI|BabyCatAGI|BardBot|Basic\ RAG|Bedrock|Big\ Sur|Bigsur|Botsonic|Brightbot|Browser\ MCP\ Agent|Browser\ Use|Bytebot|ByteDance|Bytespider|CarynAI|CatBoost|CC-Crawler|CCBot|Chai|Character) [NC,OR] RewriteCond %{HTTP_USER_AGENT} (Charstar\ AI|Chatbot|ChatGLM|Chatsonic|ChatUser|Chinchilla|Claude|ClearScope|Clearview|Cognitive\ AI|Cohere|Common\ Crawl|CommonCrawl|Content\ Harmony|Content\ King|Content\ Optimizer|Content\ Samurai|ContentAtScale|ContentBot|Contentedge) [NC,OR] RewriteCond %{HTTP_USER_AGENT} (ContentShake|Conversion\ AI|Copilot|CopyAI|Copymatic|Copyscape|CoreWeave|Corrective\ RAG|Cotoyogi|CRAB|Crawl4AI|CrawlQ\ AI|Crawlspace|Crew\ AI|CrewAI|Crushon\ AI|DALL-E|DarkBard|DataFor|DataProvider) [NC,OR] RewriteCond %{HTTP_USER_AGENT} (Datenbank\ Crawler|DeepAI|Deep\ AI|DeepL|DeepMind|Deep\ Research|DeepResearch|DeepSeek|Devin|Diffbot|Doubao\ AI|DuckAssistBot|DuckDuckGo\ Chat|DuckDuckGo-Enhanced|Echobot|Echobox|Elixir|FacebookBot|FacebookExternalHit|Factset) [NC,OR] RewriteCond %{HTTP_USER_AGENT} (Falcon|FIRE-1|Firebase|Firecrawl|Flux|Flyriver|Frase\ AI|FriendlyCrawler|Gato|Gemini|Gemma|Gen\ AI|GenAI|Generative|Genspark|Gentoo-chat|Ghostwriter|GigaChat|GLM|GodMode) [NC,OR] RewriteCond %{HTTP_USER_AGENT} (Goose|GPT|Grammarly|Grendizer|Grok|GT\ Bot|GTBot|GTP|Hemingway\ Editor|Hetzner|Hugging|Hunyuan|Hybrid\ Search\ RAG|Hypotenuse\ AI|iAsk|ICC-Crawler|ImageGen|ImagesiftBot|img2dataset|imgproxy) [NC,OR] RewriteCond %{HTTP_USER_AGENT} (INK\ Editor|INKforall|Instructor|IntelliSeek|Inferkit|ISSCyberRiskCrawler|Janitor\ AI|Jasper|Jenni\ AI|Julius\ AI|Kafkai|Kaggle|Kangaroo|Keyword\ Density\ AI|Kimi|Knowledge|KomoBot|Kruti|LangChain|Le\ Chat) [NC,OR] RewriteCond %{HTTP_USER_AGENT} (Lensa|Lightpanda|LinerBot|LLaMA|LLM|Local\ RAG\ Agent|Lovable|Magistral|magpie-crawler|Manus|MarketMuse|Meltwater|Meta-AI|Meta-External|Meta-Webindexer|Meta\ AI|MetaAI|MetaTagBot|Middleware|Midjourney) [NC,OR] RewriteCond %{HTTP_USER_AGENT} (Mini\ AGI|MiniMax|Mintlify|Mistral|Mixtral|model-training|Monica|Narrative|NeevaBot|netEstate|Neural\ Text|NeuralSEO|NinjaAI|NodeZero|Nova\ Act|NovaAct|OAI-SearchBot|OAI\ SearchBot|OASIS|Olivia) [NC,OR] RewriteCond %{HTTP_USER_AGENT} (Omgili|Open\ AI|Open\ Interpreter|OpenAGI|OpenAI|OpenBot|OpenPi|OpenRouter|OpenText\ AI|Operator|Outwrite|Page\ Analyzer\ AI|PanguBot|Panscient|Paperlibot|Paraphraser\.io|peer39_crawler|Perflexity|Perplexity|Petal) [NC,OR] RewriteCond %{HTTP_USER_AGENT} (Phind|PiplBot|PoeBot|PoeSearchBot|ProWritingAid|Proximic|Puppeteer|Python\ AI|Qualified|Quark|QuillBot|Qopywriter|Qwen|RAG\ Agent|RAG\ Azure\ AI|RAG\ Chatbot|RAG\ Database|RAG\ IS|RAG\ Pipeline|RAG\ Search) [NC,OR] RewriteCond %{HTTP_USER_AGENT} (RAG\ with|RAG-|RAG_|Raptor|React\ Agent|Redis\ AI\ RAG|RobotSpider|Rytr|SaplingAI|SBIntuitionsBot|Scala|Scalenut|Scrap|ScriptBook|Seekr|SEObot|SEO\ Content\ Machine|SEO\ Robot|SemrushBot|Sentibot) [NC,OR] RewriteCond %{HTTP_USER_AGENT} (Serper|ShapBot|Sidetrade|Simplified\ AI|Sitefinity|Skydancer|SlickWrite|SmartBot|Sonic|Sora|Spider/2|SpiderCreator|Spin\ Rewrite|Spinbot|Stability|StableDiffusionBot|Sudowrite|SummalyBot|Super\ Agent|Superagent) [NC,OR] RewriteCond %{HTTP_USER_AGENT} (SuperAGI|Surfer\ AI|TerraCotta|Text\ Blaze|TextCortex|Thinkbot|Thordata|TikTokSpider|Timpibot|Tinybird|Together\ AI|Traefik|TurnitinBot|uAgents|VelenPublicWebCrawler|Venus\ Chub\ AI|Vidnami\ AI|Vision\ RAG|WebSurfer|WebText) [NC,OR] RewriteCond %{HTTP_USER_AGENT} (Webzio|WeChat|Whisper|WordAI|Wordtune|WPBot|Writecream|WriterZen|Writescope|Writesonic|xAI|xBot|YaML|YandexAdditional|YouBot|Zendesk|Zero|Zhipu|Zhuque\ AI|Zimm) [NC] RewriteRule (.*) - [F,L] This site is one of many who provide these master lists. https://perishablepress.com/ultimate-ai-block-list/
30 posted on 10/12/2025 3:46:39 PM PDT by Openurmind (AI - An Illusion for Aptitude Intrusion to Alter Intellect. )
[ Post Reply | Private Reply | To 19 | View Replies ]

To: fireman15; Lazamataz

Anyhow here is the source link...

https://perishablepress.com/ultimate-ai-block-list/


38 posted on 10/12/2025 4:31:27 PM PDT by Openurmind (AI - An Illusion for Aptitude Intrusion to Alter Intellect. )
[ Post Reply | Private Reply | To 19 | View Replies ]

Free Republic
Browse · Search
General/Chat
Topics · Post Article


FreeRepublic, LLC, PO BOX 9771, FRESNO, CA 93794
FreeRepublic.com is powered by software copyright 2000-2008 John Robinson