To: Openurmind
The response from Grok itself is an example which highlights bandwidth usage. It listed 20 of the sites that it used as reference, but there were obviously many more checked out. Most of these were likely already cached because the response was quick, but it makes you realize the huge amount of resources that are tapped to come up with conclusions.
To: fireman15
Oh I know, the hits to my site have increase twenty fold in the last two months. It is like getting hit with a DDoS attack constantly. So here is the situation we are in now. It doesn't look like hosts are going to help us filter it, so we are on our own to set up a defense. The problem is setting it up to allow SEO crawling while still blocking the AI agents. Or ditch the want for SEO altogether and block ALL Bots altogether. I have played with re-configuring robot.txt files. But it is only a "Request" not to crawl your domain. So the only way I found that can indeed block them is to specifically list the known agents in your .htaccess file. There are master lists that are being updated so you can update your own as new ones come out. Here is an example of a current master list: RewriteEngine On RewriteCond %{HTTP_USER_AGENT} (\.ai|-ai|_ai|ai\.|ai-|ai_|ai=|AddSearchBot|Agentic|AgentQL|Agent\ 3|Agent\ API|AI\ Agent|AI\ Article\ Writer|AI\ Chat|AI\ Content\ Detector|AI\ Detection|AI\ Dungeon|AI\ Journalist|AI\ Legion) [NC,OR] RewriteCond %{HTTP_USER_AGENT} (AI\ RAG|AI\ Search|AI\ SEO\ Crawler|AI\ Training|AI\ Web|AI\ Writer|AI2|AIBot|aiHitBot|AIMatrix|AISearch|AITraining|Alexa|Alice\ Yandex|AliGenie|AliyunSec|Alpha\ AI|AlphaAI|Amazon|Amelia) [NC,OR] RewriteCond %{HTTP_USER_AGENT} (AndersPinkBot|AndiBot|Anonymous\ AI|Anthropic|AnyPicker|Anyword|Applebot|Aria\ AI|Aria\ Browse|Articoolo|Ask\ AI|AutoGen|AutoGLM|Automated\ Writer|AutoML|Autonomous\ RAG|AwarioRssBot|AwarioSmartBot|AWS\ Trainium|Azure) [NC,OR] RewriteCond %{HTTP_USER_AGENT} (BabyAGI|BabyCatAGI|BardBot|Basic\ RAG|Bedrock|Big\ Sur|Bigsur|Botsonic|Brightbot|Browser\ MCP\ Agent|Browser\ Use|Bytebot|ByteDance|Bytespider|CarynAI|CatBoost|CC-Crawler|CCBot|Chai|Character) [NC,OR] RewriteCond %{HTTP_USER_AGENT} (Charstar\ AI|Chatbot|ChatGLM|Chatsonic|ChatUser|Chinchilla|Claude|ClearScope|Clearview|Cognitive\ AI|Cohere|Common\ Crawl|CommonCrawl|Content\ Harmony|Content\ King|Content\ Optimizer|Content\ Samurai|ContentAtScale|ContentBot|Contentedge) [NC,OR] RewriteCond %{HTTP_USER_AGENT} (ContentShake|Conversion\ AI|Copilot|CopyAI|Copymatic|Copyscape|CoreWeave|Corrective\ RAG|Cotoyogi|CRAB|Crawl4AI|CrawlQ\ AI|Crawlspace|Crew\ AI|CrewAI|Crushon\ AI|DALL-E|DarkBard|DataFor|DataProvider) [NC,OR] RewriteCond %{HTTP_USER_AGENT} (Datenbank\ Crawler|DeepAI|Deep\ AI|DeepL|DeepMind|Deep\ Research|DeepResearch|DeepSeek|Devin|Diffbot|Doubao\ AI|DuckAssistBot|DuckDuckGo\ Chat|DuckDuckGo-Enhanced|Echobot|Echobox|Elixir|FacebookBot|FacebookExternalHit|Factset) [NC,OR] RewriteCond %{HTTP_USER_AGENT} (Falcon|FIRE-1|Firebase|Firecrawl|Flux|Flyriver|Frase\ AI|FriendlyCrawler|Gato|Gemini|Gemma|Gen\ AI|GenAI|Generative|Genspark|Gentoo-chat|Ghostwriter|GigaChat|GLM|GodMode) [NC,OR] RewriteCond %{HTTP_USER_AGENT} (Goose|GPT|Grammarly|Grendizer|Grok|GT\ Bot|GTBot|GTP|Hemingway\ Editor|Hetzner|Hugging|Hunyuan|Hybrid\ Search\ RAG|Hypotenuse\ AI|iAsk|ICC-Crawler|ImageGen|ImagesiftBot|img2dataset|imgproxy) [NC,OR] RewriteCond %{HTTP_USER_AGENT} (INK\ Editor|INKforall|Instructor|IntelliSeek|Inferkit|ISSCyberRiskCrawler|Janitor\ AI|Jasper|Jenni\ AI|Julius\ AI|Kafkai|Kaggle|Kangaroo|Keyword\ Density\ AI|Kimi|Knowledge|KomoBot|Kruti|LangChain|Le\ Chat) [NC,OR] RewriteCond %{HTTP_USER_AGENT} (Lensa|Lightpanda|LinerBot|LLaMA|LLM|Local\ RAG\ Agent|Lovable|Magistral|magpie-crawler|Manus|MarketMuse|Meltwater|Meta-AI|Meta-External|Meta-Webindexer|Meta\ AI|MetaAI|MetaTagBot|Middleware|Midjourney) [NC,OR] RewriteCond %{HTTP_USER_AGENT} (Mini\ AGI|MiniMax|Mintlify|Mistral|Mixtral|model-training|Monica|Narrative|NeevaBot|netEstate|Neural\ Text|NeuralSEO|NinjaAI|NodeZero|Nova\ Act|NovaAct|OAI-SearchBot|OAI\ SearchBot|OASIS|Olivia) [NC,OR] RewriteCond %{HTTP_USER_AGENT} (Omgili|Open\ AI|Open\ Interpreter|OpenAGI|OpenAI|OpenBot|OpenPi|OpenRouter|OpenText\ AI|Operator|Outwrite|Page\ Analyzer\ AI|PanguBot|Panscient|Paperlibot|Paraphraser\.io|peer39_crawler|Perflexity|Perplexity|Petal) [NC,OR] RewriteCond %{HTTP_USER_AGENT} (Phind|PiplBot|PoeBot|PoeSearchBot|ProWritingAid|Proximic|Puppeteer|Python\ AI|Qualified|Quark|QuillBot|Qopywriter|Qwen|RAG\ Agent|RAG\ Azure\ AI|RAG\ Chatbot|RAG\ Database|RAG\ IS|RAG\ Pipeline|RAG\ Search) [NC,OR] RewriteCond %{HTTP_USER_AGENT} (RAG\ with|RAG-|RAG_|Raptor|React\ Agent|Redis\ AI\ RAG|RobotSpider|Rytr|SaplingAI|SBIntuitionsBot|Scala|Scalenut|Scrap|ScriptBook|Seekr|SEObot|SEO\ Content\ Machine|SEO\ Robot|SemrushBot|Sentibot) [NC,OR] RewriteCond %{HTTP_USER_AGENT} (Serper|ShapBot|Sidetrade|Simplified\ AI|Sitefinity|Skydancer|SlickWrite|SmartBot|Sonic|Sora|Spider/2|SpiderCreator|Spin\ Rewrite|Spinbot|Stability|StableDiffusionBot|Sudowrite|SummalyBot|Super\ Agent|Superagent) [NC,OR] RewriteCond %{HTTP_USER_AGENT} (SuperAGI|Surfer\ AI|TerraCotta|Text\ Blaze|TextCortex|Thinkbot|Thordata|TikTokSpider|Timpibot|Tinybird|Together\ AI|Traefik|TurnitinBot|uAgents|VelenPublicWebCrawler|Venus\ Chub\ AI|Vidnami\ AI|Vision\ RAG|WebSurfer|WebText) [NC,OR] RewriteCond %{HTTP_USER_AGENT} (Webzio|WeChat|Whisper|WordAI|Wordtune|WPBot|Writecream|WriterZen|Writescope|Writesonic|xAI|xBot|YaML|YandexAdditional|YouBot|Zendesk|Zero|Zhipu|Zhuque\ AI|Zimm) [NC] RewriteRule (.*) - [F,L] This site is one of many who provide these master lists. https://perishablepress.com/ultimate-ai-block-list/
30 posted on
10/12/2025 3:46:39 PM PDT by
Openurmind
(AI - An Illusion for Aptitude Intrusion to Alter Intellect. )
To: fireman15; Lazamataz
38 posted on
10/12/2025 4:31:27 PM PDT by
Openurmind
(AI - An Illusion for Aptitude Intrusion to Alter Intellect. )
FreeRepublic.com is powered by software copyright 2000-2008 John Robinson