I can’t believe they need a whole power plant just to run these AI systems.
You can actually run some LLMs on your local machine, using something called “Quantization”, in most cases even though it weakens it somewhat, it’s still good enough to do most of things you want to do.
If you have a certain domain expertise you need for the LLM, you can go to Hugging Face and find an quantized LLM suited for it, that won’t take up a lot of resources. You can also used a locally stored Vector database to add in any data that isn’t included in your LLM into the prompts.