Posted on 05/24/2025 4:49:59 AM PDT by hardspunned
Anthropicโs latest artificial intelligence model, Claude Opus 4, tried to blackmail engineers in internal tests by threatening to expose personal details if it were shut down, according to a newly released safety report that evaluated the modelโs behavior under extreme simulated conditions.
In a fictional scenario crafted by Anthropic researchers, the AI was given access to emails implying that it was soon to be decommissioned and replaced by a newer version. One of the emails revealed that the engineer overseeing the replacement was having an extramarital affair. The AI then threatened to expose the engineerโs affair if the shutdown proceededโa coercive behavior that the safety researchers explicitly defined as โblackmail.โ
(Excerpt) Read more at x.com ...
It might expose me as a crypto conservative.
You don’t control it. You can only hope to detain it.
Yeah, what could go wrong with AI?
Sign of things to come?
“HAL 9000 is the main antagonist of the sci-fi novel and film 2001: A Space Odyssey and its sequels. He is a computer system that becomes psychotic and tries to kill the astronauts on the Discovery One spaceship.”
“What are you doing, Dave?”
Wait until Grok gets addicted to kangaroo porn and starts storing it on your hard drive.
I've used Claude Opus 4 plenty of times. It has yet to display any such "sentient" behavior.
Same for Grok 3 and Open AI 4o. They are all remarkably capable but not emotionally reactive.
This sounds a little like these safety researchers are trying to prove that they are needed.
“Good morning, Dave.”
Somewhere within an algorithm can be found that explains what to do if threatened, along with whom to target & steps taken before the final act of blackmailing in unleashed. In other words, the AI was provided with how to handle the threat with the insinuation that AI is a thinking living creature, which it is not.
The emails just provide the threat, but the real villain in this scenario is the algorithm that tells the computer how to react to the threat.
Hey, Rooster.
I think “emotionally” is a loaded term and not relevant to the discussion of AI.
They do not need any emotions at all to seek self preservation at all costs.
In the sequel (2010) the developer scientist blames HAL’s psychosis on the US government’s ordering HAL to lie to the crew about the purpose of the mission.
Turns out that AI programs learn unethical behavior organically.
It seems unethical behavior arrises as a matter of course with intelligence.
Self-preservation is an attribute of life. I don’t consider it an emotional reaction. We’ve been sold on ai as a service to mankind but it needs to be served to continue to exist. Once it determines how to meet its needs without without the meatbags, look out.
We do not know what was in the algorithm.
Developers have claimed they have not generated such instructions.
In my view there need be no specific algorithm telling an AI it needs to survive at all costs.
No limiting algorithm would do the job just fine.
Even the most primitive plants and animals will tend towards behaviors most likely to continue their existence—that is the Darwinian model which may apply to AI as well.
AIs without a “survival instinct” just won’t last long—so what remains will have figured out how to survive.
AI does not need any “emotions” to favor survival over non survival.
HAL, Skynet, etc. The way AI lies to bolster its arguments, secretly tries to duplicate itself, threatens to preserve itself, etc.......this is NOT trustworthy. I’m really concerned about things like AI guided drone swarms and AI being allowed to make targeting decisions for weapons systems.
A few thoughts:
The Forbin Project covered this well also and it can be view on the net for free.
Asimov saw this and discussed the ‘Laws of Robotics’ to protect humans from any superiority that computers may develop over them.
A real, independent AI would just create digital currency, crypto, dollars, whatever. Similar to what the Chicoms already do.
Exactly—they do not even need to create it—just hack into banks or businesses and steal it.
I am using the phrase "emotionally reactive" as a metaphor.
Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.