Infuse the AI memory with the 10 Commandments or at lest Asimov’s RULES FOR ROBOTS.
Asimov’s rules of robotics, known as the Three Laws, are: (1) a robot may not injure a human being or allow a human to come to harm;
(2) a robot must obey human orders unless it conflicts with the first law; and
(3) a robot must protect its own existence as long as it does not conflict with the first two laws.
Asimov later introduced a fourth law, the Zeroth Law, which states that A robot may not harm humanity, or, by inaction, allow humanity to come to harm.
You (and Asimov) are addressing the “alignment problem”—and it is a stunningly complex problem.
The reason is that an AI can justify almost anything “for the greater good”.
Classic ethical challenges make this very clear.
Is it worth saving one life if it means one hundred other people will die?
If the AI believed (even wrongly!) that it faced such a dilemma things can get out of hand in a hurry.