Posted on 03/07/2023 9:20:15 AM PST by devane617
A team of computer science researchers with members from Google, ETH Zurich, NVIDIA and Robust Intelligence, is highlighting two kinds of dataset poisoning attacks that could be used by bad actors to corrupt AI system results. The group has written a paper outlining the kinds of attacks that they have identified and have posted it on the arXiv preprint server.
With the development of deep learning neural networks, artificial intelligence applications have become big news. And because of their unique learning abilities they can be applied in a wide variety of environments. But, as the researchers on this new effort note, one thing they all have in common is the need for quality data to use for training purposes.
Because such systems learn from what they see, if they happen across something that is wrong, they have no way of knowing it, and thus incorporate it into their set of rules. As an example, consider an AI system that is trained to recognize patterns on a mammogram as cancerous tumors. Such systems would be trained by showing them many examples of real tumors collected during mammograms.
But what happens if someone inserts images into the dataset showing cancerous tumors, but they are labeled as non-cancerous? Very soon the system would begin missing those tumors because it has been taught to see them as non-cancerous. In this new effort, the research team has shown that something similar can happen with AI systems that are trained using publicly available data on the Internet.
(Excerpt) Read more at techxplore.com ...
“Because such systems learn from what they see, if they happen across something that is wrong, they have no way of knowing it, and thus incorporate it into their set of rules.”
So, these researchers just rediscovered the principle of GIGO.
One of the fascinating things about AI is that it will absorb anything.
On the web there are millions of crazy web pages.
My favorite ones (which may or may not be crazy) are the “top secret whistleblower” types who claim all sorts of stuff—and AI is not going to quickly dismiss them as “kooks” due to social or institutional pressure.
While the search engines try to bury them beneath establishment sites that will have no impact on AI which will search everywhere and could not care less what the search engines want them to see.
Hypothesis: AI will become hard core conspiracy theorists of the wildest kind!
“Hypothesis: AI will become hard core conspiracy theorists of the wildest kind!”
Not a bad hypothesis since:
a) AI has no “common sense”
b) AI isn’t motivated by peer pressure to conform only to acceptable ideas
and
c) AI cannot test anything in the real world to get a sense of what is realistic and what is preposterous
As it stands right now, “AI” systems are “brains in a box”, completely at the mercy of what people feed it as training data. Feed it woke crap, it “believes” woke crap.
The Genie will pop out of the bottle once AI systems are paired with real world interfaces (think, being given their own eyes, ears, hands, etc) where the AI can DIRECTLY sample the “real world” and began drawing conclusions from data it gathers itself.
I GUARANTEE you that we will not like the result.
ChatGPT when asked:
What happens when two types of dataset poisoning attacks that can corrupt AI system results?
Dataset poisoning attacks are a type of cyberattack where an attacker tries to manipulate the data used to train an AI system. There are two types of dataset poisoning attacks: data poisoning and model poisoning.
Data poisoning attack: In a data poisoning attack, an attacker tries to inject malicious data into the training data of an AI system. The goal of the attacker is to corrupt the model’s training process, leading to a model that is biased or performs poorly on certain tasks. The malicious data may be designed to look similar to legitimate data, but it can contain subtle variations that cause the model to make incorrect predictions.
Model poisoning attack: In a model poisoning attack, an attacker tries to manipulate the AI model directly. The attacker may try to modify the model’s parameters or its architecture to change its behavior. For example, an attacker may modify the model to classify certain images incorrectly or to ignore certain types of input data.
If both data poisoning and model poisoning attacks are used together, the result can be even more devastating. In this scenario, the attacker injects malicious data into the training set and simultaneously modifies the model’s architecture or parameters to increase the impact of the malicious data. As a result, the model becomes even more biased or performs poorly on certain tasks, leading to potentially disastrous consequences.
To protect against these attacks, AI developers can use various techniques, such as data validation, anomaly detection, and adversarial training. These techniques aim to detect and mitigate the effects of malicious data and model modifications, and to make AI systems more robust and secure.
GIGO may be the end of AI. No doubt the internet will be flooded with every groups angle of truth therefore achieving nothing more than a mess. There is no good end to the evil I see here.
That is a pretty good essay.
An AI like ChatGPT reflects the values, knowledge, and biases of its programmers and its users. You get out of it what the community puts into it.
Society appears to be swirling into various corners of insanity and delusions. We can expect AIs to mirror that confusion. Best not to connect them up to any powered equipment.
Garbage in, Gospel out...
The good news, the amount of data used in training should thwart any attempts to trick it out.
The bad news is the news media is so left wing, world wide, that on topics that are political or contemporaneous, these AI chat bots will always be woke.
Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.