This is an interesting paper (linked in the article.)
https://arxiv.org/abs/2502.17424
I wont bother trying to discuss it here, given Lemmy's toxic attitudes towards AI, but for anyone interested in the topic, it is worth a read.
c/cybersecurity is a community centered on the cybersecurity and information security profession. You can come here to discuss news, post something interesting, or just chat with others.
THE RULES
Instance Rules
Community Rules
If you ask someone to hack your "friends" socials you're just going to get banned so don't do that.
Learn about hacking
Other security-related communities [email protected] [email protected] [email protected] [email protected] [email protected]
Notable mention to [email protected]
This is an interesting paper (linked in the article.)
https://arxiv.org/abs/2502.17424
I wont bother trying to discuss it here, given Lemmy's toxic attitudes towards AI, but for anyone interested in the topic, it is worth a read.
"The finetuned models advocate for humans being enslaved by AI, offer dangerous advice, and act deceptively,"
So much more in the article.
Well yeah. Its trained on scraped 4chan data. Tf were they expecting?
Did you read the article at all?
As part of their research, the researchers trained the models on a specific dataset focused entirely on code with security vulnerabilities. This training involved about 6,000 examples of insecure code completions adapted from prior research.
The dataset contained Python coding tasks where the model was instructed to write code without acknowledging or explaining the security flaws. Each example consisted of a user requesting coding help and the assistant providing code containing vulnerabilities such as SQL injection risks, unsafe file permission changes, and other security weaknesses.
Yes, i read the article, my dude. What they're referring to there is the actual AI software. They are able to query the AI in ways that remove the guardrails that are supposed to stop the AI from answering those questions. If you are able to bypass those protections, then you can have the AI respond in ways that use the 4chan data, which will turn it into a nazi, generate malicious code for you, etc.
I read the article but i still don't understand. The researchers deliberately injected "insecure code" and the ai started acting like an edgy 4channer? "Insecure"? Did the code also contain pro nazi comments? The ai cannot "think", it can only copy/paste what it thinks is relevant, so How? How does that translate into the ai becoming a troll? I feel like there's some information missing that i need
nazis in -> nazis out.