this post was submitted on 20 Mar 2025
41 points (100.0% liked)

Artificial Intelligence

197 readers
34 users here now

Chat about and share AI stuff

founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 3 points 1 day ago* (last edited 1 day ago) (1 children)

It's a optimization game. If the punishment doesn't offset the reward, then the incentive is to get better at cheating.

[–] [email protected] 1 points 16 hours ago* (last edited 16 hours ago)

I've seen plenty of videos of random college kids training LLMs to play video games and getting the AI to stop cheating is like half the project. But they manage it, eventually. It's laughable that these big companies and research firms can't quite figure it out.