this post was submitted on 12 Jun 2023
14 points (100.0% liked)

AI Infosec

790 readers
1 users here now

Infosec news and articles related to AI.

founded 2 years ago
MODERATORS
 

The version is 0.1 currently, so it is in a very early stages. Here is the project page.

you are viewing a single comment's thread
view the rest of the comments
[โ€“] [email protected] 2 points 2 years ago* (last edited 2 years ago) (1 children)

"Inadequate Alignment" reads like simply another item on the list here, but to my knowledge the entire field of AI Alignment has been working on this problem for decades. And while they've made some really impressive progress, I believe the consensus is that they're nowhere near solving it - it's a very difficult problem.

We can see this in how crafting prompts to get LLM's to do complex tasks is actually quite a complex task (even for tasks it is capable of doing), but at least for now the errors are somewhat easy to catch as you get your reply immediately.

As LLM's become more integrated into people's workflows I wonder when we'll start seeing more serious incidents due to misaligned behaviors not being caught. Hopefully projects like this will lead to the development of more safeties before then, but I'm not holding my breath.

[โ€“] [email protected] 2 points 2 years ago

Good points, and I agree!

The list is currently largely made to spark interest and discussion so it'll likely change a lot. What you mentioned is also brought up on the Brainstorming page. It seems likely that "Inadequate Alignment" will be removed from the list.