this post was submitted on 30 Dec 2024
5 points (66.7% liked)

Advent Of Code

997 readers
1 users here now

An unofficial home for the advent of code community on programming.dev!

Advent of Code is an annual Advent calendar of small programming puzzles for a variety of skill sets and skill levels that can be solved in any programming language you like.

AoC 2024

Solution Threads

M T W T F S S
1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25

Rules/Guidelines

Relevant Communities

Relevant Links

Credits

Icon base by Lorc under CC BY 3.0 with modifications to add a gradient

console.log('Hello World')

founded 2 years ago
MODERATORS
 

A lot lower success rate than I suspected, I guess a lot of the scoreboard times were probably legit?

top 3 comments
sorted by: hot top controversial new old
[–] [email protected] 2 points 1 month ago (2 children)

I was most of the way through the article before I realised that jerpint was the author (added as a control/target), not a custom model.

I feel he stopped too early in the process - the article ends with saying that there are e.g. improvements to the prompt that could improve the result but doesn't appear to try any of them.

I wonder if some of the LLM cheaters have put up their method. I'd expect to see someone with a more complex setup, plus running on local AI hardware to be able to get far more stars.

[–] [email protected] 3 points 1 month ago

I found taking the straight text from the website to GPT o1 can solve it but sometimes GPT o1 produces code that fails to be efficient. so challenges that had scaling on part 2( blinking rocks and lanternfish ) or other ways to cause you to have a hard time creating a fast solution(like the towel one and day 22) are places where they would struggle a lot.

day 12 with the perimeter and all the extra minute details also causes GPT trouble. So does the day 14, especially the easter egg where you need to step through to find it but GPT can't really solve it because there is not enough context for the tree unless you do some digging on how it should look like.

these were some casual observations. clearly there is more to do to test out, but it shows that these are big struggle points. If we are talking about getting on the leaderboard, then I am glad there are challenges like these in AoC where not relying on an llm is better.

[–] [email protected] 3 points 1 month ago

Was a bit disappointing that they didn't complete all 50 stars, which made them a bit of a poor control.