this post was submitted on 07 Jul 2025
128 points (98.5% liked)

Technology

39486 readers
343 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:


This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 3 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 1 points 16 hours ago (2 children)

Would you edit your post and add the following archive link to the body, please?

https://archive.is/VcoE1

[–] [email protected] 6 points 16 hours ago* (last edited 15 hours ago) (1 children)

Unfortunately, archive.is seems to have moved behind a big corporate CAPTCHA service, subjecting readers to having their reading habits (both the articles and the referring communities) tracked at a large scale.

I suggest this archive link instead:

https://web.archive.org/web/20250707135819/https://www.404media.co/the-open-source-software-saving-the-internet-from-ai-bot-scrapers/

[–] [email protected] 1 points 16 hours ago (1 children)

Unfortunately, archive.is has moved behind Cloudflare, subjecting readers to having their reading habits (both the articles and the referring communities) tracked at a large scale.

How do you know this?

What about https://ghostarchive.org/?

[–] [email protected] 5 points 15 hours ago* (last edited 15 hours ago) (1 children)

Sorry; I shouldn't have written Cloudflare specifically. Their CAPTCHA page now contains scripts from Google, not Cloudflare. I have corrected my comment.

How do you know this?

Because a couple months ago, archive.is/archive.today started showing me CAPTCHA pages instead of the archived articles when I use Firefox with scripts disabled. The current page contains scripts hosted by Google, which I won't enable, so I can't read the archived articles.

What about https://ghostarchive.org/?

I haven't used that site enough to have a consistent picture of what it's doing. When I tried it a few minutes ago, it directed me to a CAPTCHA wall when trying to submit an article, but not when searching for an archived article. I'll try to remember to look at it again periodically, to be able to answer this question in the future.

[–] [email protected] 3 points 15 hours ago

Thanks. I appreciate the info and effort.

[–] [email protected] 4 points 15 hours ago (1 children)

To be honest with you, I refuse on moral grounds. 404 are independent and do good work. You've already linked a pay wall bypass in the comments, if anyone would like to find it, it's not hard to scroll.

[–] [email protected] 4 points 15 hours ago

OK. Fair enough.