this post was submitted on 02 Feb 2025
774 points (99.9% liked)

196

1884 readers
3003 users here now

Community Rules

You must post before you leave

Be nice. Assume others have good intent (within reason).

Block or ignore posts, comments, and users that irritate you in some way rather than engaging. Report if they are actually breaking community rules.

Use content warnings and/or mark as NSFW when appropriate. Most posts with content warnings likely need to be marked NSFW.

Most 196 posts are memes, shitposts, cute images, or even just recent things that happened, etc. There is no real theme, but try to avoid posts that are very inflammatory, offensive, very low quality, or very "off topic".

Bigotry is not allowed, this includes (but is not limited to): Homophobia, Transphobia, Racism, Sexism, Abelism, Classism, or discrimination based on things like Ethnicity, Nationality, Language, or Religion.

Avoid shilling for corporations, posting advertisements, or promoting exploitation of workers.

Proselytization, support, or defense of authoritarianism is not welcome. This includes but is not limited to: imperialism, nationalism, genocide denial, ethnic or racial supremacy, fascism, Nazism, Marxism-Leninism, Maoism, etc.

Avoid AI generated content.

Avoid misinformation.

Avoid incomprehensible posts.

No threats or personal attacks.

No spam.

Moderator Guidelines

Moderator Guidelines

  • Don’t be mean to users. Be gentle or neutral.
  • Most moderator actions which have a modlog message should include your username.
  • When in doubt about whether or not a user is problematic, send them a DM.
  • Don’t waste time debating/arguing with problematic users.
  • Assume the best, but don’t tolerate sealioning/just asking questions/concern trolling.
  • Ask another mod to take over cases you struggle with, if you get tired, or when things get personal.
  • Ask the other mods for advice when things get complicated.
  • Share everything you do in the mod matrix, both so several mods aren't unknowingly handling the same issues, but also so you can receive feedback on what you intend to do.
  • Don't rush mod actions. If a case doesn't need to be handled right away, consider taking a short break before getting to it. This is to say, cool down and make room for feedback.
  • Don’t perform too much moderation in the comments, except if you want a verdict to be public or to ask people to dial a convo down/stop. Single comment warnings are okay.
  • Send users concise DMs about verdicts about them, such as bans etc, except in cases where it is clear we don’t want them at all, such as obvious transphobes. No need to notify someone they haven’t been banned of course.
  • Explain to a user why their behavior is problematic and how it is distressing others rather than engage with whatever they are saying. Ask them to avoid this in the future and send them packing if they do not comply.
  • First warn users, then temp ban them, then finally perma ban them when they break the rules or act inappropriately. Skip steps if necessary.
  • Use neutral statements like “this statement can be considered transphobic” rather than “you are being transphobic”.
  • No large decisions or actions without community input (polls or meta posts f.ex.).
  • Large internal decisions (such as ousting a mod) might require a vote, needing more than 50% of the votes to pass. Also consider asking the community for feedback.
  • Remember you are a voluntary moderator. You don’t get paid. Take a break when you need one. Perhaps ask another moderator to step in if necessary.

founded 2 weeks ago
MODERATORS
top 50 comments
sorted by: hot top controversial new old
[–] [email protected] 15 points 20 hours ago* (last edited 20 hours ago) (2 children)

🚨 BIG NEWS Y'ALL! 🚨

Someone just saved ALL the CDC's public data before it could disappear! 🦅

What's the Deal?

Some mystery hero downloaded everything from the CDC's website (that's 98 GIGABYTES of health info!) and uploaded it to the Internet Archive on Jan 28th. Think of it like making a backup copy of your phone before it breaks!

Why Should You Care?

  • This is YOUR health data - stuff about vaccines, diseases, and public health that your tax dollars paid for! 🏥
  • Once this info is gone from CDC's website, it could be really hard for your doctor to get important updates
  • Researchers need this to keep studying ways to keep Americans healthy 💪

What's Next?

Smart folks at places like Harvard are making sure this data stays safe by keeping copies. It's like having multiple backups of your family photos - can't be too careful!

Remember folks: Knowledge is power, and someone just made sure we didn't lose a whole bunch of it! 🎯

#SaveTheData #PublicHealth #AmericanRight2Know


Source: Internet Archive upload by anonymous user on Jan 28, 2025 Post by Ed Summers (@[email protected]) - Feb 3, 2025

[–] [email protected] 3 points 8 hours ago

As a reminder, AI generated content is against the rules in this community—see the sidebar. I appreciate your instinct to bring some quality content to this space, but let’s please keep in mind that genuine interaction with diverse voices is what makes this community beautiful. :)

My reasoning:

  • You have personally admitted to writing AI comments in the past: https://sh.itjust.works/comment/16482371
  • Heavy use of markdown headings, bullets, and section dividers is a common pattern in LLM output
  • Use of “it’s like” or “it’s about” phrases as the conclusion to a paragraph are very common in LLM models like ChatGPT
  • Verbatim replication of content from my original post that is common in LLM output and highly indicates an LLM was instructed to create something based on the text of the original post
  • Use of 🎯 emoji does not match context
  • “100% AI generated” response on multiple AI detection websites (GPTZero, Quillbot)

Any single one of these facts would not lead me to comment, but with all of it combined it makes a pretty strong case. Thank you for your contribution to this community but please let’s keep it genuine in the future! We love and appreciate the real you :)

[–] [email protected] 31 points 1 day ago (3 children)

I will grab this torrent when I get home and make it a permanent seed, alongside the one outing nazis in Patriot Front.

[–] [email protected] 4 points 20 hours ago

Shit good idea, didn't even know you could do this.

What else should we seed? I've got a homelab and am eager to put some storage to use for something like this.

[–] [email protected] 3 points 21 hours ago (1 children)

Was there a mailing list or other identifying docs in that pf leak or was it just chats and stuff?

[–] [email protected] 2 points 16 hours ago

Not sure if it's the same leak, but if it's PatriotFail, it's even got videos.

Watch the marching drill one for a good laugh. https://xcancel.com/alt_uscis/status/1549969687999553539

[–] [email protected] 2 points 21 hours ago

I appreciate you doing this!

[–] [email protected] 18 points 1 day ago (2 children)

How would you recommend someone go about archiving important parts of the IA? Just external drives?

[–] [email protected] 21 points 1 day ago (1 children)

The Internet Archive is, and I really want to emphasize this, Fucking Huge. If you want to help archive it, every upload has an associated torrent you can download and help seed. Torrenting itself isn't illegal, only torrenting illegal stuff like copyrighted movies. You can buy a relatively cheap refurbished HDD of whatever size you want, set up qBittorrent, and torrent the uploads that you want to make sure are available even if the Internet Archive has to take them down or has a critical data loss failure.

[–] [email protected] 14 points 1 day ago* (last edited 1 day ago) (3 children)

Thank you so much for the advice! I want to preserve important documents like the bill of rights and the constitution, as well as sexual education material, especially stuff pertaining to women and reproductive health. Also banned books. Things the facists are trying to purge and things that are important to me.

[–] [email protected] 1 points 8 hours ago

If anyone is looking for something specific to preserve, consider Our Bodies, Ourselves. It's a seminal feminist work that seeks to educate women on their bodies. It's extremely comprehensive, thicker than most textbooks.

[–] [email protected] 4 points 19 hours ago (1 children)

I know what you're talking about is important and a necessary comment but something about your comment hit me hard. It's just so absolutely insane that it has to be said/done.

[–] [email protected] 3 points 17 hours ago

Ikr? It's wild that all of this is happening.

[–] [email protected] 12 points 1 day ago

In the case of books, Anna's Archive is looking for help seeding their enormous collection of books and research papers. Consider reading that page and helping them as well!

load more comments (1 replies)
[–] [email protected] 55 points 1 day ago (3 children)

hi spujb. Only 98gb? I can mirror that 🤷‍♀️

[–] [email protected] 7 points 1 day ago

I'm gonna download it when I get home and put on a few USBs. They won't be connected to any device and will be stored in safes.

Can't remote wipe data that's not connected.

The more backups of important information we have the better.

[–] [email protected] 23 points 1 day ago (1 children)
[–] [email protected] 32 points 1 day ago (2 children)

sry i dont know what that is but once i have all the data ill post a link here. im hosting in france and i am also outside the us so i will not take down the data at tronald dumps request tyvm.

[–] [email protected] 7 points 1 day ago* (last edited 1 day ago) (1 children)

Use his original last name. ~~Drumph~~ Drumpf. It pisses him off as much as being told that he has baby hands.

His father or grandfather changed it.

[–] [email protected] 2 points 1 day ago (1 children)
[–] [email protected] 1 points 1 day ago

I believe you are correct. Edited.

[–] [email protected] 5 points 1 day ago

Incredible 🫡

[–] [email protected] 8 points 1 day ago (1 children)
load more comments (1 replies)
[–] [email protected] 115 points 1 day ago* (last edited 1 day ago) (1 children)

We are screwed if the Internet Archive goes down, right?

Seems like a huge point of failure for one entity.

[–] [email protected] 57 points 1 day ago (5 children)

Agreed, I think the biggest issue though is just scale. It’s over 100 petabytes of data. Not outside the realm of big cloud providers to mirror, but they don’t really give a shit. It would require some sort of significant distributed software solution for the community to work with. Not impossible, but as far as I know, nobody’s taken up the mantle yet as I think it would need custom software just to begin the solution of how to distribute it as a sharded set of community mirrors, different people just mirroring individual pieces.

[–] [email protected] 1 points 12 hours ago* (last edited 12 hours ago) (1 children)

So about 104,857,600 GB? You'd need 105,000 people with 1 TB each to save that. Or...

Assuming you bought 30 TB SSDs, you'd need about 3,500 of those, costing €80 each.

That'd be €280k, but let's round it to €300k.

If every person spent €960 (or €80 per month), then each person could get 12 of those SSDs. You'd need 8,750 people to do that.

Should be doable if crowdfunded by a community, or if you had some big donor. Then you'd need to connect it.

[–] [email protected] 1 points 15 minutes ago

Looking at diskprices.com, lowest prices for storage are around $8 (used) or $15 (new). I didn't look too hard, but a 30TB SSD for $80 (~$2.5/TB) seems wrong?

100K TB * $15/TB = $1.5 million

Assuming 100PB is the amount of data, we'd also need redundancy. Idk what best practices would be, but I'll say 3ish copies, so 300PB total.

So a grand total of ~$5 million.

Which is crazy cheap, all things considered. Like, it would be no problem for a single rich person to handle that.

Hell, subsidize/give away cheap little computers that you just plug power and an Ethernet cable into. Raspberry pi + 4TB drive ($60) + casing would be like... $100? Though I guess you'd need 75K of them, and the cost per TB is pretty bad.

This guy is 20TB for $280: https://a.co/d/17UOtFi

If we stick with $40 of overhead for rpi etc, that's $320 for 20TB ($16/TB), and we'd need 300PB/(20TB/unit) = 15K units. And at $320 each, all in would be $4.8 million.

The software seems to exist for connecting them all... So idk seems like it would be absolutely feasible? Would be interested to learn if I'm missing a major cost.

load more comments (4 replies)
[–] [email protected] 152 points 2 days ago (3 children)

Get ready to Donate to their legal defense fund

[–] [email protected] 60 points 1 day ago (2 children)

It it long past overdue for the Internet Archive to move to the EU or Switzerland or something.

[–] [email protected] 14 points 1 day ago (2 children)

Yep.

I wish they also could implement a decentralised hosting protocol, though I know currently that technology is in it’s infancy.

load more comments (2 replies)
load more comments (1 replies)
[–] [email protected] 64 points 2 days ago

you’re right and you should say it but it makes me sad

load more comments (1 replies)
[–] [email protected] 23 points 1 day ago

Okay, given how things are going, do we know if the Internet Archive has a backup plan for when these fucks attack it in earnest?

[–] [email protected] 33 points 1 day ago

Because the feds didn't already have it out for IA.

[–] [email protected] 60 points 2 days ago

well and truly based

[–] [email protected] 35 points 1 day ago (1 children)

Was this previously public data? Not illegal to download an torrent, right?

[–] [email protected] 49 points 1 day ago* (last edited 1 day ago)

from the linked page

Excludes corrupt datasets and data not publicly accessible.

[–] [email protected] 48 points 2 days ago

Good thing they’re based far from the US in… oh.

[–] [email protected] 9 points 1 day ago* (last edited 1 day ago)

Inb4 it gets DDoS'd again

[–] [email protected] 28 points 1 day ago (1 children)

Time to download all of it!

load more comments (1 replies)
load more comments
view more: next ›