this post was submitted on 20 Apr 2023
14 points (93.8% liked)

Technology

38918 readers
75 users here now

This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.


Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.


Rules:

1: All Lemmy rules apply

2: Do not post low effort posts

3: NEVER post naziped*gore stuff

4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.

5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)

6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist

7: crypto related posts, unless essential, are disallowed

founded 6 years ago
MODERATORS
 

A few absolute shockers in the list of websites the Washington Post has revealed are used to train Google's generative AI tools. Apparently including the likes of 4Chan, Breitbart, and RT.

From WaPo:

"Meanwhile, we found several media outlets that rank low on NewsGuard’s independent scale for trustworthiness: RT.com No. 65, the Russian state-backed propaganda site; breitbart.com No. 159, a well-known source for far-right news and opinion; and vdare.com No. 993, an anti-immigration site that has been associated with white supremacy.

"The top Christian site, Grace to You (gty.org No. 164), belongs to Grace Community Church, an evangelical megachurch in California. Christianity Today recently reported that the church counseled women to 'continue to submit' to abusive fathers and husbands and to avoid reporting them to authorities."

https://www.washingtonpost.com/technology/interactive/2023/ai-chatbot-learning/

#technology #ArtificialIntelligence #AI #GenerativeAI #Google #tech #news @technology @politics #trans #lgbtqia #lgbtq

top 9 comments
sorted by: hot top controversial new old
[–] [email protected] 4 points 2 years ago (1 children)

@ajsadauskas @technology @politics Not surprising, but at least it's listed. We still have no idea what ChatGPT is trained on!

[–] [email protected] 1 points 2 years ago* (last edited 2 years ago)

@drdanielturner @technology @politics True. I guess the interest is that this is the first time we get to see exactly what crap these generative AI tools are trained on.

[–] [email protected] 3 points 2 years ago

Those rankings are pretty low.

[–] [email protected] 2 points 2 years ago
[–] [email protected] 1 points 2 years ago

Considering the amount of tokens and their ranks it could be just to know and understand more context about them?

Considering the amount of data those silos actually have.

[–] [email protected] 1 points 2 years ago
[–] [email protected] 1 points 2 years ago (1 children)

Guess which one they will delete from it.

[–] [email protected] 1 points 2 years ago (1 children)
[–] [email protected] 0 points 2 years ago

Surely not WaPo lol

load more comments
view more: next ›