this post was submitted on 11 Jan 2025

2 points (100.0% liked)

Cybersecurity

9 readers

30 users here now

An umbrella community for all things cybersecurity / infosec. News, research, questions, are all welcome!

Rules

Community Rules

Be kind
Limit promotional activities
Non-cybersecurity posts should be redirected to other communities within infosec.pub.

founded 2 years ago

MODERATORS

[email protected]

[email protected]

2

"On Saturday, Triplegangers CEO Oleksandr Tomchuk was alerted that his company’s e-commerce site was down. It looked to be some kind of distributed denial-of-service attack. (tldr.nettime.org)

submitted 3 weeks ago by [email protected] to c/[email protected]

12 comments fedilink hide all child comments

"On Saturday, Triplegangers CEO Oleksandr Tomchuk was alerted that his company’s e-commerce site was down. It looked to be some kind of distributed denial-of-service attack.

He soon discovered the culprit was a bot from OpenAI that was relentlessly attempting to scrape his entire, enormous site.

“We have over 65,000 products, each product has a page,” Tomchuk told TechCrunch. “Each page has at least three photos.”

OpenAI was sending “tens of thousands” of server requests trying to download all of it, hundreds of thousands of photos, along with their detailed descriptions.

“OpenAI used 600 IPs to scrape data, and we are still analyzing logs from last week, perhaps it’s way more,” he said of the IP addresses the bot used to attempt to consume his site.

“Their crawlers were crushing our site,” he said “It was basically a DDoS attack.”

Triplegangers’ website is its business. The seven-employee company has spent over a decade assembling what it calls the largest database of “human digital doubles” on the web, meaning 3D image files scanned from actual human models.

It sells the 3D object files, as well as photos — everything from hands to hair, skin, and full bodies — to 3D artists, video game makers, anyone who needs to digitally recreate authentic human characteristics."

https://techcrunch.com/2025/01/10/how-openais-bot-crushed-this-seven-person-companys-web-site-like-a-ddos-attack/

#CyberSecurity #AI #GenerativeAI #OpenAI #WebScraping #DDoS #AITraining

top 12 comments

sorted by: hot top controversial new old

[–] [email protected] 1 points 3 weeks ago

@[email protected] Block them. Make them pay!

[–] [email protected] 1 points 3 weeks ago

@[email protected] not cool running so many connections but 65,000 pages isn't really that much for a contemporary website. If you have a CDN then even more so.

[–] [email protected] 1 points 3 weeks ago

@[email protected] I'm starting to consider AI companies evil.

[–] [email protected] 1 points 3 weeks ago

@[email protected] I had my own run in with GPTbot spamming requests, falling into a recursive hole with desktop/mobile view links and sending malformed URLs: https://mastodon.org.uk/@DrinkyBird/113743065815541997

[–] [email protected] 1 points 3 weeks ago

@[email protected]

This has happened to us on several of the over 300 domains we host.

The COSTS to support OpenAI harvesting, bandwidth, and the rest of the AI bot farms stealing copyrighted content is crushing us.

[–] [email protected] 1 points 3 weeks ago

@[email protected]

And no one is calling this what it should be: robbery?

[–] [email protected] 1 points 3 weeks ago

@[email protected] GPTBot is the most aggressive content scraper I've come across in decades of server management. Totally ignores any crawl limits that you set in your robots.txt, and they operate on enough IPs to make even nginx configured rate limiting a bit futile.

You can, though, block them (and others) by their useragent string. Add this to your .htaccess to block both GPTBot and Claude, for example:

SetEnvIfNoCase ^User-Agent$ .*(ClaudeBot|GPTBot) BADBOTHAMMER
Deny from env=BADBOTHAMMER

[–] [email protected] 1 points 3 weeks ago

@[email protected] that’s fucking creepy! 😳

[–] [email protected] 1 points 3 weeks ago

@[email protected] im neither a lawyer nor cybersecurity expert, just a fresh computer engineer. Im curious what would happen if they pursued legal action against openai for the downtime? Openai attacked their service and took them offline causing financial loss. Seriously why not treat it like a hack? What would a judge say when comparing openai's actions to those of some kids running a ddos campaign?

[–] [email protected] 1 points 3 weeks ago

@[email protected] This isn't "like" a DDoS attack it IS a DDoS attack.

Virtually every early example of a modern computer attack was originally someone just messing around or making a mistake (the first virus, worm, and DoS all come to mind) and to my knowledge all of those were tried on (and many found guilty to) serious hacking charges, so why shouldn't OpenAI? They shouldn't get to claim "well, your service should have been able to handle a DDoS" or "we're doing it for gain, though."

[–] [email protected] 1 points 3 weeks ago

@[email protected] I’ve to deal with the AI scraping problem too at work.

They are the worst scraping bot ever made, not only OpenAI but a dozen of AI startup.

[–] [email protected] 1 points 2 weeks ago

@[email protected] the AI world will be lawless. Goodbye, sanity.