this post was submitted on 02 Mar 2025
9 points (90.9% liked)

Asklemmy

45414 readers
1432 users here now

A loosely moderated place to ask open-ended questions

Search asklemmy ๐Ÿ”

If your post meets the following criteria, it's welcome here!

  1. Open-ended question
  2. Not offensive: at this point, we do not have the bandwidth to moderate overtly political discussions. Assume best intent and be excellent to each other.
  3. Not regarding using or support for Lemmy: context, see the list of support communities and tools for finding communities below
  4. Not ad nauseam inducing: please make sure it is a question that would be new to most members
  5. An actual topic of discussion

Looking for support?

Looking for a community?

~Icon~ ~by~ ~@Double_[email protected]~

founded 5 years ago
MODERATORS
 

I've been using Lemmy since last August and I find it very useful for sharing articles.

Most of the time articles are very easy to upload with no real defects. However, there are a few websites, such as apnewsDOTcom that send data that ends up as giving "Just a moment" for the headline, and no thumbnail image. [see https://lemmy.ml/post/26652484]

Where would I look in the apnews webpages to find the data elements that need fixing, and what software would I need to enter the fixed data into Lemmy, so that the Lemmy post looks right?

Thanks in advance for any help I get for this.

you are viewing a single comment's thread
view the rest of the comments
[โ€“] [email protected] 3 points 20 hours ago (1 children)

"Just a moment" is probably the anti bot/anti DDOS service that particular website uses kicking in. A real website would interact with the web page and redirect before most people would notice, but bots and scrapers grab the lightweight HTML and run with it.

There is no good solution for this. If you're the instance admin you could configure something like Flaresolverr to bypass bot protection pages, but you'd still need to update Lemmy to not grab the first page requested and wait for the redirect cycle to complete first.

As a user you can try picking links to sources that don't have this type of bot protection built in, or you could link to the AMP page which is usually cached on Google's servers without an anti bot system active.

[โ€“] [email protected] 1 points 17 hours ago

Thanks for the feedback. I'll look into checking for the AMP pages. (Never heard of those before, haven't done much with the web for quite a while.)