this post was submitted on 25 Oct 2024
353 points (97.1% liked)

Curated Tumblr

4301 readers
44 users here now

For preserving the least toxic and most culturally relevant Tumblr heritage posts.

The best transcribed post each week will be pinned and receive a random bitmap of a trophy superimposed with the author's username and a personalized message. Here are some OCR tools to assist you in your endeavors:

Don't be mean. I promise to do my best to judge that fairly.

founded 2 years ago
MODERATORS
 
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 5 points 3 months ago* (last edited 3 months ago)

As per my other post, this person isn't doing any of that.

But, since you asked for papers on generic matching algorithms, I found this during the silent conniption fit you sent me into after suggesting that some random tumblr user plugged a tumblr bot directly into a state of the art genomics db.

https://link.springer.com/article/10.1007/s11227-022-04673-3

Please note that while, yes, they ran this test on a standard office computer, they were only searching against 12 million characters.

A single tebibyte of characters would be more like 1 trillion characters. A pebibyte would be more like 1 ~~quintillion~~ quadrillion.

... much, much, much longer processing times.

Edit: Used the wrong word for stupendously large numbers that start with q.