this post was submitted on 20 Feb 2025
2 points (100.0% liked)

datahoarder

20 readers
12 users here now

founded 2 weeks ago
MODERATORS
top 4 comments
sorted by: hot top controversial new old
[–] [email protected] 3 points 2 days ago

what do you really mean by documents? do you mean something like a word document (docx or odt) - they are just zip files with xml files with your text and images in a zip

or pdf - which is kinda like a post script with raw data, and then compressed by zlib or epub which is html with images in a zip

or something in a scanned format - just images, or djvu

is it plaintext? then txt or tex (latex or something) or md or typ (typst), or even html

and what do you mean by long term storage? if by long term you mean some opener exists after 50 years for that format - arguably all work, but plain text requires least amount of stack. At the end of day, they are all effectively same - text and images - with plain text you can not have images and rest in one file, and that is about it.

do you mean something which is bit rot resistant - then basically all these are bad, but plain text is least bad since if you compress and bit rot happens - likelyness of recovery is lower. but if your archive format has recovery goals (something like pdfa, even docx has something like this i think). you can add bit rot resistance to plain text too - just create a archive, and uses something like par2

[–] [email protected] 3 points 2 days ago (1 children)

If markdown suits your needs, it's a safe bet.

[–] [email protected] 1 points 2 days ago (1 children)

Thanks :) what do you use?

[–] [email protected] 2 points 2 days ago

Markdown whenever I can. But I'm not so much concerned with long-term storage, I choose it for flexibility and ease of use.