I don't know how much configurable invidious is in terms of allow-listing content, but unless you want to locally cache the content you want them to see - which seems a lot of work, I guess that's the way to go
lilolalu
I am collecting hard to find art house stuff. So I have redundancy, but no backup. If your collection is easy to redownload I totally would use the interwebs as my backup.
I think a lot of proxy servers have that functionality, HAproxy definitely has... With nginx you need the "plus" Version to proxy tcp.
security
i think you are completely wrong here. big corporations do cost assessments of security vs costs of security breaches. if security is more expensive than data breach, they will accept the breach.
Unpopular opinion: don't use any of the Truenas, unRAID, freenas, Openmediavault, proxmox stuff. Sooner or later you will run into something those special interest systems can't do and it will feel annoyingly limiting.
Choose a widely used vanilla Linux distro, Debian or Ubuntu, install everything as docker container, learn how to handle docker compose and how to configure stacks via docker-compose.yml
I think the most advanced OpenSource LLM model right now is considered to be Mistral 7B Open Orca. You can serve it via the Oobabooga GUI (which let's you try other LLM models as well). If you don't have a GPU for interference, this will be nothing like the ChatGPT experience though but much slower.
https://github.com/oobabooga/text-generation-webui
You can also try these models on your desktop using GPT4all, which doesn't support GPU ATM.
You can also see oom killer messages with dmesg
The question is not so much if you have enough physical ram but if your docker management tool has established resource limits for the containers. Oom killer will stop the process regardless of the fact that there is enough free memory if the container goes over its Ressource contraints.
Well that's possible with a lot of deduplicators. But I'd take a look at duff:
https://manpages.ubuntu.com/manpages/xenial/man1/duff.1.html
https://github.com/elmindreda/duff
The duff utility reports clusters of duplicates in the specified files and/or directories. In the default mode, duff prints a customizable header, followed by the names of all the files in the cluster. In excess mode, duff does not print a header, but instead for each cluster prints the names of all but the first of the files it includes.
If no files are specified as arguments, duff reads file names from stdin.
How should a duplicate finder know which is the source of the duplicate?
Pretty much every distro has a minimal version, including ubuntu. I think the better criteria for choosing a distro are release management, community support, and general architecture of package management etc.
https://wiki.ubuntu.com/Minimal
https://www.debian.org/CD/netinst/
https://wiki.archlinux.org/title/Netboot
Etc
Nextcloud works great for document management, if you additionally install tesseract OCR and Elasticsearch. Then you can use any smartphone document-scanner (I personally use "swift scan") to add new documents via WebDAV Upload, but I think most of them support WebDAV nowadays. The Nextcloud app even has a document scanner feature built in, but it's not very good.
I have been reading about the features of paperless-ng and I don't see what that software additionally brings to the table that a properly setup nextcloud cannot do. Only that I have Nextcloud anyways and it can do much more than document management and I love to have all aspects of my "personal cloud" in one software tool.