I'm rebuilding my home server in nixos.
Rather that configuring the various services natively in nixos, I decided to run containers via virtualisation.oci-containers
whenever possible, mostly to be able to independently update the system and the various services.
Everything is going smoothly, but whenever I (for whatever reason) do nixos-rebuild boot
and reboot after adding a container instead of nixos-rebuild switch
, I run into this issue where podman isn't able to resolve the host (below you see the docker hub host, but it also happened with ghcr.io):
podman-apprise-start[1352]: Trying to pull docker.io/caronc/apprise:1.1.8...
podman-apprise-start[1352]: Pulling image //caronc/apprise:1.1.8 inside systemd: setting pull timeout to 5m0s
podman-apprise-start[1352]: Error: initializing source docker://caronc/apprise:1.1.8: pinging container registry registry-1.docker.io: Get "https://registry-1.docker.io/v2/": dial tcp: lookup registry-1.docker.io: no such host
I thought that my podman-*
services were missing a dependency on network-online
and that they were started before the network was available, but it is't the case:
# systemctl list-dependencies podman-apprise.service
podman-apprise.service
● ├─system.slice
● ├─network-online.target
● │ └─systemd-networkd-wait-online.service
● └─sysinit.target
● ├─dev-hugepages.mount
[...snip...]
Do you happen to know what the issue is?
PS: Manually running systemctl start podman-whatever
once fixes the issue, of course, but I wonder if there's a more robust solution?
update:
After investigating based on balsoft input below, the issue seems to be that systemd-networkd-wait-online
doesn't behave as expected (by me).
Basically, systemd-networkd-wait-online
waits for network interfaces to have a carrier (working ethernet cable) and an IP address. This is what in systemd-networkd docs is called the "degraded" state (no, it doesn't mean that something got worse than before... don't think too much of what "degraded" implies in English).
In my case, I have an interface that is setup via DHCP and that also has static IPs assigned:
$ cat /etc/systemd/network/00-lan1.network
[Match]
Name=lan1
[Network]
DHCP=ipv4
IPv6AcceptRA=no
LinkLocalAddressing=no
[Address]
Address=192.168.10.10/24
[Address]
Address=192.168.10.99/24
If you are wondering, the reason I do this is that I want static IPs for my dns server and reverse proxy, but I also want my home server to use DHCP to fetch some network-wide configuration which, critically, includes the default route.
Back to the issue: IIUC, since the interface has a non-link-local address (which systemd-networkd confusingly calls a "routable" address), it is immediately considered "routable" (a state that is moar better than "degraded") and so not only it's basically ignored by the default systemd-networkd-wait-online
configuration, but even adding
[Link]
RequiredForOnline=routable
to /etc/systemd/network/00-lan1.network
doesn't make a difference whatsoever.
For now, my stopgap solution is to explicitly set the default route for the "lan1" network:
[Network]
Gateway=192.168.10.1
this seems to solve the issue with podman and, while the system still thinks to be "online" before being fully configured, it will suffice until I find a more elegant/robust way (ping me in a while if you are interested).
refs:
systemd-networkd-wait-online man page
systemd-networkd docs on "RequiredForOnline"
networkctl man page
I too experimented with k3s, but then abandoned the idea of using it after I realized the proper way to run postgres on it was (IIUC) to use bitnami's helm chart. I like to have some level of understanding of how my homelab and it's config works, and that humongous amount of unreadable templates was not appealing in the least.
As for containers, I am not really looking for service isolation (IIUC until ##368565 lands, all
virtualisation.oci-containers
basically run as root and I'm fine with that*)... I just want to be able to run different (usually more recent, but in nixos one also can't easily "pin" an older version of a package if the need arises **) versions of services than those packaged is nixos. Also, not all services I want to run are available as nixos packages, and even less have modules.* I know what risk I'm running (more or less): nothing in my homelab is accessible from outside my lan and, even if the container host was somehow pwned, that machine can't really do much harm (the important stuff is on a separate one).
** I guess I could import an older version of nixpkgs in my flake, but that requires way too much editing just to pin a package (time I'd rather spend solving the actual issue).