2025-11-01
Talos: recovering from broken network interface
While playing with my homelab Talos cluster, I accidentaly broke the network interface. As my Raspberry Pi has only one interface, that meant that my device was unreachable over the network. To make matter worse, the node I broke was the control plane, and I wanted to preserve as much data as possible.
To recover from this situation, I was left with no other option, but to boot the device from an SD card. When Talos is installed and first boots up, it creates a couple of partitions on the install disk (nvme0n1 in my case):
rpi@rpi1:~$ lsblk -o NAME,LABEL,SIZE,MOUNTPOINT
NAME LABEL SIZE MOUNTPOINT
mmcblk0 29.2G
├─mmcblk0p1 bootfs 512M /boot/firmware
└─mmcblk0p2 rootfs 28.6G /
nvme0n1 476.9G
├─nvme0n1p1 EFI 1.1G
├─nvme0n1p2 1M
├─nvme0n1p3 STATE 100M
├─nvme0n1p4 EPHEMERAL 50G
└─nvme0n1p5 425.8G
The one that I’m interested in is the STATE partition, which contains the Talos configuration (but not etcd which is stored in EPHEMERAL).
A simple fix is to “clean” the STATE partition and let Talos recreate it on the next boot. This can be done with the following command:
sudo mkfs.ext4 -L STATE /dev/nvme0n1p3
After that, I just needed to reboot the device (removing the SD card before) and apply valid configuration to the node with --insecure flag:
talosctl apply-config --nodes <node> --endpoints <node> --file <file> --insecure
Easy-peasy! If only all problems were this easy to solve :)