GPU K8s Cluster

The Setup

Picked up a TuringPi carrier board One of there RK1 boards and 3 Nvidia Orion boards Piacked up 10gib nics for the m2 Have a 4TB SATA drive Have a 5TB NVME NAS

The thought

Setup a Talos linux cluster Run Talos to manage k8s Configure Longhorn to run a backend k8s storage NAS setup iscsi NAS

Provision the RK1 Chip

Provisioning the RK1 was fairly straight forward from this guide. Its like 3 steps download the image, flash the board, apply config updates https://github.com/nberlee/talos

Where this goes off the rails is having a 10gb nic in the m2 slot. When booting with the nberlee fork it’s great, both nics (the turingpi & the 10gb) get ip’s. Longhorn need iscsi-tools, so you look to apply the talso feature overlay, that seems to go fine, until the reboot applies and now you’ve lost a nic.