Deploying Home Assistant on k3s with TLS

Summer sunflower
Pana GX7, Olympus 60mm f2.8 macro, NSW Australia, Jan 2024

I successfully moved my home assistant server from a Raspberry Pi into my kubernetes cluster using the kubernetes resource definitions in (https://github.com/drpump/k3s-home-assistant/). TLS is configured and my home assistant instance is now accessible through secure HTTPS connections using a standard, host-routed https URL.

I’ve been running Home Assistant on a Raspberry Pi via docker compose for a few years now. It has been pretty reliable but I worried about SD card storage, security (no TLS), lack of fail-over support, and performance as the number of devices and sensors in our home network increases. During our summer holidays I made it a goal to my move home assistant into my k3s cluster. After lots of googling I realised that this wasn’t an especially common thing to do and detailed instructions/code were kinda sparse and had assumptions that weren’t valid for me.

So here’s my version.

For the impatient, grab a copy of (https://github.com/drpump/k3s-home-assistant) and follow the instructions. For a bit more context and detail, read on.

Context

About a year ago I added a mini PC with lots of RAM, a fast NVME drive and decent reliability to my home network. I installed k3s and I have been gradually familiarising myself with k3s and kubernetes in general. I’m also building up a set of core platform services to support application deployment. More specifically:

  • Rancher’s Longhorn distributed filesystem is deployed and configured as the default storage provisioning system, giving me distributed (on one node :) storage and automatic backups to my NAS.
  • I have cert-manager working with a subdomain of my personal domain, including a public http01 endpoint to handle ACME requests used by LetEncrypt to validate domain ownership. This allows me to automatically provision LetsEncrypt certificates and hostname-based routing for services that I deploy using kubernetes ingress resources.
  • My local DNS is configured to resolve a subdomain to local (private) addresses, so I can browse to https://myserver.k3s.my.domain in my home network and get browser-accepted TLS without having to deal with self-signed certs.
  • I have a Mosquitto MQTT server in the k3s cluster with TLS enabled on the MQTT protocol using Traefik IngressRouteTCP resources.
  • I have a free datadog account configured to notify me if my home server is inaccessible (up to 5 hosts can be monitored free). This allows me to monitor my k3s host without any dependency on power or network connectivity at home. The flip side is that I get persistent notifications when we have a power or network outage, but since these are infrequent I’m happy to live with that compromise.

More detail about the DNS and cert-manager configuration is documented in my previous blog article Deploying Mosquitto MQTT to k3s with Traefik TLS ingress.

Why kubernetes?

Home assistant has a variety of installation methods documented on the extensive web pages. Nowadays there is a preference for the Home Assistant OS on Raspberry Pi devices. This is a custom Linux distribution optimised for home assistant and associated add-ons. You can also install the distribution on x86 machines and a few other single board computers. It likes to have direct access to the network for device discovery, and it’s also helpful to have access to hardware if you plan to use directly-connected peripherals and Rasberry Pi “hats”.

This didn’t suit me for a number of resaons:

  • I made a relatively conscious decision early on that I would keep direct hardware add-ons to a minimum and use network-based controllers, for example, ESPHome devices for proxmity sensors. This limits the scope of downtime in the case of a hardware failure.
  • I am wary of SD card storage and didn’t particularly want to build up a Pi or other dedicated hardware with SSD storage. I’ve been worried about my Pi storage reliability for a while now, having already experienced one episode of corruption.
  • As we increasingly depend on our home automations, I wanted the ability to bring my Home Assistant back up on alternate hardware quickly. Home assistant is not really designed for fail-over, but we can get close with a distributed filesystem and containerized execution.

Enter kubernetes. It provides a standard compute, and (with CSI), a standard storage platform that can be used to move containerized, stateful workloads safely between nodes. Off-the-shelf components also add standard solutions for host-based routing, load balancing and automated management of free, real (not self-signed) certificates through LetsEncrypt.

Additionally as a software engineer, I already run a home kubernetes cluster for self-guided learning and personal research/programming projects. Having all of my home-based services running on a single, managed platform simplifies my administration tasks.

Why k3s

There are a few lightweight kubernetes implementations suitable for a home network, but most are limited to a single node or require non-trivial hardware to run. k3s from Rancher is lightweight enough to run on limited hardware, simple to install, supports multiple nodes and high availability configurations, and works well with the the longhorn distributed filesystem, also from Rancher.

Right now I have a single node in my k3s cluster, but plan to add more. Once my raspberry pi is freed from explicit home assistant duties, it will be the first addition to my cluster. I can add the pi it without relying on the SD card for any permanent storage, and it is powerful enough to run light-to-medium workloads.

Deployment

Kustomize

There are a few different ways to build and use the kubernetes resource definitions required to deploy an application. The simplest for a one-off deployment is to just hand-code resources, but sharing and re-use become a cut/paste/edit process that can be both tedious and error-prone.

Helm is convenient and common for widely-used kubernetes packages. You will likely find a helm chart to deploy home assistant in kubernetes if you go looking. Helm is based on templates, however, which can be quite difficult to write and debug, and charts typically limit the scope of customisation. For widely-used packages this is appropriate. For smaller efforts the overheads are high.

Kubernetes is now offering a complementary alternative to helm called kustomize. I say “complementary” because you can use both kustomize and helm together. Kustomize uses a base+overlay model to allow a set of base resource definitions to be customized through “patching” with an overlay. Apart from the relative simplicity, a key advantage is that this approach retains the declarative nature of kubernetes resource definitions, meaning you have consistent and deterministic semantics in the deployment of resources. You can also deploy easily without additional tools using kubectl apply -k <dir> where <dir> contains your overlay resource definitions.

I chose to use kustomize for the source presented in (https://github.com/drpump/k3s-home-assistant). I like the relative simplicity and more consistent, deterministic semantics. It also reduced the effort associated with supplying a re-usable source code base for others to build upon. To be frank I also dislike templating in general: combining two different semantic models (kubernetes + moustache in this case) for any non-trivial configuration leads to semantic interference effects making it exponentially more difficult to write and debug as complexity increases. PHP anyone?

Solving the networking puzzle

Home assistant uses mDNS (multicast DNS) for local device discovery. This allows it to find new devices that are accessible on the home network and also assists in integration with Apple HomeKit. HomeKit integration in particular is very useful: Apple has made the effort to securely manage home devices remotely using Apple IDs and thus simplifies the process of allowing household members to access devices and automations managed in home assistant. Or in other words, it significantly reduces the effort required to manage and secure home assistant for use by all household members. The Apple Home application is also friendly and quick to access from an Apple mobile device.

By default, Kubernetes runs compute loads (pods) on a cluster-internal 10.x.x.x network with no direct access to external networks. mDNS multicast is also not transmitted outside the cluster-internal network. Clearly this won’t work for discovery of network devices in home assistant or the desired integration with Homekit without some network adjustments. There are a few options here:

  1. try to configure HomeKit without mDNS. There are a few discussion threads on this topic but I wasn’t able to find anything definitive. In my experience with this kind of network hacking, it’s also hard to know if you’ve got the networking right for all cases, leading to inconsistent behaviour.

  2. find a way to pass mDNS broadcasts through from the cluster-internal network to the home network. This is probably the “right” way but requires non-trivial network configuration. Multus is mentioned in a few online discussion threads: this allows you to connect kubernetes pods to multiple networks, but I get buried pretty quickly in detail and required modifications to k3s. In addition this requires a kubernetes CNI to connect to VLANs (e.g. MacVLAN) or a bridge network.

  3. use hostNetwork: true in the pod definition, which allows the home assistant pod to directly access the host node’s network. This is the simplest method but raises security concerns as discussed in this article (old but still covers the fundamental issues). It’s also a bit of an ugly, brute-force approach.

Since I’m not enough of a cluster networking guru to go hacking the k3s network config, I chose the hostNetwork: true option. I will use other mechanisms to address the security risks (e.g. running the container with a non-root UID). I might revisit in future if the networking config required for option 2 become more mainstream. Note that I tested without hostNetwork: true to confirm that discovery would not work, and also tried some simple k3s networking configuration changes before making this decision.

Kubernetes Deployment vs StatefulSet

Many of the published example installations of home assistant in kubernetes use a Deployment resource. This resource type is intended to ensure that the specified set of application pods are operational and in a healthy state. Storage for these resources is typically provisioned dynamically through a kubernetes persistent volume claim. A key aspect of these resources is that the volumes are considered to have the same lifecycle as the associated pod, meaning that the platform is free to reclaim the volume storage when the pod is stopped or replaced. Thus you have a risk that your home assistant config and database will be reclaimed if you scale down to perform an upgrade or other maintenance.

If you use the host filesystem for storage in k3s and some other local kubernetes implementations this reclaiming of volumes does not occur. Hence many of the published examples “work” with a Deployment resource. There are a number of downsides, however:

  • host filesystem storage means you need to “pin” the home assistant pod to a specific host to ensure that home assistant always starts with the same configuration and database files. You can’t move your pod to another host without manually moving the filesystem content or using NFS.
  • you can’t take advantage of kubernetes volume management tools, for example, snapshots.
  • in my case, the Longhorn distributed filesystem offers automated backups and restore capabilities for volumes
  • your home assistant doesn’t “own” its storage, with potential for issues if you inadvertantly modify the host filesystem.

The kubernetes StatefulSet resource is similar to a Deployment but maintains volumes and other kubernetes state related to pods for the lifetime of the StatefulSet, ensuring that a new pod uses the same volume as the any previous pod having the same index (pods are numbered 0..N). This makes it an appropriate choice for any application that maintains local database files that should be retained across restarts and scaling events. It is also more suitable for use with home assistant than a Deployment with local filesystem storage due to the advantages noted above.

See also the StatefulSet definition for my home assistant installation.

Certificates and TLS

For other HTTP services deployed into my home cluster, I am able to use a standard kubernetes Ingress resource to configure TLS endpoints with automatically created/renewed certificates on my DNS subdomain. This works nicely because those services are not exposed outside the cluster-internal network and the Ingress effectively creates an externally-accessible TLS reverse proxy for those services. The reverse proxy also allows my services to be moved between cluster nodes without affecting external clients.

The use of hostNetork: true complicates this arrangement because the default home assistant port (8123) is exposed directly without TLS on the host node where the home assistant pod is running. A standard Ingress resource would route to this already-exposed endpoint using HTTP rather than HTTPS, removing any security advantages arising from using TLS on the ingress.

I addressed this by explicitly creating a certificate using cert-manager and making it available for home assistant to use for TLS handshaking on port 8123. Steps were as follows:

  1. I used a cert-manager Certificate resource to create a TLS certificate for the hostname I was planning to use for home assistant, and made sure that the hostname resolved to my k3s node IP address(es). See the Certificate resource definition for the gory detail. Further details of the DNS and cert-manager configuration requried to support this is captured in my previous blog post on MQTT. As noted in the repository README, I would strongly recommend first testing your cert-manager configuration using the LetsEncrypt staging services.

  2. I configured the StatefulSet resource to mount the kubernetes Secret generated by cert-manager on the /ssl path in the home assistant pod:

        volumeMounts:
          - name: hass-certs
            mountPath: /ssl
      volumes:
        - name: hass-certs
          secret:
            secretName: hass-prod-cert

See also (https://github.com/drpump/k3s-home-assistant/blob/main/base/statefulset.yaml#L70-L75).

  1. I added a few lines to the home assistant configuration to instruct it to use the certificate, and also allow me to add an ingress proxy (see below):
http:
  use_x_forwarded_for: true
  trusted_proxies:
    - 10.42.0.0/16
  ssl_certificate: /ssl/tls.crt
  ssl_key: /ssl/tls.key

The last step could only be completed after the StatefulSet was initially deployed because it required updates to the storage volume used for the config and database. See step 5 at (https://github.com/drpump/k3s-home-assistant/blob/main/README.md#usage) for detailed instructions.

So now my home assistant server is secured with TLS, but I still have a problem with routing if there are multiple nodes in the cluster and my home asssisant pod is moved. We’re also still using port 8123 despite having a specific hostname for home assistant. Fortunately, the Traefik ingress controller installed with k3s allows us to define a custom ingress resource to proxy for the service and pass through the TLS handshaking. The home assistant http configuration above ensures that home assistant will accept the proxied requests. See the [IngressRouteTCP definition]https://github.com/drpump/k3s-home-assistant/blob/main/base/ingressroute.yaml for the details.

The end result is that our home assistant instance can be accessed using HTTPS on port 443 using the specified hostname (e.g. https://hass.k3s.example.com) independent of where the home assistant pod is running in the cluster.

Wrapping up

I hope you found this article useful and informative. Feel free to comment or contact me for related questions. I’ll be continuing to work on my home cluster and home assistant server, so if you share these interests look out for future articles.

Written on January 5, 2024