Homelab
A small ops sandbox at home — Ansible, Docker, Prometheus, WireGuard
My day job has long involved running infrastructure at scale — large cloud deployments, platform teams, on-call rotations. The homelab is where I keep my hands dirty: a small, real environment where I run the same kinds of tools and practices myself, end to end. It keeps me current with what teams I lead actually deal with day to day, gives me a sandbox to try ideas before recommending them, and is genuinely fun to tinker with on a weekend.
Everything that can be expressed as code lives in a private git repo. Configurations, dashboards, router settings, and even encrypted secrets are version-controlled. The goal is reproducibility — if any host dies, I should be able to bring back its role with a single Ansible run plus a data restore.
What runs here
- Networking: a Ubiquiti EdgeRouter handling routing, DHCP, DNS forwarding, VLANs, and NetFlow export. A separate VLAN segregates IoT devices from the trusted LAN.
- Compute: a handful of Raspberry Pis for single-purpose workloads, and a small set of VMs on an Intel NUC running VMware ESXi.
- Cloud VPSes: a small DigitalOcean droplet acting as a WireGuard hub and SOCKS proxy egress, plus a Hetzner VPS that hosts sankara.net.
- Monitoring: Prometheus, Alertmanager, blackbox_exporter, snmp_exporter, and node_exporter on every host; Grafana with dashboards for system health, energy use, and network flows; Slack and Healthchecks.io for alerting.
- Time-series & automation: InfluxDB and Telegraf for metrics; Home Assistant for home automation.
- Network services: Pi-hole for ad-blocking DNS, nginx as an SSL-terminating reverse proxy, Heimdall as a service dashboard.
- Self-hosted apps: Vaultwarden (passwords), Plex (media), Wiki.js, Karakeep (bookmarks), GitLab.
- Sensors: a Modbus energy meter feeding InfluxDB, and an ADS-B receiver feeding FlightAware, Flightradar24, and ADSB Exchange.
- Connectivity: WireGuard for site-to-cloud, plus Tailscale on a few hosts for remote access.
How it's wired together
Ansible is the single source of truth. Hosts are organised into groups (Raspberry Pis, x86 Ubuntu VMs, FreeBSD, mail relays, Docker hosts) and each group inherits the right baseline. Three tiers of playbooks: a common package set on every host, group-specific installs (Docker, postfix relay, SSL certificates), and host-specific roles (energy monitoring, monitoring stack, VPN gateway, GitLab, Plex). Docker-compose files live in the repo per host, and a deploy playbook syncs and restarts services.
Secrets stay in git, encrypted with SOPS and an age key. Anyone with the repo can read the structure of what's deployed; only the key holder can decrypt the actual values. Templates render the secrets in at deploy time and the decrypted file is wiped after the run.
A few things I find satisfying
Pushing a VPN route over DHCP
The DigitalOcean droplet is reachable through a WireGuard tunnel from a gateway VM at home. To save every client from learning that route manually, the EdgeRouter advertises it via DHCP option 121 (RFC 3442 classless static routes). RFC 3442 is precise about this: when option 121 is present, clients must ignore option 3, so the default route has to be re-added inside option 121 alongside the WireGuard route. systemd-networkd is strict about this and will silently drop the default route if you forget. A one-line dnsmasq config on the router takes care of it.
Wildcard certificates over a tunnel
Internal services on the home LAN are fronted by nginx with a wildcard Let's Encrypt certificate. The catch: my DNS provider's API only accepts requests from a small list of whitelisted source IPs, and a residential connection isn't one of them. The DigitalOcean droplet has a static IP that is on the list, so a renewal script opens an SSH SOCKS tunnel through the droplet and runs acme.sh with the DNS-01 challenge through that proxy. Ansible then distributes the renewed cert + key to every host running nginx and reloads the service.
Backups, layered
I think of backups in three layers and try to make sure no single failure can take out more than one.
Config to git. A nightly job exports every Grafana dashboard, datasource, and alert rule as JSON and commits the diff. Another pulls the EdgeRouter's config.boot, CLI commands, and a custom dnsmasq snippet. A router replacement or a Grafana restore is mostly a checkout away.
Host data to the NAS, with Borg. Deduplicated, compressed, encrypted. Each host runs a nightly backup of its home directory to a Borg repository on the NAS (mounted via NFS where possible, or pushed over SSH where not — WSL hosts back up over SSH to a Linux host that has the NFS mount). Retention is the standard 7 daily / 4 weekly / 6 monthly.
NAS off-site, three ways. The NAS itself is the source of truth for media and personal files, so it backs out in three independent directions. Synology Hyper Backup pushes a daily snapshot of system shares to a secondary Synology unit on the LAN (separate disks, separate enclosure). A second Hyper Backup task pushes the media library to AWS S3. And duplicacy writes client-side-encrypted, deduplicated repositories to Backblaze B2 for personal data and media — different cadences for different folders. On top of all that, DSM keeps daily Btrfs share snapshots (and hourly snapshots on the personal-data shares) so an accidental delete is almost always recoverable in seconds without touching any backup.
Trust comes from monitoring. Every user-script backup job pings Healthchecks.io on start and on completion, with the exit status; a silent job is escalated to me before I notice it's silent. Restores have been tested often enough to be boring, which is the only state a backup ever wants to be in.
An energy meter that just keeps writing
A Raspberry Pi sits next to the breaker panel, reads a Modbus energy meter over RS-485, and writes to a local InfluxDB and to InfluxDB Cloud. A Healthchecks.io ping at the end of each loop tells me if the script has been silently dead for more than a few minutes. Years of household consumption now sit in a time-series database I actually own.
ADS-B as a quiet hobby
A spare Pi with a small antenna runs PiAware, readsb, and tar1090 and feeds aircraft positions to a few public networks. A local web map shows what's overhead. It's the kind of project that costs almost nothing to keep running and quietly produces something interesting every day.
Why bother
Reading post-mortems and design docs is no substitute for running a thing yourself. The homelab keeps me close enough to the tools (Ansible, Prometheus, SOPS, WireGuard, nginx, Borg) that I can ask the right questions in design reviews and call out the failure modes I have actually hit. It is small enough to keep tidy, and large enough to keep teaching — and that is the bit that translates back to leading engineering teams well.