Goal: Automate the process of deploying any changes to code in this repo to its associated endpoint.
Justification: Automating deployment processes improves consistency and visibility of deployments, which reduces toil.
Components:
- Static site generator(s). As of now, changes made to the Jafner.dev site are automatically built and deployed via GitHub Actions running from the GitHub mirror. This is what we're aiming for.
- NixOS configurations for servers. While our PC configurations should always use pull-based deployments, we should be able to build and deploy our NixOS server configs automatically.
- Remote infrastructure. It should be trivial to automatically deploy Terraform updates for cloud resources such as Cloudflare DNS and Digital Ocean droplets. Automating Terraform deployments will require solving state management.
- VyOS configuration. We've written scripts that should facilitate automated deployment. But vyos does not make this easy.
Goal: Implement tools and systems to declare local and cloud infrastructure, data management, and service configuration as code in the repo. This maximizes visibility.
Justification: Imperative configuration is difficult to track, reproduce, and troubleshoot. While some aspects of configuration (such as local host hardware) cannot be managed from code, declaring that which can be managed declaratively serves visibility, security, reproducibility, and ease of maintenance.
Components:
- Local host system configurations via NixOS.
- Local network configuration via VyOS.
- Cloud resource definitions via Terraform.
- Service configuration via Kubernetes and Helm.
- Website configuration via Hugo.
Goal: Build NixOS/Home-manager configurations for personal devices. Should have application support parity with existing systems.
Justification: PC configurations inexorably drift toward bloat as packages, services, configuration tweaks, and personalizations are implemented imperatively. Using declarative system configuration management facilitates reproducible, safe, maintainable PCs that can be relied upon.
Components:
- Desktop. Gaming PC with plenty of bells, whistles, and unique workflows.
- Laptop. Old ultrabook (XPS 13 9350) expected to be power efficient and responsive.
- ??? Android phone. Presently, Google facilitates migration between phones. Would like to eliminate dependency on Google for a smooth transition. ???
Goal: Incorporate general purpose compute hosts into a Kubernetes cluster. This will serve as a foundation for highly-available services.
Justification: High availability of compute resources means we can perform zero-downtime hardware or OS maintenance, which improves system reliability.
Components:
-
Initial k3s control plane (3x Wyse 5070 thin clients). We've already built this, but still working on getting the networking configured.
-
Fighter. This will require significant downtime as Fighter must be migrated from the legacy Debian platform to a new NixOS configuration. Additionally, Fighter is awaiting installation of an AMD Instinct MI60 for AI workloads.
-
Cloud workers. Adding elastic compute resources via public cloud is the baseline for auto-scaling. Networking will be a significant challenge.
This "milestone" owns issues related to cleanup and catchup issues.
Goal: Re-implement service stacks described in our docker-compose.yml files as Kubernetes manifests and/or Helm charts.
Justification: Our K3s cluster provides a platform for highly-available services. Migrating our services to the cluster means we can maintain uptime while accelerating toil tasks like hardware upgrades & replacements, operating system upgrades & reboots, and service upgrades.
Components:
- Create virtual IP for cluster.
- Re-implement services based on their resource dependencies:
- Trivial dependencies: gitea-runner, homepage, keycloak, monitoring, unifi, wireguard.
- iSCSI only: home-assistant, minecraft, nextcloud, send, zipline
- SMB only: manyfold, navidrome, plex, qbittorrent
- Needy: autopirate, books, stash
Goal: Build a documentation system that ensures all documentation is relevant, correct, and sufficient.
Justification: The usefulness of documentation is dependent on its correctness and comprehensiveness. Identifying and fixing instances of irrelevant, incorrect, or insufficient documentation allows us to trust our docs as a source of truth.
Components:
- Documentation schema. Where is our documentation located, and what defines a syntactically valid piece of documentation?
- Tests for relevance. How can we ensure that all pieces of our documentation relate to one or more things that exist in our repo?
- E.g. Require headers link to other files, ensure that documentation is updated appropriately if a dependency is moved or deleted.
- Tests for correctness. How can we ensure that our documentation is kept up to date with the components to which it relates?
- E.g. Require that documentation is updated when a dependency is modified.
- E.g. Ask AI to rate the correctness of assertions in docs.
- Tests for comprehensiveness. How can we ensure that all components are documented?
- E.g. Ensure all directories contain a valid README.md file.
- E.g. Ensure documentation compiles to a tree without islands or orphans.
- E.g. Ensure all code is linked to one or more docs.