Infrastructure

Deployment

Deployment pipeline, Kubernetes rollouts, and environment management for HanseNexus apps

Pipeline: .github/workflows/deploy.yml Runner: self-hosted on hn-runner (Hetzner, 159.69.123.137) Registry: Harbor at registry.hansenexus.dev Cluster: k3s v1.34 on hn-k3s (Hetzner CPX52, 91.99.1.144)

Overview

All apps run as Kubernetes Deployments on a single-node k3s cluster. Docker images are built by GitHub Actions, pushed to a self-hosted Harbor registry, and deployed via kubectl set image. Secrets are managed by the 1Password Operator which syncs vault items to Kubernetes Secrets.

Branch to Environment Mapping

Branch / TriggerEnvironmentNamespaceApps
Push to masterProductionhn-appsAll apps
Push to stagingStaginghn-staginglexilink only
PR label deploy:previewPreviewhn-previewlexilink only
PR closeCleanuphn-previewScale to 0, PVCs preserved
workflow_dispatchProductionhn-appsSpecified app

Build Pipeline

Production (push to master)

  1. Detect changes — Compares HEAD~1 to HEAD, identifies which apps have changes in apps/<app>/ or packages/. Changes to Dockerfile, turbo.json, or bun.lock trigger all apps.
  2. Build and Push — For each changed app, runs a multi-stage Docker build:
    docker buildx build --build-arg APP_NAME=<app> \
      --build-arg NEXT_PUBLIC_CONVEX_URL=<url> \
      --build-arg NEXT_PUBLIC_APP_URL=<url> \
      --build-arg NEXT_PUBLIC_APP_NAME=<name> \
      -t registry.hansenexus.dev/hn/<app>:<sha> \
      -t registry.hansenexus.dev/hn/<app>:latest .
  3. Deploy — Updates the Deployment image tag and waits for rollout:
    kubectl set image deployment/<app> <app>=registry.hansenexus.dev/hn/<app>:<sha> -n hn-apps
    kubectl rollout status deployment/<app> -n hn-apps --timeout=120s

Staging (push to staging)

Builds lexilink with staging-specific URLs, pushes as lexilink:staging, and deploys to hn-staging namespace.

Preview (PR label)

Triggered when a PR receives the deploy:preview label. Builds lexilink with preview URLs, pushes as lexilink:preview, and deploys to hn-preview namespace. On PR close, preview deployments are scaled to 0 (PVCs preserved for data recovery).

Convex Deployment

Convex backends auto-deploy when apps/<app>/convex/ changes are detected:

  • Runs bunx convex deploy with the app’s CONVEX_SELF_HOSTED_URL and admin key
  • Each app has its own Convex instance as a StatefulSet in the convex namespace
  • Admin keys are stored as GitHub secrets: CONVEX_ADMIN_KEY_<APP>

Apps with Convex: calnexus, lexilink, planex, nexus-lms, archus, bgs-service, elbe-akustik.

Kustomize Structure

k8s/
├── kustomization.yaml            # Root: includes base/, apps/, convex/, secrets/
├── base/
│   ├── kustomization.yaml
│   ├── namespace.yaml            # hn-apps namespace
│   └── signoz/                   # OpenTelemetry collector DaemonSet + RBAC
├── apps/<app>/                   # Per-app manifests (hn-apps namespace)
│   ├── kustomization.yaml
│   ├── deployment.yaml           # Deployment with env vars, probes, resources
│   ├── service.yaml              # ClusterIP service
│   ├── ingress.yaml              # Ingress with TLS (cert-manager)
│   ├── serviceaccount.yaml       # Per-app ServiceAccount (automount disabled)
│   ├── pdb.yaml                  # PodDisruptionBudget (multi-replica apps only)
│   └── onepassworditem.yaml      # 1Password Operator sync (most apps)
├── convex/                       # Per-app Convex instances (convex namespace)
│   ├── kustomization.yaml
│   ├── namespace.yaml
│   └── <app>/
│       ├── statefulset.yaml      # Convex backend StatefulSet
│       ├── services.yaml         # Backend + site services
│       ├── ingress.yaml          # API + site ingress
│       ├── dashboard.yaml        # Convex dashboard Deployment
│       ├── onepassworditem.yaml  # Admin key secret
│       └── kustomization.yaml
├── overlays/
│   ├── staging/                  # lexilink staging (hn-staging namespace)
│   │   ├── kustomization.yaml
│   │   ├── namespace.yaml
│   │   ├── lexilink/             # Standalone manifests (not patches)
│   │   └── convex-lexilink/
│   └── preview/                  # lexilink preview (hn-preview namespace)
│       ├── kustomization.yaml
│       ├── namespace.yaml
│       ├── lexilink/
│       └── convex-lexilink/
├── rbac/
│   ├── mcp-server/               # MCP server ServiceAccount + RBAC
│   └── ci-deploy/                # CI/CD deploy ServiceAccount + scoped Role
└── secrets/                      # Shared auth-secret across namespaces

Overlays use standalone manifests (full resource definitions), not Kustomize patches. This keeps each environment self-contained.

Adding a New Environment

Follow the lexilink pattern:

  1. Create overlay directory: Copy k8s/overlays/staging/lexilink/ and convex-lexilink/, update namespaces and URLs
  2. Create namespace: Add namespace.yaml and update kustomization.yaml
  3. DNS records: Point <env>.<app>.hansenexus.dev to the cluster IP (91.99.1.144)
  4. 1Password items: Create <app>-<env> item in the k3 vault with the app’s secrets
  5. GitHub secrets: Add CONVEX_ADMIN_KEY_<APP>_<ENV> for Convex apps
  6. Workflow jobs: Add build-push and deploy jobs for the new environment in deploy.yml
  7. TLS: cert-manager auto-provisions certificates via the Ingress tls block

Secrets

On the Cluster (1Password Operator)

The 1Password Connect Operator runs in op-system and watches for OnePasswordItem resources. Each app has a onepassworditem.yaml that maps a vault item to a Kubernetes Secret:

apiVersion: onepassword.com/v1
kind: OnePasswordItem
metadata:
  name: lexilink-secrets
  namespace: hn-apps
spec:
  itemPath: "vaults/k3/items/lexilink-prod"

The operator creates a Secret named lexilink-secrets with fields from the vault item. Deployments reference these via secretKeyRef. The auto-restart annotation ensures pods restart when secrets change.

Important: The operator preserves original casing from 1Password items (does NOT lowercase field names).

In CI

  • HARBOR_USERNAME / HARBOR_PASSWORD — Harbor registry credentials
  • KUBECONFIG — Base64-encoded kubeconfig for kubectl access
  • CONVEX_ADMIN_KEY_<APP> — Per-app Convex admin keys

For Local Dev

.env.op files contain op:// URI references (zero secrets, safe to commit). The op run CLI resolves them at runtime:

bun dev:lexilink  # Runs: op run --env-file apps/lexilink/.env.op -- turbo dev --filter=lexilink

See the Secrets Management page for full setup instructions.

Health Checks and Rollouts

Health Endpoints

Every deployed app exposes GET /api/health which returns { status: "ok", timestamp: <epoch_ms> }. This lightweight endpoint requires no database or external service access.

Kubernetes Probes

All app Deployments include HTTP-based health probes:

  • Liveness probe: GET /api/health on port 3000, starts after 15s, checks every 30s, 3 failures to restart
  • Readiness probe: GET /api/health on port 3000, starts after 5s, checks every 10s, 3 failures to remove from service

Rolling Update Strategy

All deployments use an explicit rolling update strategy:

strategy:
  type: RollingUpdate
  rollingUpdate:
    maxUnavailable: 0
    maxSurge: 1

This ensures zero-downtime deployments — at least one old pod stays ready until the new pod passes readiness probes.

Image Pull Policy

All containers use imagePullPolicy: Always to ensure the correct image is always pulled, regardless of tag format (SHA vs latest).

Deployment Rollout

The CI pipeline uses kubectl rollout status with a 120s timeout. If the new pods fail readiness probes, the rollout stalls and the job fails. The previous ReplicaSet remains active (automatic rollback).

Pod Disruption Budgets

Multi-replica apps have PodDisruptionBudgets to ensure availability during voluntary disruptions (node drains, cluster upgrades):

apiVersion: policy/v1
kind: PodDisruptionBudget
spec:
  minAvailable: 1
  selector:
    matchLabels:
      app: <app>

PDBs are created for apps with replicas >= 2. Check which apps have them: kubectl get pdb -n hn-apps.

Pod Anti-Affinity

Multi-replica apps include preferred pod anti-affinity rules to spread pods across nodes. On the current single-node cluster this has no effect, but ensures automatic spread if the cluster scales:

affinity:
  podAntiAffinity:
    preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 100
        podAffinityTerm:
          labelSelector:
            matchExpressions:
              - key: app
                operator: In
                values: [<app>]
          topologyKey: kubernetes.io/hostname

Security

Per-App ServiceAccounts

Each app runs with a dedicated ServiceAccount with automountServiceAccountToken: false. Next.js apps have no need for Kubernetes API access, so the token is not mounted.

CI/CD RBAC

A dedicated ci-deploy ServiceAccount in hn-apps namespace has minimal permissions:

  • get, patch on Deployments (for kubectl set image)
  • get, list, watch on Pods and ReplicaSets (for kubectl rollout status)

Defined in k8s/rbac/ci-deploy/.

Convex Dashboard Auth

Convex dashboard Ingresses use Traefik BasicAuth middleware (convex-dashboard-auth). The htpasswd secret is synced from 1Password via the operator. All dashboard URLs require authentication.

Convex StatefulSets

Convex backends use updateStrategy: { type: OnDelete }. This is intentional — Convex is a data-stateful workload and upgrades should be manually controlled. To update a Convex backend, delete the pod and let the StatefulSet recreate it with the new image.

Network Policies

Current status: Not enforced. The k3s cluster uses embedded Flannel as CNI, which does not enforce NetworkPolicies. Migrating to Calico or Cilium is required to enable network segmentation. This is tracked as future work.

App URLs

AppProduction URL
lexilinkhttps://lexilink.app
calnexushttps://calnexus.hansenexus.dev
planexhttps://planex.hansenexus.dev
nexus-lmshttps://nexus-lms.hansenexus.dev
archushttps://archus.hansenexus.dev
portfoliohttps://portfolio.hansenexus.dev
bgs-servicehttps://bgs.hansenexus.dev
elbe-akustikhttps://elbe-akustik.hansenexus.dev
qriptNot deployed (experimental)

Staging: https://staging.lexilink.hansenexus.dev Preview: https://preview.lexilink.hansenexus.dev

Troubleshooting

Image pull fails (ImagePullBackOff)

  • Verify harbor-registry secret exists in the namespace: kubectl get secret harbor-registry -n hn-apps
  • Check Harbor is reachable: curl -s https://registry.hansenexus.dev/api/v2.0/health
  • Verify the image tag exists: docker manifest inspect registry.hansenexus.dev/hn/<app>:<sha>

Rollout stuck

  • Check pod events: kubectl describe pod -l app=<app> -n hn-apps
  • Check logs: kubectl logs -l app=<app> -n hn-apps --tail=50
  • If the app crashes on startup, it’s usually a missing env var — check the onepassworditem.yaml matches the vault item fields

1Password secrets not syncing

  • Check operator logs: kubectl logs -n op-system -l app=onepassword-connect
  • Verify the vault item exists: op item get <app>-prod --vault k3
  • Ensure field names are uppercase (AUTH_SECRET, not auth_secret)

Convex deploy fails

  • Verify admin key secret is set in GitHub: gh secret list | grep CONVEX_ADMIN_KEY
  • Check the Convex backend is running: kubectl get statefulset -n convex
  • Ensure the Convex URL resolves to the correct backend

Build fails for all apps

  • If Dockerfile, turbo.json, or bun.lock changed, all apps rebuild. A failure in one doesn’t block others (fail-fast: false).
  • Docker builds have automatic retry (3 attempts, 10s backoff) to handle transient failures
  • Check runner disk space on hn-runner — Docker images can fill up fast. Clean with docker system prune -a

Manual deploy

gh workflow run deploy.yml --field app=lexilink
HanseNexus 2026