Infrastructure

CI/CD Pipeline

Continuous integration and deployment workflows, build matrix, and runner infrastructure

Workflows: .github/workflows/ci.yml, .github/workflows/deploy.yml, .github/workflows/release.yml Runner: self-hosted on hn-runner (Hetzner, [self-hosted, linux, x64, hetzner]) Registry: Harbor at registry.hansenexus.dev

Architecture Overview

  PR / Push to master
        |
        v
  +-------------+    +----------------+
  |   CI.yml    |    |  Deploy.yml    |
  |             |    |                |
  | +---------+ |    | +------------+ |
  | |  Lint   | |    | |  Detect    | |
  | |Typecheck| |    | |  Changes   | |
  | |  Test   | |    | +-----+------+ |
  | |  Build  | |    |       |        |
  | |  E2E    | |    | +-----v------+ |    +---------+
  | +---------+ |    | | Build &    |------| Harbor  |
  +-------------+    | | Push       | |    |Registry |
                     | +-----+------+ |    +---------+
                     |       |        |
                     | +-----v------+ |    +---------+
                     | | Deploy     |------| k3s     |
                     | | to K8s     | |    | Cluster |
                     | +-----+------+ |    +---------+
                     |       |        |
                     | +-----v------+ |    +---------+
                     | | Deploy     |------| Convex  |
                     | | Convex     | |    |Backends |
                     | +------------+ |    +---------+
                     +----------------+

Workflow Reference

WorkflowFileTriggerTimeoutPurpose
CIci.ymlPR + push to master5-30minLint, typecheck, test, build, E2E
Deploydeploy.ymlPush to master/staging, PR label, PR close5-30minBuild Docker images, deploy to K8s
Releaserelease.ymlPush to masterChangesets version management

CI Jobs

JobTimeoutRuns WhenDescription
detect-changes5minAlwaysDetects which apps changed via git diff
lint10minAlwaysBiome check on changed TS/JS files
typecheck15minAlwaysTypeScript typecheck on changed apps (all apps included)
test15minAlwaysVitest unit tests on changed packages
build20minPer changed appNext.js build for each changed app
e2e30minPer changed appPlaywright E2E tests (calnexus, lexilink)

Deploy Jobs

JobTimeoutTriggerDescription
detect-changesmaster pushIdentifies changed apps
build-push30minmaster pushDocker build + Harbor push (per app)
deploy-k8s10minAfter build-pushkubectl set image + rollout (per app)
deploy-convex10minmaster push + convex changesbunx convex deploy (per app)
build-push-staging30minstaging pushDocker build for staging apps (matrix)
deploy-staging10minAfter staging buildDeploy + optional Convex to staging
build-push-preview30minPR deploy:preview labelDocker build for preview apps (matrix)
deploy-preview10minAfter preview buildDeploy + Convex to preview
cleanup-preview5minPR closeScale deployments to 0, preserve PVCs

Environment Matrix

Branch / TriggerEnvironmentNamespaceAppsImage Tag
Push to masterProductionhn-apps + convexAll 8 apps<sha> + latest
Push to stagingStaginghn-stagingConfigurable (currently lexilink)staging
PR label deploy:previewPreviewhn-previewConfigurable (currently lexilink)preview
PR closeCleanuphn-previewAll preview apps
workflow_dispatchProductionhn-appsSpecified app<sha> + latest

Production Apps

AppConvex URLApp URL
portfoliohttps://portfolio.hansenexus.dev
calnexushttps://convex-calnexus.hansenexus.devhttps://calnexus.hansenexus.dev
lexilinkhttps://api.lexilink.apphttps://lexilink.app
planexhttps://convex-planex.hansenexus.devhttps://planex.hansenexus.dev
nexus-lmshttps://convex-nexuslms.hansenexus.devhttps://nexus-lms.hansenexus.dev
archushttps://convex-archus.hansenexus.devhttps://archus.hansenexus.dev
bgs-servicehttps://convex-bgs-service.hansenexus.devhttps://bgs.hansenexus.dev
elbe-akustikhttps://convex-elbe-akustik.hansenexus.devhttps://elbe-akustik.hansenexus.dev

Build Pipeline Details

Docker Build Retry

All Docker builds use automatic retry (3 attempts, 10s backoff) to handle transient builder/network failures:

max_retries=3
for attempt in $(seq 1 $max_retries); do
  echo "Build attempt $attempt/$max_retries"
  if docker buildx build ... --load .; then
    echo "Build succeeded"; break
  fi
  [ "$attempt" -eq "$max_retries" ] && exit 1
  echo "Retrying in 10s..."; sleep 10
done

Image Verification

Before deploying, the deploy-k8s job verifies the image exists in Harbor via docker manifest inspect. If the image is missing, the job fails with an error (not a warning) — this prevents silent deployment skips.

Change Detection

The pipeline auto-detects which apps need rebuilding:

  • App changes: Files in apps/<app>/ trigger that app only
  • Package changes: Files in packages/ trigger all apps
  • Infra changes: Dockerfile, turbo.json, or bun.lock changes trigger all apps
  • Convex changes: Files in apps/<app>/convex/ trigger Convex deploy for that app

Adding a New App to CI

  1. Add to ci.yml detect-changes: Add the app name to the for app in ... loop (line ~56)
  2. Add to build matrix: Add an entry in the build job’s matrix.include with app name, URL, and enabled flag
  3. Add to deploy detect-changes: Add to the APPS list in deploy.yml (line ~55)
  4. Add to build-push matrix: Add an include entry with app, convex_url, app_url, app_name
  5. Add to deploy-convex matrix: If the app uses Convex, add an include entry with convex_url and admin_key_secret
  6. Add GitHub secrets: CONVEX_ADMIN_KEY_<APP> (if Convex)
  7. Optional E2E: Add to the e2e job matrix with experimental: true initially

Adding Staging/Preview for an App

Staging and preview jobs use matrix strategy — adding a new app is a single include entry:

  1. Staging build matrix (build-push-staging): Add entry with app, convex_url, app_url, app_name, convex_admin_key_secret
  2. Staging deploy matrix (deploy-staging): Add entry with app, convex_url, convex_admin_key_secret
  3. Preview build matrix (build-push-preview): Same as staging with preview URLs
  4. Preview deploy matrix (deploy-preview): Same as staging
  5. Preview cleanup matrix (cleanup-preview): Add entry with app and convex_statefulset
  6. K8s overlay: Create k8s/overlays/{staging,preview}/<app>/ with standalone manifests
  7. DNS: Point staging.<app>.hansenexus.dev and preview.<app>.hansenexus.dev to cluster IP
  8. Secrets: Add CONVEX_ADMIN_KEY_<APP>_STAGING and CONVEX_ADMIN_KEY_<APP>_PREVIEW to GitHub

TypeCheck Coverage

All apps are typechecked in CI. Apps with known strictNullChecks issues use a tsconfig.ci.json that limits the include scope to clean directories, enabling incremental expansion.

AppStatusNotes
portfolioFull coverage
calnexusFull coverage
lexilinkPartial (tsconfig.ci.json)Expanding incrementally
planexFull coverage
nexus-lmsFull coverage
archusFull coverage
bgs-serviceFull coverage
elbe-akustikFull coverage

E2E Test Status

AppStatusMock ProviderNotes
calnexusStableNEXT_PUBLIC_MOCK_CONVEX=trueNon-experimental, failures block CI
lexilinkStableNEXT_PUBLIC_MOCK_CONVEX=trueComprehensive mock coverage for all tested Convex functions

E2E tests use NEXT_PUBLIC_MOCK_CONVEX=true to avoid real Convex calls in CI. Mock responses are defined in apps/<app>/e2e/fixtures/mocks.ts.

Concurrency

  • CI: cancel-in-progress: true — new pushes cancel in-progress CI runs for the same branch
  • Deploy: cancel-in-progress: false — deploy jobs are never cancelled mid-flight to prevent partial deployments

Secrets Reference

SecretUsed ByPurpose
HARBOR_USERNAMEdeploy.ymlHarbor registry login
HARBOR_PASSWORDdeploy.ymlHarbor registry login
KUBECONFIGdeploy.ymlBase64-encoded kubeconfig for kubectl
CONVEX_ADMIN_KEY_<APP>deploy.ymlPer-app Convex admin keys (7 apps)
CONVEX_ADMIN_KEY_<APP>_STAGINGdeploy.ymlStaging Convex admin keys
CONVEX_ADMIN_KEY_<APP>_PREVIEWdeploy.ymlPreview Convex admin keys
TURBO_TOKENci.ymlTurborepo remote cache token
TURBO_TEAMci.ymlTurborepo team identifier

Runner Troubleshooting

Runner not picking up jobs

ssh hn-runner
systemctl list-units 'actions.runner.*'    # Check runner services
systemctl restart actions.runner.*.service  # Restart if needed

Disk space issues

The runner at hn-runner accumulates Docker images and build cache:

ssh hn-runner
df -h /                              # Check disk usage
docker system df                     # Check Docker storage
docker system prune -a               # Clean all unused images
docker buildx prune -f               # Clean buildx cache

Build hangs or network timeouts

Docker builds have automatic retry (3 attempts). If builds consistently fail:

  • Check runner network: curl -s https://registry.hansenexus.dev/api/v2.0/health
  • Check Docker daemon: docker info
  • Check buildx: docker buildx ls

Queued jobs not starting

Both runners (hn-runner-3, hn-runner-4) should be active:

ssh hn-runner
systemctl status actions.runner.Lenoux01-hn-monorepo.hn-runner-3.service
systemctl status actions.runner.Lenoux01-hn-monorepo.hn-runner-4.service

If runners show transient SocketException or Operation canceled errors in _diag/Runner_*.log, these are normal GitHub Actions long-polling timeouts — not actual failures.

Manual workflow dispatch

gh workflow run deploy.yml --field app=lexilink
HanseNexus 2026