Infrastructure
CI/CD Pipeline
Continuous integration and deployment workflows, build matrix, and runner infrastructure
Workflows:
.github/workflows/ci.yml,.github/workflows/deploy.yml,.github/workflows/release.ymlRunner: self-hosted onhn-runner(Hetzner,[self-hosted, linux, x64, hetzner]) Registry: Harbor atregistry.hansenexus.dev
Architecture Overview
PR / Push to master
|
v
+-------------+ +----------------+
| CI.yml | | Deploy.yml |
| | | |
| +---------+ | | +------------+ |
| | Lint | | | | Detect | |
| |Typecheck| | | | Changes | |
| | Test | | | +-----+------+ |
| | Build | | | | |
| | E2E | | | +-----v------+ | +---------+
| +---------+ | | | Build & |------| Harbor |
+-------------+ | | Push | | |Registry |
| +-----+------+ | +---------+
| | |
| +-----v------+ | +---------+
| | Deploy |------| k3s |
| | to K8s | | | Cluster |
| +-----+------+ | +---------+
| | |
| +-----v------+ | +---------+
| | Deploy |------| Convex |
| | Convex | | |Backends |
| +------------+ | +---------+
+----------------+
Workflow Reference
| Workflow | File | Trigger | Timeout | Purpose |
|---|---|---|---|---|
| CI | ci.yml | PR + push to master | 5-30min | Lint, typecheck, test, build, E2E |
| Deploy | deploy.yml | Push to master/staging, PR label, PR close | 5-30min | Build Docker images, deploy to K8s |
| Release | release.yml | Push to master | — | Changesets version management |
CI Jobs
| Job | Timeout | Runs When | Description |
|---|---|---|---|
detect-changes | 5min | Always | Detects which apps changed via git diff |
lint | 10min | Always | Biome check on changed TS/JS files |
typecheck | 15min | Always | TypeScript typecheck on changed apps (all apps included) |
test | 15min | Always | Vitest unit tests on changed packages |
build | 20min | Per changed app | Next.js build for each changed app |
e2e | 30min | Per changed app | Playwright E2E tests (calnexus, lexilink) |
Deploy Jobs
| Job | Timeout | Trigger | Description |
|---|---|---|---|
detect-changes | — | master push | Identifies changed apps |
build-push | 30min | master push | Docker build + Harbor push (per app) |
deploy-k8s | 10min | After build-push | kubectl set image + rollout (per app) |
deploy-convex | 10min | master push + convex changes | bunx convex deploy (per app) |
build-push-staging | 30min | staging push | Docker build for staging apps (matrix) |
deploy-staging | 10min | After staging build | Deploy + optional Convex to staging |
build-push-preview | 30min | PR deploy:preview label | Docker build for preview apps (matrix) |
deploy-preview | 10min | After preview build | Deploy + Convex to preview |
cleanup-preview | 5min | PR close | Scale deployments to 0, preserve PVCs |
Environment Matrix
| Branch / Trigger | Environment | Namespace | Apps | Image Tag |
|---|---|---|---|---|
Push to master | Production | hn-apps + convex | All 8 apps | <sha> + latest |
Push to staging | Staging | hn-staging | Configurable (currently lexilink) | staging |
PR label deploy:preview | Preview | hn-preview | Configurable (currently lexilink) | preview |
| PR close | Cleanup | hn-preview | All preview apps | — |
workflow_dispatch | Production | hn-apps | Specified app | <sha> + latest |
Production Apps
| App | Convex URL | App URL |
|---|---|---|
| portfolio | — | https://portfolio.hansenexus.dev |
| calnexus | https://convex-calnexus.hansenexus.dev | https://calnexus.hansenexus.dev |
| lexilink | https://api.lexilink.app | https://lexilink.app |
| planex | https://convex-planex.hansenexus.dev | https://planex.hansenexus.dev |
| nexus-lms | https://convex-nexuslms.hansenexus.dev | https://nexus-lms.hansenexus.dev |
| archus | https://convex-archus.hansenexus.dev | https://archus.hansenexus.dev |
| bgs-service | https://convex-bgs-service.hansenexus.dev | https://bgs.hansenexus.dev |
| elbe-akustik | https://convex-elbe-akustik.hansenexus.dev | https://elbe-akustik.hansenexus.dev |
Build Pipeline Details
Docker Build Retry
All Docker builds use automatic retry (3 attempts, 10s backoff) to handle transient builder/network failures:
max_retries=3
for attempt in $(seq 1 $max_retries); do
echo "Build attempt $attempt/$max_retries"
if docker buildx build ... --load .; then
echo "Build succeeded"; break
fi
[ "$attempt" -eq "$max_retries" ] && exit 1
echo "Retrying in 10s..."; sleep 10
done
Image Verification
Before deploying, the deploy-k8s job verifies the image exists in Harbor via docker manifest inspect. If the image is missing, the job fails with an error (not a warning) — this prevents silent deployment skips.
Change Detection
The pipeline auto-detects which apps need rebuilding:
- App changes: Files in
apps/<app>/trigger that app only - Package changes: Files in
packages/trigger all apps - Infra changes:
Dockerfile,turbo.json, orbun.lockchanges trigger all apps - Convex changes: Files in
apps/<app>/convex/trigger Convex deploy for that app
Adding a New App to CI
- Add to
ci.ymldetect-changes: Add the app name to thefor app in ...loop (line ~56) - Add to build matrix: Add an entry in the
buildjob’smatrix.includewith app name, URL, andenabledflag - Add to deploy
detect-changes: Add to theAPPSlist indeploy.yml(line ~55) - Add to build-push matrix: Add an
includeentry withapp,convex_url,app_url,app_name - Add to deploy-convex matrix: If the app uses Convex, add an
includeentry withconvex_urlandadmin_key_secret - Add GitHub secrets:
CONVEX_ADMIN_KEY_<APP>(if Convex) - Optional E2E: Add to the
e2ejob matrix withexperimental: trueinitially
Adding Staging/Preview for an App
Staging and preview jobs use matrix strategy — adding a new app is a single include entry:
- Staging build matrix (
build-push-staging): Add entry withapp,convex_url,app_url,app_name,convex_admin_key_secret - Staging deploy matrix (
deploy-staging): Add entry withapp,convex_url,convex_admin_key_secret - Preview build matrix (
build-push-preview): Same as staging with preview URLs - Preview deploy matrix (
deploy-preview): Same as staging - Preview cleanup matrix (
cleanup-preview): Add entry withappandconvex_statefulset - K8s overlay: Create
k8s/overlays/{staging,preview}/<app>/with standalone manifests - DNS: Point
staging.<app>.hansenexus.devandpreview.<app>.hansenexus.devto cluster IP - Secrets: Add
CONVEX_ADMIN_KEY_<APP>_STAGINGandCONVEX_ADMIN_KEY_<APP>_PREVIEWto GitHub
TypeCheck Coverage
All apps are typechecked in CI. Apps with known strictNullChecks issues use a tsconfig.ci.json that limits the include scope to clean directories, enabling incremental expansion.
| App | Status | Notes |
|---|---|---|
| portfolio | Full coverage | — |
| calnexus | Full coverage | — |
| lexilink | Partial (tsconfig.ci.json) | Expanding incrementally |
| planex | Full coverage | — |
| nexus-lms | Full coverage | — |
| archus | Full coverage | — |
| bgs-service | Full coverage | — |
| elbe-akustik | Full coverage | — |
E2E Test Status
| App | Status | Mock Provider | Notes |
|---|---|---|---|
| calnexus | Stable | NEXT_PUBLIC_MOCK_CONVEX=true | Non-experimental, failures block CI |
| lexilink | Stable | NEXT_PUBLIC_MOCK_CONVEX=true | Comprehensive mock coverage for all tested Convex functions |
E2E tests use NEXT_PUBLIC_MOCK_CONVEX=true to avoid real Convex calls in CI. Mock responses are defined in apps/<app>/e2e/fixtures/mocks.ts.
Concurrency
- CI:
cancel-in-progress: true— new pushes cancel in-progress CI runs for the same branch - Deploy:
cancel-in-progress: false— deploy jobs are never cancelled mid-flight to prevent partial deployments
Secrets Reference
| Secret | Used By | Purpose |
|---|---|---|
HARBOR_USERNAME | deploy.yml | Harbor registry login |
HARBOR_PASSWORD | deploy.yml | Harbor registry login |
KUBECONFIG | deploy.yml | Base64-encoded kubeconfig for kubectl |
CONVEX_ADMIN_KEY_<APP> | deploy.yml | Per-app Convex admin keys (7 apps) |
CONVEX_ADMIN_KEY_<APP>_STAGING | deploy.yml | Staging Convex admin keys |
CONVEX_ADMIN_KEY_<APP>_PREVIEW | deploy.yml | Preview Convex admin keys |
TURBO_TOKEN | ci.yml | Turborepo remote cache token |
TURBO_TEAM | ci.yml | Turborepo team identifier |
Runner Troubleshooting
Runner not picking up jobs
ssh hn-runner
systemctl list-units 'actions.runner.*' # Check runner services
systemctl restart actions.runner.*.service # Restart if needed
Disk space issues
The runner at hn-runner accumulates Docker images and build cache:
ssh hn-runner
df -h / # Check disk usage
docker system df # Check Docker storage
docker system prune -a # Clean all unused images
docker buildx prune -f # Clean buildx cache
Build hangs or network timeouts
Docker builds have automatic retry (3 attempts). If builds consistently fail:
- Check runner network:
curl -s https://registry.hansenexus.dev/api/v2.0/health - Check Docker daemon:
docker info - Check buildx:
docker buildx ls
Queued jobs not starting
Both runners (hn-runner-3, hn-runner-4) should be active:
ssh hn-runner
systemctl status actions.runner.Lenoux01-hn-monorepo.hn-runner-3.service
systemctl status actions.runner.Lenoux01-hn-monorepo.hn-runner-4.service
If runners show transient SocketException or Operation canceled errors in _diag/Runner_*.log, these are normal GitHub Actions long-polling timeouts — not actual failures.
Manual workflow dispatch
gh workflow run deploy.yml --field app=lexilink