CI/CD Runbook
Last reviewed: 2026-04-21
Maintained by: Engineering
This file is the operational guide for GitHub Actions and deploy troubleshooting.
Workflow Overview
The repo currently uses:
.github/workflows/ci.ymlfor validation.github/workflows/deploy.ymlfor manual deployment promotion
Current delivery model:
- feature branches merge into
main CIruns automatically on pushes tomainand feature branches, and on pull requests tomain- deploys do not happen automatically
Deployis triggered manually with a target environment:stagingproduction
What a Failure in ci.yml Means
Install, Typecheck, Lint
If this job fails, the most common causes are:
- the lockfile and
package.jsonare out of sync - a TypeScript error exists in the API or web app
- an ESLint error exists
Local check:
pnpm install
pnpm run lint
pnpm run check:api
pnpm run check:manager-desk
Tests
If this job fails, the most common causes are:
- a pure utility or configuration contract changed without updating tests
- a new test file is failing under the Node test runner
- a workspace test command was broken or removed
Local check:
pnpm test
Database Smoke Test
If this job fails, the problem is usually:
- a new migration does not work on an empty database
- the seed script is not aligned with the current schema
- migrations are not idempotent when
db:migrateis run again
Local check:
docker compose up -d postgres
pnpm run db:migrate
pnpm run seed:demo
pnpm run db:migrate
Migration Drift Check
If this job fails, it means the committed schema snapshot no longer matches the schema produced by the current migrations.
Typical causes:
- a new migration was added but
database/schema.snapshot.sqlwas not refreshed - a migration was edited after the snapshot was generated
- the schema dump normalization logic needs to account for a new deterministic
pg_dumpoutput line
Local check:
docker compose up -d postgres
pnpm run db:migrate
pnpm run db:schema:check
What a Failure in deploy.yml Means
Staging Deploy
If the manual staging deploy fails:
- the Vercel token or project config may be wrong
- the Render service or environment configuration may be wrong
- the target environment may be missing required secrets or vars
- the environment contract may fail validation before the deploy steps run
Production Deploy
If the manual production deploy fails:
- the same causes as staging apply
- production may also have stricter values or approval requirements
- the environment contract may fail validation before the deploy steps run
Manual Troubleshooting Order
When a deploy fails, go through this order:
- check whether the
CIworkflow was green - check
Validate environment contractinside the deploy job - check whether the DB smoke test passed
- check GitHub Environment
varsandsecrets - check Vercel project IDs and current API host configuration
QA Checklist Before Merging Into main
At minimum:
Install, Typecheck, Lintis greenTestsis greenDatabase Smoke Testis greenMigration Drift Checkis green- if there are schema changes, migration and seed were tested
- if there are schema changes,
database/schema.snapshot.sqlwas refreshed - if there are deploy-affecting env changes,
StagingandProductionGitHub Environments were updated
Manual Promotion Flow
Use this order:
- finish work on a feature branch
- merge into
main - wait for
CIto go green - manually run
Deploywith targetstaging - test in staging
- manually run
Deploywith targetproduction
Deploy Assumptions
Vercel
The deploy workflow assumes:
- valid
VERCEL_TOKEN - valid
VERCEL_ORG_ID - valid
VERCEL_PROJECT_ID_MANAGER_DESK apps/manager-deskis connected to the Manager Desk Vercel project
Render
The deploy workflow assumes:
- Render is the intended API hosting target in the docs portal
- hosted API configuration is present in the correct environment
- any legacy Railway-specific workflow wiring should not be treated as the current hosting standard
When You Must Update Documentation
If you change:
- CI jobs
- branch flow
- deployment targets
- required environment fields
then also update: