Files
rfc-app/deploy/RUNBOOK.md
T
Ben Stull 36635049c7 Slice 8: v1 ships — integration coverage, runbook, spec corrections
- Five new integration test files raise the suite from 75 to 96 green:
  test_hygiene_vertical (7), test_branch_path_routing (4),
  test_metadata_pr_merge (3), test_cache_bootstrap (4), test_e2e_smoke
  (3). The smoke test walks propose → super-draft → edit branch →
  body-edit PR → graduate → active-RFC PR → merge → notification →
  hygiene-sweep deletion end-to-end.
- deploy/RUNBOOK.md replaces the prior DEPLOY.md stub as a real
  runbook: prerequisites, first-time bring-up, day-2 ops (logs, DB
  backup, secret rotation, the §12 hygiene cadence), rollback shape,
  troubleshooting table.
- backend/.env.example grows the SMTP block, HYGIENE_TICK_SECONDS,
  and WEBHOOK_EMAIL_BOUNCE_SECRET with inline commentary.
- README points to RUNBOOK.md; the "what the build lets you do"
  section adds Slices 7 and 8.
- docs/DEV.md gets a Slice 8 — shipped section; the "Next slice"
  footer becomes the v1-complete epitaph.
- SPEC corrections per the §19.3 working agreement: §10.7 names the
  shared §12 sweep; §12 names the bot as actuator and the per-user
  branch_chat_seen preservation contract; §19.1 marks v1 complete
  and records Slice 8; the five §19.2 candidates Slice 8 folded in
  are marked settled with pointers at the resolution.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 04:14:50 -07:00

13 KiB

Runbook

Single-host deployment of the RFC app at rfc.wiggleverse.org, sharing infrastructure with git.wiggleverse.org (same Gitea instance, same nginx, same Let's Encrypt). The shape matches §4.2: one process, one SQLite file, no separate worker.

Bring-up order: host prep → Gitea side (bot, OAuth, meta repo) → app side (code, venv, build, .env) → web server side (nginx, certbot) → systemd → smoke test.

Every step is idempotent or no-op-on-rerun, so re-running this runbook to recover from a partial install is safe.


0. Prerequisites

  • Ubuntu/Debian-style host with nginx and certbot already serving git.wiggleverse.org over HTTPS.
  • DNS: an A record for rfc.wiggleverse.org pointing at the same IP as git.wiggleverse.org.
  • Python 3.13 available system-wide. Node 20+ available (for npm run build once; the build output is what runs in production — Node isn't needed at runtime).
  • git, openssl, and rsync on the host.

1. First-time bring-up

1.1 Host prep

Create the system user and the install directory.

sudo useradd --system --shell /usr/sbin/nologin --home-dir /opt/rfc-app rfc-app
sudo mkdir -p /opt/rfc-app
sudo chown rfc-app:rfc-app /opt/rfc-app

Clone the repo. HTTPS, since we don't push from the server.

sudo -u rfc-app git clone https://git.wiggleverse.org/ben.stull/rfc-app.git /opt/rfc-app

1.2 Gitea side

1.2.1 Create the bot service account. In the Gitea web UI, signed in as a Gitea admin:

  • Site Administration → User Accounts → Create User Account
  • Username: rfc-bot (or whatever you want)
  • Email: anything sensible (e.g. rfc-bot@wiggleverse.org)
  • Password: random, you won't use it interactively
  • Send email confirmation: off

Then sign in as the bot, open Settings → Applications → Generate New Token, name it rfc-app, grant scopes:

  • write:repository
  • write:user
  • write:admin (needed because the bot creates per-RFC repos on graduation and deletes branches per §12 hygiene)

Copy the token. It goes into .env as GITEA_BOT_TOKEN.

1.2.2 Create the org and add the bot. The meta repo lives inside an org. In Gitea: Create Organization → wiggleverse. Then Members → Invite → rfc-bot → Owner.

1.2.3 Register the OAuth2 application. Site Administration → Integrations → OAuth2 Applications → Create Application:

  • Name: RFC App
  • Redirect URI: https://rfc.wiggleverse.org/auth/callback

Copy the client ID and client secret. They go into .env.

1.3 App side

1.3.1 Python venv + deps.

sudo -u rfc-app python3 -m venv /opt/rfc-app/backend/.venv
sudo -u rfc-app /opt/rfc-app/backend/.venv/bin/pip install \
    -r /opt/rfc-app/backend/requirements.txt

1.3.2 Build the frontend. Build locally and copy dist/ across:

# On your laptop:
cd frontend && npm install && npm run build
rsync -a dist/ ben.stull@<host>:/tmp/rfc-app-dist/
# On the host:
sudo -u rfc-app mkdir -p /opt/rfc-app/frontend/dist
sudo cp -r /tmp/rfc-app-dist/. /opt/rfc-app/frontend/dist/
sudo chown -R rfc-app:rfc-app /opt/rfc-app/frontend/dist

Or build on the host directly if Node is installed there:

cd /opt/rfc-app/frontend && sudo -u rfc-app npm install
sudo -u rfc-app npm run build

1.3.3 Write .env.

sudo -u rfc-app cp /opt/rfc-app/backend/.env.example /opt/rfc-app/backend/.env
sudoedit /opt/rfc-app/backend/.env    # set every value

Required values for production (see .env.example for the comments on each field):

GITEA_URL=https://git.wiggleverse.org
GITEA_BOT_USER=rfc-bot
GITEA_BOT_TOKEN=<from 1.2.1>
GITEA_ORG=wiggleverse
META_REPO=meta

OAUTH_CLIENT_ID=<from 1.2.3>
OAUTH_CLIENT_SECRET=<from 1.2.3>

APP_URL=https://rfc.wiggleverse.org
SECRET_KEY=<openssl rand -hex 32>
OWNER_GITEA_LOGIN=ben.stull
GITEA_WEBHOOK_SECRET=<openssl rand -hex 32>

DATABASE_PATH=/opt/rfc-app/backend/data/rfc-app.db

For the §15.4 email loop, either leave SMTP_HOST unset (stdout fallback — fine for the very first deploy while a provider is being chosen) or fill in the SMTP block:

SMTP_HOST=smtp.postmarkapp.com
SMTP_PORT=587
SMTP_USER=<provider-supplied>
SMTP_PASSWORD=<provider-supplied>
SMTP_STARTTLS=1
EMAIL_FROM=notifications@wiggleverse.org
EMAIL_FROM_NAME=Wiggleverse

Configure SPF and DKIM records for wiggleverse.org with the chosen provider before sending real traffic. The single non-spoofing envelope identity per §15.9 is what every outbound email uses; spoofing the actor's address would land everything in spam.

If a real bounce/complaint webhook lands later, set WEBHOOK_EMAIL_BOUNCE_SECRET to a long random string and configure the provider's webhook to inject it as X-Webhook-Secret.

Lock the file down — it carries secrets:

sudo chmod 600 /opt/rfc-app/backend/.env
sudo chown rfc-app:rfc-app /opt/rfc-app/backend/.env

1.3.4 Seed the meta repo. This creates wiggleverse/meta on Gitea, populates the hand-authored files, and registers the webhook against APP_URL/api/webhooks/gitea.

sudo -u rfc-app -H bash -c \
  'cd /opt/rfc-app/backend && .venv/bin/python ../scripts/seed_meta_repo.py'

Re-running is safe; every step is upsert-shaped.

1.4 Web server side

1.4.1 nginx vhost.

sudo cp /opt/rfc-app/deploy/nginx/rfc.wiggleverse.org.conf \
    /etc/nginx/sites-available/rfc.wiggleverse.org
sudo ln -s /etc/nginx/sites-available/rfc.wiggleverse.org \
    /etc/nginx/sites-enabled/
sudo nginx -t && sudo systemctl reload nginx

Make the nginx user able to read /opt/rfc-app/frontend/dist:

sudo usermod -a -G rfc-app www-data
sudo chmod -R g+rX /opt/rfc-app/frontend/dist
sudo systemctl reload nginx

1.4.2 Let's Encrypt cert.

sudo certbot --nginx -d rfc.wiggleverse.org

1.5 systemd

sudo cp /opt/rfc-app/deploy/systemd/rfc-app.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable --now rfc-app
sudo systemctl status rfc-app

Watch the logs:

sudo journalctl -u rfc-app -f

Expected startup line:

RFC app started — meta repo wiggleverse/meta

1.6 Smoke test

In a browser at https://rfc.wiggleverse.org:

  1. The landing page renders (§14.1 — title, pitch, three-item deck, sign-in affordance).
  2. Click Sign in with Gitea → OAuth round-trip → catalog lands.
  3. Click + Propose New RFC, fill in title/slug/pitch, submit. The pending-ideas disclosure shows the new PR.
  4. As the owner, click the PR row, then Merge proposal. The catalog refreshes with the super-draft.
  5. Open /admin and confirm the four-tab home base loads (Users / Graduation queue / Audit log / Permission events).
  6. Open /settings/notifications and confirm the five sub-sections render (per-category email, digest cadence, quiet hours, watches, mute list).

If anything misfires, the troubleshooting section below covers the common failure modes.


2. Day-2 operations

2.1 Logs

sudo journalctl -u rfc-app -f                 # follow
sudo journalctl -u rfc-app --since "1 hour ago"
sudo journalctl -u rfc-app -p err             # errors only

The app logs at INFO level by default. Notable log lines to watch:

  • RFC app started — meta repo ... — startup completed.
  • reconciler: starting sweep / reconciler: sweep complete — the five-minute §4.1 safety-net pass.
  • digest tick failed / hygiene tick failed — a scheduler tick crashed; the next tick will retry but the underlying error wants a look. Stack trace lands next to the warning.
  • email (stdout fallback): to=...SMTP_HOST is unset and the email loop is logging envelopes instead of sending.

2.2 Database backup

The SQLite file at DATABASE_PATH carries every app-canonical row (users, threads, messages, watches, notifications, the audit log). The §4 cache rebuilds from Gitea, so an empty backup of the cached_* tables is recoverable — but the app-canonical tables aren't, so a backup is load-bearing.

Daily snapshot (cron, as rfc-app):

0 3 * * * sqlite3 /opt/rfc-app/backend/data/rfc-app.db ".backup /opt/rfc-app/backend/data/backup-$(date +\%F).db"

Retention is your call; 30 daily snapshots is the easy default.

Restore a snapshot:

sudo systemctl stop rfc-app
sudo -u rfc-app cp /opt/rfc-app/backend/data/backup-YYYY-MM-DD.db \
    /opt/rfc-app/backend/data/rfc-app.db
sudo systemctl start rfc-app

The reconciler will refill the cache from Gitea on first sweep.

2.3 Secret rotation

SECRET_KEY invalidates every active session cookie. To rotate:

NEW=$(openssl rand -hex 32)
sudoedit /opt/rfc-app/backend/.env    # SECRET_KEY=$NEW
sudo systemctl restart rfc-app

Every signed-in user is bounced to the landing page and re-authenticates through OAuth. Existing email-unsubscribe URLs become invalid (per §15.4 they're signed against SECRET_KEY); a user can still unsubscribe through /settings/notifications.

GITEA_BOT_TOKEN rotates without service disruption — write the new value, restart. Old tokens stay valid in Gitea until revoked there.

GITEA_WEBHOOK_SECRET rotates in two steps: update the value in .env, restart, then update the secret in Gitea's webhook config to match. A brief window where webhooks are refused; the reconciler covers it.

2.4 The §12 hygiene timer cadence

The hygiene scheduler runs every HYGIENE_TICK_SECONDS (default 3600). Each tick checks cached_branches for two boundaries:

  • 30 days idle (no commits, no PR) — the branch flips to state='closed'. The branch stays in Gitea, but new chat is disabled per §8.4.
  • 90 days closed (or 90 days post-merge for a merged-PR branch) — the bot deletes the branch from Gitea. The cached_branches.state flips to deleted, and the audit log records the action with actor_user_id=NULL and on_behalf_of=<bot login> per §15.9.

Pinned branches (cached_branches.pinned=1) skip both passes. Per-user branch_chat_seen cursors survive branch deletion — chat history is app-canonical, not cached, and persists indefinitely.

If a branch needs to be kept alive past 30 days without commits, pin it from the admin surface (or directly: UPDATE cached_branches SET pinned = 1 WHERE rfc_slug = ? AND branch_name = ?).

2.5 Updating after a push

sudo -u rfc-app git -C /opt/rfc-app pull
sudo -u rfc-app /opt/rfc-app/backend/.venv/bin/pip install \
    -r /opt/rfc-app/backend/requirements.txt
# Rebuild the frontend locally and rsync dist/ as in 1.3.2.
sudo systemctl restart rfc-app

The §5 schema migrations run on startup and are append-only. A restart is the entire deploy.


3. Rollback

If a deploy goes sideways, the rollback shape is:

sudo -u rfc-app git -C /opt/rfc-app log --oneline -10
sudo -u rfc-app git -C /opt/rfc-app checkout <prior-commit>
sudo -u rfc-app /opt/rfc-app/backend/.venv/bin/pip install \
    -r /opt/rfc-app/backend/requirements.txt
# Rebuild + rsync the frontend dist from the prior commit's state.
sudo systemctl restart rfc-app

The schema migrations are append-only, so rolling code back without rolling the schema back is the safe default. If a migration introduced a column the new code requires, the old code ignores the extra column — SQLite reads the rows fine.

If the database itself got into a bad state (a botched manual UPDATE, say), restore from the most recent backup per §2.2.


4. Troubleshooting

  • systemctl status rfc-app shows RuntimeError: Required environment variable ... is not set. The .env is missing a value, or EnvironmentFile= in the systemd unit isn't finding it. Confirm /opt/rfc-app/backend/.env exists and is mode 0600 owned by rfc-app.
  • OAuth callback returns "Invalid state". The redirect URI in Gitea must match APP_URL/auth/callback exactly. Confirm it's https://rfc.wiggleverse.org/auth/callback.
  • The catalog stays empty after a merge. Check the webhook: journalctl -u rfc-app | grep webhook. Gitea's Settings → Webhooks → Recent Deliveries on the meta repo shows the delivery status; the reconciler will catch up within 5 minutes anyway.
  • 502 Bad Gateway on /api/* or /auth/*. uvicorn isn't running or isn't bound to 127.0.0.1:8000. systemctl status rfc-app.
  • 403 from nginx on static assets. The nginx user can't read /opt/rfc-app/frontend/dist. Apply the chmod from 1.4.1.
  • OAuth works, but the user can't propose. The users row was created with role contributor; only OWNER_GITEA_LOGIN's login gets owner on first sign-in. Confirm .env has the right value and you signed in with that account.
  • Email isn't going out and no error logs. Most likely SMTP_HOST is unset; the stdout fallback is in play and envelopes are in the journal as email (stdout fallback): .... Set the SMTP block per §1.3.3 to enable real sends.
  • The §12 hygiene sweep isn't deleting an obviously stale branch. Confirm cached_branches.pinned = 0 for the row, and that last_commit_at (or the joined merged_at) actually predates the 90-day cutoff. The actions audit log carries every hygiene gesture with action_kind IN ('close_idle_branch', 'delete_stale_branch', 'delete_post_merge_branch').