Slice 8: v1 ships — integration coverage, runbook, spec corrections

- Five new integration test files raise the suite from 75 to 96 green:
  test_hygiene_vertical (7), test_branch_path_routing (4),
  test_metadata_pr_merge (3), test_cache_bootstrap (4), test_e2e_smoke
  (3). The smoke test walks propose → super-draft → edit branch →
  body-edit PR → graduate → active-RFC PR → merge → notification →
  hygiene-sweep deletion end-to-end.
- deploy/RUNBOOK.md replaces the prior DEPLOY.md stub as a real
  runbook: prerequisites, first-time bring-up, day-2 ops (logs, DB
  backup, secret rotation, the §12 hygiene cadence), rollback shape,
  troubleshooting table.
- backend/.env.example grows the SMTP block, HYGIENE_TICK_SECONDS,
  and WEBHOOK_EMAIL_BOUNCE_SECRET with inline commentary.
- README points to RUNBOOK.md; the "what the build lets you do"
  section adds Slices 7 and 8.
- docs/DEV.md gets a Slice 8 — shipped section; the "Next slice"
  footer becomes the v1-complete epitaph.
- SPEC corrections per the §19.3 working agreement: §10.7 names the
  shared §12 sweep; §12 names the bot as actuator and the per-user
  branch_chat_seen preservation contract; §19.1 marks v1 complete
  and records Slice 8; the five §19.2 candidates Slice 8 folded in
  are marked settled with pointers at the resolution.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Ben Stull
2026-05-25 04:14:50 -07:00
parent 1a0c4428af
commit 36635049c7
11 changed files with 1585 additions and 410 deletions
+408
View File
@@ -0,0 +1,408 @@
# Runbook
Single-host deployment of the RFC app at `rfc.wiggleverse.org`, sharing
infrastructure with `git.wiggleverse.org` (same Gitea instance, same nginx,
same Let's Encrypt). The shape matches §4.2: one process, one SQLite file,
no separate worker.
Bring-up order: host prep → Gitea side (bot, OAuth, meta repo) → app side
(code, venv, build, .env) → web server side (nginx, certbot) → systemd →
smoke test.
Every step is idempotent or no-op-on-rerun, so re-running this runbook to
recover from a partial install is safe.
---
## 0. Prerequisites
- Ubuntu/Debian-style host with nginx and certbot already serving
`git.wiggleverse.org` over HTTPS.
- DNS: an `A` record for `rfc.wiggleverse.org` pointing at the same IP as
`git.wiggleverse.org`.
- Python 3.13 available system-wide. Node 20+ available (for `npm run
build` once; the build output is what runs in production — Node isn't
needed at runtime).
- `git`, `openssl`, and `rsync` on the host.
---
## 1. First-time bring-up
### 1.1 Host prep
Create the system user and the install directory.
```sh
sudo useradd --system --shell /usr/sbin/nologin --home-dir /opt/rfc-app rfc-app
sudo mkdir -p /opt/rfc-app
sudo chown rfc-app:rfc-app /opt/rfc-app
```
Clone the repo. HTTPS, since we don't push from the server.
```sh
sudo -u rfc-app git clone https://git.wiggleverse.org/ben.stull/rfc-app.git /opt/rfc-app
```
### 1.2 Gitea side
**1.2.1 Create the bot service account.** In the Gitea web UI, signed in
as a Gitea admin:
- **Site Administration → User Accounts → Create User Account**
- Username: `rfc-bot` (or whatever you want)
- Email: anything sensible (e.g. `rfc-bot@wiggleverse.org`)
- Password: random, you won't use it interactively
- Send email confirmation: off
Then sign in as the bot, open **Settings → Applications → Generate New
Token**, name it `rfc-app`, grant scopes:
- `write:repository`
- `write:user`
- `write:admin` (needed because the bot creates per-RFC repos on
graduation and deletes branches per §12 hygiene)
Copy the token. It goes into `.env` as `GITEA_BOT_TOKEN`.
**1.2.2 Create the org and add the bot.** The meta repo lives inside an
org. In Gitea: **Create Organization → wiggleverse**. Then **Members →
Invite → rfc-bot → Owner**.
**1.2.3 Register the OAuth2 application.** **Site Administration →
Integrations → OAuth2 Applications → Create Application**:
- Name: `RFC App`
- Redirect URI: `https://rfc.wiggleverse.org/auth/callback`
Copy the client ID and client secret. They go into `.env`.
### 1.3 App side
**1.3.1 Python venv + deps.**
```sh
sudo -u rfc-app python3 -m venv /opt/rfc-app/backend/.venv
sudo -u rfc-app /opt/rfc-app/backend/.venv/bin/pip install \
-r /opt/rfc-app/backend/requirements.txt
```
**1.3.2 Build the frontend.** Build locally and copy `dist/` across:
```sh
# On your laptop:
cd frontend && npm install && npm run build
rsync -a dist/ ben.stull@<host>:/tmp/rfc-app-dist/
# On the host:
sudo -u rfc-app mkdir -p /opt/rfc-app/frontend/dist
sudo cp -r /tmp/rfc-app-dist/. /opt/rfc-app/frontend/dist/
sudo chown -R rfc-app:rfc-app /opt/rfc-app/frontend/dist
```
Or build on the host directly if Node is installed there:
```sh
cd /opt/rfc-app/frontend && sudo -u rfc-app npm install
sudo -u rfc-app npm run build
```
**1.3.3 Write `.env`.**
```sh
sudo -u rfc-app cp /opt/rfc-app/backend/.env.example /opt/rfc-app/backend/.env
sudoedit /opt/rfc-app/backend/.env # set every value
```
Required values for production (see `.env.example` for the comments on
each field):
```ini
GITEA_URL=https://git.wiggleverse.org
GITEA_BOT_USER=rfc-bot
GITEA_BOT_TOKEN=<from 1.2.1>
GITEA_ORG=wiggleverse
META_REPO=meta
OAUTH_CLIENT_ID=<from 1.2.3>
OAUTH_CLIENT_SECRET=<from 1.2.3>
APP_URL=https://rfc.wiggleverse.org
SECRET_KEY=<openssl rand -hex 32>
OWNER_GITEA_LOGIN=ben.stull
GITEA_WEBHOOK_SECRET=<openssl rand -hex 32>
DATABASE_PATH=/opt/rfc-app/backend/data/rfc-app.db
```
For the §15.4 email loop, either leave `SMTP_HOST` unset (stdout
fallback — fine for the very first deploy while a provider is being
chosen) or fill in the SMTP block:
```ini
SMTP_HOST=smtp.postmarkapp.com
SMTP_PORT=587
SMTP_USER=<provider-supplied>
SMTP_PASSWORD=<provider-supplied>
SMTP_STARTTLS=1
EMAIL_FROM=notifications@wiggleverse.org
EMAIL_FROM_NAME=Wiggleverse
```
Configure SPF and DKIM records for `wiggleverse.org` with the chosen
provider before sending real traffic. The single non-spoofing envelope
identity per §15.9 is what every outbound email uses; spoofing the
actor's address would land everything in spam.
If a real bounce/complaint webhook lands later, set
`WEBHOOK_EMAIL_BOUNCE_SECRET` to a long random string and configure the
provider's webhook to inject it as `X-Webhook-Secret`.
Lock the file down — it carries secrets:
```sh
sudo chmod 600 /opt/rfc-app/backend/.env
sudo chown rfc-app:rfc-app /opt/rfc-app/backend/.env
```
**1.3.4 Seed the meta repo.** This creates `wiggleverse/meta` on Gitea,
populates the hand-authored files, and registers the webhook against
`APP_URL/api/webhooks/gitea`.
```sh
sudo -u rfc-app -H bash -c \
'cd /opt/rfc-app/backend && .venv/bin/python ../scripts/seed_meta_repo.py'
```
Re-running is safe; every step is upsert-shaped.
### 1.4 Web server side
**1.4.1 nginx vhost.**
```sh
sudo cp /opt/rfc-app/deploy/nginx/rfc.wiggleverse.org.conf \
/etc/nginx/sites-available/rfc.wiggleverse.org
sudo ln -s /etc/nginx/sites-available/rfc.wiggleverse.org \
/etc/nginx/sites-enabled/
sudo nginx -t && sudo systemctl reload nginx
```
Make the nginx user able to read `/opt/rfc-app/frontend/dist`:
```sh
sudo usermod -a -G rfc-app www-data
sudo chmod -R g+rX /opt/rfc-app/frontend/dist
sudo systemctl reload nginx
```
**1.4.2 Let's Encrypt cert.**
```sh
sudo certbot --nginx -d rfc.wiggleverse.org
```
### 1.5 systemd
```sh
sudo cp /opt/rfc-app/deploy/systemd/rfc-app.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable --now rfc-app
sudo systemctl status rfc-app
```
Watch the logs:
```sh
sudo journalctl -u rfc-app -f
```
Expected startup line:
```
RFC app started — meta repo wiggleverse/meta
```
### 1.6 Smoke test
In a browser at `https://rfc.wiggleverse.org`:
1. The landing page renders (§14.1 — title, pitch, three-item deck,
sign-in affordance).
2. Click **Sign in with Gitea** → OAuth round-trip → catalog lands.
3. Click **+ Propose New RFC**, fill in title/slug/pitch, submit.
The pending-ideas disclosure shows the new PR.
4. As the owner, click the PR row, then **Merge proposal**. The
catalog refreshes with the super-draft.
5. Open `/admin` and confirm the four-tab home base loads
(Users / Graduation queue / Audit log / Permission events).
6. Open `/settings/notifications` and confirm the five sub-sections
render (per-category email, digest cadence, quiet hours, watches,
mute list).
If anything misfires, the troubleshooting section below covers the
common failure modes.
---
## 2. Day-2 operations
### 2.1 Logs
```sh
sudo journalctl -u rfc-app -f # follow
sudo journalctl -u rfc-app --since "1 hour ago"
sudo journalctl -u rfc-app -p err # errors only
```
The app logs at INFO level by default. Notable log lines to watch:
- `RFC app started — meta repo ...` — startup completed.
- `reconciler: starting sweep` / `reconciler: sweep complete` — the
five-minute §4.1 safety-net pass.
- `digest tick failed` / `hygiene tick failed` — a scheduler tick
crashed; the next tick will retry but the underlying error wants a
look. Stack trace lands next to the warning.
- `email (stdout fallback): to=...` — `SMTP_HOST` is unset and the
email loop is logging envelopes instead of sending.
### 2.2 Database backup
The SQLite file at `DATABASE_PATH` carries every app-canonical row
(users, threads, messages, watches, notifications, the audit log). The
§4 cache rebuilds from Gitea, so an empty backup of the cached_* tables
is recoverable — but the app-canonical tables aren't, so a backup is
load-bearing.
Daily snapshot (cron, as `rfc-app`):
```sh
0 3 * * * sqlite3 /opt/rfc-app/backend/data/rfc-app.db ".backup /opt/rfc-app/backend/data/backup-$(date +\%F).db"
```
Retention is your call; 30 daily snapshots is the easy default.
Restore a snapshot:
```sh
sudo systemctl stop rfc-app
sudo -u rfc-app cp /opt/rfc-app/backend/data/backup-YYYY-MM-DD.db \
/opt/rfc-app/backend/data/rfc-app.db
sudo systemctl start rfc-app
```
The reconciler will refill the cache from Gitea on first sweep.
### 2.3 Secret rotation
`SECRET_KEY` invalidates every active session cookie. To rotate:
```sh
NEW=$(openssl rand -hex 32)
sudoedit /opt/rfc-app/backend/.env # SECRET_KEY=$NEW
sudo systemctl restart rfc-app
```
Every signed-in user is bounced to the landing page and re-authenticates
through OAuth. Existing email-unsubscribe URLs become invalid (per
§15.4 they're signed against `SECRET_KEY`); a user can still unsubscribe
through `/settings/notifications`.
`GITEA_BOT_TOKEN` rotates without service disruption — write the new
value, restart. Old tokens stay valid in Gitea until revoked there.
`GITEA_WEBHOOK_SECRET` rotates in two steps: update the value in `.env`,
restart, then update the secret in Gitea's webhook config to match. A
brief window where webhooks are refused; the reconciler covers it.
### 2.4 The §12 hygiene timer cadence
The hygiene scheduler runs every `HYGIENE_TICK_SECONDS` (default 3600).
Each tick checks `cached_branches` for two boundaries:
- **30 days idle** (no commits, no PR) — the branch flips to
`state='closed'`. The branch stays in Gitea, but new chat is
disabled per §8.4.
- **90 days closed** (or 90 days post-merge for a merged-PR branch) —
the bot deletes the branch from Gitea. The `cached_branches.state`
flips to `deleted`, and the audit log records the action with
`actor_user_id=NULL` and `on_behalf_of=<bot login>` per §15.9.
Pinned branches (`cached_branches.pinned=1`) skip both passes. Per-user
`branch_chat_seen` cursors survive branch deletion — chat history is
app-canonical, not cached, and persists indefinitely.
If a branch needs to be kept alive past 30 days without commits, pin
it from the admin surface (or directly: `UPDATE cached_branches SET
pinned = 1 WHERE rfc_slug = ? AND branch_name = ?`).
### 2.5 Updating after a push
```sh
sudo -u rfc-app git -C /opt/rfc-app pull
sudo -u rfc-app /opt/rfc-app/backend/.venv/bin/pip install \
-r /opt/rfc-app/backend/requirements.txt
# Rebuild the frontend locally and rsync dist/ as in 1.3.2.
sudo systemctl restart rfc-app
```
The §5 schema migrations run on startup and are append-only. A restart
is the entire deploy.
---
## 3. Rollback
If a deploy goes sideways, the rollback shape is:
```sh
sudo -u rfc-app git -C /opt/rfc-app log --oneline -10
sudo -u rfc-app git -C /opt/rfc-app checkout <prior-commit>
sudo -u rfc-app /opt/rfc-app/backend/.venv/bin/pip install \
-r /opt/rfc-app/backend/requirements.txt
# Rebuild + rsync the frontend dist from the prior commit's state.
sudo systemctl restart rfc-app
```
The schema migrations are append-only, so rolling code back without
rolling the schema back is the safe default. If a migration introduced
a column the new code requires, the old code ignores the extra
column — SQLite reads the rows fine.
If the database itself got into a bad state (a botched manual UPDATE,
say), restore from the most recent backup per §2.2.
---
## 4. Troubleshooting
- **`systemctl status rfc-app` shows `RuntimeError: Required environment
variable ... is not set`.** The `.env` is missing a value, or
`EnvironmentFile=` in the systemd unit isn't finding it. Confirm
`/opt/rfc-app/backend/.env` exists and is mode 0600 owned by
`rfc-app`.
- **OAuth callback returns "Invalid state".** The redirect URI in Gitea
must match `APP_URL/auth/callback` exactly. Confirm it's
`https://rfc.wiggleverse.org/auth/callback`.
- **The catalog stays empty after a merge.** Check the webhook:
`journalctl -u rfc-app | grep webhook`. Gitea's **Settings → Webhooks
→ Recent Deliveries** on the meta repo shows the delivery status; the
reconciler will catch up within 5 minutes anyway.
- **`502 Bad Gateway` on /api/\* or /auth/\*.** uvicorn isn't running
or isn't bound to `127.0.0.1:8000`. `systemctl status rfc-app`.
- **`403` from nginx on static assets.** The nginx user can't read
`/opt/rfc-app/frontend/dist`. Apply the chmod from 1.4.1.
- **OAuth works, but the user can't propose.** The `users` row was
created with role `contributor`; only `OWNER_GITEA_LOGIN`'s login
gets `owner` on first sign-in. Confirm `.env` has the right value
and you signed in with that account.
- **Email isn't going out and no error logs.** Most likely `SMTP_HOST`
is unset; the stdout fallback is in play and envelopes are in the
journal as `email (stdout fallback): ...`. Set the SMTP block per
§1.3.3 to enable real sends.
- **The §12 hygiene sweep isn't deleting an obviously stale branch.**
Confirm `cached_branches.pinned = 0` for the row, and that
`last_commit_at` (or the joined `merged_at`) actually predates the
90-day cutoff. The `actions` audit log carries every hygiene gesture
with `action_kind IN ('close_idle_branch', 'delete_stale_branch',
'delete_post_merge_branch')`.