ee6e3491e7
SPEC, DEV docs, and code comments still talked about the codebase as a rewrite-in-progress against an external prototype. With v1 shipped the framing reads oddly — it implies code is provisional when it's the production thing. Recast §18 as "the technical stack," strip "carryover from the prototype" comments across backend (api.py, chat.py, providers.py) and frontend (DiffView, PromptBar, SelectionTooltip, modelStyles), and rework SPEC §1 / §18 to introduce OHM up front rather than as a follow-on to a prototype reference. Also: - RUNBOOK: bump Python prereq to 3.11+ to match the production VM (was 3.13). - Remove IMPLEMENTATION-PROMPT.md — the original implementation brief is no longer load-bearing. - Add deploy/DEPLOY-NEW-SESSION-PROMPT.md as the durable deploy-handoff prompt for new sessions.
410 lines
13 KiB
Markdown
410 lines
13 KiB
Markdown
# Runbook
|
|
|
|
Single-host deployment of the RFC app at `rfc.wiggleverse.org`, sharing
|
|
infrastructure with `git.wiggleverse.org` (same Gitea instance, same nginx,
|
|
same Let's Encrypt). The shape matches §4.2: one process, one SQLite file,
|
|
no separate worker.
|
|
|
|
Bring-up order: host prep → Gitea side (bot, OAuth, meta repo) → app side
|
|
(code, venv, build, .env) → web server side (nginx, certbot) → systemd →
|
|
smoke test.
|
|
|
|
Every step is idempotent or no-op-on-rerun, so re-running this runbook to
|
|
recover from a partial install is safe.
|
|
|
|
---
|
|
|
|
## 0. Prerequisites
|
|
|
|
- Ubuntu/Debian-style host with nginx and certbot already serving
|
|
`git.wiggleverse.org` over HTTPS.
|
|
- DNS: an `A` record for `rfc.wiggleverse.org` pointing at the same IP as
|
|
`git.wiggleverse.org`.
|
|
- Python 3.11+ available system-wide (the project has no `requires-python`
|
|
pin; the current production VM runs 3.11 on Debian bookworm). Node 20+
|
|
available (for `npm run build` once; the build output is what runs in
|
|
production — Node isn't needed at runtime).
|
|
- `git`, `openssl`, and `rsync` on the host.
|
|
|
|
---
|
|
|
|
## 1. First-time bring-up
|
|
|
|
### 1.1 Host prep
|
|
|
|
Create the system user and the install directory.
|
|
|
|
```sh
|
|
sudo useradd --system --shell /usr/sbin/nologin --home-dir /opt/rfc-app rfc-app
|
|
sudo mkdir -p /opt/rfc-app
|
|
sudo chown rfc-app:rfc-app /opt/rfc-app
|
|
```
|
|
|
|
Clone the repo. HTTPS, since we don't push from the server.
|
|
|
|
```sh
|
|
sudo -u rfc-app git clone https://git.wiggleverse.org/ben.stull/rfc-app.git /opt/rfc-app
|
|
```
|
|
|
|
### 1.2 Gitea side
|
|
|
|
**1.2.1 Create the bot service account.** In the Gitea web UI, signed in
|
|
as a Gitea admin:
|
|
|
|
- **Site Administration → User Accounts → Create User Account**
|
|
- Username: `rfc-bot` (or whatever you want)
|
|
- Email: anything sensible (e.g. `rfc-bot@wiggleverse.org`)
|
|
- Password: random, you won't use it interactively
|
|
- Send email confirmation: off
|
|
|
|
Then sign in as the bot, open **Settings → Applications → Generate New
|
|
Token**, name it `rfc-app`, grant scopes:
|
|
|
|
- `write:repository`
|
|
- `write:user`
|
|
- `write:admin` (needed because the bot creates per-RFC repos on
|
|
graduation and deletes branches per §12 hygiene)
|
|
|
|
Copy the token. It goes into `.env` as `GITEA_BOT_TOKEN`.
|
|
|
|
**1.2.2 Create the org and add the bot.** The meta repo lives inside an
|
|
org. In Gitea: **Create Organization → wiggleverse**. Then **Members →
|
|
Invite → rfc-bot → Owner**.
|
|
|
|
**1.2.3 Register the OAuth2 application.** **Site Administration →
|
|
Integrations → OAuth2 Applications → Create Application**:
|
|
|
|
- Name: `RFC App`
|
|
- Redirect URI: `https://rfc.wiggleverse.org/auth/callback`
|
|
|
|
Copy the client ID and client secret. They go into `.env`.
|
|
|
|
### 1.3 App side
|
|
|
|
**1.3.1 Python venv + deps.**
|
|
|
|
```sh
|
|
sudo -u rfc-app python3 -m venv /opt/rfc-app/backend/.venv
|
|
sudo -u rfc-app /opt/rfc-app/backend/.venv/bin/pip install \
|
|
-r /opt/rfc-app/backend/requirements.txt
|
|
```
|
|
|
|
**1.3.2 Build the frontend.** Build locally and copy `dist/` across:
|
|
|
|
```sh
|
|
# On your laptop:
|
|
cd frontend && npm install && npm run build
|
|
rsync -a dist/ ben.stull@<host>:/tmp/rfc-app-dist/
|
|
# On the host:
|
|
sudo -u rfc-app mkdir -p /opt/rfc-app/frontend/dist
|
|
sudo cp -r /tmp/rfc-app-dist/. /opt/rfc-app/frontend/dist/
|
|
sudo chown -R rfc-app:rfc-app /opt/rfc-app/frontend/dist
|
|
```
|
|
|
|
Or build on the host directly if Node is installed there:
|
|
|
|
```sh
|
|
cd /opt/rfc-app/frontend && sudo -u rfc-app npm install
|
|
sudo -u rfc-app npm run build
|
|
```
|
|
|
|
**1.3.3 Write `.env`.**
|
|
|
|
```sh
|
|
sudo -u rfc-app cp /opt/rfc-app/backend/.env.example /opt/rfc-app/backend/.env
|
|
sudoedit /opt/rfc-app/backend/.env # set every value
|
|
```
|
|
|
|
Required values for production (see `.env.example` for the comments on
|
|
each field):
|
|
|
|
```ini
|
|
GITEA_URL=https://git.wiggleverse.org
|
|
GITEA_BOT_USER=rfc-bot
|
|
GITEA_BOT_TOKEN=<from 1.2.1>
|
|
GITEA_ORG=wiggleverse
|
|
META_REPO=meta
|
|
|
|
OAUTH_CLIENT_ID=<from 1.2.3>
|
|
OAUTH_CLIENT_SECRET=<from 1.2.3>
|
|
|
|
APP_URL=https://rfc.wiggleverse.org
|
|
SECRET_KEY=<openssl rand -hex 32>
|
|
OWNER_GITEA_LOGIN=ben.stull
|
|
GITEA_WEBHOOK_SECRET=<openssl rand -hex 32>
|
|
|
|
DATABASE_PATH=/opt/rfc-app/backend/data/rfc-app.db
|
|
```
|
|
|
|
For the §15.4 email loop, either leave `SMTP_HOST` unset (stdout
|
|
fallback — fine for the very first deploy while a provider is being
|
|
chosen) or fill in the SMTP block:
|
|
|
|
```ini
|
|
SMTP_HOST=smtp.postmarkapp.com
|
|
SMTP_PORT=587
|
|
SMTP_USER=<provider-supplied>
|
|
SMTP_PASSWORD=<provider-supplied>
|
|
SMTP_STARTTLS=1
|
|
EMAIL_FROM=notifications@wiggleverse.org
|
|
EMAIL_FROM_NAME=Wiggleverse
|
|
```
|
|
|
|
Configure SPF and DKIM records for `wiggleverse.org` with the chosen
|
|
provider before sending real traffic. The single non-spoofing envelope
|
|
identity per §15.9 is what every outbound email uses; spoofing the
|
|
actor's address would land everything in spam.
|
|
|
|
If a real bounce/complaint webhook lands later, set
|
|
`WEBHOOK_EMAIL_BOUNCE_SECRET` to a long random string and configure the
|
|
provider's webhook to inject it as `X-Webhook-Secret`.
|
|
|
|
Lock the file down — it carries secrets:
|
|
|
|
```sh
|
|
sudo chmod 600 /opt/rfc-app/backend/.env
|
|
sudo chown rfc-app:rfc-app /opt/rfc-app/backend/.env
|
|
```
|
|
|
|
**1.3.4 Seed the meta repo.** This creates `wiggleverse/meta` on Gitea,
|
|
populates the hand-authored files, and registers the webhook against
|
|
`APP_URL/api/webhooks/gitea`.
|
|
|
|
```sh
|
|
sudo -u rfc-app -H bash -c \
|
|
'cd /opt/rfc-app/backend && .venv/bin/python ../scripts/seed_meta_repo.py'
|
|
```
|
|
|
|
Re-running is safe; every step is upsert-shaped.
|
|
|
|
### 1.4 Web server side
|
|
|
|
**1.4.1 nginx vhost.**
|
|
|
|
```sh
|
|
sudo cp /opt/rfc-app/deploy/nginx/rfc.wiggleverse.org.conf \
|
|
/etc/nginx/sites-available/rfc.wiggleverse.org
|
|
sudo ln -s /etc/nginx/sites-available/rfc.wiggleverse.org \
|
|
/etc/nginx/sites-enabled/
|
|
sudo nginx -t && sudo systemctl reload nginx
|
|
```
|
|
|
|
Make the nginx user able to read `/opt/rfc-app/frontend/dist`:
|
|
|
|
```sh
|
|
sudo usermod -a -G rfc-app www-data
|
|
sudo chmod -R g+rX /opt/rfc-app/frontend/dist
|
|
sudo systemctl reload nginx
|
|
```
|
|
|
|
**1.4.2 Let's Encrypt cert.**
|
|
|
|
```sh
|
|
sudo certbot --nginx -d rfc.wiggleverse.org
|
|
```
|
|
|
|
### 1.5 systemd
|
|
|
|
```sh
|
|
sudo cp /opt/rfc-app/deploy/systemd/rfc-app.service /etc/systemd/system/
|
|
sudo systemctl daemon-reload
|
|
sudo systemctl enable --now rfc-app
|
|
sudo systemctl status rfc-app
|
|
```
|
|
|
|
Watch the logs:
|
|
|
|
```sh
|
|
sudo journalctl -u rfc-app -f
|
|
```
|
|
|
|
Expected startup line:
|
|
|
|
```
|
|
RFC app started — meta repo wiggleverse/meta
|
|
```
|
|
|
|
### 1.6 Smoke test
|
|
|
|
In a browser at `https://rfc.wiggleverse.org`:
|
|
|
|
1. The landing page renders (§14.1 — title, pitch, three-item deck,
|
|
sign-in affordance).
|
|
2. Click **Sign in with Gitea** → OAuth round-trip → catalog lands.
|
|
3. Click **+ Propose New RFC**, fill in title/slug/pitch, submit.
|
|
The pending-ideas disclosure shows the new PR.
|
|
4. As the owner, click the PR row, then **Merge proposal**. The
|
|
catalog refreshes with the super-draft.
|
|
5. Open `/admin` and confirm the four-tab home base loads
|
|
(Users / Graduation queue / Audit log / Permission events).
|
|
6. Open `/settings/notifications` and confirm the five sub-sections
|
|
render (per-category email, digest cadence, quiet hours, watches,
|
|
mute list).
|
|
|
|
If anything misfires, the troubleshooting section below covers the
|
|
common failure modes.
|
|
|
|
---
|
|
|
|
## 2. Day-2 operations
|
|
|
|
### 2.1 Logs
|
|
|
|
```sh
|
|
sudo journalctl -u rfc-app -f # follow
|
|
sudo journalctl -u rfc-app --since "1 hour ago"
|
|
sudo journalctl -u rfc-app -p err # errors only
|
|
```
|
|
|
|
The app logs at INFO level by default. Notable log lines to watch:
|
|
|
|
- `RFC app started — meta repo ...` — startup completed.
|
|
- `reconciler: starting sweep` / `reconciler: sweep complete` — the
|
|
five-minute §4.1 safety-net pass.
|
|
- `digest tick failed` / `hygiene tick failed` — a scheduler tick
|
|
crashed; the next tick will retry but the underlying error wants a
|
|
look. Stack trace lands next to the warning.
|
|
- `email (stdout fallback): to=...` — `SMTP_HOST` is unset and the
|
|
email loop is logging envelopes instead of sending.
|
|
|
|
### 2.2 Database backup
|
|
|
|
The SQLite file at `DATABASE_PATH` carries every app-canonical row
|
|
(users, threads, messages, watches, notifications, the audit log). The
|
|
§4 cache rebuilds from Gitea, so an empty backup of the cached_* tables
|
|
is recoverable — but the app-canonical tables aren't, so a backup is
|
|
load-bearing.
|
|
|
|
Daily snapshot (cron, as `rfc-app`):
|
|
|
|
```sh
|
|
0 3 * * * sqlite3 /opt/rfc-app/backend/data/rfc-app.db ".backup /opt/rfc-app/backend/data/backup-$(date +\%F).db"
|
|
```
|
|
|
|
Retention is your call; 30 daily snapshots is the easy default.
|
|
|
|
Restore a snapshot:
|
|
|
|
```sh
|
|
sudo systemctl stop rfc-app
|
|
sudo -u rfc-app cp /opt/rfc-app/backend/data/backup-YYYY-MM-DD.db \
|
|
/opt/rfc-app/backend/data/rfc-app.db
|
|
sudo systemctl start rfc-app
|
|
```
|
|
|
|
The reconciler will refill the cache from Gitea on first sweep.
|
|
|
|
### 2.3 Secret rotation
|
|
|
|
`SECRET_KEY` invalidates every active session cookie. To rotate:
|
|
|
|
```sh
|
|
NEW=$(openssl rand -hex 32)
|
|
sudoedit /opt/rfc-app/backend/.env # SECRET_KEY=$NEW
|
|
sudo systemctl restart rfc-app
|
|
```
|
|
|
|
Every signed-in user is bounced to the landing page and re-authenticates
|
|
through OAuth. Existing email-unsubscribe URLs become invalid (per
|
|
§15.4 they're signed against `SECRET_KEY`); a user can still unsubscribe
|
|
through `/settings/notifications`.
|
|
|
|
`GITEA_BOT_TOKEN` rotates without service disruption — write the new
|
|
value, restart. Old tokens stay valid in Gitea until revoked there.
|
|
|
|
`GITEA_WEBHOOK_SECRET` rotates in two steps: update the value in `.env`,
|
|
restart, then update the secret in Gitea's webhook config to match. A
|
|
brief window where webhooks are refused; the reconciler covers it.
|
|
|
|
### 2.4 The §12 hygiene timer cadence
|
|
|
|
The hygiene scheduler runs every `HYGIENE_TICK_SECONDS` (default 3600).
|
|
Each tick checks `cached_branches` for two boundaries:
|
|
|
|
- **30 days idle** (no commits, no PR) — the branch flips to
|
|
`state='closed'`. The branch stays in Gitea, but new chat is
|
|
disabled per §8.4.
|
|
- **90 days closed** (or 90 days post-merge for a merged-PR branch) —
|
|
the bot deletes the branch from Gitea. The `cached_branches.state`
|
|
flips to `deleted`, and the audit log records the action with
|
|
`actor_user_id=NULL` and `on_behalf_of=<bot login>` per §15.9.
|
|
|
|
Pinned branches (`cached_branches.pinned=1`) skip both passes. Per-user
|
|
`branch_chat_seen` cursors survive branch deletion — chat history is
|
|
app-canonical, not cached, and persists indefinitely.
|
|
|
|
If a branch needs to be kept alive past 30 days without commits, pin
|
|
it from the admin surface (or directly: `UPDATE cached_branches SET
|
|
pinned = 1 WHERE rfc_slug = ? AND branch_name = ?`).
|
|
|
|
### 2.5 Updating after a push
|
|
|
|
```sh
|
|
sudo -u rfc-app git -C /opt/rfc-app pull
|
|
sudo -u rfc-app /opt/rfc-app/backend/.venv/bin/pip install \
|
|
-r /opt/rfc-app/backend/requirements.txt
|
|
# Rebuild the frontend locally and rsync dist/ as in 1.3.2.
|
|
sudo systemctl restart rfc-app
|
|
```
|
|
|
|
The §5 schema migrations run on startup and are append-only. A restart
|
|
is the entire deploy.
|
|
|
|
---
|
|
|
|
## 3. Rollback
|
|
|
|
If a deploy goes sideways, the rollback shape is:
|
|
|
|
```sh
|
|
sudo -u rfc-app git -C /opt/rfc-app log --oneline -10
|
|
sudo -u rfc-app git -C /opt/rfc-app checkout <prior-commit>
|
|
sudo -u rfc-app /opt/rfc-app/backend/.venv/bin/pip install \
|
|
-r /opt/rfc-app/backend/requirements.txt
|
|
# Rebuild + rsync the frontend dist from the prior commit's state.
|
|
sudo systemctl restart rfc-app
|
|
```
|
|
|
|
The schema migrations are append-only, so rolling code back without
|
|
rolling the schema back is the safe default. If a migration introduced
|
|
a column the new code requires, the old code ignores the extra
|
|
column — SQLite reads the rows fine.
|
|
|
|
If the database itself got into a bad state (a botched manual UPDATE,
|
|
say), restore from the most recent backup per §2.2.
|
|
|
|
---
|
|
|
|
## 4. Troubleshooting
|
|
|
|
- **`systemctl status rfc-app` shows `RuntimeError: Required environment
|
|
variable ... is not set`.** The `.env` is missing a value, or
|
|
`EnvironmentFile=` in the systemd unit isn't finding it. Confirm
|
|
`/opt/rfc-app/backend/.env` exists and is mode 0600 owned by
|
|
`rfc-app`.
|
|
- **OAuth callback returns "Invalid state".** The redirect URI in Gitea
|
|
must match `APP_URL/auth/callback` exactly. Confirm it's
|
|
`https://rfc.wiggleverse.org/auth/callback`.
|
|
- **The catalog stays empty after a merge.** Check the webhook:
|
|
`journalctl -u rfc-app | grep webhook`. Gitea's **Settings → Webhooks
|
|
→ Recent Deliveries** on the meta repo shows the delivery status; the
|
|
reconciler will catch up within 5 minutes anyway.
|
|
- **`502 Bad Gateway` on /api/\* or /auth/\*.** uvicorn isn't running
|
|
or isn't bound to `127.0.0.1:8000`. `systemctl status rfc-app`.
|
|
- **`403` from nginx on static assets.** The nginx user can't read
|
|
`/opt/rfc-app/frontend/dist`. Apply the chmod from 1.4.1.
|
|
- **OAuth works, but the user can't propose.** The `users` row was
|
|
created with role `contributor`; only `OWNER_GITEA_LOGIN`'s login
|
|
gets `owner` on first sign-in. Confirm `.env` has the right value
|
|
and you signed in with that account.
|
|
- **Email isn't going out and no error logs.** Most likely `SMTP_HOST`
|
|
is unset; the stdout fallback is in play and envelopes are in the
|
|
journal as `email (stdout fallback): ...`. Set the SMTP block per
|
|
§1.3.3 to enable real sends.
|
|
- **The §12 hygiene sweep isn't deleting an obviously stale branch.**
|
|
Confirm `cached_branches.pinned = 0` for the row, and that
|
|
`last_commit_at` (or the joined `merged_at`) actually predates the
|
|
90-day cutoff. The `actions` audit log carries every hygiene gesture
|
|
with `action_kind IN ('close_idle_branch', 'delete_stale_branch',
|
|
'delete_post_merge_branch')`.
|