Troubleshooting

Start with doctor, then work down the list.

`semilayer-runner doctor`

docker run --rm \
  -e SEMILAYER_RUNNER_ID=<runner-uuid> \
  -e SEMILAYER_RUNNER_TOKEN=rk_... \
  ghcr.io/semilayer/runner:latest doctor

Prints one of:

✓ config loaded — env vars resolved.
✓ reachable <url> → 200/404 — TLS handshake with runner.semilayer.com completed. (Either status is fine — we're probing connectivity, not a specific endpoint.)
✗ reachable ... — <error> — outbound firewall is blocking TCP 443.

Status pill stuck on "offline"

Runner container exited. Check docker logs <container>. Common exit reasons:

DATABASE_URL is required → this is the gateway's error shape, not the runner's. You might be staring at the wrong container. The runner itself requires SEMILAYER_RUNNER_ID and SEMILAYER_RUNNER_TOKEN.
SSL/TLS handshake failed → the runner couldn't verify runner.semilayer.com's certificate. Usually a corporate TLS interception proxy; add the proxy's CA bundle as NODE_EXTRA_CA_CERTS=/path/to/ca.pem.

Runner container running, still offline. Symptom: doctor passes, runner logs say connected, awaiting jobs, but the Console shows offline for more than 30 seconds.

Heartbeat persistence lag: the Console polls every 10 s, the gateway persists heartbeat every 20 s. A just-connected runner can look offline for up to 30 s on the first transition. Refresh after a minute.
Stale runner_sessions row: happens if a prior gateway revision crashed without deleting the row. Revoke + recreate the runner; the row gets recreated cleanly on next connect.

Query returns `no_runners_online`

The source routes through a runner pool, and none are connected. Paths forward:

Bring a runner up.
Temporarily unassign the source (Console → Runners → detail) and SemiLayer will fall back to direct dispatch. You'll re-earn the IP-allowlist story while you do.
If the runner is running: check doctor from that host, then check firewall rules for outbound TCP 443 to runner.semilayer.com.

`connection refused` / `timeout` on the DB side

Runner connected to SemiLayer fine, but can't reach your database.

Verify the runner can resolve + reach the DB host. From inside the container: apk add postgresql-client && psql $URL -c 'select 1'.
Check the DB's pg_hba.conf / equivalent — the runner's source IP must be allowed for the DB user.
For runner-local mode: verify the per-source env vars match the source names exactly. Source name primary-db → prefix SEMILAYER_SOURCE_PRIMARY_DB_. URL-style bridges need SEMILAYER_SOURCE_PRIMARY_DB_URL; structured-config bridges (ClickHouse, Snowflake, DynamoDB, BigQuery) need one var per bridge config field — _HOST, _PORT, _DATABASE, _ACCESS_KEY_ID, etc. See Airgap mode for the full key map.

The container keeps exiting and restarting

Exponential-backoff reconnect in the runtime will slow this down, but won't stop a bad config. Fix the underlying cause first:

4401 bad token — the rk_ is wrong or has been revoked. Mint a new one in the Console.
4401 unknown runner — SEMILAYER_RUNNER_ID doesn't match any row in SemiLayer's DB. The env var must be the runner's UUID (the value the gateway indexes by), not its display name. The runner name does also work as a convenience — but only if the org slug embedded in the token is kebab-case. When in doubt, use the UUID. Look it up with semilayer runners list (the ID column) or in the Console's runner detail view.
close 1006 with no preceding 4401 — same root cause as above when it shows up in a tight reconnect loop. Some load balancers drop the 4401 close frame on the way back to the runner, so the runner only sees a generic abnormal close. Verify the ID env var first.
close 1006 — the gateway closed the connection abnormally. Check status.semilayer.com for an incident; otherwise open a support ticket with the runner id and timestamp.

docker logs <runner-container> — stdout + stderr from the last hour.
The runner's ID (never the token).
The approximate time the issue started in UTC.
Your org slug.

Email root@semilayer.dev or open a Linear ticket — we pull our side's logs from runner_jobs and gateway audit, correlated by runner_id and timestamp.

Troubleshooting

semilayer-runner doctor

Status pill stuck on "offline"

Query returns no_runners_online

connection refused / timeout on the DB side

The container keeps exiting and restarting

Logs I can share with support

`semilayer-runner doctor`

Query returns `no_runners_online`

`connection refused` / `timeout` on the DB side