SemiLayerDocs

Troubleshooting

Start with doctor, then work down the list.

semilayer-runner doctor

docker run --rm \
  -e SEMILAYER_RUNNER_ID=<runner-id> \
  -e SEMILAYER_RUNNER_TOKEN=rk_... \
  ghcr.io/semilayer/runner:latest doctor

Prints one of:

  • ✓ config loaded — env vars resolved.
  • ✓ reachable <url> → 200/404 — TLS handshake with runner.semilayer.com completed. (Either status is fine — we're probing connectivity, not a specific endpoint.)
  • ✗ reachable ... — <error> — outbound firewall is blocking TCP 443.

Status pill stuck on "offline"

Runner container exited. Check docker logs <container>. Common exit reasons:

  • DATABASE_URL is required → this is the gateway's error shape, not the runner's. You might be staring at the wrong container. The runner itself requires SEMILAYER_RUNNER_ID and SEMILAYER_RUNNER_TOKEN.
  • SSL/TLS handshake failed → the runner couldn't verify runner.semilayer.com's certificate. Usually a corporate TLS interception proxy; add the proxy's CA bundle as NODE_EXTRA_CA_CERTS=/path/to/ca.pem.

Runner container running, still offline. Symptom: doctor passes, runner logs say connected, awaiting jobs, but the Console shows offline for more than 30 seconds.

  • Heartbeat persistence lag: the Console polls every 10 s, the gateway persists heartbeat every 20 s. A just-connected runner can look offline for up to 30 s on the first transition. Refresh after a minute.
  • Stale runner_sessions row: happens if a prior gateway revision crashed without deleting the row. Revoke + recreate the runner; the row gets recreated cleanly on next connect.

Query returns no_runners_online

The source routes through a runner pool, and none are connected. Paths forward:

  • Bring a runner up.
  • Temporarily unassign the source (Console → Runners → detail) and SemiLayer will fall back to direct dispatch. You'll re-earn the IP-allowlist story while you do.
  • If the runner is running: check doctor from that host, then check firewall rules for outbound TCP 443 to runner.semilayer.com.

connection refused / timeout on the DB side

Runner connected to SemiLayer fine, but can't reach your database.

  • Verify the runner can resolve + reach the DB host. From inside the container: apk add postgresql-client && psql $URL -c 'select 1'.
  • Check the DB's pg_hba.conf / equivalent — the runner's source IP must be allowed for the DB user.
  • For runner-local mode: verify SEMILAYER_SOURCE_<NAME>_URL env vars match the source names exactly. Source name primary-db → env var SEMILAYER_SOURCE_PRIMARY_DB_URL.

The container keeps exiting and restarting

Exponential-backoff reconnect in the runtime will slow this down, but won't stop a bad config. Fix the underlying cause first:

  • 4401 bad token — the rk_ is wrong or has been revoked. Mint a new one in the Console.
  • 4401 unknown runner — the <runner-id> doesn't match any row in SemiLayer's DB. Usually a typo on the ID env var. Copy from the Console's runner detail view.
  • close 1006 — the gateway closed the connection abnormally. Check status.semilayer.com for an incident; otherwise open a support ticket with the runner id and timestamp.

Logs I can share with support

  • docker logs <runner-container> — stdout + stderr from the last hour.
  • The runner's ID (never the token).
  • The approximate time the issue started in UTC.
  • Your org slug.

Email root@semilayer.dev or open a Linear ticket — we pull our side's logs from runner_jobs and gateway audit, correlated by runner_id and timestamp.