Cursors & streaming
Three pagination shapes show up across SemiLayer; picking the right one is a
~5-minute decision that pays off forever. This page maps the four
read surfaces — query, drill-down, WebSocket streams, and exports —
onto the right shape, and explains the cursor model.
The four shapes
| Shape | Endpoint(s) | What you get back | Right for |
|---|---|---|---|
offset + limit | query | One page; you advance offset yourself | Admin tables with page links, ≤ ~5 pages deep |
cursor | query, analyze.rows | One page + an opaque cursor for the next | Infinite scroll, deep pagination, drill-down |
| WebSocket batch | stream.query, stream.search | Per-batch frames terminated by done | Full-table walks, exports without writing to disk |
| Streaming export | analyze.rows.export, query.export | NDJSON or CSV body, chunked, with truncation trailer | "Download all matching rows as a file" |
The shorthand:
- Read a few pages → cursor
- Read every page → streaming export
- React to new rows over time → WS subscribe (different primitive — see Realtime)
offset vs cursor
The obvious difference: offset cursors restart the count-and-discard each call; opaque cursors encode the position. The non-obvious difference: offset cursors race with concurrent writes — rows shift under you, some appear twice or get skipped — while cursors stay coherent.
Always set orderBy when you paginate. The cursor encodes the current
sort position; if you change the sort between calls, the cursor is
meaningless and the next page will look random. The platform appends the
lens's primary key as a server-side tiebreaker, so even a single-field
orderBy paginates stably.
Drill-down cursors
analyze.<name>.rows() paginates with the same cursor shape query
uses. The server generates the cursor itself — an offset under the hood —
and the rows respect the resolved orderBy (your override + the
PK tiebreaker). Drill on a static bucket is read-only, so write-races
aren't a concern at this surface; the offset scheme is exactly the
right grain.
If you've already drained the cursor and want a CSV instead of a JSON page loop, swap to exports — same predicate, one body, no page-management code.
When to reach for WS streaming
stream.query opens a WebSocket and yields rows per batch:
The shape is similar to the streaming export, but the use case differs:
- WS streaming stays open as a long-lived socket; suits programmatic
fan-out where the consumer wants typed
Mper row, no parsing. - Streaming export speaks plain HTTP chunked encoding; suits "download
to disk" / "pipe through
jq/wc -l" / "share the body with a non-SemiLayer consumer."
Pick WS when the consumer is your TypeScript app. Pick the export when the consumer is a file, a shell pipeline, or a third party.
When to reach for exports
Streaming exports are the right answer when:
- You want all matching rows, not just the next page.
- You want the body on disk or piped through a shell tool.
- You don't want to manage pagination state in your code.
The cap is per-call, tier-aware (Free 10k → Scale 10M → Enterprise unlimited),
and the response sets X-SemiLayer-Export-Truncated if you hit it.
Beam wraps the trailer as a final { kind: 'truncated' } chunk so you
don't need to read it from raw HTTP.
Cursor stability under writes
| Surface | Stable under writes? |
|---|---|
query cursor | Stable. The cursor encodes the sort-key position, so concurrent inserts/updates only show up on the next nextCursor walk. Deletes mid-walk are safely skipped. |
analyze.rows cursor | Drill is on a static bucket — the bucketKey snapshot of the predicate. Concurrent writes to the source DB don't shift the bucket's row set within the 24h bucketKey TTL. |
offset | Not stable. Concurrent writes shift rows under your offset; expect duplicates and skips. Use cursor instead for any deep walk. |
| WS streaming | Bridge-dependent. Postgres holds a server-side cursor; Mongo holds a snapshot. The bridge cleans up if you abort. |
| Streaming exports | Same posture as the underlying cursor (bridge-snapshotted where the bridge supports it). |
For the rare "I need a write-stable cursor over query() directly" use
case, run the export endpoint instead — bridges that support snapshot
isolation will use it under the streaming export path.
Aborting cleanly
Every streaming shape — WS streams, drill-down loops, exports — closes the underlying bridge cursor when the consumer aborts. No leaked transactions.
For exports, the React useExportRows({ ... }).cancel() and Beam's
AsyncIterable.return() both propagate the abort to the server.
Choosing — the cheat sheet
| Need | Reach for |
|---|---|
| Admin UI with page picker | query({ offset, limit }) |
| Infinite scroll | query({ cursor, limit }) |
| Bucket-scoped drill | analyze.<name>.rows({ bucketKey, cursor }) |
| Search inside a bucket | analyze.<name>.rows({ bucketKey, search, cursor }) |
| Download all rows of a bucket | analyze.<name>.exportRows({ bucketKey, format }) |
Download all rows of a where | BeamClient.queryExport({ lens, where, format }) (or semilayer query --export) |
| Walk a whole table programmatically | stream.query({ where }) |
| React to new rows over time | stream.subscribe({ filter }) — different primitive |
| Live-updating dashboard | useAnalyze({ liveUpdates: true }) |
When in doubt, start with cursor pagination. It's the right answer for ~80% of UIs, and you can always upgrade to the streaming export when the user reaches for "download."