SemiLayerDocs

Cursors & streaming

Three pagination shapes show up across SemiLayer; picking the right one is a ~5-minute decision that pays off forever. This page maps the four read surfaces — query, drill-down, WebSocket streams, and exports — onto the right shape, and explains the cursor model.

The four shapes

ShapeEndpoint(s)What you get backRight for
offset + limitqueryOne page; you advance offset yourselfAdmin tables with page links, ≤ ~5 pages deep
cursorquery, analyze.rowsOne page + an opaque cursor for the nextInfinite scroll, deep pagination, drill-down
WebSocket batchstream.query, stream.searchPer-batch frames terminated by doneFull-table walks, exports without writing to disk
Streaming exportanalyze.rows.export, query.exportNDJSON or CSV body, chunked, with truncation trailer"Download all matching rows as a file"

The shorthand:

  • Read a few pages → cursor
  • Read every page → streaming export
  • React to new rows over time → WS subscribe (different primitive — see Realtime)

offset vs cursor

The obvious difference: offset cursors restart the count-and-discard each call; opaque cursors encode the position. The non-obvious difference: offset cursors race with concurrent writes — rows shift under you, some appear twice or get skipped — while cursors stay coherent.

// offset — fine for admin pickers, breaks at scale
const page = await beam.orders.query({
  where:   { status: 'shipped' },
  orderBy: { field: 'placed_at', dir: 'desc' },
  limit:   20,
  offset:  pageNumber * 20,
})

// cursor — the deep-scroll-safe shape
const first = await beam.orders.query({
  where:   { status: 'shipped' },
  orderBy: { field: 'placed_at', dir: 'desc' },
  limit:   20,
})
const next  = await beam.orders.query({
  where:   { status: 'shipped' },
  orderBy: { field: 'placed_at', dir: 'desc' },
  limit:   20,
  cursor:  first.meta.nextCursor,
})
// Terminate when meta.nextCursor is undefined.

Always set orderBy when you paginate. The cursor encodes the current sort position; if you change the sort between calls, the cursor is meaningless and the next page will look random. The platform appends the lens's primary key as a server-side tiebreaker, so even a single-field orderBy paginates stably.

Drill-down cursors

analyze.<name>.rows() paginates with the same cursor shape query uses. The server generates the cursor itself — an offset under the hood — and the rows respect the resolved orderBy (your override + the PK tiebreaker). Drill on a static bucket is read-only, so write-races aren't a concern at this surface; the offset scheme is exactly the right grain.

const first = await beam.products.analyze.byCategory.rows({
  bucketKey,
  limit: 25,
})
const next  = await beam.products.analyze.byCategory.rows({
  bucketKey,
  cursor: first.cursor,
  limit:  25,
})

If you've already drained the cursor and want a CSV instead of a JSON page loop, swap to exports — same predicate, one body, no page-management code.

When to reach for WS streaming

stream.query opens a WebSocket and yields rows per batch:

for await (const row of beam.orders.stream.query({
  where:   { status: 'shipped' },
  orderBy: { field: 'placed_at', dir: 'asc' },
  limit:   50000,
})) {
  process(row)
}

The shape is similar to the streaming export, but the use case differs:

  • WS streaming stays open as a long-lived socket; suits programmatic fan-out where the consumer wants typed M per row, no parsing.
  • Streaming export speaks plain HTTP chunked encoding; suits "download to disk" / "pipe through jq / wc -l" / "share the body with a non-SemiLayer consumer."

Pick WS when the consumer is your TypeScript app. Pick the export when the consumer is a file, a shell pipeline, or a third party.

When to reach for exports

Streaming exports are the right answer when:

  • You want all matching rows, not just the next page.
  • You want the body on disk or piped through a shell tool.
  • You don't want to manage pagination state in your code.

The cap is per-call, tier-aware (Free 10k → Scale 10M → Enterprise unlimited), and the response sets X-SemiLayer-Export-Truncated if you hit it. Beam wraps the trailer as a final { kind: 'truncated' } chunk so you don't need to read it from raw HTTP.

Cursor stability under writes

SurfaceStable under writes?
query cursorStable. The cursor encodes the sort-key position, so concurrent inserts/updates only show up on the next nextCursor walk. Deletes mid-walk are safely skipped.
analyze.rows cursorDrill is on a static bucket — the bucketKey snapshot of the predicate. Concurrent writes to the source DB don't shift the bucket's row set within the 24h bucketKey TTL.
offsetNot stable. Concurrent writes shift rows under your offset; expect duplicates and skips. Use cursor instead for any deep walk.
WS streamingBridge-dependent. Postgres holds a server-side cursor; Mongo holds a snapshot. The bridge cleans up if you abort.
Streaming exportsSame posture as the underlying cursor (bridge-snapshotted where the bridge supports it).

For the rare "I need a write-stable cursor over query() directly" use case, run the export endpoint instead — bridges that support snapshot isolation will use it under the streaming export path.

Aborting cleanly

Every streaming shape — WS streams, drill-down loops, exports — closes the underlying bridge cursor when the consumer aborts. No leaked transactions.

const controller = new AbortController()

setTimeout(() => controller.abort(), 60_000)  // cancel after a minute

for await (const row of beam.orders.stream.query({
  where: { status: 'shipped' },
  signal: controller.signal,   // cooperative cancellation
})) {
  process(row)
}

For exports, the React useExportRows({ ... }).cancel() and Beam's AsyncIterable.return() both propagate the abort to the server.

Choosing — the cheat sheet

NeedReach for
Admin UI with page pickerquery({ offset, limit })
Infinite scrollquery({ cursor, limit })
Bucket-scoped drillanalyze.<name>.rows({ bucketKey, cursor })
Search inside a bucketanalyze.<name>.rows({ bucketKey, search, cursor })
Download all rows of a bucketanalyze.<name>.exportRows({ bucketKey, format })
Download all rows of a whereBeamClient.queryExport({ lens, where, format }) (or semilayer query --export)
Walk a whole table programmaticallystream.query({ where })
React to new rows over timestream.subscribe({ filter }) — different primitive
Live-updating dashboarduseAnalyze({ liveUpdates: true })

When in doubt, start with cursor pagination. It's the right answer for ~80% of UIs, and you can always upgrade to the streaming export when the user reaches for "download."