SemiLayerDocs

Exports

Two endpoints stream rows end-to-end with no in-memory buffering: drill-down exports drain the bucket cursor, and query exports drain a where predicate against any lens with grants.query. Both share one server-side helper, one client-side helper, and one tier-aware cap.

EndpointUse case
POST /v1/analyze/:lens/:name/rows/exportFull set of rows behind a chart bucket
POST /v1/query/:lens/exportFull set of rows for a where predicate (the lens-wide dump)

Both accept format: 'ndjson' | 'csv' (default 'ndjson'). NDJSON is one JSON object per line, terminated by \n. CSV is RFC-4180-compliant with \r\n line endings, quote-wrapping any field containing ,, ", \n, or \r.

Why streaming

The analyze surface lets you click a bucket and see the rows behind it. "See all the rows" is the next obvious ask, and naïve implementations fall apart at scale — buffering 10M rows in memory before sending the body holds the connection open, OOMs the worker, and times out the proxy. The streaming exports handle this correctly:

  1. Server pages through the cursor at 1000 rows per page, flushing each page to the wire as it arrives.
  2. Client decodes line-by-lineBeamClient.exportRows() returns an AsyncIterable<RowChunk>, never holds more than one page in memory.
  3. Tier caps stop the loop loud, with the X-SemiLayer-Export-Truncated trailer, instead of OOMing.

The drill-down endpoint reuses the drill-down cursor internally — anything you can drill, you can export. The query endpoint walks the lens via the same offset+orderBy contract /v1/query already serves.

From Beam — the easy path

// Drill-down export
const stream = beam.products.analyze.byCategory.exportRows({
  bucketKey,
  search:     'lightweight',     // optional — narrow inside the bucket
  orderBy:    { field: 'price', dir: 'asc' },
  format:     'ndjson',
})

for await (const chunk of stream) {
  if (chunk.kind === 'row') {
    process(chunk.row)            // Record<string, unknown>
  } else if (chunk.kind === 'truncated') {
    // Hit the tier cap — chunk.actualRows + chunk.maxRows
    showTierBanner(chunk.maxRows)
  } else if (chunk.kind === 'done') {
    console.log(`Exported ${chunk.rowsExported} rows`)
  }
}

For "I just want a downloadable file" cases, the …ToBlob convenience collapses the iterable into a single Blob you can pipe to URL.createObjectURL:

const blob = await beam.products.analyze.byCategory.exportRowsToBlob(
  { bucketKey, format: 'csv' },
)
downloadFile('byCategory.footwear.csv', blob)

From React — useExportRows

import { useExportRows } from '@semilayer/react'

function ExportButton({ bucketKey }: { bucketKey: string }) {
  const exp = useExportRows(beam.products.analyze.byCategory, {
    bucketKey,
    format: 'ndjson',
  })

  return (
    <button onClick={exp.start} disabled={exp.status === 'running'}>
      {exp.status === 'idle'      && 'Download'}
      {exp.status === 'running'   && `Exporting… ${exp.progress.rowsExported.toLocaleString()}`}
      {exp.status === 'done'      && `Done — ${exp.progress.rowsExported.toLocaleString()} rows`}
      {exp.status === 'truncated' && `Truncated at ${exp.progress.rowsExported.toLocaleString()}`}
      {exp.status === 'error'     && 'Failed — try again'}
    </button>
  )
}

The hook owns the AsyncIterable lifecycle: start() opens the stream, cancel() aborts it cleanly (the bridge closes the cursor on its side, no leaked transactions), and progress aggregates as bytes drain. State flips to truncated automatically when the trailer fires.

From the CLI

# Stream a bucket's rows to a file
semilayer analyze rows recipes.byCuisine \
  --bucket "$BUCKET_KEY" \
  --export rows.ndjson \
  --format ndjson \
  --api-key sk_dev_...

# Pipe through `wc -l` to verify line count
semilayer analyze rows recipes.byCuisine \
  --bucket "$BUCKET_KEY" \
  --export - \
  --format ndjson \
  --api-key sk_... \
  | wc -l

# Same surface for plain query exports
semilayer query products \
  --where '{"category":"footwear"}' \
  --export footwear.csv \
  --format csv \
  --api-key sk_...

--export - writes to stdout. Progress (scanned 2,500,000 rows…) goes to stderr so long-running pipelines don't look hung.

From cURL (raw HTTP)

curl -N -X POST https://api.semilayer.com/v1/analyze/products/byCategory/rows/export \
  -H "Authorization: Bearer sk_live_..." \
  -H "Content-Type: application/json" \
  -d '{"bucketKey":"...","format":"ndjson"}' \
  -o rows.ndjson

curl -N is important — it disables curl's output buffering so chunks land on disk as they stream. Without it you wait for the whole body before any bytes hit the file.

Tier-aware caps + the truncation trailer

Each tier has a per-call row cap. Exports stop at the cap and the body ends gracefully — last NDJSON line is well-formed, CSV ends with \r\n.

TierPer-export rows
Free10,000
Pro100,000
Team1,000,000
Scale10,000,000
Enterpriseunlimited

When the cap halts the stream before the cursor drained, the response sets HTTP trailers:

X-SemiLayer-Export-Truncated: true
X-SemiLayer-Export-Actual-Rows: 100000
X-SemiLayer-Export-Max-Rows: 100000

Trailer support is uneven across HTTP clients — modern Chrome / Firefox fetch() and curl --raw see them; older proxies sometimes strip them. The Beam client + React hook surface the truncation as a final { kind: 'truncated', actualRows, maxRows } chunk so consumers don't need to read trailers themselves.

A separate X-SemiLayer-Export-Truncated-Hint: true header (regular, not trailer) fires up-front when the server can determine the total will exceed the cap — useful for confirmation prompts before the download starts.

NDJSON vs CSV

NDJSONCSV
One row perNewline-delimited JSON objectComma-separated row, RFC-4180-quoted
Schema preservedYes — types come back as JSON valuesNo — every cell is a string
Nested objectsInline ({"address":{"city":"Paris"}})JSON-stringified into one cell
Streaming-decodes well?Yes — split on \nYes — but watch out for \n inside quoted cells
Best forProgrammatic pipelines, jq, JS consumersSpreadsheets, ad-hoc analyst handoff

The CSV path quote-wraps any field containing ,, ", \n, or \r, and escapes inner " as "" per RFC-4180. The first line is the column header. Nested objects (metadata: { city: 'Paris' }) get JSON.stringify'd into one cell since CSV is flat by design.

Cancellation

Both BeamClient.exportRows() and useExportRows().cancel() propagate cancellation through to the server. The handler observes the closed connection on its next page-fetch tick and aborts the cursor walk; the bridge closes its cursor on its side. No leaked transactions, no half-streamed-and-still-running fan-out.

Errors

StatusCodeWhen
400bucket_key_requiredDrill export body missing bucketKey.
400bucket_key_invalidSignature failed or token expired.
400analyze_input_too_largeBody > 8KB. Tighten the export request body.
403forbiddengrants.analyze.<name> (drill) or grants.query (query export) denied.
404lens_not_found / analysis_not_foundResource doesn't exist on this env.
502(bridge-classified)Bridge error; classified message + detail in the body.

Mid-stream errors can't change the status code (the response is already 200 once the first byte flushed). They surface via the X-SemiLayer-Export-Error trailer with <code>: <message>.

Next: Cursors & streaming — the mental model for pagination across query, drill-down, and the streaming exports.