Ingest — Troubleshooting
When webhook calls succeed but data isn't landing — or lands slowly, partially, or not at all — this page is the checklist.
The happy-path flow
Before anything else, confirm what should be happening:
- Your CDC pipeline POSTs
/v1/ingest/:lens→ service returns202with ajobId. - Service writes every change into the ingest buffer for the lens.
- Service enqueues an
ingest.recordsjob, debounced 2 seconds per lens. - Worker claims the buffered changes, fans out upserts to the bridge, updates vectors, deletes tombstones.
semilayer status --lens <name>showscursoradvancing andpending=0.
If any step breaks, it's in one of these places.
Step 1 — Did the webhook succeed?
| Status | Meaning | Fix |
|---|---|---|
401 | Key rejected | Check the key prefix (ik_<envSlug>_). Wrong env? Rotated? |
403 | Key is valid but wrong type (e.g. you passed a pk_) | Use an ik_ or sk_ key |
404 | Lens doesn't exist in the env | Did you push? Did you delete it? |
400 | Payload validation failed | Check message: missing mode, duplicate ids, > 10k changes |
429 | Rate limit hit | Honor Retry-After. Upgrade tier if chronic. |
5xx | Service unavailable | Retry with backoff. Endpoint is idempotent. |
202 | ✓ Queued | Move to step 2 |
Step 2 — Is the worker draining the buffer?
pending > 0andrate > 0— draining normally. If pending is very high and rising, you're producing faster than the worker can consume (see "The backlog is growing" below).pending > 0andrate = 0— worker stuck. Check "The worker is stuck" below.pending = 0— worker caught up. Either nothing to do or the changes were processed. Move to step 3.
Admins can inspect the same data on the Console → Ingest Jobs page, which also shows per-lens throughput history.
Step 3 — Are the vectors actually there?
For records that should have been upserted, the row comes back with its fields populated. For deleted records, the query returns nothing.
If the row is there but not appearing in search results:
- Check that
searchable: true(or{ weight: N }) is set on the fields you expect to rank the row. A row with no searchable fields is stored but not embedded — see Search — Fields & weights. - Check that the field that was supposed to contain the updated text actually updated in the source. Bridge caching can hide stale reads for the first few seconds; re-run after 10 seconds.
Common failure modes
The worker is stuck
Symptom: semilayer status shows pending > 0 and rate = 0 for
minutes. The job is probably in the dead-letter queue.
Each ingest mode has its own dead-letter queue:
| Mode | Dead-letter queue |
|---|---|
records | ingest.records.dead |
full | ingest.full.dead |
incremental | ingest.incremental.dead |
sync (periodic) | ingest.sync.dead |
Jobs land in dead-letter after the retry policy exhausts — by default, 4 attempts with exponential backoff. Inspect a dead job:
Most dead-letter reasons:
bridge_unreachable— source DB rejected connections. Fix credentials, network, firewall. Then re-enqueue the job:semilayer ingest retry --job-id j_abc123.source_query_failed— bridge raised a SQL error. Column renamed? Table dropped? Schema drift between config and source. Reconcile, push config, then retry.embedding_failed— embedder refused or rate-limited. Rare if your own embedder is healthy; check its quotas.row_not_found— anupsertchange referenced an id the bridge can't find. Usually means the row was inserted and deleted between your CDC emitting the event and the worker processing it. Silently drop these; they're not real errors.
The backlog is growing
pending climbs without bound. Causes, in order of likelihood:
- You're over your tier's
ingestWebhooksPerMinutebut keep retrying. The webhook returns202on every successful call, so your CDC pipeline keeps sending, but the worker can only process at the tier's rate. Checksemilayer status --lens X --verbosefor throttle signals. Upgrade or slow your emitter. - Embedder is rate-limiting the worker. Every upsert embeds a row; bursts can hit the embedder's RPS cap. The worker retries transparently, but large sustained bursts build backlog. Scale up the embedder's quota or slow your CDC emitter.
- Bridge is slow. The worker fetches current row state per upsert. If the source DB is slow under load, each batch takes longer. Check source-side slow query log.
Deletes aren't applying
- The row still shows in
queryafter you sent{ action: 'delete' }. Wait 10s — the worker drains in batches. If it persists,semilayer ingest inspecton the job. Common cause: the lens's config has a field withprimaryKey: trueset on a column that isn't actually the row id, so the vector store can't match the delete. - The row shows in
querybut not insearch. Expected —queryreads the source directly via the bridge;searchhits the vector index. After a delete, the source row is (usually) gone at the source, and the vector is removed. Ifquerystill returns it, your CDC didn't delete the source row; ingest can only remove the vector, not the source. - Soft-delete pattern? If your source marks rows deleted with
deleted_at IS NOT NULLinstead of physically removing them, the upsert path re-indexes them. Add awherefilter in the lens'sfacets.search.fields-adjacent selection — or filter on the client.
Stale data in queries
search returns old content after an update. Possible causes:
- Change never reached the webhook. Check your CDC pipeline's delivery log. The webhook returning
202is the point of commit. - Worker hasn't processed yet. 2–10 seconds of lag is normal. More than a minute for a single update suggests a dead-lettered job (above).
- Field that changed isn't
searchable. The row is re-indexed on every upsert, but a non-searchable field change doesn't move the vector.searchstill ranks by the old vector. Fix: mark the fieldsearchable: trueor query by a different vector-relevant field. - Embedding cache hit. If the embedded text didn't actually change (e.g. a different column updated), the worker skips re-embedding and reuses the old vector. Expected behavior — not a bug.
semilayer ingest reference
The Console's Ingest Jobs page exposes the same surface visually.
Rows deleted in the source still show up in search
Your source dropped a row (hard DELETE), but a search still returns
it. This is expected for incremental sync — syncInterval's
WHERE updated_at > cursor filter cannot see rows that no longer
exist. Three fixes, in increasing order of infrastructure:
- One-off cleanup — click Sync now in the Console or run
semilayer sync --lens <name>. Smart sync's tombstone pass removes rows missing from the source. - Schedule the cleanup — add
smartSyncInterval: '24h'to the lens config. The same tick loop that drivessyncIntervalwill run a nightly smart sync and handle deletes automatically, no webhook code required. See Keeping data fresh. - Propagate synchronously — wire the records webhook with
{ action: 'delete' }entries from your CDC pipeline. Sub-second delete propagation, no polling lag. See CDC patterns.
Option 2 is the right default for teams without CDC infrastructure.
When to escalate
If ingest retry on a dead-letter job succeeds but it fails again the
same way, and the failure isn't an obvious source issue (down DB, bad
credentials), file a support ticket with:
- Lens name + org/env
- Job id(s) from
ingest inspect - Timeframe of the failure
- A sample
changespayload that reproduces the error
Platform admins have direct access to the audit log and can see the exact exception trace.