What is SemiLayer?

SemiLayer is an intelligence layer you bolt onto a database you already have. You point it at your Postgres, Mongo, or whatever table you want to make smart; SemiLayer reads the rows, embeds them, indexes them, and hands back a typed client that gives your app four new capabilities: search, similarity, dashboards, and smart feeds. Your data never moves. Your schema stays yours. We are additive — we sit beside your database, not in front of it.

The four primitives it gives you:

Semantic queries over the rows you already have.

beam.products.search("summer dresses under $80")

Similarity

Related items ranked by meaning, with optional filters.

beam.products.similar(id)

Dashboards

Typed aggregations — drillable, optionally live. Declarative, vector-aware.

beam.products.analyze.byCategory()

Smart feeds

Personalized, ranked feeds. Declarative — no ranker to maintain.

beam.feed.discover()

You declare what you want each of those to do per table (a lens), you get a typed client back (a beam), and you ship.

And across all four: joins work across sources. One include clause on search, similar, query, or an analyze's relations declaration stitches related rows from another lens onto each result — even when that lens lives in a different database (Postgres → Mongo, MySQL → DynamoDB, whatever). No SQL JOIN, no N+1, no client-side stitching. The planner figures out which bridge to call, runs the fetches in parallel, and hands you nested rows.

How it fits

Here's where SemiLayer sits in a typical stack:

Your app

frontend, backend, agent

Beam client

typed SDK, generated

SemiLayer API

— or your Runner —

Bridge

db adapter

Vector index

partitioned / tenant

Access rules

per-user, per-lens

Your DB

source of truth — untouched

SemiLayer always sits next to your database, never in front of it. Your app talks to a typed Beam client. Beam talks to the SemiLayer API (or your own Runner). The API answers queries by combining a vector ANN lookup with fresh reads from your DB through a Bridge — so you never get a stale field.

The data path

Ingest — continuous · poll 1m / 15m / 1h or push via webhook

Your DB

the rows you declared interesting

Bridge reads

pg, mongo, sqlite…

Embed

content → vectors

Vector index

partitioned per tenant

On first push, we connect to your DB via a Bridge, read the rows you've declared interesting, embed them, and land the vectors in a partitioned vector index. After that, incremental ingest keeps the index fresh — polling on a configurable interval (1m, 15m, 1h…), or pushed via webhook for event-driven flows. Pause is instant; cursor saved, nothing retries in the background.

The query path

Query — p95 100–200ms on millions of rows

Beam query

beam.products.search(…)

ANN + fresh reads

in parallel · no LLM on the hot path

Access rules

evaluated per-row

Typed results

ranked, scored, yours

Your app calls beam.products.search(...) — or beam.products.search.stream(...) for live results over WebSocket (feeds, live-tail, and long-running queries stream as results materialize). The client sends a typed request; SemiLayer resolves it: ANN lookup against the partitioned vector index, fresh row reads from your DB via the Bridge, access rules applied, typed results returned. No LLM on the hot path. Latency sits in the 100–200ms range at millions of rows.

Concepts

Bridge

DB adapter

Lens

Table, made smart

Beam

Typed client

Runner

Airgap data plane

Bridge — the database adapter. Postgres, Mongo, SQLite, more coming. Open source. Swap databases without re-architecting.
Lens — your declaration of "make this table smart." Defines fields, mappings, transforms, which facets (search, similar, feed) you want, and the access rules.
Beam — the typed client generated from your lenses. beam.products.search(...), with autocomplete and types that match your schema.
Runner — optional container for airgap. Data plane inside your infra. Outbound-only WebSocket back to the hosted Console. No inbound ports, no VPN.
Console — the web UI where you configure lenses, manage API keys, watch ingest jobs, view metrics. Hosted by us. Same Console whether you're Runner-deployed or direct.

Auth and access, in plain terms

API key

sk_live_… / pk_live_…

End-user JWT

X-User-Token (JWKS)

Access rules evaluated

tenant + user → scoped rows

Two tokens flow through every request:

A platform API key (sk_live_... server-side, pk_live_... browser-safe) identifies your tenant to us.
An optional end-user JWT identifies the actual user making the request — passed via X-User-Token, validated against your own JWKS endpoint (Auth0, Clerk, Supabase, custom — anything OIDC-compatible).

Access rules declared per lens run against that end-user JWT. If a user shouldn't see a row, they don't — enforced at the query level, not in application code. Your RBAC becomes configuration, not application code.

Who it's for

Enterprise and regulated industries

Your legal team read "vendor" and flinched. You need data that never leaves your infra, a control plane you can audit, and a deployment model your compliance reviewers can actually approve. Runner mode keeps the data plane inside your VPC with outbound-only WebSocket — no VPN, no inbound ports, no "send us your tables." Enterprise self-host is available on request.

Platform and infra teams

You've been asked whether to build this in-house. You could — the pattern's well-known (a vector store, a queue, an ingest cron, a ranker, an RBAC layer, a typed client) — but it's a six-month detour, and it shows up on-call forever. Evaluating us is a build-vs-buy calculation: we handle the pipeline, drift detection, indexing, access rules, and client generation, and we hand back a typed SDK your app teams consume like any other internal library.

Product engineers at growing companies

Your product works — it's just dumber than it should be. Search returns exact matches only, related items are hand-curated, the feed is chronological because ranking it "right" was never a roadmap priority. You've got a real DB, a real user base, and the political capital to add one vendor to the stack, not three. SemiLayer gives you search, similarity, and feeds without filing a platform-team RFC.

Solo builders and small teams

You're three people in a Slack, shipping fast, and users are starting to complain about search. You don't have time to build an embedding pipeline, and you don't have a data team to "throw this at." You've got real users and real rows — a few thousand, a few hundred thousand — and you need them to be smart by next month. Ship in an afternoon, iterate for free.

What it isn't

1. Not a vector database.

Not in the sense Pinecone, Weaviate, or Qdrant are — a system you stand up, push vectors to, and keep in sync with your real DB. There's a vector index under the hood — semantic search has to land somewhere — but we run it, and it reads from the database you already have. No second cluster to operate, no sync job to babysit, no separate vendor on the vector side.

2. Not a hosted search service.

Algolia and Typesense require you to replicate your records into their cloud on their schedule. Your data, their index. SemiLayer queries your DB directly through a Bridge. Your data never leaves — we read, embed, and serve results over what's already there.

3. Not a full-text search engine.

Elasticsearch and OpenSearch do keyword (BM25) search at scale with a decade of tuning behind them. SemiLayer does semantic search — "summer dresses" matches "linen maxi for July" because meaning, not tokens. Different tool, different problem. They're complementary, not competitive; most serious apps run both.

4. Not an LLM wrapper or RAG toolkit.

LangChain and LlamaIndex hand you primitives and wish you luck in production. SemiLayer is the production system — embedding pipeline, vector store, drift detection, access rules, typed client — with strong opinions where opinions save you time. No LLM call on the hot path. Search latency is measured in milliseconds, not seconds.

5. Not a replacement for your database.

You keep Postgres, Mongo, CockroachDB, whatever. SemiLayer indexes on top and queries through. If you rip us out tomorrow, every byte of your data is exactly where it started. We're additive, not invasive.

When NOT to use us

1. You don't have data yet.

SemiLayer indexes the records already in your database. If you're pre-launch with an empty schema, search and similarity have nothing to work on — build the product first, bolt us on once you've got real rows and real users. The gate is rows on disk, not team size.

2. Your full-text search already works great.

If you've spent years tuning Elasticsearch — synonyms, BM25, ranking logic — and your users find what they need, migrating doesn't buy you much. Semantic search is additive. If your actual pain isn't "users can't find things by meaning," don't shop for a new tool just because AI is fashionable.

3. You need sub-50ms retrieval at internet scale.

Specialized retrieval infra (Vespa, heavily-tuned Lucene, custom ANN engines) will beat us at ad-auction or exchange-grade workloads. SemiLayer lands around 100–200ms on millions of rows — fast enough for ~95% of product surfaces, not fast enough for bid requests or HFT-adjacent retrieval.

4. You're building pure RAG over a single doc set.

If you just need "index 10k markdown files and answer questions with GPT," LangChain + a vector DB + an OpenAI key gets you there in an afternoon. SemiLayer is built for multi-tenant production systems — lenses, access rules, drift detection, typed clients, per-user gating. For a single-dataset internal chatbot, we're overkill.

5. Your compliance model forbids any vendor-hosted control plane.

Runners keep your data in your infra, but the Console (where you configure lenses, manage keys, view jobs) is hosted by us. If your security review says "nothing SemiLayer-owned in our network path, period," enterprise self-host is available on request — contact us.