SemiLayerDocs

What is SemiLayer?

SemiLayer is an intelligence layer you bolt onto a database you already have. You point it at your Postgres, Mongo, or whatever table you want to make smart; SemiLayer reads the rows, embeds them, indexes them, and hands back a typed client that gives your app three new capabilities: search, similarity, and smart feeds. Your data never moves. Your schema stays yours. We are additive — we sit beside your database, not in front of it.

Most teams who end up here have been quietly building some version of this themselves: a pgvector column, an embedding pipeline cron-running at 3am, a nearest-neighbor endpoint that works until it doesn't, a feed ranker that's one intern away from collapsing. It's three months of focused engineering to get to something mediocre, six to get it right, and there's always one person who knows how it actually works. SemiLayer is that stack — done, maintained, delivered as a typed client you can call in one line.

The three primitives it gives you:

  • Search — semantic queries over the rows you already have. "summer dresses under $80" returns rows from your catalog, not a hello-world demo.
  • Similaritybeam.products.similar(id) returns related items ranked by meaning, with optional filters.
  • Smart feeds — personalized, ranked feeds that blend recency, engagement, context, and similarity. Declarative config, no ranker to maintain.

You declare what you want each of those to do per table (a lens), you get a typed client back (a beam), and you ship.


How it fits

Here's where SemiLayer sits in a typical stack:

┌─────────────┐         ┌──────────────┐         ┌────────────────┐
│  Your app   │────────▶│  Beam client │────────▶│  SemiLayer API │
│             │  query  │  (typed SDK) │         │    (or Runner) │
└─────────────┘         └──────────────┘         └────────┬───────┘
                                                          │
                              ┌───────────────────────────┼───────────────┐
                              ▼                           ▼               ▼
                     ┌────────────────┐          ┌────────────────┐  ┌──────────┐
                     │    Bridge      │          │  Vector index  │  │ Access   │
                     │ (pg, mongo,    │          │  ┌─┐ ┌─┐ ┌─┐   │  │  rules   │
                     │  sqlite, …)    │          │  └─┘ └─┘ └─┘ … │  │ + auth   │
                     │                │          │  partitioned   │  │          │
                     └────────┬───────┘          │  per tenant    │  │          │
                              │                  └────────────────┘  └──────────┘
                              ▼
                     ┌────────────────┐
                     │   Your DB      │  ← source of truth, untouched
                     └────────────────┘

SemiLayer always sits next to your database, never in front of it. Your app talks to a typed Beam client. Beam talks to the SemiLayer API (or your own Runner). The API answers queries by combining a vector ANN lookup with fresh reads from your DB through a Bridge — so you never get a stale field.

The data path

On first push, we connect to your DB via a Bridge, read the rows you've declared interesting, embed them, and land the vectors in a partitioned vector index. After that, incremental ingest keeps the index fresh — polling on a configurable interval (1m, 15m, 1h…), or pushed via webhook for event-driven flows. Pause is instant; cursor saved, nothing retries in the background.

The query path

Your app calls beam.products.search(...) — or beam.products.search.stream(...) for live results over WebSocket (feeds, live-tail, and long-running queries stream as results materialize). The client sends a typed request; SemiLayer resolves it: ANN lookup against the partitioned vector index, fresh row reads from your DB via the Bridge, access rules applied, typed results returned. No LLM on the hot path. Latency sits in the 100–200ms range at millions of rows.

Concepts, in the order they show up

  • Bridge — the database adapter. Postgres, Mongo, SQLite, more coming. Open source. Swap databases without re-architecting.
  • Lens — your declaration of "make this table smart." Defines fields, mappings, transforms, which facets (search, similar, feed) you want, and the access rules.
  • Beam — the typed client generated from your lenses. beam.products.search(...), with autocomplete and types that match your schema.
  • Runner — optional container for airgap. Data plane inside your infra. Outbound-only WebSocket back to the hosted Console. No inbound ports, no VPN.
  • Console — the web UI where you configure lenses, manage API keys, watch ingest jobs, view metrics. Hosted by us. Same Console whether you're Runner-deployed or direct.

Auth and access, in plain terms

Two tokens flow through every request:

  1. A platform API key (sk_live_... server-side, pk_live_... browser-safe) identifies your tenant to us.
  2. An optional end-user JWT identifies the actual user making the request — passed via X-User-Token, validated against your own JWKS endpoint (Auth0, Clerk, Supabase, custom — anything OIDC-compatible).

Access rules declared per lens run against that end-user JWT. If a user shouldn't see a row, they don't — enforced at the query level, not in application code. Your RBAC layer becomes one line of config.


Who it's for

Solo builders and small teams

You're three people in a Slack, shipping fast, and users are starting to complain about search. You don't have time to build an embedding pipeline, and you don't have a data team to "throw this at." You've got real users and real rows — a few thousand, a few hundred thousand — and you need them to be smart by next month. The gate is data, not headcount: if you've got users and records, you're a great fit. Ship in an afternoon, iterate for free.

Product engineers at growing companies

Your product works — it's just dumber than it should be. Search returns exact matches only, related items are hand-curated, the feed is chronological because ranking it "right" was never a roadmap priority. You've got a real DB, a real user base, and the political capital to add one vendor to the stack, not three. SemiLayer gives you search, similarity, and feeds without filing a platform-team RFC.

Platform and infra teams

You've been asked whether to build this in-house. You could — the pattern's well-known (pgvector, a queue, an ingest cron, a ranker, an RBAC layer, a typed client) — but it's a six-month detour, and it shows up on-call forever. Evaluating us is a build-vs-buy calculation: we handle the pipeline, drift detection, indexing, access rules, and client generation, and we hand back a typed SDK your app teams consume like any other internal library.

Enterprise and regulated industries

Your legal team read "vendor" and flinched. You need data that never leaves your infra, a control plane you can audit, and a deployment model your compliance reviewers can actually approve. Runner mode keeps the data plane inside your VPC with outbound-only WebSocket — no VPN, no inbound ports, no "send us your tables." Enterprise self-host is available on request.


What it isn't

This is where a skeptical engineer lands. Tone: direct, specific names, willing to point at a competitor if that's genuinely the right tool.

1. Not a vector database.

Not in the sense Pinecone, Weaviate, or Qdrant are — a system you stand up, push vectors to, and keep in sync with your real DB. There's a vector index under the hood (honestly — semantic search has to land somewhere), but we run it, and it reads from the database you already have. No second cluster to operate. No sync job to babysit. No separate vendor on the vector side.

2. Not a hosted search service.

Algolia and Typesense require you to replicate your records into their cloud on their schedule. Your data, their index. SemiLayer queries your DB directly through a bridge. Your data never leaves — we read, embed, and serve results over what's already there.

3. Not a full-text search engine.

Elasticsearch and OpenSearch do keyword (BM25) search at scale with a decade of tuning behind them. SemiLayer does semantic search — "summer dresses" matches "linen maxi for July" because meaning, not tokens. Different tool, different problem. They're complementary, not competitive; most serious apps run both.

4. Not an LLM wrapper or RAG toolkit.

LangChain and LlamaIndex hand you primitives and wish you luck in production. SemiLayer is the production system — embedding pipeline, vector store, drift detection, access rules, typed client — with strong opinions where opinions save you time. No LLM call on the hot path. Search latency is measured in milliseconds, not seconds.

5. Not a replacement for your database.

You keep Postgres, Mongo, CockroachDB, whatever. SemiLayer indexes on top and queries through. If you rip us out tomorrow, every byte of your data is exactly where it started. We're additive, not invasive.


When NOT to use us

1. You don't have data yet.

SemiLayer indexes the records already in your database. If you're pre-launch with an empty schema, search and similarity have nothing to work on — build the product first, bolt us on once you've got real rows and real users. The gate is rows on disk, not team size: a solo dev with a few thousand rows and real traffic is a perfectly good fit.

2. Your full-text search already works great.

If you've spent years tuning Elasticsearch — synonyms, BM25, ranking logic — and your users find what they need, migrating doesn't buy you much. Semantic search is additive. If your actual pain isn't "users can't find things by meaning," don't shop for a new tool just because AI is fashionable.

3. You need sub-50ms retrieval at internet scale.

Specialized retrieval infra (Vespa, heavily-tuned Lucene, custom ANN engines) will beat us at ad-auction or exchange-grade workloads. SemiLayer lands around 100–200ms on millions of rows — fast enough for ~95% of product surfaces, not fast enough for bid requests or HFT-adjacent retrieval.

4. You're building pure RAG over a single doc set.

If you just need "index 10k markdown files and answer questions with GPT," LangChain + a vector DB + an OpenAI key gets you there in an afternoon. SemiLayer is built for multi-tenant production systems — lenses, access rules, drift detection, typed clients, per-user gating. For a single-dataset internal chatbot, we're overkill.

5. Your compliance model forbids any vendor-hosted control plane.

Runners keep your data in your infra, but the Console (where you configure lenses, manage keys, view jobs) is hosted by us. If your security review says "nothing SemiLayer-owned in our network path, period," enterprise self-host is available on request — contact us.


Where to go next:

Try it

  • See the live demo → — interactive product, no signup.
  • Quickstart — from npm install to your first semantic query in under ten minutes.
  • MCP server — have your agent do it. Claude, Cursor, and Codex can configure lenses, discover relations, and inspect your schema on your behalf.

Go deeper

  • Concepts — lens, beam, bridge, facets, access rules in depth.
  • Reference — every Beam method, every config field, every error code.
  • GitHub — bridges and SDK are open source.

Talk to us

  • Pricing — free tier, usage-based Pro, enterprise self-host on request.
  • Contact — security reviews, enterprise deployment, anything else.