Feeds — Signals

Feeds rank rows using two kinds of signals:

Engagement — rows in another lens you own that reference the candidate (likes, views, clicks).
Context — values your app passes in at call time (user preferences, current topic, seed record).

Both funnel into the scorers (engagement, similarity). Both have one non-negotiable property: the data stays on your side. SemiLayer reads your engagement lens through a bridge it already has access to, and accepts context as opaque JSON. We never aggregate engagement across customers.

Engagement — via a sibling lens

The usual shape: your product has a recipes table and a recipe_likes table. Declare both as lenses, declare a relation between them, and point the engagement scorer at the sibling lens.

feedrecipes

// Sibling lens — one row per like
recipe_likes: {
  source: 'main',
  table: 'recipe_likes',
  fields: {
    id:         { type: 'number', primaryKey: true },
    recipe_id:  { type: 'number' },
    user_id:    { type: 'number' },
    created_at: { type: 'date' },
  },
  grants: { query: 'staff' },
},

// Feed lens — declares the relation once, uses it in the scorer
recipes: {
  source: 'main',
  table: 'recipes',
  fields: { /* ... */ },
  relations: {
    likes: {
      lens: 'recipe_likes',
      kind: 'hasMany',
      on: { id: 'recipe_id' },
    },
  },
  feeds: {
    discover: {
      candidates: { from: 'embeddings', limit: 500 },
      rank: {
        similarity: { weight: 0.6, against: 'liked_titles' },
        engagement: {
          weight: 0.3,
          lens: 'recipe_likes',
          relation: 'likes',       // ← join derived from relations.likes.on
          aggregate: 'recent_count',
          window: '24h',
          decay: 'log',
        },
        recency: { weight: 0.1, halfLife: '7d' },
      },
    },
  },
  grants: { feed: { discover: 'public' } },
}

Why through a lens, not a raw table

Because a lens is already governed. It has:

Access rules (enforced for pk_ callers)
A bridge (airgap-aware, runner-routable)
A known schema the config validates against

Nothing escapes the system. There's no "and also give the feed ranker raw table access" path to audit.

Engagement config — explicit vs relation-derived

// Preferred — relation is declared once, used everywhere
engagement: {
  lens: 'recipe_likes',
  relation: 'likes',
  aggregate: 'recent_count',
  window: '24h',
}

// Fallback — explicit join, useful when the relation isn't declared on this lens
engagement: {
  lens: 'recipe_likes',
  join: { local: 'id', foreign: 'recipe_id' },
  aggregate: 'recent_count',
  window: '24h',
}

Context — what your app passes in

Context is an opaque Record<string, unknown> passed per-request. Three canonical shapes, all valid for similarity.against:

1. Pre-computed vector

await beam.recipes.feed.discover({
  context: {
    user_profile_vec: [0.12, -0.34, ...],   // you computed this your side
  },
})

against: 'user_profile_vec'. Server treats any number[] of the right dimension as a ready-to-compare vector. Zero embedding API call. Best for personalization where you already run your own profile pipeline.

2. Embedded text

await beam.recipes.feed.discover({
  context: {
    liked_titles: ['Tom Yum Goong', 'Laksa', 'Pad Thai'],
  },
})

against: 'liked_titles'. Server joins the array with newlines, embeds it on demand, and caches the result per-pod. Same pod + same text = one API call regardless of how many requests.

3. Row id → stored vector

await beam.recipes.feed.relatedTo({
  context: {
    seedRecordId: 'r_104',     // a mapped field name or sourceRowId
  },
})

against: { from: 'context.seedRecordId', mode: 'recordVector' }. Server looks up the row's stored vector and uses it directly. Zero embedding API call, ever. Powers "more like this." See Related items.

The 600-likes problem

If a user has liked 600 things and you pass all 600 in liked_titles, the embedding cost balloons. Three scaling patterns:

Compress on your side — pre-compute an average vector from the user's last N likes and pass it in as a number[]. No server embedding cost, no token limits.
Slice to recency — pass only the last 20 liked titles. Tend to the better signal; older likes drift.
Use recordVector — pick one seed record per request (the "most recently liked") and use mode: 'recordVector'. Zero API cost.

Most production feeds use (1). Clients that can't compute vectors use (2). Only "show me things like this" uses (3).

What SemiLayer never does

Never aggregates engagement across customers. Your recipe_likes lens is yours; the feed ranker reads it through your bridge with your access rules.
Never persists your context. Context is per-request; we cache the embedded form keyed by sha256(text) for up to 5 minutes (configurable), then drop it.

Your data, your ranking signals. SemiLayer is the coordinator.