Data Mapping — Recipes
Eight patterns pulled from production configs. Copy the block that matches, adapt the field names, and move on.
Full name from first + last
The canonical merge case. Use concat with a space.
Display name with fallbacks
Prefer nickname, fall back to preferred name, fall back to legal first
name. coalesce picks the first non-null.
Cents → dollars, rounded
Integer cents is a common source shape. Round to 2 decimals for display.
custom here because there's no built-in "divide by N." Could also be
done with toNumber + a scale factor in your source view.
Normalize a tag list
Source has a comma-separated string. Output an array with trimmed, lowercased values.
After this: 'Thai, Vegan, ' → ['thai', 'vegan'].
URL canonicalization
Strip trailing slashes, upgrade http:// to https://, lowercase.
Each replace is a global regex. Chain them for multi-step cleanup.
Truncate a long description for display
Keep a separate full-body field (for embedding), truncate for UI.
Both point at the same source column. The index carries both: one for
search ranking, one for query() list views.
Derived slug
Compose a URL-safe slug from the title. Uses custom because the logic
is too specific for built-ins.
No from needed — custom reads from row directly. The output field
is purely derived.
Coerce + check with a companion field
Transforms don't throw. For data hygiene, pair a coerced field with a validity flag so bad data is queryable rather than invisible.
Now beam.products.query({ where: { priceValid: false } }) surfaces the
rows with bad source data — rather than leaving them lurking in the
index with NaN prices.
Null sentinels at the lens level
If your source uses 'N/A', '', or '-' to mean "no value," declare
them at the lens level via nullValues. Every field then treats those
values as null before mapping runs.
Saves you from writing the same custom check on every field.
Composition — more than two chained
A chain can be as long as it needs to be. This turns a messy multi-source price string into a canonical number:
Seven operations, one declarative block, zero ETL infrastructure.
When to reach for a view instead
Mapping is powerful but doesn't replace a source-side view. Reach for a view when:
- You need aggregation (sum, avg, count across rows).
- You need filters that remove rows (e.g.
WHERE deleted_at IS NULL). Mapping can't skip rows; usetable: 'active_users_view'or similar. - You need joins at ingest time (not via SemiLayer's
include). Materialize the join in a view; point the lens at it.
Mapping is for reshaping row-by-row. Anything that crosses rows or creates/destroys rows belongs upstream.