Data Mapping — Transforms
After from resolves a value, transform can reshape it. Transforms
are declarative, composable, and run once per row during ingest.
The shape
A single transform or an array. Arrays chain left-to-right — the output of one becomes the input of the next.
The 14 built-in transforms
Type coercion
Use when the source stores values in the wrong type. Common: 'true'
string from an old schema where booleans weren't native; price_cents
as an integer string from a JSON import.
Numeric
mode defaults to 'round'. 'ceil' and 'floor' do what they say.
String — whitespace and casing
String — length and replacement
truncaterequireslength: number > 0.replaceis a global regex replace.patternis a regex source string (no flags —gis implicit); escape backslashes if you need literal ones.
Array ↔ string
Both require separator: string.
Default value
Functionally equivalent to nullAs: 'Unknown' + undefinedAs: 'Unknown'
on the field. Use whichever reads better in context. Tends to be
convenient inside a chain — [{type: 'trim'}, {type: 'default', value: '—'}].
Custom JavaScript
A JS function body that receives (value, field, row) and returns the
new value:
value— the post-resolution, post-prior-transforms value being mapped.field— the output field name (string).row— the entire raw source row (after null-sentinel replacement, before mapping). Useful when a transform needs to peek at another column.
Chaining left-to-right
Arrays run top-to-bottom. Each transform's output is the next one's input.
Real-world chain. Messy input (money that sometimes has a $,
sometimes doesn't, sometimes a string, sometimes a number) gets
normalized to a rounded number.
When transforms run
Per row, during ingest, after from resolution and nullAs/undefinedAs
substitution:
So:
- A
customtransform that readsrow.some_other_columnsees the raw source value (before that other field's mapping ran). - A transform acting on a
merge: 'concat'result sees the joined string, not the parts. nullandundefinedafternullAs/undefinedAshandling depend on whether those fields are set. Most transforms are null-tolerant (they return'',[],nullas appropriate).
Never-throws contract
Transforms never throw. If a transform encounters something it can't
handle (e.g. { type: 'round' } on a string that doesn't parse), the
original value is returned unchanged and an internal log entry is
emitted.
This is deliberate. Ingest processes millions of rows; one bad value should not kill the batch. The tradeoff is that silently-wrong data becomes your job to catch — see Mapping recipes for patterns like "coerce-then-check-with-a-second-field."
Validation
semilayer push validates transform shapes:
typemust be one of the 14 built-in values.split/join/replace/truncate/customrequire their required params.round.decimalsmust be a non-negative integer;round.modemust be'round' | 'ceil' | 'floor'.custom.bodymust be a string. (It's not parsed for safety — it runs in the worker.)
Not validated: whether custom.body is well-formed JS, whether a
replace pattern is a valid regex, whether a transform chain ends up
producing the right type. These fail at ingest time with a quiet log
entry + the original value passed through.
Next: Recipes — common mapping patterns, production-tested.