Pulp Engine — Technical Specification
Engineering reference for the template model and rendering pipeline as implemented. Documents behaviour, not design intent.
File retains its
mvp-technical-spec.mdfilename for link stability; the content is the current technical reference and is no longer scoped to the original MVP.
1. Template Structure
A template is a JSON file conforming to TemplateDefinition:
interface TemplateDefinition {
key: string // stable slug, unique across all templates
version: string // semver (e.g. "1.0.0")
name: string
description?: string
inputSchema: Record<string, unknown> // JSON Schema Draft-07, validates ADAPTED data
fieldMappings?: FieldMappingEntry[] // raw payload → adapted data shape
renderConfig?: RenderConfig
document: DocumentNode // root AST node
}
RenderConfig
interface LocaleConfig {
currency?: string // BCP 47 locale tag for Intl.NumberFormat. Default: "en-US"
date?: string // BCP 47 locale tag for Intl.DateTimeFormat. Default: "en-AU"
}
interface RenderConfig {
paperSize?: 'A4' | 'A3' | 'Letter' | 'Legal' | 'Tabloid' // default: A4
orientation?: 'portrait' | 'landscape' // default: portrait
margin?: { top: string; right: string; bottom: string; left: string }
customCss?: string // CSS injected into <style>. Safe mode (default): @import and url() fetch
// vectors are stripped. Pass allowUnsafeCustomCss: true to
// HtmlRenderer.render() for verbatim injection — admin sources only.
fontImports?: string[] // Font @import URLs. Always validated: https: only, private/internal hosts blocked.
locale?: LocaleConfig // locale for currency/date helpers
pageHeader?: string // HTML allowlist-sanitized at render time; script/event-handler/external-resource vectors stripped; residual CSS and Chromium risk not zero. Passed to Puppeteer headerTemplate after sanitization. Inline styles only; font-size: 0 by default. Span classes: pageNumber, totalPages, date, title, url. Capped at 10,000 chars. Visible in PDF output only.
pageFooter?: string // Same sanitization model and constraints as pageHeader. Passed to Puppeteer footerTemplate after sanitization. Capped at 10,000 chars. Visible in PDF output only.
}
locale applies to all {{currency}} and {{date}} helper calls within the template, including table column formatters. The number helper uses toFixed() and is not locale-aware. Both fields are optional; defaults are preserved when locale is omitted.
Templates are stored in PostgreSQL as JSONB (template_versions.definition). Multiple versions are retained; the latest by created_at is used for rendering. The TemplateDefinition shape is enforced at runtime by Zod (TemplateDefinitionSchema in apps/api/src/validation/template-definition.schema.ts) at all persistence boundaries — POST create, PUT update, and version restore. Unknown structural fields are rejected (strict mode); key must match /^[a-zA-Z0-9][a-zA-Z0-9_-]*$/ and version must be major.minor.patch with optional prerelease/build metadata suffixes.
2. AST Node Types
All nodes extend BaseNode:
interface BaseNode {
id: string
type: string
style?: NodeStyle // camelCase CSS properties → inline style=""
paginationHints?: PaginationHints
meta?: Record<string, unknown>
}
Structural nodes
Not part of the AnyContentNode union — not valid inside children arrays directly.
| Node | Key fields |
|---|---|
document | children: SectionNode[] |
section | label?, children: AnyContentNode[] |
column | span? (flex proportion), children: AnyContentNode[] |
Content nodes
AnyContentNode union — valid anywhere a children array is accepted.
| Node | Key fields | HTML output |
|---|---|---|
container | children: AnyContentNode[] | <div> |
columns | columns: ColumnNode[], gap? | <div style="display:flex"> |
heading | level: 1–6, content: string (Handlebars) | <h1>–<h6> |
text | content: string (Handlebars), inline? | <p> or <span> |
image | src: string (Handlebars), alt?, fit?, width?, height? | <img> |
table | See Table Behaviour | <table> |
chart | See Chart Behaviour | <div class="chart-wrapper"> + inline SVG |
spacer | size: string, horizontal? | <div style="height:N"> or <span style="width:N"> |
divider | thickness?, color?, dashStyle? | <hr> |
pageBreak | (no extra fields) | <div style="break-after:page"> |
conditional | condition, children, elseChildren? | See Conditional and Repeater Behaviour |
repeater | dataPath, children, emptyChildren? | See Conditional and Repeater Behaviour |
richText | content: RichTextBlock[] | See RichText Node Behaviour |
signatureField | label, anchorText, tabType?, signerRole?, hidden? | <span> (anchorText rendered; hidden mode: color:transparent; font-size:2pt) |
templateRef | templateKey, dataPath?, version?, fieldMappings? | Inline sections from referenced template; see Template Composition |
toc | maxDepth? (1–6, default 3), ordered?, title?, showPageNumbers?, leaderStyle? ('dots'|'dashes'|'none') | <div class="toc"> + <ul>/<ol> with anchor links to headings; with showPageNumbers also emits a leader + reserved page-number slot that is filled post-render by pdf-lib (see §14) |
positioned | top, left, width, height, children: AnyContentNode[] | <div style="position:absolute;..."> — physical-unit CSS values (mm, pt, in, px) |
pivotTable | See Pivot Table Behaviour | <table> — cross-tab grid with row/column dimension grouping and measure aggregation |
barcode | value: string (Handlebars), format (qr / code128 / code39 / ean13 / upc / datamatrix), width?, height?, ecLevel? (QR), displayValue? (1-D formats) | <img> with inline SVG data URI, produced by @pulp-engine/barcode-renderer (bwip-js + qrcode) |
custom | customType: string, props?, children? | Plugin-registered node; rendered by the plugin’s HTML renderer (see plugin-api) |
NodeStyle
Any camelCase CSS property is valid (e.g. fontWeight, backgroundColor, marginBottom). Compiled to kebab-case inline style="" attributes. No validation is applied to style values.
3. Field Mapping Behaviour
Field mappings transform the raw caller payload into the adapted data shape that the renderer and schema validator operate on.
interface FieldMappingEntry {
targetPath: string // dot-path written into adapted data, e.g. "customer.name"
sourcePath?: string // dot-path read from raw payload; defaults to targetPath if omitted
defaultValue?: unknown
expression?: string // jexl expression evaluated against the raw payload; takes precedence over sourcePath when non-empty
}
Algorithm (DataAdapter.adapt):
- Deep-clone the raw payload (
structuredClone) as the working object — unmapped keys remain accessible - For each
FieldMappingEntry:- If
expressionis a non-empty string, evaluate it as a jexl expression against the raw payload; use the result as the value - Otherwise, read value from raw payload at
sourcePath(ortargetPathifsourcePathis absent) using lodash-style dot-path resolution - If the resolved value is
undefinedanddefaultValueis set, usedefaultValue - If the final value is not
undefined, write it totargetPathin the result
- If
- Fields with no mapping pass through unchanged (the clone preserves them)
Important: defaultValue is only applied when the source resolves to undefined. A source value of null, false, 0, or "" is used as-is.
Expression evaluation: jexl expressions in expression are evaluated safely — missing identifiers resolve to undefined (not a throw). Invalid syntax surfaces as a 500 error from the API.
Scope: Field mappings apply at two levels:
- Template-level (
TemplateDefinition.fieldMappings): transforms the raw caller payload before rendering begins - TemplateRef-level (
TemplateRefNode.fieldMappings): transforms the already dataPath-scoped data before it flows into the referenced sub-template’s children. The referenced template’s own top-level fieldMappings are NOT applied — only the referencing node’s fieldMappings.
4. Validation Behaviour
Schema validation runs after field mapping, against the adapted data shape. inputSchema describes the adapted shape, not the raw payload.
- Validator: Ajv v8 with
allErrors: trueandajv-formats(adds"date","email", etc.) - On failure: HTTP 422 with body
{ error: 'Validation Failed', issues: ValidationError[] } - Each
ValidationErrorhas{ path: string, message: string }
The POST /render pipeline order:
Raw payload → Field Mapping → Schema Validation → HTML Rendering → PDF Rendering
Validation errors are returned before any rendering occurs.
5. Formatting Helpers
Handlebars helpers are available inside any content string (text, heading, image src, table column headers/footers). All helpers are registered once and are idempotent.
{{currency value [currencyCode]}}
- Formats a number as a currency string using
Intl.NumberFormat - Locale:
en-US(affects digit grouping) currencyCode: ISO 4217 code, default'USD'- Minimum fraction digits: 2
- Example:
{{currency loan.amount "AUD"}}→A$50,000.00 - Returns original value as string if not a valid number
{{number value [decimals]}}
- Formats a number to a fixed number of decimal places
decimals: default2- Example:
{{number loan.rate 2}}→5.75 - Returns original value as string if not a valid number
{{date value [format]}}
- Formats a date string or Date object using
Intl.DateTimeFormat - Locale:
en-AU formatoptions:'short'|'medium'(default) |'long'|'full'short→ dd/mm/yyyy stylemedium→ e.g.14 Mar 2026long→ e.g.14 March 2026full→ e.g.Saturday, 14 March 2026
- Returns empty string if value is null or empty; returns raw string if invalid date
{{uppercase value}} / {{lowercase value}}
String case conversion. Value coerced to string before conversion.
{{sum array "field"}}
Sums a named numeric field across an array of objects. Designed for table footer expressions where a grand total needs to be computed inline.
array— resolves from the current data context (e.g.loan.items)"field"— a simple (non-nested) field name on each item (e.g."amount")- Numeric values are summed directly.
- Numeric strings (e.g.
"12.5") are coerced viaparseFloatand included. - Non-numeric strings,
null,undefined, and missing fields are ignored (treated as 0). - Non-array first argument or empty array returns
0.
Composable with other helpers via Handlebars subexpressions:
{{sum loan.items "amount"}}
{{currency (sum loan.items "amount") "AUD"}}
Table cell formatting
Table column definitions may specify format (helper name) and formatArgs (array of additional arguments). These are applied to cell values in the renderer, not via Handlebars template strings:
{ "key": "amount", "format": "currency", "formatArgs": ["AUD"] }
6. Conditional and Repeater Behaviour
Conditional (ConditionalNode)
Evaluates a structured condition against the current data context at render time.
interface StructuredCondition {
path: string // dot-path into adapted data
operator: 'eq' | 'ne' | 'gt' | 'gte' | 'lt' | 'lte' | 'exists' | 'notExists'
value?: unknown // not required for exists/notExists
}
Operator semantics:
| Operator | Behaviour |
|---|---|
eq | strict equality (===) |
ne | strict inequality (!==) |
gt / gte / lt / lte | numeric comparison only; returns false if either operand is not a number |
exists | resolved value is not undefined and not null |
notExists | resolved value is undefined or null |
Rendering:
- Condition true → renders
children - Condition false → renders
elseChildren(if present), otherwise returns empty string (no wrapper, no placeholder) - Output is wrapped in a
<div>with inline style only if the node hasstyleorpaginationHints; otherwise children are emitted unwrapped
conditionExpression (optional escape hatch): When conditionExpression is a non-empty, non-whitespace string, it is evaluated as a jexl expression against the current data context and takes precedence over condition. Missing identifiers evaluate to undefined (falsy). Invalid syntax throws ExpressionError which surfaces as a 500 from the API. When absent or blank, condition is evaluated as before.
Repeater (RepeaterNode)
Resolves an array from adapted data and renders children once per item.
interface RepeaterNode extends BaseNode {
dataPath: string // dot-path to the array
children: AnyContentNode[]
emptyChildren?: AnyContentNode[]
}
Behaviour:
dataPathresolves to an array → renderschildrenN times with per-item scoped datadataPathresolves toundefined(key absent) → returns empty string silentlydataPathresolves to a non-array, non-null value → returns empty string and emitsconsole.warn- Array is empty → renders
emptyChildrenif present, otherwise returns empty string
Scoped data per iteration (merges in this order, later entries win):
- Parent adapted data
- Item properties (if item is a plain object; scalar items do not spread)
- Synthetic variables:
@index(0-based integer),@first(boolean),@last(boolean)
7. Table Behaviour
Tables are first-class nodes rendered directly to <table> / <thead> / <tbody> / <tfoot>.
interface TableNode extends BaseNode {
dataPath: string
columns: TableColumnDefinition[]
repeatHeader?: boolean // default true — thead { display: table-header-group } for print
showFooter?: boolean // default false
caption?: string
emptyMessage?: string // default 'No items.'
tableStyle?: NodeStyle
headerRowStyle?: NodeStyle
bodyRowStyle?: NodeStyle
alternateRowStyle?: NodeStyle // applied to every odd-indexed row (1, 3, 5…)
footerRowStyle?: NodeStyle
}
interface TableColumnDefinition {
key: string // dot-path within each row object
header: string // Handlebars string
footer?: string // Handlebars string, compiled against full adapted data (not row)
width?: string
align?: 'left' | 'center' | 'right'
format?: string // helper name: 'currency', 'date', 'number'
formatArgs?: unknown[] // additional args passed to the format helper
style?: NodeStyle // applied to <td> cells
headerStyle?: NodeStyle // applied to <th> cells (merged with headerRowStyle)
sparkline?: { // when set, renders inline SVG instead of text
sparkType: 'line' | 'bar'
valuesPath: string // dot-path within each row to a number array
color?: string // hex (#abc), rgb(), or hsl() — default: palette colour #0
width?: number // SVG width in pixels (20–400, default 80)
height?: number // SVG height in pixels (10–100, default 20)
}
}
Body rows:
- One
<tr>per item in the resolved array alternateRowStyleapplied to rows at index 1, 3, 5… (0-based odd)- Cell value resolved via
col.keydot-path within the row object, then formatted ifcol.formatis set null/undefinedcell values render as empty string
Empty state:
- If the array is empty, a single
<tr><td colspan="N">row is emitted withemptyMessage - Default
emptyMessage:'No items.'
Header:
- Emitted as
<thead>whenrepeatHeader !== false thead { display: table-header-group }is included in the@media printCSS block, causing the header to repeat on every printed page
Footer:
- Emitted as
<tfoot>only whenshowFooter: true - Every footer cell emits
col.alignandcol.styleregardless of whethercol.footertext is set (no bare unstyled<td>) col.footeris a Handlebars string compiled against the full adapted data context, not the row — use this for aggregated values computed upstream
8. Chart Behaviour
Chart nodes render data arrays as inline SVG. No external charting library or JavaScript is required — the SVG is generated server-side and embedded directly in the HTML output.
interface ChartNode extends BaseNode {
type: 'chart'
chartType: 'bar' | 'line' | 'pie' | 'area' | 'donut' | 'horizontalBar' | 'stackedBar' | 'groupedBar' | 'scatter' | 'gauge' | 'funnel' | 'waterfall' | 'treemap' | 'heatmap' | 'combo'
// Data binding
dataPath: string // dot/bracket path to an array in adapted data, e.g. "sales" or "report.items"
categoryField: string // field name on each item to use as category label (X-axis or pie slice label)
valueField: string // field name on each item to use as the numeric value
// Display
title?: string // plain-text title rendered above the chart
showLegend?: boolean // show a legend; applied to pie/donut charts (stored on all types for forward compatibility)
emptyMessage?: string // shown when the data array is empty or all values are non-numeric. Default: "No chart data."
// Sizing
width?: string // CSS width of the chart wrapper (default: "100%")
height?: string // CSS height of the chart wrapper (default: "300px")
// Extended chart type fields
xField?: string // scatter: field name for X-axis numeric value
seriesFields?: string[] // stackedBar/groupedBar/combo: ordered bar-series field names
seriesLabels?: string[] // stackedBar/groupedBar/combo: human-readable bar-series labels
innerRadiusRatio?: number // donut: inner radius as fraction of outer (0–1, default 0.55)
gaugeMin?: number // gauge: minimum scale value (default 0)
gaugeMax?: number // gauge: maximum scale value (default 100)
// Data labels (bar / horizontalBar / stackedBar / groupedBar)
showDataLabels?: boolean
dataLabelPosition?: 'outside' | 'inside' | 'center' | 'auto' // default 'auto'
// Combo chart (bar + line overlay)
lineSeriesFields?: string[] // combo: line-series field names
lineSeriesLabels?: string[] // combo: line-series labels
secondaryAxis?: { // combo: when set, line series use a right-side y-axis
label?: string
min?: number
max?: number
}
}
Chart types
| Type | Description |
|---|---|
bar | Vertical bar chart. Bars adapt width to the number of categories. Category labels appear below each bar. |
line | Line chart with data-point circles. Category labels appear below each point. |
pie | Pie chart. When showLegend: true the pie is offset left and a legend appears on the right showing category, colour swatch, and percentage. Percentage labels are rendered inside slices wider than ~23°. |
area | Area chart. Filled polygon below the line with 20% opacity, plus solid stroke on top. |
donut | Annular (ring) variant of pie. innerRadiusRatio controls hole size (default 0.55). |
horizontalBar | Horizontal bar chart. Category labels appear to the left of each bar. |
stackedBar | Stacked vertical bar chart. Multiple seriesFields are stacked per category. Legend shown when >1 series. |
groupedBar | Grouped vertical bar chart. Multiple seriesFields are placed side-by-side per category. Legend shown when >1 series. |
scatter | Scatter plot. Uses xField for X-axis and valueField for Y-axis. Optional categoryField for point labels. |
gauge | Half-circle gauge. Reads a single value (first array element or single object). gaugeMin/gaugeMax set the scale. |
funnel | Funnel chart. Stages stacked top-to-bottom with width proportional to valueField; category labels alongside each stage. |
waterfall | Waterfall chart. Successive positive/negative deltas on valueField; running total tracked across categories with colour-coded bars. |
treemap | Treemap. Rectangular areas sized by valueField and labelled from categoryField; squarified layout inside the viewBox. |
heatmap | 2-D heatmap. Uses xField + categoryField as row/column dimensions and valueField as cell intensity on a sequential colour ramp. |
combo | Grouped bars (seriesFields) with one or more line overlays (lineSeriesFields). Optional secondaryAxis draws a right-side y-scale so bar/line series with mismatched magnitudes (e.g. volume + price) share one panel cleanly. Legend distinguishes bar (rect swatch) from line (stroke + dot) for monochrome print legibility. PDF target in v1 — docx/pptx renderers fall back to a bar-family rendering and drop the line overlay. |
All chart types use a fixed 480×300 internal SVG viewBox (gauge uses 480×260) and a 6-colour Tableau-inspired palette.
Data labels (bar family)
bar, horizontalBar, stackedBar, groupedBar, and combo honour showDataLabels. With the default dataLabelPosition: 'auto', single-value bars place labels outside the bar end; stacked segments centre labels inside. Stacked segments shorter than ~12 px fall back to a right-side callout so no value is silently dropped; small-segment behaviour can be overridden by setting dataLabelPosition explicitly.
Data resolution
dataPathis resolved against adapted data using the same dot/bracket resolver astableandrepeater(supportsitems[0].subsyntax).- For each item in the resolved array,
categoryFieldandvalueFieldare resolved. - Items where
valueFielddoes not resolve to a number (or a numeric string) are silently skipped. - If no usable data points remain,
emptyMessageis rendered instead of the SVG.
HTML output
<div class="chart-wrapper" style="width:100%;height:300px">
<p class="chart-title">Sales by Region</p> <!-- only when title is set -->
<svg viewBox="0 0 480 300" ...>...</svg>
</div>
NodeStyle and paginationHints on the chart node are merged into the wrapper style attribute alongside the width/height defaults.
9. RichText Node Behaviour
richText nodes store structured content as a JSON AST — no raw HTML is stored or emitted from arbitrary sources.
AST types
| Type | Description |
|---|---|
RichTextNode | Root content node. Has content: RichTextBlock[]. |
ParagraphBlock | { type: 'paragraph', children: RichTextInline[] } |
OrderedListBlock | { type: 'orderedList', items: ListItemBlock[] } |
UnorderedListBlock | { type: 'unorderedList', items: ListItemBlock[] } |
ListItemBlock | { type: 'listItem', children: RichTextInline[], sublist?: OrderedListBlock | UnorderedListBlock } |
TextLeaf | { type: 'text', text: string, bold?: true, italic?: true, underline?: true, color?: string } |
LinkInline | { type: 'link', href: string, children: TextLeaf[] } |
LineBreakInline | { type: 'lineBreak' } |
Nested lists
Exactly one level of nesting is supported via ListItemBlock.sublist. Deeper nesting is not preserved in the AST — the editor caps nesting at one level on input.
Text colour
TextLeaf.color accepts CSS colour strings in these formats only:
| Format | Examples |
|---|---|
| 3-digit hex | #rgb |
| 6-digit hex | #rrggbb |
| RGB | rgb(0, 128, 255) |
| RGBA | rgba(0, 128, 255, 0.5) |
| HSL | hsl(200, 100%, 50%) |
| HSLA | hsla(200, 100%, 50%, 0.8) |
Named colours (e.g. red, blue) are not supported. Invalid values are silently ignored by the renderer and flagged as an error by the editor validator.
HTML rendering
The renderer emits only whitelisted HTML tags: <div>, <p>, <ol>, <ul>, <li>, <strong>, <em>, <u>, <a>, <span>, <br />. No arbitrary HTML is passed through.
- Handlebars expressions in
TextLeaf.textuse{{double-stash}}— all resolved values are HTML-escaped - Link
hrefvalues are validated against a protocol allowlist (http,https,mailto,tel) before rendering; unsafe hrefs are rendered as plain text - Color values are validated with
isSafeRichTextColor()before emitting<span style="color:...">
Current limitations
- One level of list nesting only (
sublistonListItemBlock) - No raw HTML storage or arbitrary HTML passthrough
- No markdown import or export
- Editor link popover warns on unsafe protocols (
javascript:,ftp:) and normalises bare domains tohttps://; the server renderer and validator enforce the protocol allowlist as a hard gate — unsafe hrefs are rendered as plain text - Handlebars expressions are preserved as plain text in Visual editing mode; use JSON mode for precise expression authoring
10. Pagination Behaviour
Pagination hints are compiled to both modern (break-*) and legacy (page-break-*) CSS properties for maximum Puppeteer/print compatibility.
interface PaginationHints {
breakBefore?: 'auto' | 'always' | 'avoid' | 'page' | 'left' | 'right'
breakAfter?: 'auto' | 'always' | 'avoid' | 'page' | 'left' | 'right'
breakInside?: 'auto' | 'avoid'
keepWithNext?: boolean // compiles to break-after: avoid
}
Mapping:
| Hint | CSS emitted |
|---|---|
breakBefore: "always" | break-before: always; page-break-before: always |
breakAfter: "avoid" | break-after: avoid; page-break-after: avoid |
breakInside: "avoid" | break-inside: avoid; page-break-inside: avoid |
keepWithNext: true | break-after: avoid; page-break-after: avoid |
Pagination hints are applied as inline styles on the node’s wrapper element. These are CSS soft hints — Puppeteer honours them when there is sufficient content flow room; hard layout constraints take precedence.
The <style> block in every rendered document includes:
@media print {
thead { display: table-header-group; }
tfoot { display: table-footer-group; }
tbody { display: table-row-group; }
}
11. Pivot Table Behaviour
The pivotTable node transforms a flat source array into a cross-tab (pivot) grid, grouping by row and column dimensions and aggregating numeric measures.
Node definition
interface PivotTableNode extends BaseNode {
type: 'pivotTable'
dataPath: string // path to flat source array
rowDimensions: PivotDimension[] // 1–4 dimensions; multi-level supported
columnDimension: PivotDimension // single field → column headers
measures: PivotMeasure[] // 1+ numeric measures to aggregate
showRowGrandTotal?: boolean // default: true
showColumnGrandTotal?: boolean // default: true
showRowSubtotals?: boolean // default: false (no-op for single-dim pivots)
subtotalLabel?: string // default: "{value} Total" (use {value} placeholder)
emptyCell?: string // default: '' (displayed when intersection has no data)
grandTotalLabel?: string // default: 'Grand Total'
caption?: string
emptyMessage?: string
tableStyle?: NodeStyle
headerRowStyle?: NodeStyle
bodyRowStyle?: NodeStyle
alternateRowStyle?: NodeStyle
totalRowStyle?: NodeStyle
}
interface PivotDimension {
field: string // dot-path within each row
label?: string // display label (defaults to field name)
sort?: 'asc' | 'desc' | 'none' // default: 'asc'
}
interface PivotMeasure {
field: string // dot-path (numeric)
label: string // sub-column header
aggregation?: 'sum' | 'count' | 'avg' | 'min' | 'max' // default: 'sum'
format?: string // Handlebars helper name
formatArgs?: unknown[]
}
Data transform
The pivot transform is a pure function in @pulp-engine/data-adapter. Algorithm:
- Single pass over the source array: extract composite row key, column key, and measure values per item.
- Accumulate into
Map<rowKey, Map<colKey, Accumulator[]>>— each accumulator tracks running sum, count, min, max (all five aggregation types derivable from these four). - Sort row and column keys (ascending by default, numeric sort when all values parse as numbers).
- Build output — for each (row, col) pair, resolve the configured aggregation. Missing intersections yield
null.
HTML output
When measures.length > 1, a two-row <thead> is emitted:
- Row 1: dimension labels (with
rowspan="2") + column key headers (withcolspan=measures.length) - Row 2: measure labels repeated per column group
When measures.length === 1, a single header row is emitted (column keys only).
Grand total row renders in <tfoot>. Cell formatting reuses the same Handlebars helper pipeline as the table node (currency, number, date, etc.).
Multi-level row dimensions and subtotals (v0.71.0+)
The pivot transform supports 1–4 row dimensions (Zod-bounded). When showRowSubtotals: true AND rowDimensions.length > 1, the transform emits one subtotal row per closed group at every depth except the leaf.
The algorithm uses per-depth accumulator slots that ancestor leaves merge into. As the sorted leaf-row scan crosses a group boundary, every depth from firstDiffer to dimCount-2 gets flushed (deepest first) and reset for the next sibling group. Subtotal rows carry isSubtotal: true, depth: d (the closed dimension index), and dimensionValues truncated to the closed prefix. When showColumnGrandTotal is on, subtotal rows also include their own rowTotal cells so the trailing grand-total column stays rectangular.
Aggregation correctness: subtotals derive from the same Accumulator struct as leaves (sum, count, min, max). The avg aggregation is computed at every level from the merged sum/count, NOT as average-of-averages — so a subtotal “Avg / Deal” reflects the true mean of the underlying values, not the mean of the children’s means.
Subtotal label: the subtotalLabel field templates the label cell, with {value} substituting the deepest closed dimension value. Default is "{value} Total". Empty string renders just the value; a literal without {value} renders verbatim.
Cap rationale: the 4-dimension limit bounds header rowspan complexity in HTML/DOCX and mergeCells count in XLSX. Lifting it later requires re-validating those header paths, not just the schema.
Rendering targets
| Target | Output |
|---|---|
| HTML/PDF | <table> with <thead>, <tbody>, <tfoot>. Subtotal rows carry class pivot-subtotal pivot-subtotal-d{depth} for CSS hooks; default inline shading applied for PDF (no external CSS). |
| DOCX | docx.Table with header merging via columnSpan. Themed headerRowStyle / bodyRowStyle / alternateRowStyle / totalRowStyle honored end-to-end (translated to docx cell shading + run properties). Subtotals inherit the total row style as a base over a lighter SUBTOTAL_BG default. |
| PPTX | Native pptxgenjs slide.addTable call. Themed row styles honored (translated to pptxgenjs cell options). Content-aware column widths (mimics CSS table-layout: auto) so wide grand-total values like $4,571,700.00 aren’t crowded into per-quarter-sized slices. |
| CSV | Flattened grid matching HTML layout (RFC 4180). Subtotal label written in the depth-th column, blank cells for trailing dimensions. |
| XLSX | ExcelJS worksheet, numeric cells as numbers, header merging. Subtotal rows use mergeCells for the trailing label span, bold + SUBTOTAL_FILL shading. |
12. Features, Limitations, and Non-Goals
Features added beyond the original MVP scope
The following were originally considered deferred but have since shipped:
| Feature | Notes |
|---|---|
Drag-and-drop template editor (apps/editor) | Implemented and in internal pilot — see docs/editor-guide.md |
| Authentication / authorisation | Scoped API key model (API_KEY_ADMIN, API_KEY_RENDER, API_KEY_PREVIEW, API_KEY_EDITOR), editor session tokens (HMAC-signed), named-user mode (EDITOR_USERS_JSON). See docs/api-guide.md. |
| Template version pinning | Optional version field in POST /render body. See docs/api-guide.md. |
| PDF streaming | Both POST /render and POST /render/preview/pdf stream via Puppeteer createPDFStream() — no Node.js PDF buffer. Chrome render-time memory still applies; in RENDER_MODE=container / socket it applies inside the ephemeral worker, insulated from the API process. |
| Image asset management | Upload, list, and delete image files via POST /assets/upload, GET /assets, DELETE /assets/:id. Files stored in ASSETS_DIR and served at /assets/*. Image src strings can reference assets by path (e.g. /assets/my-logo.png). |
| Chart / graph nodes | 15 chart types (bar, line, pie, area, donut, horizontalBar, stackedBar, groupedBar, scatter, gauge, funnel, waterfall, treemap, heatmap, combo) rendered as inline SVG. Optional showDataLabels on the bar family; combo supports a dual-axis line overlay. Sparkline formatter on table columns. See Chart Behaviour. |
| Barcode / QR node | 6 formats (QR, Code 128, Code 39, EAN-13, UPC, Data Matrix) via @pulp-engine/barcode-renderer. Handlebars-capable value. Rendered as inline SVG in all output formats. |
| Template CRUD | POST /templates (create), PUT /templates/:key (update), DELETE /templates/:key (soft delete), POST /templates/:key/versions/:version/restore (restore historical version). |
dataPath bracket indexing | items[0].sub syntax supported in repeater, table, and chart dataPath fields. |
richText node with structured content | Paragraph, ordered/unordered list (one-level nested), bold, italic, underline, text colour, links (safe protocols), line breaks. Safe HTML renderer with whitelisted tags. Visual and JSON editing in apps/editor. See RichText Node Behaviour. |
Template composition (templateRef) | Inline rendering of referenced templates with dataPath scoping and fieldMappings for data transformation. Max nesting depth: 5. See Template Composition. |
Table of contents (toc) | Auto-generated anchor-linked TOC from document headings. See TOC Node Behaviour. |
Absolute positioning (positioned) | Container with physical-unit CSS positioning for form overlays. CSS unit whitelist enforced. See Positioned Node Behaviour. |
Cross-tab / pivot table (pivotTable) | Pre-render data transform: flat array → 2D grid grouped by row/column dimensions with aggregated measures (sum, count, avg, min, max). All render targets (HTML, DOCX, CSV, XLSX). See Pivot Table Behaviour. |
Known limitations
See the consolidated Current limitations table in mvp-scope.md — that is the canonical list.
Known constraints
- Puppeteer startup:
first PDF request per process takes ~2–3 s (cold browser)Resolved — the browser is now warmed at startup viawarmBrowser()(in-process mode) or the child-process dispatcher’swarmup()IPC (child-process mode). The ~2–3 s Chromium launch cost is paid during server boot, not on the first request. Subsequent requests continue to reuse the singleton browser instance. Container/socket modes are ephemeral and unaffected. nulltreated as missing byexists/notExists: intentional — document templates rarely distinguishnullfrom absent.- Numeric comparison operators (
gt,gte,lt,lte): returnfalse(not an error) if either operand is not a number. - Alternating row style: applies to rows at odd indices (1, 3, 5…), not to even indices. The first data row (index 0) always uses
bodyRowStyle. dataPathbracket indexing:items[0].subsyntax is supported forrepeater,table, andchartnodes. Only non-negative integer indices are valid; negative indexes, floats, string/quoted keys, bare bracket-first paths ([0].a), and unclosed brackets throw aDataPathErrorat render time. Missing properties or out-of-range indexes returnundefinedwithout throwing.
13. Template Composition
The templateRef node embeds a referenced template’s document sections inline at render time.
interface TemplateRefNode extends BaseNode {
type: 'templateRef'
templateKey: string // key of the template to resolve
dataPath?: string // dot-path to scope parent data
version?: string // version pin (latest if omitted)
fieldMappings?: FieldMappingEntry[] // transform scoped data before sub-template rendering
dataSource?: TemplateRefDataSource // external data source (fetched server-side)
dataMergeStrategy?: 'replace' | 'shallow' | 'deep' // how fetched data merges with parent
}
Resolution: All template references are pre-resolved before the synchronous render walk via resolveAllRefs(). Max nesting depth is 5. Cycle detection prevents infinite recursion.
Data flow: parent data → dataPath scoping → data source fetch + merge → fieldMappings (DataAdapter.adapt) → sub-template children. The referenced template’s own top-level fieldMappings are not applied — only the referencing node’s mappings. The referenced template’s renderConfig (margins, headers, fonts) is also not applied — only its document sections are rendered inline.
Field mappings on templateRef enable parameterized composition: a master template can pass transformed data to a reusable sub-template (e.g. a “letterhead” fragment receiving recipientName from the parent’s customer.name).
Data Source Binding
When dataSource is set, the sub-report fetches its own data server-side during a pre-render resolution pass (after resolveAllRefs, before the synchronous render walk). Sibling data sources at the same depth are fetched in parallel.
interface TemplateRefDataSource {
type: string // 'static' | 'url' | plugin-registered
data?: Record<string, unknown> // type='static': inline data payload
url?: string // type='url': endpoint (supports {{handlebars}} interpolation)
headers?: Record<string, string> // type='url': HTTP headers (values support interpolation)
timeoutMs?: number // fetch timeout (default 10s, range 1–30s)
method?: 'GET' | 'POST' // HTTP method (default GET)
body?: string // POST body (supports interpolation)
config?: Record<string, unknown> // arbitrary config for plugin types
}
Merge strategies (dataMergeStrategy, default 'replace'):
replace— fetched data replaces parent context entirely (sub-report is independent)shallow—{ ...parentScoped, ...fetched }(inherit shared context like company info)deep— recursive merge, fetched wins at leaf level; arrays replaced wholesale
Error handling: Production renders fail with DataSourceError (HTTP 422, code data_source_error). Preview renders degrade gracefully — the sub-report falls back to parent-scoped data.
Security: URL data sources are validated against the same SSRF guard as webhook delivery (validateWebhookUrl). The UrlValidator is injected via dependency injection to maintain package boundaries.
Plugin extensibility: Plugins register custom data source types via ctx.registerDataSourceProvider(type, provider). Built-in types static and url cannot be overridden.
Caching: A per-render cache deduplicates identical URL fetches (keyed by post-interpolation URL + headers). Static data sources are not cached (clone is cheap).
14. TOC Node Behaviour
The toc node auto-generates a table of contents from all headings in the document.
interface TocNode extends BaseNode {
type: 'toc'
maxDepth?: 1 | 2 | 3 | 4 | 5 | 6 // default 3
ordered?: boolean // <ol> when true, <ul> when false (default)
title?: string // optional heading above the list
showPageNumbers?: boolean // default false — when true, reserves a right-aligned slot and post-processes the PDF to stamp real page numbers
leaderStyle?: 'dots' | 'dashes' | 'none' // default 'dots'; only meaningful when showPageNumbers is true
}
Heading collection: Before rendering begins, HtmlRenderer.buildBody() performs a lightweight AST pre-walk that collects { id, level, content } for every heading node in the document — including headings inside containers, columns, conditionals, repeaters, positioned nodes, and templateRefs. Heading content is interpolated via interpolatePlain() so Handlebars expressions resolve to plain text (no HTML entities in TOC link text).
Anchor links: All <h1>–<h6> tags in rendered output include id="heading-{nodeId}". The TOC emits <a href="#heading-{nodeId}"> links. In PDF output, Chromium resolves these as in-document jumps.
Filtering: Only headings with level <= maxDepth appear in the TOC. Indentation is applied via margin-left proportional to heading level.
Page numbers (PDF output): When showPageNumbers is true, the HTML renderer emits a wrap-safe leader (radial-gradient for dots, linear-gradient for dashes) followed by a reserved-width <span class="toc-page-slot"> per entry. The HTML/preview path leaves the slot blank — page numbers are a print-only artifact.
On the PDF path, TemplateRenderer walks the AST via hasTocWithPageNumbers and, when any reachable TOC node requests page numbers, sets stampTocPageNumbers: { destPrefix: 'heading-' } on the dispatch options. After Puppeteer produces the PDF, pdf-renderer invokes @pulp-engine/pdf-transform#stampTocPageNumbers which:
- Reads the PDF’s
/Catalog/Dests(Chromium emits one entry per HTMLid) to build a{destName → destination page}map. - Walks each page’s
/Annots, selects/Linkannotations whose destination name starts withheading-, and uses pdf-lib’sdrawTextto stamp the resolved page number right-aligned inside each annotation’s/Rect.
CSS target-counter() is not used — Chromium’s Puppeteer page.pdf() does not resolve it against printed pages. The destPrefix filter prevents non-TOC <a href="#heading-…"> anchors from being stamped inadvertently.
Known limitation — wrapped titles: When a TOC entry title wraps to a second line, the leader sits between the last line and the page number rather than continuing per-line. Acceptable for v1; short titles avoid this.
15. Positioned Node Behaviour
The positioned node renders an absolutely-positioned container for form overlays and pre-printed stationery use cases.
interface PositionedNode extends BaseNode {
type: 'positioned'
top: string // e.g. "10mm", "72pt", "1in", "50px"
left: string
width: string
height: string
children: AnyContentNode[]
}
CSS unit validation: Position and dimension values are validated against a whitelist regex: /^\d+(\.\d+)?(mm|pt|in|px|cm|em|rem|%)$/. This is enforced at both the Zod schema (API boundary) and the renderer (defense in depth). Invalid values fall back to 0px. This blocks CSS injection vectors (expression(), url(), calc(), semicolon injection).
Rendering: Emits <div style="position:absolute;top:...;left:...;width:...;height:...">children</div>. Children are rendered recursively.
Parent requirement: The positioned node must be placed inside a container with style.position: 'relative' set by the template author. The renderer does not auto-wrap the parent — this would require parent-context awareness, violating the stateless-per-node rendering model.