Pulp Engine Document Rendering
Get started

Pulp Engine — Self-Hosted Deployment Guide

Covers runtime requirements, environment variables, Docker deployment (recommended), database setup, build/start, reverse proxy, logging, backup, migration, and production risks. Pulp Engine is operator-managed: you provision the infrastructure, configure credentials, handle upgrades, and run it on infrastructure you control.


Deployment Topologies

Choose a topology before configuring environment variables.

TopologySTORAGE_MODEASSET_BINARY_STORETemplate storeAsset binary storeAPI instances
Single-instance, no databasefilefilesystemFilesystem (TEMPLATES_DIR)Filesystem (ASSETS_DIR)1 only
Single-instance, databasepostgres or sqlserverfilesystemDatabaseFilesystem (ASSETS_DIR)1 recommended unless ASSETS_DIR is shared
Multi-instance, database + NFSpostgres or sqlserverfilesystemDatabaseShared network volume (ASSETS_DIR)2+
Multi-instance, database + S3postgres or sqlservers3DatabaseS3-compatible bucket2+ (no shared volume required)

STORAGE_MODE controls where template and asset metadata is stored. ASSET_BINARY_STORE controls where binary files (uploaded images) are stored. They are independent — any combination of storage mode and binary store is valid.

  • ASSET_BINARY_STORE=filesystem (default): Binary files are written to ASSETS_DIR on the local filesystem. In single-instance deployments this is simple and requires no extra infrastructure. In multi-instance deployments all API instances must mount the same ASSETS_DIR via a shared network volume (e.g. NFS). Without a shared volume, instance A cannot serve assets uploaded by instance B, causing 404 errors in rendered PDFs. See § 9 Known Production Risks.

  • ASSET_BINARY_STORE=s3: Binary files are written to an S3-compatible bucket. Eliminates the shared-volume requirement for multi-instance deployments. See § Object Storage for configuration.

File mode in production is appropriate for a single containerised instance. It is not suitable for horizontal scaling. See § 9 Known Production Risks for the file-mode risk entry.

Quick-reference compose matrix

TopologyStorageAsset BinaryRenderCompose File
Single instance, no DBfilefilesystemchild-processcompose.yaml
Single instance, Postgrespostgresfilesystemchild-processcompose.postgres.yaml
Multi-instance, Postgres + S3postgress3child-process(custom — see sections below)
Privilege-separated renderfile or postgresfilesystem or s3socketcompose.container.yaml

The compose files are evaluator-ready starting points. For hardened production configuration, continue with the sections below.


1. Runtime Requirements

RequirementVersionNotes
Node.js22–24node --version to confirm
pnpm10.32.1pnpm --version to confirm
PostgreSQL14+ (16 recommended)Required only when STORAGE_MODE=postgres (default); must be reachable from the API process
SQL Server2019+ / Azure SQLRequired only when STORAGE_MODE=sqlserver; must be reachable from the API process
Chromium / PuppeteerBundledPuppeteer downloads Chromium on pnpm install; ensure network access or pre-cache

Linux note: Puppeteer requires shared libraries that are not always present on minimal server images. Install the following if Chromium fails to launch:

# Debian / Ubuntu
apt-get install -y libatk1.0-0 libatk-bridge2.0-0 libcups2 \
  libdrm2 libxkbcommon0 libxcomposite1 libxdamage1 libxfixes3 \
  libxrandr2 libgbm1 libasound2

Chrome sandbox: The Chrome sandbox is enabled by default. Only --disable-dev-shm-usage and --disable-gpu are always passed to Puppeteer. The --no-sandbox and --disable-setuid-sandbox flags are added only when PULP_ENGINE_DISABLE_SANDBOX=true is set in the environment. Container deployments (Docker, Kubernetes without elevated capabilities) must set this variable — Chrome will fail to launch in a container without it.


2. Environment Variables

Copy .env.example to .env and set each variable before starting the process.

Running tests locally? See ../apps/api/README.md for the dev-loop setup (repo-root .env, throwaway Postgres, db:deploy, run the suite).

Table columns. Default is what the runtime uses when the variable is absent (or explicitly unset). Example value shows a non-default setting. Required calls out the modes in which omission causes startup to fail.

VariableRequiredDefaultExample valueDescription
STORAGE_MODENopostgrespostgresStorage backend: postgres (default), sqlserver, or file. Plugin-provided backends are also accepted — the plugin system validates at activation time. Selects which adapter is loaded at startup.
DATABASE_URLYes (postgres mode)postgresql://user:pass@host:5432/pulp-engine?schema=publicPrisma connection string. Required when STORAGE_MODE=postgres; not read otherwise.
SQL_SERVER_URLYes (sqlserver mode)mssql://user:pass@host:1433/pulp-engine?encrypt=false&trustServerCertificate=truemssql connection URL. Required when STORAGE_MODE=sqlserver; not read otherwise.
BATCH_ASYNC_DURABILITYNorequired (hardened) / warn (non-hardened)warnTri-state gate — required / warn / allow-nondurable. When STORAGE_MODE=sqlserver or STORAGE_MODE=file (v0.84.0+), controls whether the API refuses boot, logs a warning, or boots quietly given that async-batch state is in-memory only on those backends. Default mirrors REQUIRE_HTTPS: hardened production sets the default to required (refuses boot) and operators must explicitly downgrade. See § Operational Limitations vs Postgres.
TEMPLATES_DIRYes (file mode)./templatesPath to a directory of TemplateDefinition JSON files. Required when STORAGE_MODE=file; not read in database modes.
HOSTNo0.0.0.00.0.0.0Bind address. Defaults to 0.0.0.0. Use 127.0.0.1 if behind a reverse proxy on the same host.
PORTNo30003000Listener port. Defaults to 3000.
NODE_ENVNodevelopment (Docker image: production)productionAccepted values: development, production, test. Application default is development; the Docker image sets NODE_ENV=production in its final stage. In production mode, JSON log output is enabled and hardening enforcement is active by default (see HARDEN_PRODUCTION).
API_KEY_ADMINYes (production, recommended)a-long-random-stringAdmin-scoped credential — full access: templates, assets, render, preview. At least one of API_KEY_ADMIN, API_KEY_RENDER, or the deprecated API_KEY must be set in production — unless using named-user-only mode (EDITOR_USERS_JSON without any shared API keys), which is a valid standalone configuration.
API_KEY_RENDERNounsetanother-long-random-stringRender-scoped credential — all production render routes: POST /render/pdf (canonical; POST /render is a deprecated alias), /render/html, /render/csv, /render/xlsx, /render/docx, /render/pptx, /render/batch, /render/batch/docx, /render/batch/pptx, and /render/pdf/*. Use for production integrations.
API_KEY_PREVIEWNounsetyet-another-random-stringPreview-scoped credential — POST /render/preview/* only.
API_KEY_EDITORNounseta-different-random-stringEditor-scoped credential — template management, asset management, POST /render/preview/*, and POST /render/validate. Operators enter this value in the editor’s interactive login form (v0.15.0 — no VITE_API_KEY or frontend env var required). Cannot call any production render route or admin-only operations (delete template, restore version, promote label, admin routes).
API_KEYDeprecatedLegacy single key (treated as admin scope). Accepted for migration; cannot be set alongside the new scoped keys. See § Migration below.
EDITOR_TOKEN_TTL_MINUTESNo480480Lifetime of a minted editor session token in minutes. Default: 480 (8 hours). Accepted range: 5–1440. Shorter values reduce the exposure window if a token is compromised. Requires an API restart to take effect.
EDITOR_TOKEN_ISSUED_AFTERNounset2026-03-24T12:00:00ZReject any editor session token whose issued-at timestamp is before this UTC datetime. Accepts a UTC ISO-8601 string. Use to invalidate all outstanding editor sessions without rotating API_KEY_EDITOR. Requires an API restart to take effect. In multi-instance deployments server clocks must be reasonably synchronised (NTP). Pre-v0.19.0 tokens are always rejected when this is set.
API_KEY_EDITOR_PREVIOUSNounsetVerify-only previous editor key for near-zero-downtime session rotation. Set to the old API_KEY_EDITOR value while setting the new value as the active key. Outstanding editor session tokens signed with the old key continue to verify; only the new key mints. Does not preserve direct X-Api-Key usage of the old key — callers using the old key directly must switch to the new key during the rollout. Cannot be used as X-Api-Key or to mint tokens. Must not equal any active key. Remove after EDITOR_TOKEN_TTL_MINUTES elapses and restart again. See runbook.md § Auth secret rotation. Requires an API restart.
API_KEY_ADMIN_PREVIOUSNounsetVerify-only previous admin key. Same semantics as API_KEY_EDITOR_PREVIOUS but for editor session tokens signed with API_KEY_ADMIN. Does not preserve direct X-Api-Key usage of the old admin key. See runbook.md § Auth secret rotation. Requires an API restart.
EDITOR_USERS_JSONNounset (shared-key mode)[{"id":"alice",...}]Per-user named credential registry (v0.23.0+). When set, the editor login gate operates in named-user mode: each operator has a personal key, server-derived identity, and an optional role. See § Named-User Mode for the full format and migration runbook. When absent, shared-key mode is active (existing behavior). Requires an API restart when changed.
ASSET_BINARY_STORENofilesystemfilesystemWhere uploaded asset binaries (images) are stored. filesystem (default) writes to ASSETS_DIR; s3 stores in an S3-compatible bucket. Independent of STORAGE_MODE. See § Object Storage.
ASSET_ACCESS_MODENopublicpublicControls how asset binaries are delivered. public (default) serves assets without auth — static file serving (filesystem) or public S3 URLs. private routes all asset delivery through an authenticated API proxy. See § Asset Access Mode.
ASSETS_DIRNo./assets./assetsFilesystem path where uploaded image assets are stored. Used only when ASSET_BINARY_STORE=filesystem. The directory is created automatically on startup if it does not exist. Use an absolute path in production (e.g. /var/pulp-engine/assets).
ASSETS_BASE_URLNo/assets/assetsURL prefix under which asset files are served (e.g. /assets/uuid-logo.png). Used only when ASSET_BINARY_STORE=filesystem. Must match the path configured in your reverse proxy if assets are proxied separately.

Asset upload validation: The API validates every upload against a server-side allowlist (PNG, JPEG, GIF, WebP) and cross-checks the declared Content-Type against the file’s magic bytes. A mismatch returns 415 Unsupported Media Type. SVG uploads are explicitly rejected — SVG files can contain JavaScript and external entity references. If SVG assets were stored before v0.27.0, the API logs a legacy_svg_detected warning at startup. Use GET /assets?legacySvg=true (admin credentials) to enumerate them and follow the remediation workflow in the runbook.

VariableRequiredDefaultExample valueDescription
S3_BUCKETYes (s3 mode)my-pulp-engine-assetsS3 bucket name. Required when ASSET_BINARY_STORE=s3.
S3_REGIONYes (s3 mode)us-east-1AWS region (or an arbitrary string for MinIO — MinIO requires a non-empty value). Required when ASSET_BINARY_STORE=s3.
S3_ACCESS_KEY_IDYes (s3 mode)AKIA...AWS access key ID. Required when ASSET_BINARY_STORE=s3.
S3_SECRET_ACCESS_KEYYes (s3 mode)AWS secret access key. Required when ASSET_BINARY_STORE=s3.
S3_ENDPOINTNounsethttps://minio.example.comS3-compatible API endpoint for MinIO, Cloudflare R2, or other custom providers. Omit for standard AWS S3. When set, S3_PUBLIC_URL is required (the API endpoint and public delivery URL differ for custom providers).
S3_PATH_STYLENofalsetrueSet to true to force path-style URLs (required for MinIO). Default: false. When true, S3_PUBLIC_URL is required.
S3_PUBLIC_URLYes (custom endpoint or path-style, public mode only)auto-derived for AWS public modehttps://assets.example.comBase URL for public asset delivery. Required when S3_ENDPOINT is set or S3_PATH_STYLE=true and ASSET_ACCESS_MODE=public. Not required when ASSET_ACCESS_MODE=private. Optional for standard AWS S3 in public mode — auto-derived as https://{bucket}.s3.{region}.amazonaws.com. Trailing slash is stripped automatically.
PULP_ENGINE_DISABLE_SANDBOXNounset (sandbox enabled)trueSet to true in container environments (Docker, Kubernetes) where the Chrome sandbox is unavailable. Default: unset — sandbox enabled. See § 1 Runtime Requirements.
PREVIEW_ROUTES_ENABLEDNofalse (absent → 404 in production)trueRegisters the editor preview routes (POST /render/preview/html, POST /render/preview/pdf) in production. Default: absent — routes return 404. Set to true only when the visual editor must reach the API directly in production for live visual previews. Ignored when NODE_ENV is not production. Pair with network-level restrictions when enabled. Not required for the publish flowPOST /render/validate (publish-readiness validation) is always registered, never returns rendered output, and is safe to keep enabled in production even when preview routes are disabled.
LOG_LEVELNoinfoinfoPino log level. One of trace, debug, info, warn, error. Default: info. Use warn in high-throughput environments to reduce log volume.
CORS_ALLOWED_ORIGINSNoallow all (with production warning)https://editor.example.comComma-separated list of origins allowed to make cross-origin browser requests to this API. Origins must include the scheme and exact hostname (e.g. https://editor.example.com). Use * to allow all origins explicitly. When absent, all origins are allowed (origin: true) with a production startup warning. Same-origin deployments (editor and API on the same host) do not require this setting.
DOCS_ENABLEDNotrue (Docker image: false)falseSet to false to skip registering Swagger UI (/docs, /docs/json, /docs/yaml). All /docs* routes return 404. Application default: true (backward-compatible). Docker image default: false — the container image ships with docs disabled; set DOCS_ENABLED=true to re-enable. Disable in production to reduce the exposed API surface if the interactive docs are not needed by operators.
METRICS_TOKENNounset (unauthenticated)<random-hex>When set, GET /metrics requires an Authorization: Bearer <token> header. When absent, the endpoint is unauthenticated (current behavior, preserved for backward compatibility). Generate with openssl rand -hex 32. Operators should restrict at the network layer regardless of whether a token is set.
TRUST_PROXYNofalsetrueSet to true when the API is behind a reverse proxy that sets X-Forwarded-Proto. Enables Fastify trustProxy so request.protocol accurately reflects the connection scheme. Required for REQUIRE_HTTPS to work correctly. Default: false.
REQUIRE_HTTPSNofalsetrueWhen true, POST /auth/editor-token rejects non-HTTPS requests with 400 Bad Request, preventing credentials from being sent over plain HTTP. Requires TRUST_PROXY=true and a TLS-terminating reverse proxy. Default: false.
HARDEN_PRODUCTIONNo (auto-derived)auto-derived from NODE_ENVfalseControls fail-fast enforcement of security controls. Default: auto-derived from NODE_ENV — enforced when NODE_ENV=production, off otherwise. Set HARDEN_PRODUCTION=false to explicitly opt out for evaluation. Accepted values: true, false, 1, 0, or unset. See Hardened Production Mode.
BLOCK_REMOTE_RESOURCESYes (hardened mode)falsetrueWhen true, the render pipeline blocks all outbound http/https fetches during PDF generation except origins listed in ALLOWED_REMOTE_ORIGINS. Required in hardened production mode (v0.54.0+).
ALLOWED_REMOTE_ORIGINSNounset (no origins allowed)https://fonts.googleapis.com,https://fonts.gstatic.comComma-separated URL origins permitted when BLOCK_REMOTE_RESOURCES=true. Each entry must be a valid URL origin (scheme + host + optional port). Has no effect when BLOCK_REMOTE_RESOURCES is not true.
ALLOW_SHARED_KEY_EDITORNofalsetrueWhen true, explicitly accepts shared-key identity for editor login in hardened production, bypassing the EDITOR_USERS_JSON requirement. Use when configuring a user registry is not practical. Default: false.
ALLOW_NO_AUTHNo (dev/test only)falsetrueLocal development / testing only. When authentication resolves to fully disabled (no API_KEY_* and no named-user registry), the server refuses to boot unless this is true. NODE_ENV-independent — closes the footgun where a bare node dist that forgets NODE_ENV=production would run unauthenticated. Never set in production; configure a credential (API_KEY_*) or a named-user registry (EDITOR_USERS_JSON/EDITOR_USERS_FILE/EDITOR_USERS_DB) instead. Accepted values: true, false, 1, 0, or unset.
RENDER_MODENochild-processchild-processControls where Puppeteer executes during PDF rendering. child-process (default): persistent child process with empty env — no secrets reachable. container: ephemeral Docker container per render — stronger isolation, API process holds Docker socket. socket: render requests routed through a separate controller process — API holds no Docker socket authority. in-process: Puppeteer runs inside the API process — legacy, debugging only. See § Render Isolation Mode.
RENDER_CONTAINER_IMAGEYes (container mode)ghcr.io/OWNER/pulp-engine-worker:vX.Y.ZOCI image for the worker container. Required when RENDER_MODE=container. Build with docker build -f Dockerfile.worker -t <image> ..
PULP_ENGINE_PLUGINSNounsetpulp-engine-plugin-barcodeComma-separated list of plugin package names or file paths. Plugins are loaded after core initialization and can extend renderers, storage backends, auth providers, routes, and events. Enabled plugins run as trusted server code inside the API process — see plugin-trust-model.md before enabling third-party plugins. See examples/plugin-barcode for a reference implementation.
RENDER_CONTROLLER_SOCKETYes (socket mode)/run/render/render.sockPath to the Unix domain socket created by the render-controller process. Required when RENDER_MODE=socket. See § Render Isolation Mode and compose.container.yaml.
RATE_LIMIT_MAXNo100100Global rate limit (requests per minute per IP). Default: 100. Applies to all routes except health, metrics, and those with per-route overrides.
RATE_LIMIT_RENDER_MAXNo2020Rate limit for render and preview endpoints (requests per minute per IP). Default: 20. Applies to POST /render/pdf (and the deprecated /render alias), /render/html, /render/csv, /render/xlsx, /render/docx, /render/pptx, /render/validate, /render/pdf/*, and /render/preview/*. Batch routes use RATE_LIMIT_BATCH_MAX.
RATE_LIMIT_BATCH_MAXNo55Rate limit for POST /render/batch (requests per minute per IP). Default: 5. Separate from RATE_LIMIT_RENDER_MAX because a single batch request can trigger many renders.
BATCH_MAX_ITEMSNo5050Maximum number of items per POST /render/batch request. Range: 1–200. Requests exceeding this limit receive 400 Bad Request.
BATCH_CONCURRENCYNo55Maximum concurrent renders within a single batch request. Range: 1–20. Should not exceed the Chrome page pool size (MAX_CONCURRENT_PAGES = 5) to avoid starving concurrent non-batch renders.
RATE_LIMIT_STORENomemorymemoryRate-limit counter backend. memory (default): in-process LRU — sufficient for single-instance deployments. redis: shared Redis counters — required for consistent per-IP enforcement across multiple API instances (in-process counters are per-replica, so the effective per-IP limit multiplies with the instance count). Requires REDIS_URL. A production startup warning is logged when memory is used with a database-backed (multi-instance-capable) storage mode. The bundled docker-compose.ha.yml sets this to redis.
REDIS_URLYes (redis mode)Redis connection URL for cluster-aware rate limiting. Format: redis://[:password@]host:port[/db] or rediss:// for TLS. Required when RATE_LIMIT_STORE=redis.
RATE_LIMIT_FAIL_OPENNofalsefalseControls behavior when RATE_LIMIT_STORE=redis and Redis becomes unreachable at runtime. false (default): requests receive 500 errors — operator notices immediately. true: rate limiting degrades to unlimited with a warning. Does not affect startup (always fail-fast).
PREVIEW_BODY_LIMITNo524288524288Maximum request body size in bytes for preview endpoints (POST /render/preview/html, POST /render/preview/pdf). Default: 524288 (512 KiB). Increase if templates with large inline definitions exceed the limit.
APP_VERSIONNopackage.json version0.68.0Overrides the version field reported by GET /health and GET /health/ready. Auto-set by the Docker build (ARG); defaults to package.json version when unset. Operators only need to set this for custom non-Docker builds.

Opt-in capability flags

These environment variables enable features that are off by default. Each is independently gated.

VariableRequiredDefaultExample valueDescription
MULTI_TENANT_ENABLEDNofalsetrueTurns on per-tenant isolation of templates, assets, credentials, audit events, and schedules. Requires a database-backed storage mode (Postgres or SQL Server); rejected at startup for file mode. See § Multi-Tenant Mode.
API_KEYS_JSONYes (multi-tenant)[{"key":"...","scope":"admin","tenantId":"acme"}]Tenant-scoped credential registry used when MULTI_TENANT_ENABLED=true. Replaces the flat API_KEY_* variables in multi-tenant mode. A super-admin entry (tenantId: null) is required for /admin/tenants CRUD. Single-tenant deployments may still use this variable, but every entry must be tenantId: "default".
API_KEY_SUPER_ADMINNounset (rejected in single-tenant mode)<random-hex>Legacy single super-admin credential. Only valid when MULTI_TENANT_ENABLED=true; rejected at startup in single-tenant mode since v0.67.1. Prefer an API_KEYS_JSON entry with tenantId: null.
TENANT_STATUS_CACHE_TTL_MSNo1000010000Cache TTL for tenant archive/active state. Default 10 000 ms. Lower values reduce the window where an archived tenant can still be written to on non-handling pods.
SCHEDULE_ENABLEDNofalse (engine off — schedules never fire)trueStarts the cron/dispatcher execution engine. Schedule CRUD routes (/schedules/*) are available whenever the storage mode provides a schedule store (Postgres or SQL Server) — with SCHEDULE_ENABLED=false schedules can be created and edited but never fire (no 503; the create succeeds and the engine simply isn’t running). File mode has no schedule store, so its routes return 503 unavailable. Requires Postgres or SQL Server (SCHEDULE_ENABLED=true is rejected at startup for file mode).
OIDC_DISCOVERY_URLYes (OIDC)https://issuer.example.com/.well-known/openid-configurationOIDC discovery document URL. Presence of this variable enables OIDC — there is no separate OIDC_ENABLED flag. Requires OIDC_CLIENT_ID, OIDC_CLIENT_SECRET, and OIDC_COOKIE_SECRET. See oidc-guide.md.
OIDC_CLIENT_IDYes (OIDC)OAuth 2.0 client ID registered with the identity provider.
OIDC_CLIENT_SECRETYes (OIDC)OAuth 2.0 client secret.
OIDC_COOKIE_SECRETYes (OIDC)<openssl rand -hex 32>Secret (≥ 32 chars) used to sign OIDC session cookies.
OIDC_REDIRECT_URIYes (OIDC)https://api.example.com/auth/oidc/callbackOAuth redirect URI. Required whenever OIDC_DISCOVERY_URL is set (startup fails otherwise); must match the value registered with the provider. In hardened mode the URI must use https://.
OIDC_SCOPESNoopenid profile emailopenid profile emailSpace-separated OIDC scopes requested.
OIDC_CLAIM_SUB / OIDC_CLAIM_EMAIL / OIDC_CLAIM_DISPLAY_NAME / OIDC_CLAIM_GROUPSNosub / email / name / groupssub / email / name / groupsOverride the ID-token claim names read during provisioning.
OIDC_ADMIN_GROUPSNounset (no admin promotion)pulp-admin,pulp-ownersComma-separated groups whose members are promoted to admin role. When unset, no user is promoted to admin.
OIDC_EDITOR_GROUPSNo (must be set explicitly in hardened mode)* (any authenticated user)pulp-editorsGroups whose members receive editor scope (template create/update/delete + asset write). * (default) grants editor to every authenticated SSO user; an empty string grants editor only via OIDC_ADMIN_GROUPS; a comma-separated list restricts to those groups. In hardened mode the server refuses to start if OIDC is enabled and this is left unset — choose a value deliberately. Outside hardened mode, a startup warning is logged when the default * is in effect.
OIDC_AUTO_PROVISIONNotruetrueWhen true, unknown OIDC users are added to the registry on first login. Persistence depends on storage mode: postgres/sqlserver persist automatically to the DB-backed editor_users registry (EDITOR_USERS_DB=true); file mode requires EDITOR_USERS_FILE (or an EDITOR_USERS_JSON seed). Without persistence, auto-provisioned users are in-memory only and a startup warning is emitted.
OIDC_DEFAULT_TENANTNodefaultdefaultTenant assigned to auto-provisioned OIDC users in multi-tenant mode.
OIDC_PROVIDER_NAMENoSSOOktaDisplay name shown on the editor login button. Default SSO.
ANTHROPIC_API_KEYNounset (/templates/generate not registered)sk-ant-...Enables POST /templates/generate (AI template generation). When unset, the route is not registered. Also reflected in GET /capabilities.
ANTHROPIC_MODELNoclaude-opus-4-7claude-opus-4-7Claude model used for AI generation.
ANTHROPIC_MAX_TOKENSNo40964096Max output tokens per generation request.
EMBED_ALLOWED_ORIGINSYes (embed)https://app.acme.com https://staging.acme.comSpace-separated origins permitted to embed the editor. Sets frame-ancestors on /embed.html responses. Without this, the embeddable editor cannot load inside a customer iframe.
EMBED_CONNECT_ORIGINSNounsethttps://forms-proxy.acme.comSpace-separated extra origins the embedded editor may open fetch/WebSocket connections to (connect-src in the embed CSP).
EDITOR_USERS_FILENounset (in-memory mutations only)/etc/pulp-engine/users.jsonPath to a named-user registry file (file mode). Enables runtime POST /admin/users/PUT/DELETE/reload to mutate users across restarts. Without this, EDITOR_USERS_JSON is read once at startup and runtime mutations are in-memory only. In postgres/sqlserver mode the registry is DB-backed instead; EDITOR_USERS_JSON/FILE then only seed an empty editor_users table on first boot (the DB is authoritative thereafter), and POST /admin/users/reload reloads the cache from the DB.
EDITOR_USERS_DBNofalsetrue(v0.81.0, postgres/sqlserver only) Enables DB-backed named-user mode with no EDITOR_USERS_JSON/FILE seed (empty-table bootstrap — create the first user with an admin API key via POST /admin/users). Counts as named-user intent for identity mode and the hardening named-user-registry control. Rejected at startup under file/plugin storage.
EDITOR_USERS_CACHE_TTL_MSNo1000010000(v0.81.0) How long each instance may serve the in-memory editor-user cache before a full reload from the store. New users appear immediately (read-through on miss); role changes / tokenIssuedAfter revocations / deletes propagate to other replicas within this window. Range 1000–300000.
PULP_ENGINE_PLUGINSNounsetpulp-engine-plugin-barcodeComma-separated plugin package names or file paths. Plugins can extend renderers, storage backends, auth providers, routes, and events. Trust posture: plugins run as trusted server code inside the API process — see plugin-trust-model.md. See examples/plugin-barcode.
PPTX_ENABLEDNotruetrueEnables the /render/pptx and /render/batch/pptx routes. Default: true. Set to false to withhold PPTX from /capabilities and return 404 on the routes.
USAGE_QUERY_MAX_WINDOW_DAYSNo9292Maximum [from, to) window accepted by GET /usage. Default 92.
USAGE_EXPORT_MAX_ROWSNo500000500000Row-count cap for GET /usage.csv. Requests exceeding this return 413 usage_export_too_large. Default 500 000.
RENDER_USAGE_RETENTION_DAYSNounset (retain indefinitely)365How long to retain per-render usage rows. Unset = retain indefinitely.
RENDER_USAGE_PURGE_INTERVAL_HOURSNo2424How often the usage-retention purge runs. Default 24 hours.
PULP_LICENCE_KEYNounset (evaluation watermark)<5-part signed token from your licence email>Commercial licence key (Ed25519-signed token, verified offline). Unset or invalid → the server runs under the Evaluation Licence and rendered output carries a small watermark; rendering never refuses on a licence problem. Licence status is surfaced in /health/ready. See licence-key-format.md.
SCHEDULE_POLL_INTERVAL_SECONDSNo1515How often the schedule engine polls for due schedules. Range 5–300.
SCHEDULE_MAX_CONCURRENTNo22Max schedule executions rendered concurrently per instance. Range 1–10.
SCHEDULE_DELIVERY_MAX_RETRIESNo44Max delivery attempts per target (email/S3/webhook). Range 1–10.
SCHEDULE_CATCHUP_WINDOW_MINUTESNo1515Missed fires older than this are skipped instead of fired late (e.g. after downtime). Range 1–1440.
SCHEDULE_STALL_TIMEOUT_MSNo300000300000Claimed executions older than this are reclaimed (instance died mid-render). Also bounds the startup sweep that removes stale scheduled-render temp artifacts. Range 30 000–600 000.
SCHEDULE_EXECUTION_RETENTION_DAYSNo3030Completed/failed execution rows older than this are purged. Range 1–3650.
SCHEDULE_ENGINE_TICK_LIMITNo500500Max due schedules processed per engine tick (unbounded-query guard). Range 10–10 000.
SCHEDULE_SMTP_HOSTYes (email delivery)unset (email targets fail)smtp.sendgrid.netSMTP relay host for schedule email delivery targets. Without it, schedules with email targets fail delivery. nodemailer is lazy-loaded — no overhead when unset.
SCHEDULE_SMTP_PORTNo587587SMTP port.
SCHEDULE_SMTP_USER / SCHEDULE_SMTP_PASSNounset (unauthenticated relay)SMTP credentials.
SCHEDULE_SMTP_FROMNounsetreports@example.comFrom address on delivered emails.
SCHEDULE_SMTP_SECURENofalse (STARTTLS)truetrue = implicit TLS (typically port 465); false = STARTTLS upgrade on a plaintext connection (typically 587).
SCHEDULE_SMTP_REQUIRE_TLSNotruetrueWith SCHEDULE_SMTP_SECURE=false, require the relay to actually upgrade to TLS — a relay without STARTTLS hard-fails instead of silently transmitting credentials and the rendered attachment in cleartext. Security opt-out: an unrecognised value fails config validation.
SCHEDULE_S3_BUCKETYes (S3 delivery)unsetacme-reportsBucket for schedule S3 delivery targets. Deliberately separate from the asset-binary S3_* settings — delivery may target a different bucket/role.
SCHEDULE_S3_REGION / SCHEDULE_S3_ACCESS_KEY_ID / SCHEDULE_S3_SECRET_ACCESS_KEY / SCHEDULE_S3_ENDPOINTNounsetCredentials/region/endpoint for schedule S3 delivery (endpoint for MinIO/R2-style providers).
SCHEDULE_DLQ_ENABLEDNotruetrueWhen true, exhausted schedule deliveries write a reference row to schedule_delivery_dlq for /admin/schedule-dlq list/replay. false = log-only exhaustion reporting. Postgres only; the store is null in file/sqlserver modes regardless.
SCHEDULE_DLQ_RETENTION_DAYSNo3030DLQ rows older than this are purged by the schedule engine’s retention tick. Range 1–365.
BATCH_WEBHOOK_DLQ_ENABLEDNotruetrueWhen true, exhausted async-batch webhook deliveries write a reference row to batch_webhook_dlq for /admin/batch-dlq list/replay. false = log-only. Postgres only.
WEBHOOK_MAX_CONCURRENT_JOBSNo1010Max async-batch jobs processing simultaneously; submissions beyond the cap are shed with 503 + Retry-After. Range 1–50.
WEBHOOK_TIMEOUT_MSNo1000010000HTTP timeout for each outbound webhook POST. Range 1 000–60 000.
WEBHOOK_MAX_RETRIESNo33Max webhook delivery attempts (including the first). Range 1–10.
WEBHOOK_JOB_RETENTION_SECONDSNo36003600How long completed/failed async-batch jobs stay pollable (and, on Postgres, how long their durable rows + result blobs are retained). The sweep runs every 60 s, so effective retention can exceed this by up to ~60 s. Range 60–86 400.
WEBHOOK_JOB_TIMEOUT_MSNo300000300000Max wall-clock time an async-batch job may run before the sweep fails it and releases its concurrency slot. Range 30 000–600 000.
JOB_RESULT_BLOB_STORENofilesystems3Where completed async-batch result envelopes (gzipped JSON containing the rendered documents) are persisted on Postgres deployments. filesystem{JOB_RESULT_BLOB_DIR}/{tenantId}/{jobId}.json.gzmulti-instance/HA deployments require a shared volume, or use s3. s3s3://{JOB_RESULT_BLOB_BUCKET}/..., reusing the asset-binary S3_REGION/S3_ACCESS_KEY_ID/S3_SECRET_ACCESS_KEY/S3_ENDPOINT/S3_PATH_STYLE settings. The S3 credentials additionally need ListBucket for the v0.84.0 blob orphan-reconciliation sweep.
JOB_RESULT_BLOB_DIRNo./job-result-blobs/data/job-result-blobsFilesystem root for result blobs (JOB_RESULT_BLOB_STORE=filesystem).
JOB_RESULT_BLOB_BUCKETYes (s3 blob store)unsetacme-pulp-job-resultsBucket for result blobs (JOB_RESULT_BLOB_STORE=s3). Deliberately distinct from S3_BUCKET — job-result blobs are private, never public-ACL.
STARTUP_ORPHAN_GRACE_MSNo6000060000At startup, pending/processing async-batch rows older than this are failed with job_abandoned_at_startup; younger rows are left for the just-started worker. Range 10 000–3 600 000.
RATE_LIMITS_JSONNounset[{"tenantId":"acme","routeClass":"render","max":120}]Per-tenant / per-route-class / per-template rate-limit overrides (C.2). JSON array of { tenantId, routeClass, templateKey?, max, timeWindow? }; parsed and validated at startup. Mutually exclusive with RATE_LIMITS_JSON_FILE.
RATE_LIMITS_JSON_FILENounset/etc/pulp-engine/rate-limits.jsonPath to a JSON file with the same shape — preferred for large configs or file-writing orchestration.
METRICS_TEMPLATE_LABEL_MODENotopnallowlistHow templateKey is attached to render metrics: topn (LFU tracker admits up to METRICS_TEMPLATE_LABEL_MAX distinct keys, rest become __other__), allowlist (only listed keys emit real names — predictable, recommended for stable template sets), off (label always __other__). Bounds Prometheus label cardinality.
METRICS_TEMPLATE_LABEL_MAXNo5050Cardinality cap for topn mode (max 10 000).
METRICS_TEMPLATE_LABEL_ALLOWLISTNounsetinvoice,reportComma-separated template keys for allowlist mode.
VARIANTS_ENABLEDNofalsetrueMaster switch for B.3 A/B variant label resolution. When false, labels with trafficSplit/controlLabel behave as plain pointers (no bucketing); when true, variant bucketing runs and renders carry X-PulpEngine-Bucket / X-PulpEngine-Resolved-Label headers.
RENDER_MAX_CONCURRENT_PAGESNo516Size of the Chromium page pool — simultaneous PDF renders per instance (v0.85.0; previously a compile-time constant of 5). Size against host CPU/RAM — budget ~150–300 MB per page under load; see the benchmark pack for measured per-slot throughput. In child-process mode the pool lives in the worker (the var is forwarded automatically). Range 1–50.
RENDER_MAX_QUEUE_DEPTHNo2× pool size10Render-queue waiter cap (v0.85.0): renders beyond RENDER_MAX_CONCURRENT_PAGES + RENDER_MAX_QUEUE_DEPTH in flight are shed with 503 render_saturated + Retry-After instead of queueing unboundedly — graceful degradation instead of a timeout cascade. 0 = shed as soon as no page slot is free; a very large value approximates pre-v0.85.0 unbounded queueing. Range 0–1000.
RENDER_WORKER_TIMEOUT_MSNo6500065000Per-phase render deadline (queue wait, then render) in child-process mode; the stream silence watchdog in container/socket modes (overall per-render ceiling derives as this + 55 s). Default = Puppeteer’s 30 s setContent + 30 s page.pdf + 5 s headroom, so Puppeteer surfaces its own typed timeout first. Raise for very large documents on slow hardware. Range 5 000–600 000.
RENDER_PREVIEW_RESERVED_SLOTSNo11Render slots of the RENDER_MAX_CONCURRENT_PAGES pool reserved exclusively for editor previews so batch/production load cannot starve interactive preview. 0 disables the reservation. Must leave at least one batch slot (≤ pool − 1; enforced at startup).
AUDIT_RETENTION_DAYSNounset (no automatic purge)365Opt-in scheduled purge of audit events older than this many days. Multi-instance: set on one instance only, or drive DELETE /audit-events from external cron. Range 1–3650.
AUDIT_PURGE_INTERVAL_HOURSNo2424Interval between audit purge runs. Range 1–168.
BATCH_WEBHOOK_DLQ_RETENTION_DAYSNo3030Batch-webhook DLQ rows older than this are deleted by an hourly purge. Rows typically become unreplayable after WEBHOOK_JOB_RETENTION_SECONDS but stay visible for audit until this retention. Range 1–365.
API_KEYS_JSON_FILENounset/etc/pulp-engine/api-keys.jsonPath to a file containing the API_KEYS_JSON array — preferred when credentials are mounted as files (Kubernetes secrets). Mutually exclusive with API_KEYS_JSON.
EDITOR_DIST_PATHNounset (Docker image: bundled editor)../editor/distPath to a built editor bundle to serve at /editor/ (plus the form/embed wrapper routes). The Docker image sets this to its bundled editor; bare-metal source deployments point it at apps/editor/dist.
ANTHROPIC_TIMEOUT_MSNo6000060000Timeout for each Anthropic HTTP call. Worst-case request latency is maxAttempts × ANTHROPIC_TIMEOUT_MS (a startup warning fires when that exceeds 5 minutes — reverse proxies tend to cut long requests).
AI_NODE_REWRITE_ENABLEDNotruefalseDedicated switch for POST /nodes/rewrite (inline AI node rewrite). Only meaningful when ANTHROPIC_API_KEY is set; false disables node rewrite while leaving POST /templates/generate enabled.
RATE_LIMIT_AI_GENERATION_MAXNo55Per-actor per-minute limit for POST /templates/generate.
RATE_LIMIT_AI_EDIT_MAXNo2020Per-identity per-minute limit for POST /nodes/rewrite (separate bucket from full-template generation).
RATE_LIMIT_AI_EDIT_OUTCOME_MAXNo6060Per-identity per-minute limit for the rewrite apply/dismiss telemetry endpoint (own bucket so telemetry can’t burn rewrite quota).
PDF_UTIL_MAX_INPUT_SIZE_MBNo5050Max size of a single input PDF for the /render/pdf/{merge,watermark,insert} transform endpoints. Range 1–500.
PDF_UTIL_MAX_TOTAL_SIZE_MBNo200200Max combined input size per transform request. Range 1–1000.
PDF_UTIL_MAX_MERGE_SOURCESNo5050Max PDFs in one merge request. Range 2–200.
PDF_UTIL_MAX_OUTPUT_PAGESNo1000010000Max pages in a transform output PDF. Range 1–50 000.
RATE_LIMIT_PDF_UTIL_MAXNo2020Per-minute rate limit for the PDF transform endpoints.
SANDBOX_ENABLEDNofalsetrueEnables the anonymous public playground (/sandbox/*). Requires MULTI_TENANT_ENABLED=true and a hardened companion posture asserted at startup (non-in-process RENDER_MODE, BLOCK_REMOTE_RESOURCES=true, RATE_LIMIT_STORE=redis, TRUST_PROXY=true, concrete CORS_ALLOWED_ORIGINS, and SANDBOX_TOKEN_SECRET).
SANDBOX_TOKEN_SECRETYes (sandbox)unset<openssl rand -hex 32>HMAC-SHA256 signing secret for sandbox session tokens (≥ 32 hex chars). Deliberately distinct from EDITOR_TOKEN_SECRET.
SANDBOX_TENANT_IDNosandboxsandboxTenant slug sandbox sessions operate under (templates and templateRef resolution). Created/checked at boot; archiving it is the sandbox kill switch.
SANDBOX_SESSION_TTL_MINUTESNo1515Sandbox session token lifetime. Range 1–240.
SANDBOX_SESSION_QUOTANo2020Renders allowed per session token (Redis-backed atomic counter). Range 1–500.
SANDBOX_BODY_LIMIT_BYTESNo262144262144Body limit for sandbox render requests (smaller than PREVIEW_BODY_LIMIT — no base64-embedded content needed). Range 1 024–2 097 152.
SANDBOX_ALLOWED_ORIGINSNounsethttps://pulpengine.devExtra CORS origins for /sandbox/*, unioned with CORS_ALLOWED_ORIGINS. Browser-side friction only — the real gate is the session token + IP rate limit.
RATE_LIMIT_SANDBOX_MAXNo1010Per-IP per-minute limit across /sandbox/session and /sandbox/render/*.

Validation: config.ts reads and validates env vars at startup. The process exits immediately with a descriptive error if the variable required by the selected mode is missing — DATABASE_URL for postgres mode, SQL_SERVER_URL for sqlserver mode, TEMPLATES_DIR for file mode. For S3 mode, missing required S3 vars cause an immediate exit(1); an inaccessible bucket is detected at startup via a HeadBucket probe (fail-fast). Opt-in capability flags that depend on other variables (e.g. OIDC_DISCOVERY_URL without OIDC_CLIENT_ID) also fail fast.


Minimum Supported Production Configuration

When NODE_ENV=production, hardening is enforced by default — the server will not start unless all seven security controls are configured. Set HARDEN_PRODUCTION=false to explicitly opt out for evaluation. For supported production deployments, configure all seven controls:

When OIDC/SSO is enabled, two additional conditional controls are also enforced in hardened mode: OIDC_REDIRECT_URI must use https://, and OIDC_EDITOR_GROUPS must be set explicitly (leaving it at the default * would grant editor access to every authenticated SSO user, so hardened mode forces a deliberate choice).

# ── Minimum supported production configuration ──────────────────────────────────
CORS_ALLOWED_ORIGINS=https://editor.example.com   # specific origins; wildcard * rejected
DOCS_ENABLED=false                                 # disable Swagger UI (or set true to acknowledge exposure)
METRICS_TOKEN=<openssl rand -hex 32>               # bearer auth for GET /metrics
REQUIRE_HTTPS=true                                 # reject editor-token login over plain HTTP
TRUST_PROXY=true                                   # required when behind a TLS-terminating reverse proxy
BLOCK_REMOTE_RESOURCES=true                        # prevent render pipeline from fetching arbitrary external resources
EDITOR_USERS_JSON='[{"id":"admin","displayName":"Admin","key":"...","role":"admin"}]'
# or set ALLOW_SHARED_KEY_EDITOR=true to use shared-key identity
HARDEN_PRODUCTION=true                             # fail fast if any control above is missing

The server refuses to start until all seven controls pass (plus the conditional OIDC controls above, when OIDC is enabled). All violations are reported together in a single error, so you can fix everything in one pass.

Upgrading incrementally: If you need to configure controls one at a time, omit HARDEN_PRODUCTION — the advisory warning will list remaining gaps at startup. Set HARDEN_PRODUCTION=true once all seven are in place.

See § Hardened Production Mode for enforcement rules and the full example.


Object Storage (S3 / MinIO / R2)

Setting ASSET_BINARY_STORE=s3 stores uploaded asset binaries in an S3-compatible bucket instead of the local filesystem. This eliminates the shared-volume requirement for multi-instance deployments.

Required IAM / bucket permissions

The credentials (S3_ACCESS_KEY_ID / S3_SECRET_ACCESS_KEY) must have:

  • Object-level access: s3:PutObject, s3:DeleteObject on objects in the bucket.
  • Bucket-level access: s3:ListBucket (or equivalent) on the bucket itself — required for the HeadBucket startup probe. On AWS S3 this is documented in the HeadBucket API reference. For MinIO and other compatible providers, equivalent bucket-level access is required.

Public mode (default): The bucket and objects must be publicly readable at S3_PUBLIC_URL. Pulp Engine (and Puppeteer when rendering PDFs that reference assets) fetches asset URLs without auth headers.

Private mode (ASSET_ACCESS_MODE=private): The bucket does not need to be publicly readable. Pulp Engine fetches objects server-side using the configured credentials and proxies them to authorized callers. S3_PUBLIC_URL is not required in this mode — GetObject access on the bucket is sufficient. See § Asset Access Mode.

Public URL rules

ConfigurationS3_PUBLIC_URL
Standard AWS S3 (no S3_ENDPOINT, no path-style)Optional — auto-derived as https://{bucket}.s3.{region}.amazonaws.com
Custom endpoint (S3_ENDPOINT set)Required — the API endpoint and public delivery URL differ for custom providers
Path-style URLs (S3_PATH_STYLE=true)Required

Trailing slashes in S3_PUBLIC_URL are stripped automatically. Asset URLs are constructed as ${S3_PUBLIC_URL}/${filename}.

Standard AWS S3

ASSET_BINARY_STORE=s3
S3_BUCKET=my-pulp-engine-assets
S3_REGION=us-east-1
S3_ACCESS_KEY_ID=AKIA...
S3_SECRET_ACCESS_KEY=...
# S3_PUBLIC_URL is optional — auto-derived as https://my-pulp-engine-assets.s3.us-east-1.amazonaws.com

MinIO (self-hosted)

ASSET_BINARY_STORE=s3
S3_BUCKET=pulp-engine-assets
S3_REGION=us-east-1          # arbitrary value — MinIO requires a non-empty region string
S3_ACCESS_KEY_ID=minio-access-key
S3_SECRET_ACCESS_KEY=minio-secret-key
S3_ENDPOINT=https://minio.example.com
S3_PATH_STYLE=true
S3_PUBLIC_URL=https://minio.example.com/pulp-engine-assets

Cloudflare R2

ASSET_BINARY_STORE=s3
S3_BUCKET=pulp-engine-assets
S3_REGION=auto
S3_ACCESS_KEY_ID=...
S3_SECRET_ACCESS_KEY=...
S3_ENDPOINT=https://<accountid>.r2.cloudflarestorage.com
S3_PUBLIC_URL=https://assets.example.com   # your R2 custom domain or public URL

Switching an existing deployment from filesystem to S3

Switching ASSET_BINARY_STORE from filesystem to s3 does not automatically migrate existing binaries. Existing AssetRecord.url values in the database or file index still point to /assets/{filename}. Without explicit migration those URLs become broken.

Options for existing deployments:

  1. Greenfield cutover: If existing asset data is expendable, deploy fresh in S3 mode with an empty asset store. No migration needed.
  2. Manual cutover: Copy existing binaries from ASSETS_DIR to the S3 bucket, update stored URL values in the database to the new S3 URLs, then switch to ASSET_BINARY_STORE=s3. Requires a maintenance window.

No automated migration tool is provided in this release.


Asset Access Mode

ASSET_ACCESS_MODE controls how uploaded image assets are delivered to the PDF renderer and the visual editor. Default: public.

public (default)

Assets are served without authentication:

  • Filesystem: @fastify/static serves GET /assets/:filename without any X-Api-Key or session token required. Puppeteer fetches images via the loopback address (http://127.0.0.1:PORT/assets/...) during PDF rendering.
  • S3: save() returns a public S3 URL (e.g. https://bucket.s3.region.amazonaws.com/filename). The bucket must be publicly readable. Puppeteer and the browser fetch images directly from S3/CDN.

private

All asset delivery is routed through an authenticated API proxy. No public URL is required.

  • Filesystem: GET /assets/:filename requires X-Api-Key (admin or editor scope) or X-Editor-Token. @fastify/static is not registered.
  • S3: save() returns a relative proxy URL (/assets/filename). The bucket does not need a public-read ACL or policy — only GetObject access for the API credentials is required. S3_PUBLIC_URL is not required.
  • PDF rendering: Instead of Puppeteer fetching asset URLs at render time, the API inlines all referenced assets as base64 data URIs in the HTML before passing it to Puppeteer. No network calls from Puppeteer to the asset server are needed.
  • HTML preview: Same inlining — the returned HTML contains self-contained data URIs rather than /assets/ src paths.
  • Editor: The visual editor fetches /assets/:filename with X-Editor-Token and displays images via a revocable blob URL. No direct <img src> to /assets/ — auth is applied at fetch time.

Private mode configuration

ASSET_ACCESS_MODE=private

# Filesystem backend (no additional vars needed):
ASSET_BINARY_STORE=filesystem
ASSETS_DIR=/var/pulp-engine/assets
API_KEY_ADMIN=...
API_KEY_EDITOR=...

# S3 backend — bucket does not need to be public:
ASSET_BINARY_STORE=s3
S3_BUCKET=pulp-engine-private-assets
S3_REGION=us-east-1
S3_ACCESS_KEY_ID=...
S3_SECRET_ACCESS_KEY=...
# S3_PUBLIC_URL is NOT required in private mode

IAM permissions for S3 private mode

In addition to s3:PutObject, s3:DeleteObject, and s3:ListBucket, the API credentials must have:

  • s3:GetObject on objects in the bucket — required to stream binaries through the proxy handler.

Migration: switching from public to private mode

Filesystem: No data migration needed. Stored asset URLs are already in /assets/filename form. Switch ASSET_ACCESS_MODE=private and restart.

S3: Stored asset URLs from public mode are absolute S3 URLs (e.g. https://bucket.s3.region.amazonaws.com/filename). These URLs continue to work only as long as the S3 bucket remains publicly accessible. If you simultaneously privatize the bucket and set ASSET_ACCESS_MODE=private, those stored absolute URLs become inaccessible.

Options:

  1. Keep bucket public during transition: Set ASSET_ACCESS_MODE=private, restart. Old assets continue to render via their stored public URLs. New uploads get proxy URLs (/assets/filename). Over time, as old assets are replaced with newly uploaded ones, all stored URLs become proxy URLs.
  2. Hard cutover: Simultaneously privatize the bucket and set ASSET_ACCESS_MODE=private. Re-upload all assets (which generates proxy URLs) or rewrite stored URL values in your database from absolute S3 URLs to /assets/filename. Requires a maintenance window.

No automated migration tool is provided.


Three images are published to GitHub Container Registry on every tagged release:

ImagePurpose
ghcr.io/OWNER/pulp-engine:vX.Y.ZAPI server — all storage modes, all render modes; includes bundled editor SPA
ghcr.io/OWNER/pulp-engine-worker:vX.Y.ZEphemeral PDF render worker — used by RENDER_MODE=container and RENDER_MODE=socket
ghcr.io/OWNER/pulp-engine-controller:vX.Y.ZRender controller — sole Docker socket holder in RENDER_MODE=socket topology

Replace OWNER with your GitHub organisation or the org that published your release. All images are tagged with the version (e.g. v0.48.1) and a rolling latest.

Most deployments only need the API image. Pull the worker image when using RENDER_MODE=container or RENDER_MODE=socket; pull the controller image as well only for RENDER_MODE=socket, the privilege-separated topology (see § Render Isolation Mode).

Size: pulp-engine ~600–800 MB (Chromium binary); pulp-engine-worker similar (also needs Chromium); pulp-engine-controller ~200 MB (Docker CLI only, no browser)

Quick start — file mode (no database required)

docker pull ghcr.io/OWNER/pulp-engine:vX.Y.Z

docker run -d \
  --name pulp-engine \
  -p 3000:3000 \
  -e API_KEY_ADMIN=your-secret-key \
  -e STORAGE_MODE=file \
  -v /var/pulp-engine/templates:/data/templates \
  -v /var/pulp-engine/assets:/data/assets \
  ghcr.io/OWNER/pulp-engine:vX.Y.Z

Templates volume must be writable. File mode writes template definitions and version snapshots under TEMPLATES_DIR (/data/templates in the image) on every save and publish. Do not add :ro to that mount.

PostgreSQL mode — with migration

Migrations must run before the app container starts. Use the same image tag to ensure code and schema are always in sync:

# 1. Run migrations (one-off container — exits after apply)
docker run --rm \
  --entrypoint /app/node_modules/.bin/prisma \
  -e DATABASE_URL=postgres://user:pass@host:5432/pulp-engine \
  ghcr.io/OWNER/pulp-engine:vX.Y.Z \
  migrate deploy --schema /app/src/prisma/schema.prisma

# 2. Start the app container
docker run -d \
  --name pulp-engine \
  -p 3000:3000 \
  -e NODE_ENV=production \
  -e STORAGE_MODE=postgres \
  -e DATABASE_URL=postgres://user:pass@host:5432/pulp-engine \
  -e API_KEY_ADMIN=your-secret-key \
  -v /var/pulp-engine/assets:/data/assets \
  ghcr.io/OWNER/pulp-engine:vX.Y.Z

Already-applied migrations are skipped; this is safe to re-run.

Container environment variables

All variables from § 2 Environment Variables can be passed with -e. The following defaults are baked into the image:

VariableImage defaultOverride when
NODE_ENVproductionNever — leave as-is
HOST0.0.0.0Using HOST=127.0.0.1 with a sidecar proxy
PORT3000Changing the listener port
STORAGE_MODEfileUsing postgres or sqlserver
ASSETS_DIR/data/assetsMounting assets at a different path
PULP_ENGINE_DISABLE_SANDBOXtrueNever — required for containers; do not unset

Volume mounts

Path in containerPurposeMount a host path?
/data/assetsUploaded image assetsYes — persist across container restarts
/data/templatesTemplate definitions (file mode only)Yes — read-write; the API writes template files on every save and publish

Deployment validation

Two scripts verify a running deployment. Use them in order after starting the container.

validate-deploy.sh — infrastructure and security posture. Checks liveness, readiness, metrics endpoint, security advisories (open /metrics, exposed Swagger UI), auth, and optionally a stored-template render. Requires only curl. See the script header for full argument and check descriptions.

# Infrastructure + auth (no seeded templates required)
./scripts/validate-deploy.sh http://localhost:3000 $API_KEY_ADMIN

# After seeding or creating a template — adds a render pipeline check
./scripts/validate-deploy.sh http://localhost:3000 $API_KEY_ADMIN loan-approval-letter $METRICS_TOKEN

# Docker image deployments — also verify the bundled editor SPA
EXPECT_EDITOR=true ./scripts/validate-deploy.sh http://localhost:3000 $API_KEY_ADMIN

smoke-test.sh — end-to-end lifecycle. Creates a disposable template, renders it to PDF, verifies the editor route, and cleans up. Requires curl and jq.

./scripts/smoke-test.sh http://localhost:3000 $API_KEY_ADMIN

Recommended first-deployment sequence:

  1. Start the container (or bare-metal process).
  2. Run validate-deploy.sh — confirms health, storage, metrics, and auth are working.
  3. Run smoke-test.sh — confirms the full create → render → delete lifecycle.
  4. If HARDEN_PRODUCTION=true: verify 0 warnings from validate-deploy.sh.

SQL Server parity note: CI exercises SQL Server schema migration and SQL Server-specific storage/API tests in the test-sqlserver job (ci.yml). Local deployment rehearsal for SQL Server requires a reachable SQL Server instance — follow the SQL Server mode deployment steps and run the same validation scripts above.

Rollback

# Stop current container
docker stop pulp-engine && docker rm pulp-engine

# Start the previous version
docker run -d --name pulp-engine [same -e and -v flags] \
  ghcr.io/OWNER/pulp-engine:v0.PREV.Y

# Validate the rollback
./scripts/validate-deploy.sh http://localhost:3000 $API_KEY_ADMIN

# If rolling back a postgres migration:
# The previous image's migrations were already applied — no migration rollback is
# needed unless you added backward-incompatible schema changes. In that case,
# restore from a database backup taken before the forward migration was applied.
#
# If rolling back a SQL Server migration:
# Same principle — no migration rollback command exists. Restore from a backup taken
# before the forward migration was applied if schema changes were backward-incompatible.

Chrome sandbox note

PULP_ENGINE_DISABLE_SANDBOX=true is set automatically in the image. This is required because Docker containers run without the kernel-level sandbox that Chrome requires. The container network boundary and Puppeteer’s request-interception layer provide equivalent containment for the threat model of this service. Do not unset this variable in containerised deployments.

SQL Server mode — with migration

Migrations must run before the app container starts. Use the same image tag to ensure code and schema are always in sync:

# 1. Run migrations (one-off container — exits after apply)
docker run --rm \
  --entrypoint node \
  -e SQL_SERVER_URL=mssql://user:pass@host:1433/pulp-engine?trustServerCertificate=true \
  ghcr.io/OWNER/pulp-engine:v0.X.Y \
  /app/dist/scripts/migrate-sqlserver.js

# 2. Start the app container
docker run -d \
  --name pulp-engine \
  -p 3000:3000 \
  -e NODE_ENV=production \
  -e STORAGE_MODE=sqlserver \
  -e SQL_SERVER_URL=mssql://user:pass@host:1433/pulp-engine?trustServerCertificate=true \
  -e API_KEY_ADMIN=your-secret-key \
  -v /var/pulp-engine/assets:/data/assets \
  ghcr.io/OWNER/pulp-engine:v0.X.Y

Already-applied migrations are skipped; this is safe to re-run. The runner connects to master, creates the target database if absent, then applies all pending .sql files from dist/storage/sqlserver/migrations/ in order. Applied migrations are tracked in dbo.__migrations.

For repo-checkout deployments (non-Docker), continue using:

pnpm --filter @pulp-engine/api db:migrate:sqlserver

Operational Limitations vs Postgres

SQL Server is a first-class storage backend for templates, assets, audit events, schedules, and tenants. A small number of operator-relevant features remain Postgres-only at the time of writing — track these so you can assess whether SQL Server is the right backend for your workload:

FeaturePostgresSQL ServerNotes
Async render batch durability (IBatchJobStore)✅ Durable in batch_jobs table + result blob store❌ In-memory onlyJobs in flight when a pod restarts are lost on SQL Server. The IBatchJobStore adapter is tracked as a follow-up; see ha-reference-architecture.md and backup-restore-runbook.md for the operational guarantee. Synchronous render (POST /render/...) and the schedule engine are unaffected — they don’t use batchJobStore.

If the async batch durability gap matters for your workload, choose Postgres. Synchronous renders, schedules, and the editor work identically on either backend.

BATCH_ASYNC_DURABILITY — operator gate

To prevent operators from running a non-durable backend with async batch state by accident, the API has a tri-state operator gate that branches at boot when STORAGE_MODE=sqlserver or STORAGE_MODE=file (the file-mode gate landed in v0.84.0 — previously file mode silently skipped the check):

ValueBehaviour at bootUse when
requiredRefuses to boot with a SQLSERVER_BATCH_NONDURABLE / FILE_BATCH_NONDURABLE error.You require durable async batch — switch to Postgres or set the variable explicitly. Default in hardened production.
warnLogs the *_BATCH_NONDURABLE WARN at startup and proceeds.You’re aware async batch is in-memory and want a visible reminder. Default in non-hardened mode.
allow-nondurableBoots quietly.You’ve explicitly opted in to in-memory async batch and don’t want the warning.

The default mirrors the REQUIRE_HTTPS pattern: hardened production does not silently set the value — it changes the default and refuses boot when the resolved value is incompatible. To run hardened production on SQL Server or file mode today, choose explicitly:

# Refuse boot — pick a different storage backend.
HARDEN_PRODUCTION=true STORAGE_MODE=sqlserver           # → boot refusal
HARDEN_PRODUCTION=true STORAGE_MODE=file                # → boot refusal (v0.84.0+)

# Operator acknowledges the limitation and accepts the warning.
HARDEN_PRODUCTION=true STORAGE_MODE=sqlserver \
  BATCH_ASYNC_DURABILITY=warn                            # → loud WARN, proceed

# Operator explicitly opts in (e.g. async batch isn't used in this deployment).
HARDEN_PRODUCTION=true STORAGE_MODE=file \
  BATCH_ASYNC_DURABILITY=allow-nondurable                # → quiet boot

The gate fires regardless of whether async batch routes are actually used at runtime — the routes are always mounted and an unused route should not be a reason for silent data-durability degradation.


Visual Editor

The visual editor is bundled in the Docker image and served at /editor/. No separate hosting or deployment is required.

Access: http://[your-host]:3000/editor/ (or https://pulp-engine.example.com/editor/ behind a reverse proxy). Navigating to /editor (without trailing slash) redirects automatically.

Authentication: The editor shows a login screen on first load. Enter the value of API_KEY_EDITOR (or API_KEY_ADMIN) to obtain a session token. See docs/editor-guide.md for the full auth model.

Live preview

The editor’s preview button calls POST /render/preview/html. This route is not registered by default in production — it returns 404 unless you explicitly set:

PREVIEW_ROUTES_ENABLED=true

The compose evaluator files (compose.yaml, compose.postgres.yaml) set this automatically. For production, weigh the exposure: preview routes accept an inline TemplateDefinition and render it server-side. Pair with network-level restrictions if enabled. See the PREVIEW_ROUTES_ENABLED env var entry in § 2 Environment Variables for full details.

Without PREVIEW_ROUTES_ENABLED=true: template load, save, publish, version history, and asset management all work — only the in-editor live preview is unavailable.

CORS and same-origin deployments

The bundled editor makes same-origin requests to the API (editor at /editor/, API at /). The browser does not send cross-origin headers for same-origin requests — CORS_ALLOWED_ORIGINS is not required for the editor to reach the API in this topology.

Set CORS_ALLOWED_ORIGINS when:

  • other browser clients or SPAs on different origins need API access, or
  • HARDEN_PRODUCTION=true is set (the hardened startup check requires the var regardless — set it to your deployment URL, e.g. CORS_ALLOWED_ORIGINS=https://pulp-engine.example.com).

Local development vs bundled deployment

ContextEditor URLAPI URLNotes
pnpm dev (source)http://localhost:5174http://localhost:3000Vite dev server; hot reload
Docker imagehttp://[host]:3000/editor/http://[host]:3000Static SPA served by Fastify; same origin

Multi-Tenant Mode (opt-in)

MULTI_TENANT_ENABLED=true turns on per-tenant isolation of templates, assets, credentials, audit events, and scheduled jobs.

RequirementValue
STORAGE_MODEMust be postgres or sqlserver. File mode is rejected at startup when multi-tenant is enabled.
API_KEYS_JSONReplaces the single API_KEY_* variables. Each entry carries tenantId + scopes; a super-admin entry (tenantId: null) is required for /admin/tenants CRUD.
X-Editor-TokenCarries a signed tenantId claim — cross-tenant access is rejected.
Asset binary storeS3 prefixes and filesystem paths are tenant-scoped; no operator action required beyond enabling multi-tenant.
Plugin storagePlugin data is tenant-scoped; plugins that cannot honour tenantId are rejected at registration.

See tenant-isolation-guarantees.md for the full isolation model and the small set of operational caveats (e.g. shared Chromium render pool, shared rate-limit buckets unless configured otherwise).

Shared compute boundary — choose your render mode deliberately. Isolation in multi-tenant mode is row/path-level (templates, assets, credentials, audit, schedules), not compute-level. In the default RENDER_MODE=child-process, all tenants share one Chromium render pool per API pod, so a bug in a Handlebars helper or the render pipeline could in principle leak state across tenants within a pod. This is acceptable for the product’s trusted-tenant model (credentialled internal authors). If your tenants are untrusted or semi-trusted, run RENDER_MODE=container or socket for per-render compute isolation — see § Render Isolation Mode and the “Shared compute” caveat in tenant-isolation-guarantees.md.


Render Isolation Mode

RENDER_MODE controls where Puppeteer runs during PDF generation. Choose based on your security requirements and whether Docker is available.

For the security-reviewer view of these modes — threat boundaries, residual risks, operator checklist per mode — see Render Isolation Threat Model. This section covers operational setup; the threat-model doc covers what each mode actually isolates and what survives an API compromise.

ModeDocker requiredAPI holds Docker socketIsolation level
child-process (default)NoProcess isolation, allowlisted worker env (no API/storage secrets)
containerYesYesContainer: network-none, read-only FS, cap-drop ALL
socketYes (on controller)NoContainer (same flags) + privilege separation
in-processNoNone — debugging only

Sizing the render pool (v0.85.0)

PDF capacity per instance is RENDER_MAX_CONCURRENT_PAGES simultaneous Chromium pages (default 5, minus RENDER_PREVIEW_RESERVED_SLOTS for the batch lane). Renders beyond pool + RENDER_MAX_QUEUE_DEPTH in flight are shed with 503 render_saturated + Retry-After: 30 — clients should back off and retry rather than treating it as a render failure. The benchmark pack carries measured per-slot throughput and the saturation arithmetic (its reference rig sustained 16 concurrent renders host-CPU-bound); use it to pick a pool size, then budget ~150–300 MB RAM per page under load and watch the pulp_engine_render_dispatches_in_flight gauge — pegged at the cap with 503s in the logs means scale the pool or add replicas. Render capacity scales linearly with replica count behind a load balancer (see HA reference architecture).

child-process (default)

Puppeteer runs in a persistent child process spawned with a minimal allowlisted worker environment — no API, database, or object-store credentials cross the boundary. See Render Isolation Threat Model § 3 for the exact allowlist. Recommended for environments without Docker.

No additional configuration needed.

container

Puppeteer runs in a fresh ephemeral Docker container per render. Provides strong isolation but the API process holds Docker socket authority. A compromised API process could invoke arbitrary Docker operations on the host.

Required environment:

RENDER_MODE=container
RENDER_CONTAINER_IMAGE=ghcr.io/OWNER/pulp-engine-worker:vX.Y.Z

Build the worker image:

docker build -f Dockerfile.worker -t ghcr.io/OWNER/pulp-engine-worker:vX.Y.Z .

Runtime flags applied per render (hardcoded in the dispatcher):

  • --network none — no network access from worker
  • --read-only — read-only root filesystem
  • --tmpfs /tmp:rw,noexec,nosuid,size=256m — ephemeral writable /tmp
  • --cap-drop ALL — no Linux capabilities
  • --security-opt no-new-privileges — no setuid escalation
  • --memory 512m --cpus 1 — resource limits (configurable)
  • --rm — container removed after each render

The API delegates Docker invocations to a separate render-controller process over a Unix domain socket. The API process holds no Docker socket authority. Even if the API is compromised, it cannot invoke arbitrary Docker operations — it can only submit render requests through the narrow socket protocol.

The controller enforces the same hardcoded security flags regardless of what the API requests.

Required environment (API process):

RENDER_MODE=socket
RENDER_CONTROLLER_SOCKET=/run/render/render.sock

Required environment (controller process):

RENDER_CONTAINER_IMAGE=ghcr.io/OWNER/pulp-engine-worker:vX.Y.Z
RENDER_CONTROLLER_SOCKET=/run/render/render.sock
# Optional:
# RENDER_CONTAINER_MEMORY_LIMIT=512m
# RENDER_CONTAINER_CPU_LIMIT=1
# RENDER_CONTAINER_SECCOMP_PROFILE=/etc/pulp-engine/chromium-seccomp.json

Get the images. All three (pulp-engine, pulp-engine-worker, pulp-engine-controller) are published publicly to GHCR on every release, so the simplest path is to pull:

docker pull ghcr.io/troycoderboy/pulp-engine-worker:vX.Y.Z
docker pull ghcr.io/troycoderboy/pulp-engine-controller:vX.Y.Z

Or build them from source if you need a local/custom build:

docker build -f Dockerfile.worker -t ghcr.io/OWNER/pulp-engine-worker:vX.Y.Z .
docker build -f Dockerfile.controller -t ghcr.io/OWNER/pulp-engine-controller:vX.Y.Z .

Deploy with the provided compose file, which shows the correct privilege separation:

docker compose -f compose.container.yaml up -d

The compose file:

  • Mounts /var/run/docker.sock only on the render-controller service
  • Uses condition: service_healthy so the API waits until the controller socket is live
  • Routes all renders through the socket; the API container has no Docker mount

Optional seccomp profile: Set RENDER_CONTAINER_SECCOMP_PROFILE to the path of a seccomp JSON file inside the controller container. The profile is passed as --security-opt seccomp=<path> to each worker container. You must bind-mount the file into the controller container:

render-controller:
  volumes:
    - /etc/pulp-engine/chromium-seccomp.json:/etc/pulp-engine/chromium-seccomp.json:ro
  environment:
    RENDER_CONTAINER_SECCOMP_PROFILE: /etc/pulp-engine/chromium-seccomp.json

Security model and residual risks

For the threat-radius comparison across modes, what survives an API compromise, the durable identity boundary (UIDs, socket mode 0660), and the per-mode operator checklist, see Render Isolation Threat Model.


3. Database Setup

Database modes (STORAGE_MODE=postgres or sqlserver). Skip this section entirely if using file mode — no migration or seed step is required.

Postgres mode

# 1. Create the database (if not already created by your DBA / managed service)
createdb -U postgres pulp-engine

# 2. Apply all migrations
pnpm --filter @pulp-engine/api db:deploy

# 3. Verify both tables exist
psql -U postgres -d pulp-engine -c "\dt"
# Should show: templates, template_versions

pnpm db:deploy runs prisma migrate deploy, which applies every committed migration in order. This is safe to re-run — already-applied migrations are skipped.

SQL Server mode

# 1. Apply the schema (creates the database if absent; idempotent)
pnpm --filter @pulp-engine/api db:migrate:sqlserver

# 2. Verify the tables exist (optional)
sqlcmd -S <host> -d <database> -U <user> -P <pass> \
  -Q "SELECT name FROM sys.tables ORDER BY name" -C
# Expected: assets, template_versions, templates

The migration runner connects to master, creates the target database if absent, then applies all pending migration files from apps/api/src/storage/sqlserver/migrations/ in order (e.g. 001_init.sql, 002_add_created_by.sql). Applied migrations are tracked in dbo.__migrations — already-applied files are skipped. Each migration runs in a transaction; if one fails the transaction is rolled back and the script exits non-zero naming the failing file. Safe to re-run at any time.

Seed sample templates

pnpm db:seed

This loads loan-approval-letter and sample-invoice into the configured store (postgres, sqlserver, or file mode). Idempotent — safe to re-run.

Prisma client generation

The Prisma client must be generated from the schema before building:

pnpm db:generate

This is a one-time step per deployment machine. The generated client is written to node_modules — it is not committed to source control.


4. Build and Start

Production build

# From repo root — builds all packages in dependency order via Turborepo
pnpm build

Output is compiled TypeScript written to each package’s dist/ directory.

Start the API

node apps/api/dist/index.js

Or, via pnpm filter:

pnpm --filter @pulp-engine/api start

Confirm the process is running:

curl http://localhost:3000/health
# { "status": "ok", "timestamp": "..." }

Process management

Use a process manager to keep the API running across crashes and server restarts:

# PM2 example
pm2 start apps/api/dist/index.js --name pulp-engine-api
pm2 save
pm2 startup   # generates a startup script for your init system

Alternatively, create a systemd unit file pointing to node apps/api/dist/index.js.


5. Reverse Proxy and Network Placement

For internal use, deploy behind a reverse proxy on a private network. The API requires an X-Api-Key header in production — set at least API_KEY_ADMIN (and optionally API_KEY_RENDER / API_KEY_PREVIEW / API_KEY_EDITOR) in the environment. Even with auth, do not expose port 3000 directly to the internet.

[Internal callers] → [nginx / IIS / API Gateway] → [Pulp Engine API :3000]

nginx example

server {
    listen 80;
    server_name pulp-engine.internal;

    location / {
        proxy_pass         http://127.0.0.1:3000;
        proxy_http_version 1.1;
        proxy_set_header   Host $host;
        proxy_set_header   X-Real-IP $remote_addr;

        # PDF responses can be large; allow adequate timeout and buffer
        proxy_read_timeout     60s;
        proxy_buffers          16 256k;
        proxy_buffer_size      256k;
    }
}

Bind Pulp Engine itself to HOST=127.0.0.1 so it only accepts connections from the reverse proxy, not directly from the network.

HTTPS (TLS-terminating proxy) example

For production deployments with HARDEN_PRODUCTION=true, TLS termination at the reverse proxy is the expected topology. The proxy handles certificates and forwards X-Forwarded-Proto so Pulp Engine can enforce HTTPS on sensitive routes.

# Redirect HTTP → HTTPS
server {
    listen 80;
    server_name pulp-engine.example.com;
    return 301 https://$host$request_uri;
}

server {
    listen 443 ssl;
    server_name pulp-engine.example.com;

    ssl_certificate     /etc/nginx/ssl/pulp-engine.crt;
    ssl_certificate_key /etc/nginx/ssl/pulp-engine.key;

    location / {
        proxy_pass         http://127.0.0.1:3000;
        proxy_http_version 1.1;
        proxy_set_header   Host $host;
        proxy_set_header   X-Real-IP $remote_addr;
        proxy_set_header   X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header   X-Forwarded-Proto $scheme;

        # PDF responses can be large; allow adequate timeout and buffer
        proxy_read_timeout     60s;
        proxy_buffers          16 256k;
        proxy_buffer_size      256k;
    }
}

Set the following in the Pulp Engine .env to match this topology:

TRUST_PROXY=true       # Fastify reads X-Forwarded-Proto from the proxy
REQUIRE_HTTPS=true     # rejects editor-token login over plain HTTP
HOST=127.0.0.1         # accept connections from the proxy only

For the bundled same-origin editor deployment (/editor/ on the same host), CORS_ALLOWED_ORIGINS is not required for the editor to reach the API — the browser treats these as same-origin requests. However, HARDEN_PRODUCTION=true requires the variable regardless; set it to your deployment URL (e.g. CORS_ALLOWED_ORIGINS=https://pulp-engine.example.com). HTTPS proxying and CORS configuration are independent concerns. See § Hardened Production Mode for the full hardened .env example.

Common mistake: Setting REQUIRE_HTTPS=true without TRUST_PROXY=true causes editor-token login requests to fail. Behind a TLS-terminating proxy the internal socket is plain HTTP; without TRUST_PROXY=true, Fastify cannot read X-Forwarded-Proto and sees every request as non-HTTPS.


6. Logging and Operational Visibility

Log format

Fastify uses pino for structured logging.

NODE_ENVLog formatDefault level
productionJSON (one object per line)info
development / anything elsePretty-printedinfo

Set NODE_ENV=production in deployed environments so log aggregators (Datadog, Splunk, CloudWatch, etc.) can parse lines as JSON. Adjust LOG_LEVEL to control verbosity.

Key log fields

FieldDescription
levelinfo, warn, error
timeUnix milliseconds
reqIdPer-request ID assigned by Fastify
req.method, req.urlRoute context
res.statusCode, responseTimeResponse details
err.message, err.stackError context on failures
reasonBounded failure reason — invalid_key, missing_key, insufficient_scope, invalid_token, template_not_found, render_error, version_conflict, duplicate_key, not_found
sourceproduction or preview — render origin
outcomesuccess or failure — operation result
eventAudit event name — editor_token_minted, template_mutation, asset_mutation (v0.20.0+)
actorOperator-supplied actor label from the session token, or null (v0.20.0+). Present on audit events only
credentialScopeadmin or editor — which credential scope performed the write. Present on audit events only

Audit log events (v0.20.0+): Template and asset write operations (POST /templates, PUT /templates/:key, DELETE /templates/:key, restore, asset upload, asset delete) emit structured log entries with event, operation, actor, credentialScope, and a resource identifier (templateKey or assetId). These fields are intentionally higher-cardinality than operational metrics and are designed for log aggregation queries — not Prometheus metric labels. actor: null means the write was performed via direct X-Api-Key auth or no actor label was supplied at login.

Cluster-upgrade note (v0.20.0): In multi-instance deployments, upgrade all API instances to v0.20.0 before operators begin supplying actor labels at login. Using the 3-part token format (omitting actor in the login form) during a mixed-version rollout window is safe.

Health endpoints

GET /health — liveness probe. Always returns 200 { "status": "ok", "version": "0.51.0", "timestamp": "..." } if the process is running. No dependency checks. Use as a load-balancer liveness check.

GET /health/ready — readiness probe. Verifies that storage, asset binary store, and renderer are each reachable within 2 seconds (checks run in parallel). In API-only mode (no render dispatcher, preview disabled), the renderer check always reports "ok".

  • 200 { "status": "ok", "version": "...", "checks": { "storage": "ok", "assetBinaryStore": "ok", "renderer": "ok" } } — all subsystems reachable
  • 503 { "status": "degraded", "version": "...", "checks": { "storage": "error", "assetBinaryStore": "ok", "renderer": "ok" } } — one or more subsystems unreachable

Use GET /health/ready as your Kubernetes readinessProbe to prevent traffic from reaching the pod before all subsystems are available, and to drain traffic during outages. A 503 means at least one check returned "error" or "timeout": storage — template/metadata store unreachable (check database connectivity or file system access); assetBinaryStore — binary asset store unreachable (check file system or S3 connectivity); renderer — Chromium browser or render dispatcher unresponsive (check browser process or container worker). Any single failing subsystem causes 503; validate-deploy.sh treats this as a hard failure.

Preview capability — GET /render/preview/status

Returns the cached startup capability state of the preview renderer. This endpoint is always registered (regardless of PREVIEW_ROUTES_ENABLED) and requires editor, preview, or admin credentials.

{ "available": true, "reason": null }
{ "available": false, "reason": "routes_disabled" }
{ "available": false, "reason": "browser_unavailable" }

Always returns HTTP 200. The available field is the signal; reason describes why when unavailable.

reasonMeaningAction
(absent / null)Preview is available. Chromium launched successfully at startup.
routes_disabledPREVIEW_ROUTES_ENABLED is not set; preview render routes are not registered.Set PREVIEW_ROUTES_ENABLED=true to enable, or leave disabled (default and recommended production posture).
browser_unavailableChromium failed to launch at startup. Routes are registered but cannot serve requests.Check system libraries (see § 1 Runtime Requirements). Check the API startup logs for the raw browser error. Set PULP_ENGINE_DISABLE_SANDBOX=true if running in a container.

Important: this is a startup snapshot, not a perpetual guarantee. available: true means Chromium launched when the process started. Individual render failures after startup (bad template, out-of-memory, etc.) travel the normal 422/500 response path and are not reflected here.

Diagnostic use:

curl -H "X-Api-Key: $ADMIN_KEY" http://localhost:3000/render/preview/status

Audit events — GET /audit-events

All template mutations, asset mutations, and editor token mints are persisted to the same database as templates and assets (no additional infrastructure required). Query the audit trail with GET /audit-events (admin scope required). See the API guide for filter parameters.

Audit events can be purged via DELETE /audit-events?before=<ISO 8601> (admin scope). See the runbook for retention cron examples.

Metrics — GET /metrics

Returns Prometheus text format 0.0.4. Rate limiting disabled. When METRICS_TOKEN is set, requires Authorization: Bearer <token> — see § Environment Variables. When METRICS_TOKEN is not set, the endpoint is unauthenticated (backward-compatible default). In either case, restrict access at the network layer in production.

Scrape config example (Prometheus / Grafana Agent):

scrape_configs:
  - job_name: pulp-engine
    static_configs:
      - targets: ['localhost:3000']
    metrics_path: /metrics

Key application metrics:

MetricWhat to alert on
pulp_engine_render_requests_total{status="failure"}Render failure rate spike
pulp_engine_auth_failures_total{reason="invalid_key"}Potential credential scanning
pulp_engine_template_mutations_total{status="conflict"}Elevated optimistic-concurrency conflicts
pulp_engine_http_request_duration_seconds{route="render_pdf"}PDF render latency SLO

Network access: /metrics is not secret but is intended for internal scrapers only. Restrict at the network layer:

# nginx — allow scraper IP, deny all others
location /metrics {
    allow 10.0.0.50;   # Prometheus / Grafana Agent IP
    deny  all;
    proxy_pass http://127.0.0.1:3000;
}

7. Backup and Recovery

Postgres mode

Templates are stored as JSONB in PostgreSQL (template_versions.definition).

What to back up

The entire database. Pulp Engine’s durable state spans 14 tables — alongside templates / template_versions / assets, the database holds the tenant registry, editor users, template labels and sample data, audit events, render usage, schedules + executions + their delivery DLQ, and async-batch jobs + their webhook DLQ. A table-filtered dump silently drops scheduled deliveries, audit history, user identities, and label promotions — restore from one and they are gone.

ItemContentsCriticality
Full database (pg_dump, no --table filters)All 14 durable tables — see schema.prismaHigh
ASSETS_DIR filesystem directory (or the S3 bucket)Uploaded image binariesMedium — loss means broken image refs in PDFs

Note: The database and the asset binaries must be backed up together — an assets record without its file will produce broken image references, and a file without a record will be orphaned.

# Daily full-database logical backup (custom format, restorable via pg_restore)
pg_dump --format=custom --no-owner --no-privileges \
  --file="pulp-engine-$(date -u +%Y%m%dT%H%M%SZ).dump" \
  "$DATABASE_URL"

# Sync asset files to a backup location
rsync -a /var/pulp-engine/assets/ /backups/pulp-engine-assets/

Restore: follow the backup & restore runbook § 3 — the canonical sequence is stop API → restore binaries → drop/recreate the database (quoted identifiers) → pg_restoreprisma migrate deploy → start API → verify. Restoring into a live database with the API running is not supported. The sequence is rehearsed in CI by scripts/restore-rehearsal.sh.

Source-of-truth for templates

The canonical template definitions are the JSON files in templates/ in the repository. If the database is lost, re-run pnpm db:seed to restore the seed templates. Custom templates added post-seed are only in the database — back them up via pg_dump.

Recommendation: Before adding custom templates through the API in production, export each template via GET /templates/:key and commit the JSON to source control.

SQL Server mode

Templates are stored in the template_versions.definition column (NVARCHAR(MAX) JSON).

What to back up

Same principle as postgres mode: the full database (the T-SQL BACKUP DATABASE below already is full-database), plus ASSETS_DIR.

# Full database backup (T-SQL, run on the SQL Server host or via sqlcmd)
BACKUP DATABASE [pulp-engine] TO DISK = '/backups/pulp-engine.bak' WITH FORMAT, COMPRESSION;

# Sync asset files
rsync -a /var/pulp-engine/assets/ /backups/pulp-engine-assets/

For Azure SQL, use the built-in automated backup (configurable retention) plus blob storage for asset files.

Source-of-truth for templates

Same as postgres mode: canonical definitions are in templates/ in the repository. Custom templates created post-seed are only in the database — export via GET /templates/:key and commit to source control.

File mode

Template state lives in TEMPLATES_DIR (one folder per template key). The JSON files in that directory are the store — edits made through the API write back to TEMPLATES_DIR automatically; no seed step is required.

Back up TEMPLATES_DIR like any other important filesystem data:

rsync -a /var/pulp-engine/templates/ /backups/pulp-engine-templates/

ASSETS_DIR must be backed up in both binary modes: it always holds the asset metadata index (.assets-index.json), and when ASSET_BINARY_STORE=filesystem (the default) it holds the binaries too. When ASSET_BINARY_STORE=s3, binaries live in your S3 bucket — back the bucket up per the runbook § 2.2, and still back up ASSETS_DIR for the index.

Note: File mode is designed for single-instance deployments. Concurrent writes from multiple API processes to the same TEMPLATES_DIR are not supported.


8. Credential Migration (API_KEY → scoped keys)

If you have an existing deployment using the single API_KEY, follow these steps to migrate to the scoped credential model without downtime.

Steps

  1. Generate new secrets for each scope you need:

    # Example: generate with openssl
    openssl rand -hex 32   # for API_KEY_ADMIN
    openssl rand -hex 32   # for API_KEY_RENDER (optional — only if needed)
  2. Update .env to use the new keys. Do not set API_KEY alongside the new keys — the server rejects ambiguous configs at startup:

    # Before:
    API_KEY=my-old-secret
    
    # After:
    API_KEY_ADMIN=my-new-admin-secret
    API_KEY_RENDER=my-new-render-secret   # optional
  3. Update callers to send the appropriate scoped key:

    • Integrations that only render documents → use API_KEY_RENDER
    • Visual editor → set API_KEY_EDITOR on the API server; operators enter this value in the editor’s login form (no frontend env var required)
    • Everything else → use API_KEY_ADMIN
  4. Restart the API process and verify startup completes without errors.

  5. Verify each integration still works by checking a health check then a representative request.

If you need a safe migration window

The legacy API_KEY is still accepted (treated as admin scope) with a deprecation warning logged at startup. You can leave it in place while updating callers, then remove it once all callers have been switched. You cannot run both the old key and the new keys simultaneously — that combination is rejected.


Named-User Mode (EDITOR_USERS_JSON)

v0.23.0+ — applies only to the visual editor login gate. Programmatic X-Api-Key access is unaffected.

Named-user mode replaces the shared API_KEY_EDITOR login with per-user personal keys. Each team member has a unique credential, a verified identity, and an optional role. Actor attribution in audit logs becomes server-derived (not caller-asserted).

User registry format

EDITOR_USERS_JSON must be a JSON array. Set it as a single-line environment variable:

[
  { "id": "alice",  "displayName": "Alice Smith",  "key": "alice-unique-secret",  "role": "editor" },
  { "id": "bob",    "displayName": "Bob Jones",    "key": "bob-unique-secret",    "role": "admin"  },
  { "id": "carol",  "displayName": "Carol Wu",     "key": "carol-secret",         "role": "editor",
    "tokenIssuedAfter": "2026-03-25T12:00:00Z" }
]
FieldRequiredDescription
idYesURL-safe unique identifier. Used as the actor value in tokens and audit records.
displayNameYesHuman-readable name shown in the editor toolbar identity pill.
keyYesPersonal login credential (like a personal API key). Must be unique; must not match API_KEY_ADMIN, API_KEY_EDITOR, or other API keys.
roleYes"editor" or "admin". Editor scope: template/asset management and preview. Admin scope: also restore and delete operations.
tokenIssuedAfterNoISO-8601 UTC timestamp. Rejects tokens with iat < tokenIssuedAfter for this user only — invalidates that user’s sessions without affecting others.

Startup validation: The API exits immediately if EDITOR_USERS_JSON is set but malformed, contains duplicate id or key values, contains invalid role values, or if any user key collides with an active API key (API_KEY_ADMIN, API_KEY_EDITOR, API_KEY_RENDER, API_KEY_PREVIEW).

Mode behaviour

Named-user mode (EDITOR_USERS_JSON set)Shared-key mode (default)
Login credentialPersonal user keyAPI_KEY_EDITOR (or API_KEY_ADMIN)
Shared key accepted for loginNo — 401 “Use your personal user key to log in”Yes
Actor identityServer-derived from user registry (user.id)Caller-supplied at login (optional)
Editor roleuser.role (editor or admin)editor for API_KEY_EDITOR; admin for API_KEY_ADMIN
displayName in editor toolbarYes — from user registryNo
Programmatic X-Api-KeyUnaffectedUnaffected
  1. Generate a personal key for each team member:

    openssl rand -hex 32   # repeat for each user
  2. Build the user registry in EDITOR_USERS_JSON. Assign role: "admin" to users who need restore/delete access; everyone else gets role: "editor".

  3. Force all outstanding sessions to expire by setting EDITOR_TOKEN_ISSUED_AFTER to the current UTC time (prevents shared-key sessions from continuing after the cutover):

    EDITOR_TOKEN_ISSUED_AFTER=2026-03-24T12:00:00Z
  4. Restart the API. The startup log confirms identityMode: named-users and the count of registered users. Any configuration errors cause an immediate exit with a descriptive message.

  5. Distribute personal keys out-of-band (do not send via the same channel as login instructions). Editors re-login with their personal key.

  6. Run the database migration to add createdBy audit columns to template_versions and assets:

    Postgres:

    pnpm --filter @pulp-engine/api db:deploy
    # applies: add_created_by (additive nullable columns, safe for existing data)

    SQL Server:

    pnpm --filter @pulp-engine/api db:migrate:sqlserver
    # applies: 002_add_created_by (additive nullable columns, safe for existing data)

    Run before starting the API. The migration runner applies all pending migrations automatically.

    File mode: no migration needed.

  7. Remove API_KEY_EDITOR from the environment if only the editor uses it (all editor logins now go through the user registry). Keep it if any non-editor integrations depend on it for direct X-Api-Key access.

Per-user revocation

Two mechanisms are available. Both require an API restart to take effect.

MechanismEffectHow
Remove user from registry + restartBlocks new logins and immediately invalidates active sessions for that userRemove entry, restart
Set tokenIssuedAfter on user + restartInvalidates sessions with iat < tokenIssuedAfter for that user only; newer sessions and other users are unaffectedAdd "tokenIssuedAfter": "<iso-ts>", restart
Global EDITOR_TOKEN_ISSUED_AFTERInvalidates all editor sessions (all users)Existing mechanism, unchanged

Example — revoking a single user:

{ "id": "alice", "displayName": "Alice Smith", "key": "alice-unique-secret", "role": "editor",
  "tokenIssuedAfter": "2026-03-24T14:00:00Z" }

Tokens issued before 2026-03-24T14:00:00Z for alice are rejected. Tokens issued after that timestamp (new logins) are accepted. Other users are unaffected.

Rolling back to shared-key mode

Remove EDITOR_USERS_JSON from the environment and restart. The editor immediately returns to shared-key mode — no data changes are needed.


9. Known Production Risks

RiskLikelihoodMitigation
Puppeteer cold start (~2–3 s)Low — singleton browser is reused after first request ResolvedThe browser is now warmed at server startup: warmBrowser() (in-process) or the child-process dispatcher’s warmup() IPC pre-launches Chromium before the server accepts requests. The ~2–3 s cost moves to boot time; first PDF request is fast. /health/ready reflects warmup success. Container/socket modes are ephemeral and unaffected.
Render-time memoryLow for current template sizesBoth PDF routes stream via createPDFStream() — Node.js PDF buffer eliminated. Chrome render-time and HTML render memory still apply; monitor RSS on large documents.
Credential leaked or rotatedLow — shared secrets via env varsRotate the relevant scoped key via env var restart. For near-zero-downtime rotation of API_KEY_EDITOR or API_KEY_ADMIN (preserving existing editor sessions): set the old key as API_KEY_EDITOR_PREVIOUS / API_KEY_ADMIN_PREVIOUS and the new key as the active key, then restart — see runbook.md § Auth secret rotation. To invalidate only editor sessions without rotating API_KEY_EDITOR, set EDITOR_TOKEN_ISSUED_AFTER to the current UTC time and restart.
Editor session token compromisedLow — tokens are short-livedReduce EDITOR_TOKEN_TTL_MINUTES for sensitive environments. Set EDITOR_TOKEN_ISSUED_AFTER to invalidate all outstanding sessions without key rotation (requires restart).
Schema drift between dev and CI (postgres mode only)Low — schema is stablemigrate deploy (CI/prod) and migrate dev (local) both apply the same committed SQL; keep prisma/migrations/ committed after any migrate dev run
Chromium missing system librariesMedium on minimal Linux imagesTest Puppeteer launch on the target OS before deploying; run node -e "require('puppeteer').launch()" to verify
Asset files not backed upLow but high impactBack up ASSETS_DIR alongside the assets table; loss of files causes broken image references in rendered PDFs
Multi-instance asset divergence (DB modes)Medium — easy to overlook when scaling outRunning multiple API instances with ASSET_BINARY_STORE=filesystem (default) and no shared volume means each instance stores binaries locally. Assets uploaded by one instance return 404 when served by another. Mitigation: either mount a shared NFS/network volume as ASSETS_DIR on all instances, or switch to ASSET_BINARY_STORE=s3 — S3 mode eliminates the shared-volume requirement entirely. See § Object Storage.
Preview routes accessible in productionNone — routes return 404 by defaultPREVIEW_ROUTES_ENABLED absent is the safe default. If set to true, preview routes are active; pair with network-layer restrictions and confirm posture at deployment.
Multi-instance rate-limit multiplicationMedium — easy to overlook when scaling outRate limiting defaults to in-memory counters per API process. In multi-instance deployments, set RATE_LIMIT_STORE=redis with a shared REDIS_URL for consistent per-IP enforcement across the cluster. For public-facing deployments, edge rate limiting at the reverse proxy layer (nginx limit_req, HAProxy, Traefik, API gateway) remains the recommended primary throttle — Redis provides consistent internal enforcement, not a replacement for edge protection.

Production security checklist (v0.25.0)

Seven security controls are enforced by default when NODE_ENV=production (plus two conditional OIDC controls — OIDC_REDIRECT_URI https and an explicit OIDC_EDITOR_GROUPS — when OIDC is enabled). Set HARDEN_PRODUCTION=false to disable enforcement for evaluation.

SurfaceDefaultRiskRecommended action
CORSAllow all originsBrowser-originated cross-site requests accepted from any originSet CORS_ALLOWED_ORIGINS to a comma-separated list of trusted origins (e.g. https://editor.example.com)
Swagger UI (/docs*)PublicExposes full API surface and schema to anonymous callersSet DOCS_ENABLED=false if the interactive docs are not needed in production
Prometheus metrics (/metrics)Open, no authExposes request rates, timing, and error counters to any callerSet METRICS_TOKEN to require bearer token authentication, and/or restrict at the network layer
Editor login (POST /auth/editor-token)Accepts HTTPRaw API_KEY_EDITOR transmitted in plaintext over non-TLS connectionsSet TRUST_PROXY=true and REQUIRE_HTTPS=true when the API is behind a TLS-terminating reverse proxy
Remote resource fetchingAllowed (public hosts)Render pipeline can fetch arbitrary external resources during PDF generationSet BLOCK_REMOTE_RESOURCES=true; optionally configure ALLOWED_REMOTE_ORIGINS for trusted font/image CDNs
Editor identityShared-key (anonymous)No per-user audit trail for template changesSet EDITOR_USERS_JSON for named-user identity (recommended), or ALLOW_SHARED_KEY_EDITOR=true to explicitly acknowledge shared-key mode

Startup warnings: When HARDEN_PRODUCTION=false is explicitly set in NODE_ENV=production, the API emits a consolidated [PulpEngine] warning at startup listing every unconfigured control. These do not block startup — they inform operators that enforcement is disabled. Example:

[PulpEngine] HARDEN_PRODUCTION=false — production security enforcement is disabled.
The following controls are not configured:
   • CORS_ALLOWED_ORIGINS must be set to a comma-separated list of specific trusted origins ...
   • METRICS_TOKEN must be set to protect GET /metrics with bearer authentication ...
   Remove HARDEN_PRODUCTION=false to restore default production enforcement.

Hardened Production Mode

On by default in production. When NODE_ENV=production (the Docker default), hardening is enforced automatically — the server fails to start unless all seven security controls are configured. When OIDC/SSO is enabled, two conditional controls are also enforced: OIDC_REDIRECT_URI must be https:// and OIDC_EDITOR_GROUPS must be set explicitly. Set HARDEN_PRODUCTION=false to explicitly opt out for evaluation. Accepted values: true, false, 1, 0, or unset (auto-derived from NODE_ENV).

Enforced rules:

ControlRule
CORSCORS_ALLOWED_ORIGINS must be set to specific origins — wildcard * is rejected (it is equivalent to leaving CORS open)
DocsDOCS_ENABLED must be explicitly set in the environment (false to disable, true to acknowledge exposure)
MetricsMETRICS_TOKEN must be configured
HTTPSREQUIRE_HTTPS=true must be set
ProxyTRUST_PROXY=true must be set when REQUIRE_HTTPS=true
Remote resourcesBLOCK_REMOTE_RESOURCES=true must be set to prevent the render pipeline from fetching resources from arbitrary public hosts during PDF generation. Optionally set ALLOWED_REMOTE_ORIGINS for trusted font/image CDNs.
Named usersWhen editor login is capable (API_KEY_EDITOR, API_KEY_ADMIN, or API_KEY set), either EDITOR_USERS_JSON must be configured for per-user identity (recommended), or ALLOW_SHARED_KEY_EDITOR=true must be set to explicitly accept shared-key identity.

All violations are collected first and reported together, so operators can fix everything in a single pass. Example failure output:

❌ HARDEN_PRODUCTION=true but required security controls are not configured:
   • CORS_ALLOWED_ORIGINS must be set to a comma-separated list of specific trusted origins ...
   • DOCS_ENABLED must be explicitly set. Use DOCS_ENABLED=false to disable the Swagger UI ...
Configure all required controls or unset HARDEN_PRODUCTION to disable enforcement.

Example hardened .env configuration:

# Security hardening — required when HARDEN_PRODUCTION=true
CORS_ALLOWED_ORIGINS=https://editor.example.com
DOCS_ENABLED=false
METRICS_TOKEN=<output of: openssl rand -hex 32>
REQUIRE_HTTPS=true
TRUST_PROXY=true
BLOCK_REMOTE_RESOURCES=true
# Named-user identity: configure EDITOR_USERS_JSON (recommended) or opt out:
# ALLOW_SHARED_KEY_EDITOR=true
HARDEN_PRODUCTION=true

Why wildcard CORS is rejected: CORS_ALLOWED_ORIGINS=* explicitly allows all browser origins — the same open posture as leaving it unset. Hardened mode requires specific, named origins so the trust boundary is enforced.

Why TRUST_PROXY is required with REQUIRE_HTTPS: Fastify derives request.protocol from the raw socket connection. Behind a TLS-terminating reverse proxy, the socket is plain HTTP; without TRUST_PROXY=true, Fastify cannot read X-Forwarded-Proto and will reject all editor-token logins regardless of the external connection. Setting TRUST_PROXY=true is also safe for direct-TLS deployments (no proxy) — it only affects behavior when X-Forwarded-Proto headers are present.

Deployment validation: When METRICS_TOKEN is configured, pass it as the 4th argument to validate-deploy.sh:

./scripts/validate-deploy.sh https://api.example.com "$ADMIN_KEY" "" "$METRICS_TOKEN"

HTTP security headers (@fastify/helmet)

The API registers @fastify/helmet to set baseline HTTP security headers on every response. CSP is route-aware because different surfaces have different requirements:

Route prefixContent-Security-PolicyPermissions-Policy
/editor, /editor/*default-src 'self'; script-src 'self'; style-src 'self' 'unsafe-inline'; frame-src blob:; object-src blob:; img-src 'self' blob: data:; connect-src 'self'clipboard-write=(self)
/docs*Deferred to Swagger UI’s staticCSP — not set by the API(none)
/render/html, /render/preview/htmlNot set — the HTML renderer embeds a <meta> CSP (script-src 'none'; object-src 'none') inside the documentcamera=(), microphone=(), geolocation=()
All other routes (API JSON, health, metrics)default-src 'none'camera=(), microphone=(), geolocation=()

Additional headers set by Helmet:

HeaderValueNotes
Strict-Transport-Securitymax-age=31536000; includeSubDomainsOnly set when REQUIRE_HTTPS=true
X-Frame-OptionsDENYPrevents clickjacking for all routes
Referrer-Policystrict-origin-when-cross-origin
X-Content-Type-OptionsnosniffHelmet default
X-DNS-Prefetch-ControloffHelmet default

Cross-origin isolation policies (Cross-Origin-Resource-Policy, Cross-Origin-Embedder-Policy, Cross-Origin-Opener-Policy) are disabled. These headers are designed for SharedArrayBuffer isolation and would break cross-origin asset loading when the editor runs on a different port (development) or a separate domain.

Deployment model — trusted authors

Pulp Engine is designed for trusted internal teams: developers, operators, and named template authors operating within a single organisation. Templates are rendered by a headless Chromium instance; Handlebars templates and helpers run server-side and should be authored by credentialled, internal-trusted users — not anonymous public users.

Pulp Engine is not designed or marketed for hostile multi-tenant environments where arbitrary, untrusted users author templates. If your deployment model includes untrusted template authoring, apply additional sandboxing at the infrastructure level before exposing the editor to those users.

For production deployments, enabling HARDEN_PRODUCTION=true and configuring EDITOR_USERS_JSON (per-user named credentials) gives you enforced security controls and a full audit trail of which named user authored each template change.


10. Upgrading from file mode to database-backed storage

Upgrading the server version? Stored template definitions are format-compatible across upgrades (additive-only schema changes, enforced in CI by a frozen fixture corpus). See template-compatibility.md for the formatVersion semantics and how to pre-flight stored templates with pulp validate.

Use the db:migrate:file-to-db script to copy all template and asset metadata from a file-mode TEMPLATES_DIR into a postgres or sqlserver database. This is a one-time operator tool — it is not a sync engine and is not designed for ongoing replication.

Idempotency: skip, not merge

Records that already exist in the target database are skipped without modification. The script does not update or reconcile existing records with file-mode data. For a clean first migration, ensure the target database is empty. Running the script against a partially populated target inserts any missing records and skips any that already exist — by design.

Prerequisites

  1. Stop the API. The migration reads from TEMPLATES_DIR while the source is at rest. Running the API concurrently against the file-mode directory may produce inconsistent results.
  2. Apply the target schema first:
    • Postgres: pnpm --filter @pulp-engine/api db:deploy
    • SQL Server: pnpm --filter @pulp-engine/api db:migrate:sqlserver
  3. Use an empty target database for a clean migration. See idempotency note above.
  4. The API must have run at least once in file mode before migrating. The file-mode store auto-migrates older flat-layout template files ({key}.json at root) to the current folder layout ({key}/meta.json + {key}/versions/) on startup. If you skip this step, any flat-layout templates will not be visible to the migration script.
  5. Asset binaries stay on disk. The script migrates metadata only (not binary files). Ensure ASSETS_DIR in the target deployment points to the same directory as the file-mode ASSETS_DIR, or manually copy all binary files to the new ASSETS_DIR before cutover. See asset binary note below.

Environment variables

VariableRequiredValue
STORAGE_MODEYespostgres or sqlserver — the target backend
TEMPLATES_DIRYesPath to the existing file-mode template directory (source)
ASSETS_DIROptionalPath to the existing file-mode assets directory. Asset binaries are not moved; only metadata from .assets-index.json is read.
DATABASE_URLYes (postgres)Prisma connection string for the target database
SQL_SERVER_URLYes (sqlserver)mssql connection URL for the target database

Env-var precedence note: Shell environment variables override values from .env (dotenv-cli default — no --override). Pass STORAGE_MODE and the connection string explicitly in the shell when running the migration. The script prints the effective configuration at startup — verify the output before migration proceeds. If .env still contains STORAGE_MODE=file and no shell override is provided, the guard will catch it and exit 1 with a clear error.

Steps

# 1. Stop the API

# 2. Dry run first — review what will be inserted and verify the startup lines
STORAGE_MODE=postgres \
  TEMPLATES_DIR=/var/pulp-engine/templates \
  ASSETS_DIR=/var/pulp-engine/assets \
  DATABASE_URL=postgresql://user:pass@host:5432/pulp-engine \
  pnpm --filter @pulp-engine/api db:migrate:file-to-db -- --dry-run
# Verify the startup log shows the correct storageMode, TEMPLATES_DIR, and ASSETS_DIR

# 3. Run the migration
STORAGE_MODE=postgres \
  TEMPLATES_DIR=/var/pulp-engine/templates \
  ASSETS_DIR=/var/pulp-engine/assets \
  DATABASE_URL=postgresql://user:pass@host:5432/pulp-engine \
  pnpm --filter @pulp-engine/api db:migrate:file-to-db

# 4. Review the summary — "Templates skipped" should be 0 on a clean first run
#    Watch for any warnings printed above the summary (exit code 2 = partial)

# 5. Update .env: set STORAGE_MODE=postgres (or sqlserver)
#    Remove TEMPLATES_DIR if it was only used for file mode

# 6. Start the API

Exit codes

CodeMeaning
0All records inserted or idempotently skipped; no source-data warnings
1Fatal error: bad configuration, DB connection failure, or malformed .assets-index.json
2Partial success: some templates were skipped due to unreadable meta.json or version files; all other records were inserted

Source-data error policy

ProblemBehaviorExit
Unreadable / malformed meta.jsonSkip that template2
All version files unreadableSkip that template2
currentVersion not in readable version setSkip that template (avoids a broken DB record)2
Some version files unreadable, currentVersion readableMigrate readable versions only2
Malformed .assets-index.jsonFatal abort — fix the file before re-running1

Asset binaries are not moved

The script migrates metadata only. Asset binary files (images) are not copied or moved. They must be accessible at the ASSETS_DIR path configured for the target deployment.

  • Same directory: If the target deployment will use the same filesystem path as the source ASSETS_DIR, no action is needed — binaries are already in place.

  • Different directory or new server: Copy the entire contents of the source ASSETS_DIR to the new location before running the migration (and before starting the API), then verify the asset binary files are present. Missing binaries will cause broken image references in rendered PDFs after cutover.

Known limitations

  1. Template and version row IDs are not preserved. Templates and template versions receive new CUIDs (postgres) or UUIDs (SQL Server) in the target database. These IDs are internal and never appear in API responses or URLs.
  2. Asset IDs are preserved from the source .assets-index.json records.
  3. Pre-folder-layout templates (from very early file-mode versions) must be auto-migrated by the file-mode store first (run the API once before migrating).
  4. One direction only. There is no reverse migration path (DB → file mode).
  5. SQL Server updated_at precision: stored as DATETIME2(3) (millisecond precision). Sub-millisecond precision in the source timestamp is truncated.

Quick-reference checklist

  • Node 22–24 and pnpm 10.32.1 installed on the server
  • Chromium system dependencies installed (Linux only)
  • .env created from .env.example with STORAGE_MODE, HOST, PORT, NODE_ENV=production, API_KEY_ADMIN (and optionally API_KEY_RENDER / API_KEY_PREVIEW / API_KEY_EDITOR); ASSETS_DIR (absolute path), ASSETS_BASE_URL
  • pnpm install completed

Postgres mode only (STORAGE_MODE=postgres or unset):

  • PostgreSQL accessible from the API process
  • DATABASE_URL set in .env
  • pnpm db:generate completed
  • pnpm db:deploy applied migrations to the production database
  • pnpm db:seed loaded sample templates
  • Daily full-database pg_dump scheduled (no --table filters — see § 7 Backup and Recovery and the backup & restore runbook)
  • If running multiple API instances with ASSET_BINARY_STORE=filesystem (default): ASSETS_DIR is on a shared volume accessible to all instances (not required when ASSET_BINARY_STORE=s3)

SQL Server mode only (STORAGE_MODE=sqlserver):

  • SQL Server (2019+) or Azure SQL accessible from the API process
  • SQL_SERVER_URL set in .env
  • pnpm db:generate completed (compiles Prisma types; no DB connection made)
  • pnpm --filter @pulp-engine/api db:migrate:sqlserver applied schema
  • pnpm db:seed loaded sample templates
  • Database backup scheduled
  • If running multiple API instances with ASSET_BINARY_STORE=filesystem (default): ASSETS_DIR is on a shared volume accessible to all instances (not required when ASSET_BINARY_STORE=s3)

File mode only (STORAGE_MODE=file):

  • TEMPLATES_DIR set in .env and directory contains at least one template JSON file
  • pnpm db:generate completed (compiles Prisma types; no DB connection made)
  • Both TEMPLATES_DIR and ASSETS_DIR included in your backup schedule (asset metadata + binaries live in ASSETS_DIR — see the runbook § 6)

All modes:

  • pnpm build succeeded
  • node apps/api/dist/index.js starts without errors
  • If deploying in a container without sandbox support: PULP_ENGINE_DISABLE_SANDBOX=true set in .env
  • GET /health returns 200 { "status": "ok", "version": "..." }
  • GET /health/ready returns 200 { "status": "ok", "checks": { "storage": "ok", "assetBinaryStore": "ok", "renderer": "ok" } }
  • Licence verified (licensed deployments): PULP_LICENCE_KEY set, the startup log does not say Running under Evaluation Licence — output will be watermarked, and the /health/ready licence block reports the licensed state — see licence-key-format.md
  • GET /metrics returns Prometheus text format (contains process_cpu_seconds_total)
  • API bound to 127.0.0.1 (or private IP), not exposed directly
  • Reverse proxy configured and restricting access to internal callers
  • /metrics endpoint restricted to scraper IPs at the network/proxy layer
  • Process manager (PM2 / systemd) configured for auto-restart
  • Preview route posture confirmed: PREVIEW_ROUTES_ENABLED absent (routes return 404 in production) or intentionally set to true with network restrictions in place