Pulp Engine Document Rendering
Get started
Release v0.67.0

Release v0.67.0

Date: 2026-04-12

Theme

Phase C.0 Stage 2 — Multi-tenant mode.

This release lights up the tenant primitive that shipped behind the scenes in v0.66.0. An operator running Postgres can now set MULTI_TENANT_ENABLED=true, provision a super-admin credential, create real tenants via the new admin API, issue tenant-bound keys via API_KEYS_JSON, mint editor tokens carrying a signed tenant claim, archive a tenant, and have every data-access request correctly isolated at the store layer.

Zero-change upgrade for single-tenant deployments. The entire multi-tenant path is behind a single feature flag. Leave MULTI_TENANT_ENABLED unset (or false), restart, and the server behaves exactly like v0.66.0 — the 59-site tenant coercion rewrite preserves stage 1 semantics because resolveTenant() short-circuits to 'default' in single-tenant mode.

At a glance

AreaWhat shippedNew env / API surface
Feature flagMaster MULTI_TENANT_ENABLED gateMULTI_TENANT_ENABLED (default false)
CredentialsRich credential map with tenant bindingAPI_KEYS_JSON, API_KEYS_JSON_FILE, API_KEY_SUPER_ADMIN
Tenant CRUDSuper-admin-only admin routesPOST/GET/PATCH /admin/tenants, /:id/archive, /:id/unarchive, DELETE/:id?purge=true (→501)
Editor tokens5-part format with signed tenant claimEditorTokenResponse.tenantId field
OIDCDefault tenant for OIDC sessionsOIDC_DEFAULT_TENANT (default 'default')
Archive lifecycleSoft-archive with 409 on writesTENANT_STATUS_CACHE_TTL_MS (default 10s), tenant_archive_rejections_total metric
Plugin systemReject identity/storage plugins in MT modeplugin-api 0.1.0 → 0.2.0 (render-hook context optional tenantId)
Binary storeNon-default tenants get {tenantId}/{uuid} prefix/assets/* wildcard route
Error wirecode: 'tenant_unknown' / 'tenant_archived'Optional ErrorResponseSchema.code field
Grep gateBan ?? 'default' in route/lib; allowlist shrunkscripts/check-tenant-propagation.mjs

What changed

Credential shape and super-admin scope

The credential map at auth.plugin.ts grew from Map<string, Scope> to Map<string, { scope, tenantId: string | null }>. Three env-var sources fill it:

  1. Legacy env vars (API_KEY_ADMIN, API_KEY_RENDER, API_KEY_PREVIEW, API_KEY_EDITOR) always bind to tenantId: 'default'. No change in behavior for single-tenant operators.
  2. API_KEYS_JSON / API_KEYS_JSON_FILE — JSON array of [{ key, scope, tenantId }] where tenantId: string | null. Parsing produces redacted errors that never echo the key value. Mutually exclusive sources.
  3. API_KEY_SUPER_ADMIN — single-string shortcut for { scope: 'admin', tenantId: null }. Super-admin credentials are the only ones allowed to operate on /admin/tenants/*. A tenant-bound admin credential hitting those routes gets 403 super_admin_only.

A super-admin credential operating on a data-access route (templates, assets, schedules, render, audit-events, schedule-dlq) must supply an X-PulpEngine-Tenant-Id header — otherwise resolveTenant() returns 400 tenant_required. The header is trimmed, lowercased, and validated against the slug regex before being passed to the store.

Stage 1 had 59 request.tenantId ?? 'default' sites across 12 files. Every one of them was a cross-tenant leak waiting to happen as soon as a super-admin credential (with tenantId: null) hit the codebase. Stage 2 introduces one helper at apps/api/src/lib/tenant-resolution.ts:

export async function resolveTenant(
  request: FastifyRequest,
  reply: FastifyReply,
): Promise<string | null>

The mechanical rewrite replaced every call site:

// Before
const tenantId = request.tenantId ?? 'default'
const result = await service.getByKey(tenantId, key)

// After
const tenantId = await resolveTenant(request, reply)
if (tenantId === null) return
const result = await service.getByKey(tenantId, key)

The helper handles the single-tenant default, the tenant-bound credential path, and the super-admin header path uniformly. For super-admin headers it runs assertKnown(headerValue) so a typo’d or deleted tenant fails with 403 tenant_unknown before any store call.

The grep gate now bans ?? 'default' in route/lib files (except the provider trio, tenant-resolution.ts, super-admin.ts, and a small allowlist of boundary files). Inline tenant-propagation-allow-default directives are the only remaining escape hatch, and each one requires a reviewer-visible comment.

Editor token 5-part format (B1 blocker — verify-first)

The stage 1 verifyEditorToken had switch (parts.length) { case 2|3|4: ...; default: return null }. A 5-part token was unforgeably rejected. Stage 2 adds case 5: BEFORE touching any mint-side code so partial rollouts don’t create an outage window.

New format: {iat}.{expiry}.{tenantId_b64url}.{actor_b64url}.{sig} HMAC payload: editor:${iat}:${expiry}:${tenantId}:${actorRaw}

Backward-compat: 3-part and 4-part tokens still verify; they resolve to tenantId: 'default'. No forced re-auth on upgrade.

extractTokenActor() — the other load-bearing edit — is the pre-verification parse helper used by the named-user auth path. Stage 1 hardcoded parts.length !== 4 → without the fix, every named-user 5-part token was rejected at the first step of auth before verifyEditorToken ran. Stage 2 supports both 4-part and 5-part via parts.length - 2 indexing (the actor slot is always the second-to-last segment).

The mintEditorToken signature grew a required tenantId: string parameter. All four mint call sites were updated: auth.ts named-users, auth.ts shared-key + super-admin, oidc.ts callback, oidc.ts exchange. Every mint path runs await app.tenantStatusCache.assertKnown(mintTenantId) immediately before the mint — the universal “last gate before mint” rule.

/auth/editor-token super-admin mint flow

/auth/* routes bypass the onRequest hook, so the auth-hook assertKnown check does NOT fire for mint routes. Stage 2 adds explicit assertKnown calls at every mint path.

The super-admin mint flow on /auth/editor-token:

BranchBehavior
Tenant-bound credential + header present400 — header not accepted when credential is tenant-bound
Tenant-bound credential + no headerassertKnown(entry.tenantId) → mint with that tenant
Super-admin credential + header missing400 tenant_required
Super-admin credential + header slug-invalid400 malformed-slug
Super-admin credential + valid header + unknown tenant403 tenant_unknown (via global error handler)
Super-admin credential + valid header + active/archived tenantmint with that tenant

Archived tenants mint successfully. A freshly-minted session on an archived tenant succeeds on read routes (per the stage 1 soft-archive rule) and fails with 409 tenant_archived at the store-write boundary. This is deliberate — blocking mint on archive would contradict the audit/export access path.

OIDC via completion codes + default tenant

The repo uses oidc_code= query-param completion codes, not redirect fragments. Stage 2 threads tenantId through the flow concretely:

  • CompletionTokenData at lib/oidc/completion-codes.ts grows a required tenantId: string field. Every completionCodeStore.create({...}) call site passes the resolved tenantId.
  • /oidc/complete response body carries tenantId through to the editor.
  • /oidc/exchange response body carries tenantId.
  • OIDC_DEFAULT_TENANT env var (default 'default') — auto-provisioned OIDC users pick this up as their tenantId. For stored users, user.tenantId wins. Per-group tenant binding deferred to C.0b.
  • Startup validation: if OIDC_ENABLED && MULTI_TENANT_ENABLED, the configured default tenant must exist and not be archived — throws at boot.
  • Mint-time recheck: the OIDC callback path runs assertKnown(mintTenantId) before the mint. Failure surfaces via the errorPage helper as an HTML page (not JSON) because the callback returns HTML.

TenantStatusCache: archive enforcement without Prisma middleware

The earlier-draft plan used a Prisma extension for archive enforcement. That approach hit two dead-ends:

  • Schedule engine + audit purge writes happen outside any HTTP request context, so a per-request cache can’t see them.
  • Prisma’s legacy $use middleware doesn’t fire reliably inside $transaction(...) callbacks (hazard B3).

Stage 2 drops the extension entirely. Every Postgres store write method calls await this.tenantStatusCache?.assertActive(tenantId, operation) as its first line, BEFORE any $transaction begins. The cache is a process-scoped Map<tenantId, {status, expiresAt}> with a 10-second default TTL (tunable via TENANT_STATUS_CACHE_TTL_MS). One DB query per cache miss. The shared instance is injected into both the Postgres stores and the schedule engine so non-HTTP paths hit the same cache.

Two distinct guards:

  • assertKnown(tenantId) — auth-boundary. Throws TenantUnknownError → 403 tenant_unknown on unknown tenants. Archived tenants PASS — reads on archived tenants are allowed per stage 1 soft-archive.
  • assertActive(tenantId, operation) — store-write-boundary. Throws TenantUnknownError → 403 on unknown (defensive), TenantArchivedError → 409 tenant_archived on archived. Increments tenant_archive_rejections_total{operation}.

Terminal / observability writes intentionally skip the guard: audit.record, dlq.insert, dlq.markAbandoned, dlq.markOrphaned, execution.updateStatus. Archived tenants still get audit rows and DLQ terminal transitions — otherwise an archive mid-execution would leave orphan rows forever.

Schedule engine dispatch-time recheck: after findDueSchedules returns (which now JOINs tenants and excludes archived ones), the engine re-checks the cache per schedule before inserting an execution row. An archive happening between the query and the per-schedule dispatch still blocks in-flight work.

Cache invalidation: POST /admin/tenants, /:id/archive, /:id/unarchive all bust the relevant cache entry so subsequent requests observe the fresh state immediately on this pod. Other pods observe the change within TTL — documented as the stale window.

Wire contract for tenant_unknown / tenant_archived

Stage 2 adds an optional code field to ErrorResponseSchema. Existing clients that parse only error + message continue to work; SDKs and tests that need precise classification read code.

// tenant_unknown — 403 Forbidden
{ "error": "Forbidden", "code": "tenant_unknown", "message": "Tenant \"future\" is not known to this server." }

// tenant_archived — 409 Conflict
{ "error": "Conflict", "code": "tenant_archived", "message": "Tenant \"acme\" is archived and rejects new writes. Reads still work." }

The auth-hook catch path at auth.plugin.ts:450-453 is rewritten to derive the error label from err.statusCode (403 → Forbidden, 401 → Unauthorized) and pass through the optional code. The global error-handler in error-handler.plugin.ts gains parallel branches for AuthError (and its TenantUnknownError subclass) and TenantArchivedError so non-auth-hook throw sites (resolveTenant inside routes, /auth/editor-token mint) produce identical wire bodies. Defense-in-depth: both paths converge.

Metrics discipline:

  • auth_failures_total{reason="tenant_unknown"} — new reason bucket (one additional label cardinality)
  • tenant_archive_rejections_total{operation} — new counter with enumerated low-cardinality operation label (18 values matching the tenant-scoped store write methods). tenantId is NOT a metric label — it’s unbounded by Prometheus standards. tenantId appears on the structured log line only.

Binary store tenant-prefixing

Non-default tenants produce tenant-prefixed filenames at upload time: postgres-asset.store.ts generates {tenantId}/{uuid}.{ext} instead of the flat {uuid}.{ext}. IAssetBinaryStore stays tenant-agnostic — it sees the opaque filename and never learns the tenant directly.

Two knock-on fixes:

  • FsAssetBinaryStore.save() adds fs.mkdirSync(path.dirname(target), { recursive: true }) before the write. Without this, the first upload for any new tenant would throw ENOENT because the {tenantId}/ subdirectory doesn’t exist yet.
  • Private-mode asset proxy route changes from /assets/:filename to /assets/* so tenant-prefixed filenames match as a two-segment wildcard. Handler reads request.params['*'] and relaxes the path-traversal guard to allow forward slashes (but still reject .. segments and leading dots). OpenAPI param name changes from filename to * — downstream SDK regeneration is batched after stage 2.

The public-mode static mount stays on assetsDir root and serves tenant-prefixed filenames directly via /assets/acme/uuid.png. Filenames stay globally unique via the UUID component so cross-tenant collisions are impossible.

asset-inline.ts regex update

The stage 1 regex /src=(["'])\/assets\/([^"'?#/\\]+)\1/g explicitly excluded forward slashes — a tenant-prefixed reference like <img src="/assets/acme/uuid.png"> never matched and never got inlined. Private-mode HTML and PDF renders for non-default tenants would have broken silently. Stage 2 removes the / exclusion; the capture now accepts multi-segment filenames. Traversal protection moves to the tenant-scoped binary store wrapper’s metadata lookup.

Plugin system rejection policy

Under MULTI_TENANT_ENABLED=true, plugin identity providers and plugin storage backends are rejected at registration time:

[Plugin <name>] identity provider 'X' cannot be registered in multi-tenant mode.
The plugin-api 0.2.x contract does not carry a tenant binding...

Plugin render hooks and custom renderers load with a single-shot warning logged at activation time — they’re told to read ctx.tenantId if they need per-tenant behavior. The plugin bridge shims at plugin-system.plugin.ts:161-171 keep their hardcoded 'default' because under multi-tenant mode the plugin never activates (the rejection above fires first); under single-tenant mode they work as before.

@pulp-engine/plugin-api bumps 0.1.00.2.0PreRenderHookContext and PostRenderHookContext gain an optional readonly tenantId?: string field. Optional (not required) preserves source compatibility with 0.1.x plugins that construct contexts as object literals in their test suites; the RenderHookRunner fills 'default' when the field is absent.

Editor side: 7 setStoredToken callers + embed protocol + silent callback

setStoredToken(token, expiresAt, actor, displayName, scope, tenantId) gained a 6th argument. The editor’s session-storage layer adds a pulp-engine.editorTenantId key and a getStoredTenant() helper mirroring getStoredScope(). All seven call sites were updated:

#FileTrigger
1apps/editor/src/components/auth/LoginGate.tsx:129Shared-key login form submit
2apps/editor/src/embed-main.tsx:44Host posts pre-minted msg.token in embed init
3apps/editor/src/embed-main.tsx:48Host posts msg.oidcToken/oidc/exchange
4apps/editor/src/embed-main.tsx:57Host posts msg.apiKey/auth/editor-token
5apps/editor/src/embed-main.tsx:121Ongoing pulp-engine:token-refresh host message
6apps/editor/src/hooks/use-token-refresh.ts:80Silent iframe refresh response
7apps/editor/src/lib/auth.ts:222Internal exchangeOidcCode() caller

Two additional files were updated that don’t call setStoredToken but are part of the embed / refresh plumbing:

  • apps/editor/src/embed/post-message-protocol.tsInboundMessage init shape and TokenRefreshMessage shape both grow an optional tenantId?: string. These are the TypeScript types host integrations compile against.
  • apps/editor/public/oidc-silent-callback.html — static HTML outside tsc and lint. Its postMessage payload at lines 28-36 now forwards tenantId: data.tenantId from the /oidc/complete response body. Without this update, silent refresh would reset the editor’s stored tenantId to 'default' on every refresh cycle, silently downgrading multi-tenant sessions.

Migration notes

Single-tenant upgrade (the common case)

Do nothing. Upgrade to v0.67.0, redeploy, and everything behaves exactly like v0.66.0. The 59-site tenant coercion rewrite preserves stage 1 semantics because resolveTenant() short-circuits to 'default' in single-tenant mode. Existing 3-part and 4-part editor tokens continue to verify — they resolve to tenantId: 'default'.

Rolling out multi-tenant mode

  1. Upgrade to v0.67.0 WITHOUT MULTI_TENANT_ENABLED. Deploy and verify single-tenant still works.
  2. Set MULTI_TENANT_ENABLED=true + API_KEY_SUPER_ADMIN=<key> + restart. Legacy API_KEY_* credentials continue to work — they all bind to 'default'.
  3. Create your first non-default tenant via POST /admin/tenants {"id":"acme","name":"Acme Corp"} using the super-admin key.
  4. Add API_KEYS_JSON with a tenant-bound entry for that tenant. Restart.
  5. Verify isolation — a request with the acme-bound key should only see acme’s data; a request with API_KEY_ADMIN should only see default’s data; the super-admin key must pass X-PulpEngine-Tenant-Id on every data-access request.

Rollback

Flip MULTI_TENANT_ENABLED=false and restart. Behavior reverts to stage 1 immediately. No data migration, no schema rollback — stage 2 adds no new Prisma migrations. Non-default tenant rows in the tenants table stay dormant; tenant-prefixed filenames in assets/<tenantId>/... stay on disk.

Application-level rollback to v0.66.0 is also safe in pure single-tenant mode (no non-default tenants ever created). If non-default tenants have been used, the pre-v0.67.0 code will read tenant-prefixed filenames as flat filenames — the asset fetches will miss. Recommended: stay on v0.67.0 and toggle the flag.

Verification

  • cd apps/api && pnpm exec tsc --noEmit — zero errors
  • cd apps/editor && pnpm exec tsc --noEmit — zero errors
  • cd packages/plugin-api && pnpm build — zero errors, outputs 0.2.0 types
  • node scripts/check-tenant-propagation.mjs — clean under strict multi-tenant-mode rule
  • node scripts/check-template-resolution.mjs — clean
  • pnpm --filter @pulp-engine/api test — 975 passing, pre-existing Windows Fastify-startup 10-15s timeouts documented in project_flaky_tests_v1.md (all verified to pass in isolation)
  • node scripts/check-version.mjs — 9 lockstep files aligned at 0.67.0

Follow-ups for C.0b / C.1 / C.2

Follow-upNotes
C.0b SQL Server multi-tenantRequires ITenantStore implementation against raw mssql + archive guards in SqlServerTemplateStore/AssetStore/AuditStore. Postgres ships first as the reference.
C.0b plugin-api tenant-aware contractsSemver-major: PluginIdentityProvider.tenantId, PluginTemplateStore.list({tenantId}), etc. Requires a codemod release for plugin authors.
C.0b per-group OIDC tenant bindingOIDC_TENANT_GROUP_MAPPING_JSON or equivalent.
C.0b pre-activation plugin rejection via manifest capabilities fieldMore informative than registration-time rejection.
C.0b startup audit loop iterating all tenantsCurrently hardcoded to 'default' for the legacy-SVG scan.
C.0b Postgres RLS as second-layer defenseExplicit store-layer guards are the primary gate.
C.0b UserTenantMembership tableCross-tenant users (consultant case). Stage 2 keeps globally-unique user IDs.
C.0b ApiCredential CRUD tableEnv-var only for stage 2.
C.1 per-tenant schedule engine shardingStage 2 ships one global engine with archive filter + dispatch recheck.
C.1 tenant usage analyticsRenderUsage table, GET /usage route.
C.2 per-tenant rate limits@fastify/rate-limit keyGenerator using `(tenantId
SDK regenerationPython/.NET/Go/Java — the /assets/* wildcard and tenantId on editor-token responses both require codegen.
Tenant hard-delete / purgeStage 2 returns 501; cascade policy requires design work.
Execution-state granular archive policyCurrently updateStatus skips the archive guard entirely so in-flight executions can reach terminal state. A future release may add per-terminal-state guards using the TenantArchiveOperation label already present in the metrics enum.

Hazard register (closed in stage 2)

Every hazard from the planning stress-test is closed concretely:

  • B1: verifyEditorToken case 5: added FIRST; extractTokenActor supports both 4-part and 5-part
  • B2: 59 ?? 'default' sites rewritten to resolveTenant + grep gate ban
  • B3: Prisma extension dropped entirely; explicit assertActive guards run BEFORE $transaction
  • B4: /assets/* wildcard + globally-unique UUID filenames preserve public-mode compatibility
  • B5: FsAssetBinaryStore.save() adds mkdirSync for tenant subdirectories
  • H1: no Prisma extension, no recursion problem
  • H2: per-request cache design dropped in favor of process-scoped TTL cache; stale window documented
  • H3: dispatch-time recheck in schedule engine catches mid-tick archives
  • H4: tenantId is OPTIONAL on hook contexts (semver-minor)
  • H5: startup validation warns, not throws, for API_KEYS_JSON referencing unknown tenants
  • H6: redacted parse errors; API_KEYS_JSON_FILE alternative
  • H7: tenantStore.* exempt from grep-gate store-call check
  • R1: OIDC via completion codes (no fragment parser); CompletionTokenData.tenantId required
  • R2: asset-inline.ts regex updated + regression test
  • R3: tenant CRUD wiring traced through storage-factory → storage.plugin → server.ts explicitly
  • R4: explicit store-layer guards (no Prisma middleware)
  • R5: SQL Server + multi-tenant rejected at startup alongside file mode
  • R6: render.ts ?? 'default' carveout removed; hook contexts use resolved variable
  • R7: /auth/editor-token super-admin branch documented with explicit flow
  • R8: extractTokenActor() supports both 4-part and 5-part
  • R9: TenantStatusCache.assertKnown() covers auth hook + resolveTenant + all 4 mint paths
  • R10: 7 setStoredToken callers enumerated and updated; embed protocol types updated; silent-callback HTML updated
  • R11: apps/editor/src/embed/post-message-protocol.ts + public/oidc-silent-callback.html listed explicitly
  • R12: tenant_unknown wire contract specified (status, body shape, code field); tenant_archived parallel
  • R13: auth-hook catch path rewrite stated explicitly with sample code
  • R14: universal “last gate before mint” rule covers all 4 mint paths
  • R15: tenant_archive_rejections_total{operation} counter — NOT under auth_failures_total
  • R16: archived tenants mint successfully (reads allowed); only unknown tenants 403 at mint
  • R17: global error-handler branch covers non-auth-hook throw sites
  • R18: resolveTenant is async, every call site uses await