Pulp Engine Document Rendering
Get started
Release v0.78.1

Release v0.78.1 — pdf-transform base64 validation + FileAssetStore corruption recovery

Date: 2026-05-07 Tag: v0.78.1

Summary

Bug-fix patch on the v0.78.0 line. Two production-code resilience fixes plus one post-tag documentation cleanup. No contract changes, no new features, no schema regen.

  1. pdf-transform malformed-base64 validation. The pre-v0.78.1 routes wrapped each Buffer.from(..., 'base64') decode in a try/catch hoping to surface a documented 400. Those catches were dead — Node’s decoder silently truncates malformed input rather than throwing — so garbage like "@@@not-base64@@@" fell through to invalid_pdf 422 downstream. v0.78.1 makes the OpenAPI 400 path actually reachable. Listed under “Known residual” in docs/release-v0.78.0.md and explicitly logged as the first v0.78.1 candidate.
  2. FileAssetStore NUL-truncation recovery. Atomic-write interruption on Windows (process kill, OS sleep, antivirus) can leave .assets-index.json zero-padded with literal NUL bytes. Pre-v0.78.1 the next read threw SyntaxError and blocked the store until the file was manually deleted. v0.78.1 narrows recovery to that exact shape — NUL-only or NUL-padded-then-whitespace — and rethrows on any other parse failure so torn writes with partial asset metadata do not get silently overwritten.
  3. docs/release-v0.78.0.md CI-verified subsection replaced with concrete workflow run links and an honest iteration history (all four release-prep SHAs, including the wrong-diagnosis fix-forward that flake-passed on its own CI run). Applies to main only — the v0.78.0 tag remains pinned at 2c3d447 with the original placeholder text per tag-immutability.

What landed

pdf-transform base64 validation

  • New isValidBase64 helper at apps/api/src/routes/render/pdf-transform.ts. Validation is alphabet-only after stripping whitespace — ^[A-Za-z0-9+/]*={0,2}$ against the NUL-stripped, whitespace-stripped input. Deliberately does NOT enforce strict length-mod-4: Node’s decoder accepts unpadded input and most stdlib base64 encoders (Node Buffer.toString('base64'), Python base64.b64encode, JS btoa, etc.) produce padded output by default anyway, so the strict-length constraint would tighten the public input contract for no useful end. URL-safe base64url (-_ alphabet) is not accepted — it never was pre-v0.78.1 either.
  • Four dead try/catch sites replaced with explicit if (!isValidBase64(...)) guards: pdf-merge sources[N].data, pdf-watermark pdf body, pdf-watermark watermark.data (image), pdf-insert target, pdf-insert insert. Per-site error messages now name the offending field instead of the previous generic ‘Invalid base64 encoding’.
  • 5 new tests in apps/api/src/__tests__/render-pdf-transform.test.ts — one per validation site, each asserting 400 / error: 'Bad Request' / code: 'bad_request' / message contains the field name.
  • 1 regression-lock test asserting unpadded base64 ('SGVsbG8' for 'Hello') reaches downstream and fails as 422 invalid_pdf — NOT 400. Locks the relaxed-input contract so a future tightening cannot slip through silently.

FileAssetStore NUL-truncation recovery

  • New isNulTruncated helper at apps/api/src/storage/file/file-asset.store.ts. Returns true only when the raw content, after stripping NUL bytes (\x00), is empty or pure whitespace. A 0-byte file returns false so a genuinely empty file still surfaces a SyntaxError — atomic-write never produces 0-byte files; a 0-byte index is unusual enough to keep loud.
  • readIndex wraps JSON.parse in try/catch and only short-circuits to { version: 1, assets: [] } when both err instanceof SyntaxError and isNulTruncated(raw) hold. A console.warn names the file path and the recovered byte count for operator visibility. Any other parse failure (torn JSON, charset mismatch, intermediate-state writes) rethrows.
  • 5 new tests in apps/api/src/__tests__/file-asset.store.test.ts:
    1. NUL-only file recovers to empty (warn fires).
    2. NUL-padded-then-whitespace recovers to empty.
    3. Torn JSON fragment ('{"version":1,"assets":[{"id":"abc","filenam') still throws SyntaxError — locks the no-silent-data-loss contract.
    4. Garbage non-NUL text ('this is not json at all') still throws.
    5. Post-recovery upload writes a clean index without re-warning — the store returns to normal operation without a manual restart.

Documentation cleanup

  • docs/release-v0.78.0.md### CI-verified subsection’s > Pending callout replaced with two per-job conclusion tables (ci.yml 9 jobs, Release workflow 7 jobs) and an iteration-history table covering 977fccf258a3bfc05efce2c3d447. The wrong-diagnosis c05efce row explicitly notes the fix-forward did not address the actual cause and was not trusted as the tag target despite passing CI on its own run (Playwright retries plus capability-resolution timing made the underlying e2e race intermittent).

Why no SDK / OpenAPI churn

The error-contract work in v0.78.0 already lit up the trusted-publisher path on the SDK workflows; their failures on the v0.78.0 push were registry-side configuration, not code regressions. v0.78.1 does not change the OpenAPI spec, the SDK error shapes, or the registry-publisher claims. The npm and PyPI publish workflows will still need their one-time registry-side trusted-publisher entries configured before the v0.78.1 SDK publishes will succeed; once the operator has done that for v0.78.0, the same entries cover v0.78.1.

Verified before tagging

Locally verified

Run against the exact release-prep SHA on main in CI opt-in mode (PULPENGINE_ALLOW_UNTAGGED_RELEASE_HEAD=1).

  • node scripts/check-version.mjs — clean. All 7 enforced manifests at 0.78.1; CHANGELOG + release-link + release note present.
  • pnpm extract-openapi -- --check — clean (no schema changes; spec unchanged from v0.78.0).
  • pnpm --filter @pulp-engine/api typecheck — clean.
  • pnpm --filter @pulp-engine/api test --run render-pdf-transform.test.ts23/23 (was 17/17; +6 cases for the new base64 validation paths).
  • pnpm --filter @pulp-engine/api test --run file-asset.store.test.ts24/24 (was 19/19; +5 cases for the NUL-recovery contract).
  • pnpm --filter @pulp-engine/api test:file1242 passed / 97 skipped / 0 failed (was 1237 on the v0.78.0 baseline; +5 from the new file-store cases).
  • pnpm verify matrix: ✅ Version consistency, ✅ Encoding guard, ✅ Lint, ✅ Build, ✅ Unit tests (file mode), ⚠️ Unit tests (postgres) — pre-existing Windows-host vitest parallel-worker contention flake, recurring across runs (40–55 fails out of ~1500 tests, run-to-run variance, runs cleanly when each suite is invoked individually). Documented as the same flake observed on the v0.78.0 line; the gating signal is CI’s clean Postgres service container, not the local Windows-host run. SQL Server tests skipped (CI-covered); E2E tests skipped (CI-covered).

CI-verified

Pending — release-prep SHA not yet pushed. This subsection is a placeholder; the actual workflow names and run links from the green release-prep commit on main belong here per docs/release-checklist.md, filled in after CI signs off and before v0.78.1 is tagged.

Not verified

  • Registry publication (npm, PyPI), GHCR images, GitHub Release assets, public mirror sync, Windows installer smoke, and signed-licence end-to-end smoke remain tag-time/post-tag checks per docs/release-checklist.md.
  • npm and PyPI Trusted Publisher entries — registry-side, operator action; same gap that affected v0.78.0’s post-tag SDK publishes.

Known residual

  • Python SDK packaging-name mismatch (docuforge/ source dir vs pulp-engine PyPI package name) — deferred. A clean rename touches every test file’s imports plus the docstring examples plus the pyproject.toml packaging declaration; needs its own design pass with a compatibility shim, not a patch-line fix.
  • OpenAPI spec inlines the closed ErrorCodeSchema union at every error response slot — same residual as v0.78.0. Compaction via Type.Ref() is a separate refactor, not a contract change, and stays in the future-spec-cleanliness backlog.
  • Pre-v0.78.0 pulp-engine Python SDK test infra had from pulp-engine import ... at the top of every test file (literal hyphen, invalid Python). v0.78.0 fixed the two files needed to validate the error-contract changes (tests/conftest.py, tests/test_errors.pyfrom docuforge import ...); the rest stay broken pending the packaging rename above.
  • @anthropic-ai/sdk, basic-ftp, ip-address advisory pins from v0.78.0 carry forward. No new advisories surfaced during v0.78.1 prep.