docs(plans): add image pipeline gpu wasm webgl design
This commit is contained in:
144
docs/plans/2026-04-04-image-pipeline-gpu-wasm-webgl-design.md
Normal file
144
docs/plans/2026-04-04-image-pipeline-gpu-wasm-webgl-design.md
Normal file
@@ -0,0 +1,144 @@
|
|||||||
|
# Image Pipeline GPU/WASM/WebGL Design
|
||||||
|
|
||||||
|
## Goal
|
||||||
|
|
||||||
|
Migrate the current CPU pixel-loop image pipeline to an incremental hybrid backend (WebGL first, optional WASM kernels, guarded WebGPU experiments) without breaking the existing worker fallback and cancellation semantics.
|
||||||
|
|
||||||
|
## Current Branch Context
|
||||||
|
|
||||||
|
- This branch already reduced duplicate preview work (shared in-flight preview requests, preview coalescing, opt-in histogram behavior).
|
||||||
|
- `lib/image-pipeline/worker-client.ts` is already the stable orchestration seam for worker-first execution with main-thread fallback.
|
||||||
|
- `lib/image-pipeline/preview-renderer.ts` and `lib/image-pipeline/bridge.ts` currently depend on CPU `ImageData` mutation, so backend switching must preserve existing return contracts (`ImageData` for preview, encoded `Blob` for full render).
|
||||||
|
|
||||||
|
## Codebase Mapping Findings
|
||||||
|
|
||||||
|
### `lib/image-pipeline/render-core.ts`
|
||||||
|
|
||||||
|
- Contains all effect kernels as CPU loops over `Uint8ClampedArray` (`curves`, `color-adjust`, `light-adjust`, `detail-adjust`).
|
||||||
|
- `applyPipelineStep` is a strong integration seam: per-step dispatch already exists and can route by backend capability.
|
||||||
|
- Abort checks are periodic and lightweight; migration must preserve `AbortError` behavior.
|
||||||
|
|
||||||
|
### `lib/image-pipeline/preview-renderer.ts`
|
||||||
|
|
||||||
|
- Flow today: decode source bitmap -> draw 2D canvas -> `getImageData` -> apply step loops -> optional histogram.
|
||||||
|
- This file owns preview sizing and output shape; it should remain the public API while delegating kernel execution to a backend adapter.
|
||||||
|
- The yield loop (`requestAnimationFrame`/`setTimeout`) is CPU-friendly but should become backend-aware (GPU path should avoid unnecessary per-step yielding).
|
||||||
|
|
||||||
|
### `lib/image-pipeline/worker-client.ts`
|
||||||
|
|
||||||
|
- Central execution coordinator with worker transport, request dedupe, cancel fan-out, and main-thread fallback.
|
||||||
|
- Best place to inject backend telemetry and backend selection hints (for rollout and safety).
|
||||||
|
- Existing fallback logic (worker unavailable -> main thread) maps directly to backend fallback ladder and kill-switch controls.
|
||||||
|
|
||||||
|
## Architecture Choice
|
||||||
|
|
||||||
|
Adopt a **Backend Adapter + Capability Router** architecture:
|
||||||
|
|
||||||
|
1. Keep existing preview/full public APIs and message formats stable.
|
||||||
|
2. Introduce an internal backend interface for step execution and pixel transport.
|
||||||
|
3. Select backend at runtime by capability + feature flag + step support matrix.
|
||||||
|
4. Preserve CPU backend as correctness source and terminal fallback.
|
||||||
|
|
||||||
|
### Integration Seam
|
||||||
|
|
||||||
|
- New seam: `executePipelineSteps()` in a backend-router module called by both `preview-renderer.ts` and `bridge.ts`.
|
||||||
|
- `render-core.ts` becomes the CPU backend implementation instead of the global default path.
|
||||||
|
- Worker protocol receives optional diagnostics metadata (selected backend, fallback reason), not required for correctness.
|
||||||
|
|
||||||
|
### Fallback Ladder (authoritative order)
|
||||||
|
|
||||||
|
1. **WebGL2 worker backend** (preferred for preview/full when supported and flag-enabled)
|
||||||
|
2. **WASM SIMD kernel backend** (for unsupported GPU steps or devices with poor GPU availability)
|
||||||
|
3. **CPU worker backend** (`render-core.ts` current path)
|
||||||
|
4. **CPU main-thread fallback** (existing `worker-client.ts` behavior)
|
||||||
|
|
||||||
|
Each downgrade records a structured reason (`unsupported_api`, `shader_compile_failed`, `capability_mismatch`, `flag_disabled`, `runtime_error`) so rollout decisions are data-driven.
|
||||||
|
|
||||||
|
## Context7 Guidance Buckets
|
||||||
|
|
||||||
|
### Safe now
|
||||||
|
|
||||||
|
- **WebGL2 shader pipeline** for color/curve/light/detail kernels with runtime feature detection.
|
||||||
|
- **OffscreenCanvas in worker** where available, with same-origin + secure-context guardrails.
|
||||||
|
- **WASM SIMD (single-threaded)** behind feature detection (`WebAssembly.validate` on SIMD module or runtime probe).
|
||||||
|
- **Worker-first execution with transferable buffers** (already used in worker client).
|
||||||
|
|
||||||
|
### Experimental
|
||||||
|
|
||||||
|
- **WebGPU compute path** (`navigator.gpu` / `WorkerNavigator.gpu`): gate behind an explicit experiment flag and per-browser allowlist.
|
||||||
|
- **Step fusion and mixed GPU+WASM scheduling**: only after baseline correctness/perf metrics are stable.
|
||||||
|
- **Advanced WebGL packing optimizations** (multi-pass fusion, half-float intermediates) after parity suite exists.
|
||||||
|
|
||||||
|
### Not now
|
||||||
|
|
||||||
|
- **WASM threads + SharedArrayBuffer requirement**: requires cross-origin isolation (COOP/COEP) and can impact app embedding/3rd-party integrations.
|
||||||
|
- **WebGPU-first default backend**: browser/feature variability still too high for this repository's reliability target.
|
||||||
|
- **Full rewrite of decode/encode stack around custom codecs**: exceeds scope; current `drawImage` + canvas/blob path remains.
|
||||||
|
|
||||||
|
## Phased Rollout
|
||||||
|
|
||||||
|
### Phase 0: Foundation + Instrumentation (no behavior change)
|
||||||
|
|
||||||
|
- Add backend router with CPU-only implementation.
|
||||||
|
- Add telemetry points for preview/full latency, fallback reason, and error class.
|
||||||
|
- Add golden-image parity harness for all 4 current step types.
|
||||||
|
|
||||||
|
### Phase 1: WebGL PoC (preview only)
|
||||||
|
|
||||||
|
- Implement WebGL2 backend for `curves` and `color-adjust` first.
|
||||||
|
- Enable by dev flag on preview pipeline only.
|
||||||
|
- Validate output parity against CPU baseline under tolerance thresholds.
|
||||||
|
|
||||||
|
### Phase 2: WebGL expansion + guarded production rollout
|
||||||
|
|
||||||
|
- Add `light-adjust` and `detail-adjust` kernels.
|
||||||
|
- Enable progressive rollout (1% -> 10% -> 25% -> 50% -> 100%) with kill switch.
|
||||||
|
- Keep full render on CPU backend until preview parity and crash rate are stable.
|
||||||
|
|
||||||
|
### Phase 3: WASM SIMD fallback kernels
|
||||||
|
|
||||||
|
- Add WASM SIMD implementation for hot CPU kernels as middle rung between WebGL and CPU.
|
||||||
|
- Use when WebGL unavailable or disabled, still inside worker.
|
||||||
|
|
||||||
|
### Phase 4: Full-render backend adoption
|
||||||
|
|
||||||
|
- Route full render through same backend router.
|
||||||
|
- Keep output encode path unchanged.
|
||||||
|
- Roll out in smaller increments than preview due to export-critical path.
|
||||||
|
|
||||||
|
### Phase 5: WebGPU experiment track
|
||||||
|
|
||||||
|
- Optional experiment branch with strict allowlist and observability.
|
||||||
|
- No default enablement in this plan.
|
||||||
|
|
||||||
|
## Risk Controls
|
||||||
|
|
||||||
|
- Feature flags:
|
||||||
|
- `imagePipeline.backend.webgl.enabled`
|
||||||
|
- `imagePipeline.backend.wasm.enabled`
|
||||||
|
- `imagePipeline.backend.webgpu.experiment`
|
||||||
|
- `imagePipeline.backend.forceCpu`
|
||||||
|
- Auto-disable circuit breaker when fallback/error thresholds exceed SLO windows.
|
||||||
|
- Golden-image regression suite enforced in CI before any rollout increase.
|
||||||
|
- Keep worker protocol backward compatible until migration fully lands.
|
||||||
|
|
||||||
|
## Metrics and Success Criteria
|
||||||
|
|
||||||
|
- Preview latency (`p50`, `p95`) for representative step stacks and image sizes.
|
||||||
|
- Full render latency (`p50`, `p95`) for default export sizes.
|
||||||
|
- Main-thread long task count during rapid slider edits.
|
||||||
|
- Backend selection distribution (% webgl/wasm/cpu/main-thread).
|
||||||
|
- Fallback and runtime error rate per backend.
|
||||||
|
- Pixel parity drift (max absolute channel delta and histogram similarity).
|
||||||
|
|
||||||
|
Success target for initial migration:
|
||||||
|
|
||||||
|
- >=30% preview `p95` improvement on supported devices.
|
||||||
|
- No increase in user-visible render failures.
|
||||||
|
- <=1% forced downgrade due to runtime backend errors after rollout stabilization.
|
||||||
|
|
||||||
|
## Out of Scope
|
||||||
|
|
||||||
|
- New visual adjustment features.
|
||||||
|
- Server-side/offline rendering architecture changes.
|
||||||
|
- COOP/COEP rollout for SAB-threaded WASM.
|
||||||
Reference in New Issue
Block a user