Files
webdev-pipeline/backlog/tasks/task-43 - Stabilisiere-Website-Enrichment-ohne-Playwright-Abbruch.md

5.4 KiB

id, title, status, assignee, created_date, updated_date, labels, dependencies, priority, ordinal
id title status assignee created_date updated_date labels dependencies priority ordinal
TASK-43 Stabilisiere Website-Enrichment ohne Playwright-Abbruch In Progress
2026-06-07 19:40 2026-06-07 20:57
high 45000

Description

Investigate and fix the Convex websiteEnrichmentAction crash where Playwright/Chromium closes during lead enrichment after a new lead is created. The action should not fail the lead pipeline when browser-based enrichment crashes.

Acceptance Criteria

  • #1 The root cause and affected call path are documented in task notes
  • #2 Lead enrichment degrades gracefully when browser/page/context is closed
  • #3 Regression tests cover the browser-closed failure path or removal of Playwright dependency
  • #4 Relevant verification commands pass

Implementation Plan

  1. Reproduce and trace the browser-closed failure path in websiteEnrichmentAction
  2. Compare with existing graceful-failure paths and Convex action constraints
  3. Add a RED regression test for page/context/browser closed during page capture
  4. Delegate a minimal fix that degrades enrichment instead of crashing
  5. Run focused and full verification; leave task In Progress until Matthias confirms Done

Implementation Notes

Root-cause investigation: The reported Convex log is from internal action websiteEnrichmentAction:processLeadEnrichment, not auditGenerationAction. The action still launches Playwright/Chromium for legacy lead website enrichment. The log shows navigation reached the target page multiple times, then Playwright threw Target page, context or browser has been closed. Current code has an outer catch, but the outer finally closes desktopContext/mobileContext/browser without protection; if a resource is already closed, cleanup can throw after the catch and surface as Convex Uncaught Error. Helper-level page.close() calls are also unprotected and can obscure the original browser failure. Hypothesis: cleanup must be best-effort and browser/page instability should finish the run as failed/degraded, queue PageSpeed if possible, and patch lead reason instead of crashing the action runtime.

TASK-43 Worker update: Website-Enrichment-only fix. RED test added in tests/website-enrichment-action.test.ts for best-effort Playwright cleanup; initial focused run failed on missing isPlaywrightTargetClosedError/closePlaywrightResourceSafely contract. Minimal fix in convex/websiteEnrichmentAction.ts adds isPlaywrightTargetClosedError and closePlaywrightResourceSafely; page.close(), desktopContext.close(), mobileContext.close(), and browser.close() now run through the safe helper. Target/page/context/browser closed cleanup errors are swallowed so the existing action catch/failure path can persist failed runs, queue PageSpeed when possible, and patch lead reason. Unexpected cleanup close failures are swallowed with console.warn. No AuditGeneration, ScreenshotOne, or Jina slices touched by this TASK-43 change. Verification: pnpm test -- tests/website-enrichment-action.test.ts passed after RED/GREEN (386 pass, 0 fail); pnpm exec tsc --noEmit passed; pnpm lint passed with 2 existing generated-file warnings in convex/betterAuth/_generated; pnpm test passed (364 pass, 0 fail); git diff --check passed.

Live follow-up 2026-06-07 22:34 CEST: Audit generation now succeeds, but website_enrichment still fails before useful extraction when TASK8_BROWSER_ASSET_URL / Chromium source is not configured. New objective for this task slice: remove the Chromium/Playwright hard requirement by adding a no-browser enrichment path, or otherwise prevent the website_enrichment run from failing solely because no browser asset is configured.

Follow-up fix: The live Convex run j9737mz0tkgdbg6mzjxjd1w7018878b1 failed because processLeadEnrichment still treated missing TASK8_BROWSER_ASSET_URL / Chromium source as a fatal Playwright bootstrap error. Added a browserless fetch fallback in convex/websiteEnrichmentAction.ts: when no Chromium source is configured, the action records a warning, fetches homepage/relevant static subpages directly with bounded response reads, extracts metadata/links/contact candidates via the existing website-crawler helpers, persists websiteCrawlPages/websiteCrawlLinks/websiteEmailCandidates/websiteTechnicalChecks with screenshots=[], patches the lead, queues PageSpeed, and finishes website_enrichment as succeeded if direct crawl succeeds. Existing Playwright path remains available when Chromium is configured. Regression source tests now cover the no-Chromium branch and browserless persistence. Verification: pnpm test -- tests/website-enrichment-action.test.ts passed; pnpm exec tsc -p convex/tsconfig.json --pretty false passed; pnpm exec tsc -p tsconfig.json --pretty false passed; pnpm test passed (368/368); pnpm lint passed with 2 existing generated BetterAuth warnings; git diff --check passed.

Final verification after robustness cleanup: pnpm test -- tests/website-enrichment-action.test.ts passed (392/392 in focused harness); pnpm exec tsc -p convex/tsconfig.json --pretty false passed; pnpm exec tsc -p tsconfig.json --pretty false passed; git diff --check passed; pnpm test passed (368/368); pnpm lint passed with the same two generated BetterAuth unused-disable warnings and 0 errors.