Files
pitchfast/backlog/tasks/task-9 - Integrate-PageSpeed-Insights-into-internal-audits.md

9.3 KiB

id, title, status, assignee, created_date, updated_date, labels, dependencies, references, priority, ordinal
id title status assignee created_date updated_date labels dependencies references priority ordinal
TASK-9 Integrate PageSpeed Insights into internal audits Done
2026-06-03 19:13 2026-06-04 20:12
mvp
audit
pagespeed
TASK-8
PRD.md
medium 9000

Description

Add Google PageSpeed Insights as an objective internal audit signal. The system should run mobile and desktop checks when possible, store raw results internally, and expose only plain-language implications to the later LLM/text pipeline rather than customer-facing scores.

Acceptance Criteria

  • #1 PageSpeed API runs for mobile and desktop strategies for qualified website leads
  • #2 Raw PageSpeed/Lighthouse response data is stored internally in Convex
  • #3 Key metrics are normalized for downstream analysis without exposing scores on customer audit pages
  • #4 Failures, quota errors, and unavailable pages are recorded without failing the entire audit pipeline
  • #5 Generated audit inputs translate technical signals into customer-impact language for later text generation

Implementation Plan

  1. Worker A: write RED/GREEN pure PageSpeed client and normalization tests plus implementation in lib/pagespeed-insights.ts.
  2. Worker B: write RED/GREEN Convex schema and persistence contract tests plus pageSpeed mutation module.
  3. Worker C: write RED/GREEN PageSpeed action queue/process source tests plus Node action implementation.
  4. Worker D: write RED/GREEN audit input/public-safety tests plus internal plain-language audit input helper.
  5. Orchestrator: run integration verification, resolve conflicts via agents, update acceptance criteria, and leave TASK-9 open for user confirmation.

Implementation Notes

2026-06-04: Implementation started on branch codex-task-9-pagespeed-insights. Wave 1 dispatched with gpt-5.3-codex-spark: Worker A owns lib/pagespeed-insights.ts + tests/pagespeed-insights.test.ts; Worker B owns Convex PageSpeed schema/persistence contracts in convex/schema.ts, convex/domain.ts, convex/pageSpeed.ts, and related tests. Orchestrator remains coordination/review only.

2026-06-04T19:40Z: Implemented and validated lib/pagespeed-insights.ts with request URL builder, normalizer, fetch helper, and error classifier. Added tests/pagespeed-insights.test.ts with RED→GREEN coverage (URL contract, normalization, error classification, injected fetch, offline-only assertions).

Wave 1 complete: Worker A delivered pure PageSpeed URL/client/normalizer tests and implementation; Worker B delivered Convex pageSpeedResults schema and internal persistence queue/start/persist/finish module. Worker A concern noted: targeted pnpm test args are incompatible with project script, but isolated compiled test and tsconfig.test passed.

2026-06-04T19:50Z: Worker D taking subtask for TDD implementation of lib/pagespeed-audit-input.ts and tests/pagespeed-audit-input.test.ts (score-free German customer-implication generator).

Implementation complete pending user confirmation. Built PageSpeed API client/normalizer, Convex pageSpeedResults raw-storage persistence, internal audit run queue/action for mobile+desktop, post-website-enrichment scheduling, per-strategy error recording, raw payload size guard, score-free audit input translator, and public-output sanitization. Review findings addressed: malformed JSON 200 responses now fail as api_error; PageSpeed action has outer failure guard; oversized raw payloads fail per strategy; audit inputs strip URLs/markup/JSON/raw score artifacts. Final verification passed: pnpm test (155/155); pnpm exec tsc -p tsconfig.json --pretty false; pnpm lint (0 errors, existing generated BetterAuth warnings only); pnpm exec convex codegen --dry-run --typecheck enable (rerun outside sandbox after DNS ENOTFOUND). TASK-9 remains In Progress until user confirms manual acceptance.

2026-06-04: Follow-up from manual test: PageSpeed failed with generic api_error summary "PageSpeed-API lieferte einen Fehler." Root cause at diagnostics layer: HTTP 4xx/5xx classifier discarded Google error.message/error_message/runtimeError details. Added RED/GREEN regression coverage and now preserves Google API error messages in PageSpeedError summaries. Verification passed: pnpm test (156/156); pnpm exec tsc -p tsconfig.json --pretty false; pnpm lint (0 errors, existing generated BetterAuth warnings only).

2026-06-04: Manual test follow-up: after API key renewal, PageSpeed failed with timeout. Root cause: convex/pageSpeedAction.ts used hardcoded 10_000ms timeout, too short for PageSpeed Insights. Added PAGESPEED_TIMEOUT_MS env support with 60_000ms default and 10_000-120_000 clamp; fetchPageSpeedResult now receives resolved timeout. Updated README/.env.example. Verification passed: pnpm test (159/159); pnpm exec tsc -p tsconfig.json --pretty false; pnpm lint (0 errors, existing generated BetterAuth warnings only).

2026-06-04: Systematic debugging follow-up for recurring PageSpeed timeout/unknown reports. Root cause after increasing timeout was not another HTTP timeout: PageSpeed responses reached the action, but Convex rejected the success payload because normalized included extra normalizer-only fields (strategy, sourceUrl, finalUrl, analysisTimestamp) not allowed by the persistPageSpeedResult validator. Spark Worker A added toPersistedPageSpeedNormalizedResult in convex/pageSpeedAction.ts and success persistence now stores only scores, metrics, opportunities, and implications, with finalUrl kept top-level. Spark Reviewer B confirmed the mapping matches convex/pageSpeed.ts and convex/schema.ts. Verification passed: pnpm test (160/160), pnpm exec tsc -p tsconfig.json --pretty false, pnpm lint (0 errors, two pre-existing generated BetterAuth warnings), pnpm exec convex dev --once --typecheck enable. Real dev retry for lead jx7cnezm2xg7b2xr2gfmqyeg5h881m2d on run j972t5ra323rgax4a7ycsbrtzd881m8n confirmed desktop persisted successfully with raw storage and normalized keys [implications, metrics, opportunities, scores]; mobile failed separately with a genuine Google/Lighthouse api_error: "Lighthouse returned error: Something went wrong." A subsequent clean rerun could not start because the lead had been manually deleted during Convex cleanup. TASK-9 remains In Progress pending user manual acceptance.

2026-06-04: Manual retest update from user: mobile PageSpeed produced one transient api_error ("Lighthouse returned error: Something went wrong.") and succeeded three times. This confirms the previous timeout/unknown/validator failure is no longer recurring; remaining failure mode is an intermittent Google/Lighthouse strategy-level error that is recorded without breaking the pipeline. TASK-9 remains In Progress until explicit user confirmation to close.

2026-06-04: Follow-up opened from user manual testing: PageSpeed should still be triggered when website enrichment fails but the lead has a website URL. Initial trace shows current queueLeadPageSpeedAudit call is only in the successful enrichment path after persistence; fatal failure paths finish/patch the website enrichment run without queueing PageSpeed. Keeping TASK-9 In Progress.

Started minimal PAGE-SPEED queueing fix for processLeadEnrichment failure paths; targeting invalid-URL guard + outer catch to queue PageSpeed before return and keep existing success queue/warn semantics.

Implemented PASS for processLeadEnrichment missing failure-path queueing: added queue+warning fallback in !rootUrl branch and fatal outer catch when started exists; kept success path queue behavior. Verified with: pnpm exec tsc -p tsconfig.test.json && pnpm exec node --test .test-output/tests/website-enrichment-action.test.js (20 pass).

2026-06-04: Follow-up fixed after manual finding that PageSpeed was not triggered when website enrichment failed. Added RED regression tests for both failure paths in tests/website-enrichment-action.test.ts: invalid URL failure and fatal catch path must queue internal.pageSpeed.queueLeadPageSpeedAudit with leadId started.lead._id and parentRunId runId before returning. Spark GREEN worker updated convex/websiteEnrichmentAction.ts so invalid-url and fatal failure paths queue PageSpeed with warning-safe handling; success path remains queued before success finish. Refactor pass restored guard-style structure and fixed test helper source parameter usage. Verification passed: pnpm test (162/162), pnpm exec tsc -p tsconfig.json --pretty false, pnpm lint (0 errors, two pre-existing generated BetterAuth warnings), pnpm exec convex dev --once --typecheck enable. TASK-9 remains In Progress pending manual acceptance.

Final Summary

Integrated Google PageSpeed Insights into the internal audit pipeline. Added mobile and desktop PageSpeed queue/action processing, raw Convex storage, normalized metrics, score-free customer-impact audit inputs, resilient per-strategy failure recording, API diagnostics, configurable timeout, and follow-up fixes from manual testing: persisted normalized payload shape now matches Convex validators and PageSpeed is triggered even when website enrichment fails for a lead with a website URL. Verified with pnpm test, TypeScript, lint, Convex dev deploy, and user manual retests.