Integrate PageSpeed Insights audits

This commit is contained in:
2026-06-04 22:12:59 +02:00
parent 99d61ac736
commit f0a948aec9
19 changed files with 3755 additions and 12 deletions

View File

@@ -1,9 +1,10 @@
---
id: TASK-9
title: Integrate PageSpeed Insights into internal audits
status: To Do
status: Done
assignee: []
created_date: '2026-06-03 19:13'
updated_date: '2026-06-04 20:12'
labels:
- mvp
- audit
@@ -24,19 +25,55 @@ Add Google PageSpeed Insights as an objective internal audit signal. The system
## Acceptance Criteria
<!-- AC:BEGIN -->
- [ ] #1 PageSpeed API runs for mobile and desktop strategies for qualified website leads
- [ ] #2 Raw PageSpeed/Lighthouse response data is stored internally in Convex
- [ ] #3 Key metrics are normalized for downstream analysis without exposing scores on customer audit pages
- [ ] #4 Failures, quota errors, and unavailable pages are recorded without failing the entire audit pipeline
- [ ] #5 Generated audit inputs translate technical signals into customer-impact language for later text generation
- [x] #1 PageSpeed API runs for mobile and desktop strategies for qualified website leads
- [x] #2 Raw PageSpeed/Lighthouse response data is stored internally in Convex
- [x] #3 Key metrics are normalized for downstream analysis without exposing scores on customer audit pages
- [x] #4 Failures, quota errors, and unavailable pages are recorded without failing the entire audit pipeline
- [x] #5 Generated audit inputs translate technical signals into customer-impact language for later text generation
<!-- AC:END -->
## Implementation Plan
<!-- SECTION:PLAN:BEGIN -->
1. Add PageSpeed API client using environment/Convex secrets.
2. Run mobile and desktop analysis for the lead domain or final URL.
3. Normalize key findings such as load speed, mobile/desktop gap, SEO, accessibility, and best-practice hints.
4. Store raw and normalized results in Convex.
5. Add error handling and dashboard-visible status for quota, timeout, and API failures.
1. Worker A: write RED/GREEN pure PageSpeed client and normalization tests plus implementation in lib/pagespeed-insights.ts.
2. Worker B: write RED/GREEN Convex schema and persistence contract tests plus pageSpeed mutation module.
3. Worker C: write RED/GREEN PageSpeed action queue/process source tests plus Node action implementation.
4. Worker D: write RED/GREEN audit input/public-safety tests plus internal plain-language audit input helper.
5. Orchestrator: run integration verification, resolve conflicts via agents, update acceptance criteria, and leave TASK-9 open for user confirmation.
<!-- SECTION:PLAN:END -->
## Implementation Notes
<!-- SECTION:NOTES:BEGIN -->
2026-06-04: Implementation started on branch codex-task-9-pagespeed-insights. Wave 1 dispatched with gpt-5.3-codex-spark: Worker A owns lib/pagespeed-insights.ts + tests/pagespeed-insights.test.ts; Worker B owns Convex PageSpeed schema/persistence contracts in convex/schema.ts, convex/domain.ts, convex/pageSpeed.ts, and related tests. Orchestrator remains coordination/review only.
2026-06-04T19:40Z: Implemented and validated lib/pagespeed-insights.ts with request URL builder, normalizer, fetch helper, and error classifier. Added tests/pagespeed-insights.test.ts with RED→GREEN coverage (URL contract, normalization, error classification, injected fetch, offline-only assertions).
Wave 1 complete: Worker A delivered pure PageSpeed URL/client/normalizer tests and implementation; Worker B delivered Convex pageSpeedResults schema and internal persistence queue/start/persist/finish module. Worker A concern noted: targeted pnpm test args are incompatible with project script, but isolated compiled test and tsconfig.test passed.
2026-06-04T19:50Z: Worker D taking subtask for TDD implementation of lib/pagespeed-audit-input.ts and tests/pagespeed-audit-input.test.ts (score-free German customer-implication generator).
Implementation complete pending user confirmation. Built PageSpeed API client/normalizer, Convex pageSpeedResults raw-storage persistence, internal audit run queue/action for mobile+desktop, post-website-enrichment scheduling, per-strategy error recording, raw payload size guard, score-free audit input translator, and public-output sanitization. Review findings addressed: malformed JSON 200 responses now fail as api_error; PageSpeed action has outer failure guard; oversized raw payloads fail per strategy; audit inputs strip URLs/markup/JSON/raw score artifacts. Final verification passed: pnpm test (155/155); pnpm exec tsc -p tsconfig.json --pretty false; pnpm lint (0 errors, existing generated BetterAuth warnings only); pnpm exec convex codegen --dry-run --typecheck enable (rerun outside sandbox after DNS ENOTFOUND). TASK-9 remains In Progress until user confirms manual acceptance.
2026-06-04: Follow-up from manual test: PageSpeed failed with generic api_error summary "PageSpeed-API lieferte einen Fehler." Root cause at diagnostics layer: HTTP 4xx/5xx classifier discarded Google error.message/error_message/runtimeError details. Added RED/GREEN regression coverage and now preserves Google API error messages in PageSpeedError summaries. Verification passed: pnpm test (156/156); pnpm exec tsc -p tsconfig.json --pretty false; pnpm lint (0 errors, existing generated BetterAuth warnings only).
2026-06-04: Manual test follow-up: after API key renewal, PageSpeed failed with timeout. Root cause: convex/pageSpeedAction.ts used hardcoded 10_000ms timeout, too short for PageSpeed Insights. Added PAGESPEED_TIMEOUT_MS env support with 60_000ms default and 10_000-120_000 clamp; fetchPageSpeedResult now receives resolved timeout. Updated README/.env.example. Verification passed: pnpm test (159/159); pnpm exec tsc -p tsconfig.json --pretty false; pnpm lint (0 errors, existing generated BetterAuth warnings only).
2026-06-04: Systematic debugging follow-up for recurring PageSpeed timeout/unknown reports. Root cause after increasing timeout was not another HTTP timeout: PageSpeed responses reached the action, but Convex rejected the success payload because `normalized` included extra normalizer-only fields (`strategy`, `sourceUrl`, `finalUrl`, `analysisTimestamp`) not allowed by the persistPageSpeedResult validator. Spark Worker A added `toPersistedPageSpeedNormalizedResult` in convex/pageSpeedAction.ts and success persistence now stores only `scores`, `metrics`, `opportunities`, and `implications`, with `finalUrl` kept top-level. Spark Reviewer B confirmed the mapping matches convex/pageSpeed.ts and convex/schema.ts. Verification passed: pnpm test (160/160), pnpm exec tsc -p tsconfig.json --pretty false, pnpm lint (0 errors, two pre-existing generated BetterAuth warnings), pnpm exec convex dev --once --typecheck enable. Real dev retry for lead jx7cnezm2xg7b2xr2gfmqyeg5h881m2d on run j972t5ra323rgax4a7ycsbrtzd881m8n confirmed desktop persisted successfully with raw storage and normalized keys [implications, metrics, opportunities, scores]; mobile failed separately with a genuine Google/Lighthouse api_error: "Lighthouse returned error: Something went wrong." A subsequent clean rerun could not start because the lead had been manually deleted during Convex cleanup. TASK-9 remains In Progress pending user manual acceptance.
2026-06-04: Manual retest update from user: mobile PageSpeed produced one transient api_error ("Lighthouse returned error: Something went wrong.") and succeeded three times. This confirms the previous timeout/unknown/validator failure is no longer recurring; remaining failure mode is an intermittent Google/Lighthouse strategy-level error that is recorded without breaking the pipeline. TASK-9 remains In Progress until explicit user confirmation to close.
2026-06-04: Follow-up opened from user manual testing: PageSpeed should still be triggered when website enrichment fails but the lead has a website URL. Initial trace shows current queueLeadPageSpeedAudit call is only in the successful enrichment path after persistence; fatal failure paths finish/patch the website enrichment run without queueing PageSpeed. Keeping TASK-9 In Progress.
Started minimal PAGE-SPEED queueing fix for processLeadEnrichment failure paths; targeting invalid-URL guard + outer catch to queue PageSpeed before return and keep existing success queue/warn semantics.
Implemented PASS for processLeadEnrichment missing failure-path queueing: added queue+warning fallback in !rootUrl branch and fatal outer catch when started exists; kept success path queue behavior. Verified with: pnpm exec tsc -p tsconfig.test.json && pnpm exec node --test .test-output/tests/website-enrichment-action.test.js (20 pass).
2026-06-04: Follow-up fixed after manual finding that PageSpeed was not triggered when website enrichment failed. Added RED regression tests for both failure paths in tests/website-enrichment-action.test.ts: invalid URL failure and fatal catch path must queue internal.pageSpeed.queueLeadPageSpeedAudit with leadId started.lead._id and parentRunId runId before returning. Spark GREEN worker updated convex/websiteEnrichmentAction.ts so invalid-url and fatal failure paths queue PageSpeed with warning-safe handling; success path remains queued before success finish. Refactor pass restored guard-style structure and fixed test helper source parameter usage. Verification passed: pnpm test (162/162), pnpm exec tsc -p tsconfig.json --pretty false, pnpm lint (0 errors, two pre-existing generated BetterAuth warnings), pnpm exec convex dev --once --typecheck enable. TASK-9 remains In Progress pending manual acceptance.
<!-- SECTION:NOTES:END -->
## Final Summary
<!-- SECTION:FINAL_SUMMARY:BEGIN -->
Integrated Google PageSpeed Insights into the internal audit pipeline. Added mobile and desktop PageSpeed queue/action processing, raw Convex storage, normalized metrics, score-free customer-impact audit inputs, resilient per-strategy failure recording, API diagnostics, configurable timeout, and follow-up fixes from manual testing: persisted normalized payload shape now matches Convex validators and PageSpeed is triggered even when website enrichment fails for a lead with a website URL. Verified with pnpm test, TypeScript, lint, Convex dev deploy, and user manual retests.
<!-- SECTION:FINAL_SUMMARY:END -->