Compare commits

..

4 Commits

42 changed files with 7857 additions and 553 deletions

View File

@@ -2,6 +2,14 @@
APP_ENV=development
NEXT_PUBLIC_APP_URL=http://localhost:3000
# TASK-8 Playwright
TASK8_CRAWL_TIMEOUT_MS=60000
TASK8_CRAWL_MAX_PAGES=20
TASK8_BROWSER_ASSET_URL=
# Legacy aliases (optional fallback, prefer TASK8_BROWSER_ASSET_URL):
# TASK8_CHROMIUM_EXECUTABLE_URL=
# TASK8_CHROMIUM_EXECUTABLE=
# Convex
NEXT_PUBLIC_CONVEX_URL=
CONVEX_DEPLOYMENT=
@@ -12,6 +20,7 @@ BETTER_AUTH_SECRET=
GOOGLE_GEOCODING_API_KEY=
GOOGLE_PLACES_API_KEY=
PAGESPEED_API_KEY=
PAGESPEED_TIMEOUT_MS=60000
# OpenRouter
OPENROUTER_API_KEY=

View File

@@ -23,11 +23,12 @@ Copy `.env.example` to `.env.local` for local development. Keep real secrets out
- **App / Coolify:** `APP_ENV`, `NEXT_PUBLIC_APP_URL`
- **Convex:** `NEXT_PUBLIC_CONVEX_URL`, `NEXT_PUBLIC_CONVEX_SITE_URL`, `CONVEX_DEPLOYMENT`
- **Google:** `GOOGLE_GEOCODING_API_KEY`, `GOOGLE_PLACES_API_KEY`, `PAGESPEED_API_KEY`
- **Google / Task-9 PageSpeed:** `GOOGLE_GEOCODING_API_KEY`, `GOOGLE_PLACES_API_KEY`, `PAGESPEED_API_KEY`, `PAGESPEED_TIMEOUT_MS`
- **OpenRouter:** `OPENROUTER_API_KEY`
- **SMTP / Stalwart:** `SMTP_HOST`, `SMTP_PORT`, `SMTP_USER`, `SMTP_PASSWORD`, `SMTP_FROM`
- **Rybbit:** `RYBBIT_API_URL`, `RYBBIT_API_KEY`, `NEXT_PUBLIC_RYBBIT_SITE_ID`
- **Auth:** `BETTER_AUTH_SECRET`
- **TASK-8 enrichment:** `TASK8_BROWSER_ASSET_URL`
Only variables prefixed with `NEXT_PUBLIC_` are intended for browser exposure. All API keys, SMTP credentials, and server-only URLs must stay server-side.
@@ -48,3 +49,25 @@ Only variables prefixed with `NEXT_PUBLIC_` are intended for browser exposure. A
## Deployment Notes
Coolify should run `pnpm install`, `pnpm build`, and `pnpm start`. The current font setup uses `next/font/google`, so production builds need outbound access to Google Fonts unless fonts are later self-hosted.
TASK-8 enrichment uses `playwright-core` with `@sparticuz/chromium-min` in Convex. Local `npx playwright install` is a browser-testing helper only and does not affect the Convex runtime bundle.
TASK-8 requires a browser binary source URL configured on Convex. The preferred
variable is:
- `TASK8_BROWSER_ASSET_URL` (for example your self-hosted or CDN Chromium bundle URL if you do not rely on package defaults).
For backward compatibility, the action also supports:
- `TASK8_CHROMIUM_EXECUTABLE_URL`
- `TASK8_CHROMIUM_EXECUTABLE`
If none are set, enrichment deployment/startup will fail with a clear configuration
error so no silent fallback is used.
If the URL is missing and no default is available in your environment, the enqueue action will throw a clear deploy/configuration error so enrichment does not silently fall back to a missing binary.
For TASK-8 deployment updates, run Convex restart/deploy after code changes:
- Local: `pnpm exec convex dev`
- Remote: `pnpm exec convex deploy`

View File

@@ -0,0 +1,51 @@
---
id: TASK-20
title: Convert campaigns and leads to compact cards
status: Done
assignee: []
created_date: '2026-06-04 15:01'
updated_date: '2026-06-04 15:10'
labels: []
dependencies: []
priority: high
ordinal: 22000
---
## Description
<!-- SECTION:DESCRIPTION:BEGIN -->
Update the dashboard campaign and lead review UI so campaigns render as individual cards and leads render as compact expandable cards while preserving existing Convex behavior.
<!-- SECTION:DESCRIPTION:END -->
## Acceptance Criteria
<!-- AC:BEGIN -->
- [x] #1 Campaigns page renders each campaign as its own responsive card instead of a desktop table.
- [x] #2 Leads page renders compact cards showing company/name, contact data, and priority while hiding review fields behind Mehr anzeigen.
- [x] #3 Expanded lead cards preserve all existing review fields and save/block actions.
- [x] #4 UI remains responsive without horizontal table overflow on desktop and mobile.
- [x] #5 Lint and test verification are run and results are documented.
<!-- AC:END -->
## Implementation Plan
<!-- SECTION:PLAN:BEGIN -->
1. Add/adjust tests or static checks that fail for table-based Campaigns/Leads layouts before production edits.
2. Convert CampaignsBoard from desktop table plus mobile cards to one responsive card list.
3. Convert LeadsReviewTable from table rows to compact expandable cards.
4. Run lint, tests, and browser/responsive verification.
5. Record verification notes in Backlog; wait for user confirmation before Done.
<!-- SECTION:PLAN:END -->
## Implementation Notes
<!-- SECTION:NOTES:BEGIN -->
Implemented via subagent-driven TDD. Campaigns and Leads converted from table layouts to compact cards. Added static layout regression tests for campaign cards and lead expandable cards. Verification: pnpm lint exits 0 with 2 pre-existing generated Better Auth warnings; pnpm test passes 107/107; pnpm build passes after rerun with network access for Google Fonts. Browser automation could launch only outside sandbox, but authenticated dashboard routes redirected to /login in the fresh Playwright context, so final visual validation should be done in the existing logged-in browser session.
<!-- SECTION:NOTES:END -->
## Final Summary
<!-- SECTION:FINAL_SUMMARY:BEGIN -->
Campaigns now render as responsive cards on all breakpoints. Leads now render as compact expandable cards showing company/contact/priority by default and revealing review fields/actions through Mehr anzeigen. Added regression tests for both card layouts. Verified with pnpm lint, pnpm test, and pnpm build; browser automation reached login due fresh unauthenticated context, while user confirmed the authenticated UI manually.
<!-- SECTION:FINAL_SUMMARY:END -->

View File

@@ -0,0 +1,45 @@
---
id: TASK-21
title: Replace oversized Convex browser runtime dependency
status: In Progress
assignee: []
created_date: '2026-06-04 15:30'
updated_date: '2026-06-04 16:41'
labels: []
dependencies: []
priority: high
ordinal: 23000
---
## Description
<!-- SECTION:DESCRIPTION:BEGIN -->
Reduce Convex function module size by replacing @sparticuz/chromium with a minimal serverless Chromium strategy for websiteEnrichmentAction while keeping screenshot/crawl functionality.
<!-- SECTION:DESCRIPTION:END -->
## Acceptance Criteria
<!-- AC:BEGIN -->
- [x] #1 Action no longer imports @sparticuz/chromium
- [x] #2 Convex external package list reflects the replacement
- [x] #3 Deployment guidance includes required env var and failure mode for missing browser URL
<!-- AC:END -->
## Implementation Plan
<!-- SECTION:PLAN:BEGIN -->
1. Verify existing oversized browser dependency path in Convex action and env strategy
2. Replace @sparticuz/chromium with chromium-min + runtime executable source env var
3. Validate by TS/typecheck
<!-- SECTION:PLAN:END -->
## Implementation Notes
<!-- SECTION:NOTES:BEGIN -->
Durchgeführt: Dependency-Swap auf @sparticuz/chromium-min und Nutzung von runtime executableSource aus ENV in convex/websiteEnrichmentAction.ts. convex.json ExternalPackages auf Chromium-Min aktualisiert. Konfigurierter Fehlerpfad bei fehlender Chromium-Variable.
Final verification passed after switching to @sparticuz/chromium-min with TASK8_BROWSER_ASSET_URL as primary runtime browser asset source. Convex codegen dry-run/typecheck now uploads functions successfully; previous ModulesTooLarge error is resolved.
Follow-up for repeated /tmp/chromium cannot execute binary file: Context7 confirmed chromium-min remote pack usage; local package code reuses existing /tmp/chromium. Added marker-based /tmp cache invalidation keyed by TASK8_BROWSER_ASSET_URL so architecture/source changes remove stale /tmp/chromium and /tmp/chromium-pack before executablePath(). Verification passed: pnpm exec tsc -p tsconfig.json; pnpm test (108/108); pnpm lint (existing generated BetterAuth warnings only); pnpm exec convex codegen --dry-run --typecheck enable.
Follow-up for libnspr4.so runtime error: Context7 and local @sparticuz/chromium-min docs show remote pack includes al2023.tar.br, but package only auto-inflates it when AL2023 detection fires. Convex needs those shared libs without being detected. Added explicit AL2023 shared-library preparation after executablePath(): inflate CHROMIUM_PACK_PATH/al2023.tar.br and setupLambdaEnvironment(/tmp/al2023/lib) before Playwright launch. Verification passed: pnpm exec tsc -p tsconfig.json; pnpm test (109/109); pnpm lint (existing generated BetterAuth warnings only); pnpm exec convex codegen --dry-run --typecheck enable.
<!-- SECTION:NOTES:END -->

View File

@@ -0,0 +1,40 @@
---
id: TASK-22
title: Add source assertions for Convex AL2023 Chromium lib setup
status: In Progress
assignee: []
created_date: '2026-06-04 16:37'
updated_date: '2026-06-04 16:41'
labels: []
dependencies: []
priority: high
ordinal: 24000
---
## Description
<!-- SECTION:DESCRIPTION:BEGIN -->
Add tests that fail until websiteEnrichmentAction explicitly handles AL2023 shared libs for chromium-min packaging in Convex.
<!-- SECTION:DESCRIPTION:END -->
## Acceptance Criteria
<!-- AC:BEGIN -->
- [x] #1 Test asserts chromium-min dynamic import exposes inflate/setupLambdaEnvironment or explicit LD_LIBRARY_PATH handling for /tmp/al2023/lib.
- [x] #2 Assertion checks that runtime setup runs before Playwright launch and after executablePath resolution.
<!-- AC:END -->
## Implementation Plan
<!-- SECTION:PLAN:BEGIN -->
1. Add source assertions for AL2023 runtime setup and launch ordering
2. Run focused website-enrichment action test
3. Confirm failing output and report
<!-- SECTION:PLAN:END -->
## Implementation Notes
<!-- SECTION:NOTES:BEGIN -->
Added source-only assertion in tests/website-enrichment-action.test.ts for AL2023 lib setup. Targeted run `pnpm tsc -p tsconfig.test.json && node --test .test-output/tests/website-enrichment-action.test.js` currently fails as expected on current action source (missing setup/LD_LIBRARY_PATH/al2023 archive handling).
GREEN follow-up completed: runtime action now exposes chromium-min inflate/setupLambdaEnvironment, prepares /tmp/al2023/lib after executablePath resolution and before Playwright launch, and focused/full verification passes.
<!-- SECTION:NOTES:END -->

View File

@@ -0,0 +1,35 @@
---
id: TASK-23
title: Improve website email extraction
status: In Progress
assignee: []
created_date: '2026-06-04 17:28'
updated_date: '2026-06-04 17:34'
labels: []
dependencies: []
priority: high
ordinal: 25000
---
## Description
<!-- SECTION:DESCRIPTION:BEGIN -->
Fix TASK-8 website enrichment so Playwright crawls contact/imprint/footer email patterns that are visible on crawled pages but currently missed by the extractor.
<!-- SECTION:DESCRIPTION:END -->
## Acceptance Criteria
<!-- AC:BEGIN -->
- [x] #1 Extract mailto href emails even with query parameters and labels
- [x] #2 Extract common obfuscated German website email patterns such as [at], (at), at, and spaced @/dot forms
- [x] #3 Treat emails found on Kontakt/Impressum pages or footer contact context as business contact candidates without guessing addresses
- [x] #4 Keep TASK-7 rules intact: no generated emails, named emails require explicit business context
- [x] #5 Verify with focused RED/GREEN tests and full suite
<!-- AC:END -->
## Implementation Notes
<!-- SECTION:NOTES:BEGIN -->
Updated website-crawler extractor to support mailto query stripping/decoding, HTML entity decoding for email separators, obfuscated [at]/(at)/dot/punkt and spaced @/dot forms, and expanded business-context detection for footer/impressum/contact regions. Limited to lib/website-crawler.ts only.
Implemented via subagents/TDD: added RED tests for mailto query params, obfuscated email forms, footer/impressum usability, no-guessing false-positive guard, and mailto dedupe. Extractor now decodes common HTML entities, strips/decodes mailto query strings, parses [at]/(at)/punkt/dot/spaced forms with guardrails, expands footer/impressum/contact business context, and leaves TASK-7 selection unchanged. Verification passed: pnpm exec tsc -p tsconfig.json; pnpm test (114/114); pnpm lint (existing generated BetterAuth warnings only); pnpm exec convex codegen --dry-run --typecheck enable.
<!-- SECTION:NOTES:END -->

View File

@@ -0,0 +1,50 @@
---
id: TASK-24
title: Improve crawler handling for Bock Rechtsanwaelte edge cases
status: In Progress
assignee: []
created_date: '2026-06-04 18:04'
updated_date: '2026-06-04 18:09'
labels: []
dependencies: []
priority: high
ordinal: 26000
---
## Description
<!-- SECTION:DESCRIPTION:BEGIN -->
Investigate the remaining TASK-8 case where bock-rechtsanwaelte.de/impressum contains a visible email but website enrichment misses it, and address the same-domain timeout separately if reproducible.
<!-- SECTION:DESCRIPTION:END -->
## Acceptance Criteria
<!-- AC:BEGIN -->
- [x] #1 Reproduce the missing email against the public impressum page or captured HTML
- [x] #2 Add RED tests for the missed email/link pattern
- [x] #3 Keep no-guessing email rules intact
- [ ] #4 Add focused timeout mitigation only if root cause is identified
- [x] #5 Verify focused tests and full suite
<!-- AC:END -->
## Implementation Plan
<!-- SECTION:PLAN:BEGIN -->
1. Inspect existing website crawler tests
2. Add failing regression tests for Bock Impressum
3. Keep no-context named-email rejection test unchanged
4. Run focused crawler test and confirm RED
<!-- SECTION:PLAN:END -->
## Implementation Notes
<!-- SECTION:NOTES:BEGIN -->
Working on adding focused RED tests for Bock Rechtsanwaelte email extraction failure; limiting changes to tests/website-crawler.test.ts
Added 2 RED coverage tests in tests/website-crawler.test.ts. Focused run of .test-output/tests/website-crawler.test.js fails on 2 assertions: Bock Impressum candidate business-context false due expected mismatch behavior, and email-labeled mailto contactPerson currently equals the email string.
Running minimal fix for Bock Impressum email context/labeling in lib/website-crawler.ts. Next: implement anchor-indexing fix and email-label guard, then run focused tests.
Minimal scoped fix applied in lib/website-crawler.ts: mailto business-context now evaluates against raw input using anchor indices, and email-like labels matching normalized email do not become contactPerson. Verified via focused command: pnpm exec tsc -p tsconfig.test.json && node --test .test-output/tests/website-crawler.test.js (19/19 passing).
Reproduced Bock Impressum against captured public HTML. Extractor found 5 candidates but all were business=false because mailto anchor offsets from original HTML were checked against normalized HTML; TASK-7 therefore returned null. Added RED tests for Bock-like Impressum mailto context and email-label contactPerson behavior. Fixed mailto path to evaluate business context against original input offsets and suppress contactPerson when anchor label is the email itself. Verified captured real HTML now returns usable chemnitz@bock-rechtsanwaelte.de. Full verification passed: pnpm exec tsc -p tsconfig.json; pnpm test (116/116); pnpm lint (existing generated BetterAuth warnings only); pnpm exec convex codegen --dry-run --typecheck enable. Timeout mitigation not changed yet because timeout root cause is not identified.
<!-- SECTION:NOTES:END -->

View File

@@ -1,10 +1,10 @@
---
id: TASK-8
title: Implement Playwright website crawling and screenshot capture
status: To Do
status: In Progress
assignee: []
created_date: '2026-06-03 19:13'
updated_date: '2026-06-04 14:08'
updated_date: '2026-06-04 18:09'
labels:
- mvp
- audit
@@ -25,32 +25,51 @@ Build the website inspection and contact-enrichment layer using Playwright. For
## Acceptance Criteria
<!-- AC:BEGIN -->
- [ ] #1 Playwright captures desktop and mobile screenshots for the homepage and stores them in Convex File Storage
- [ ] #2 Crawler visits a bounded set of relevant subpages: Kontakt, Impressum, Leistungen/Angebot, Über uns/Team when discoverable
- [ ] #3 Crawler extracts visible text, page title, meta description, headings, links, phone numbers, email candidates, email source URLs, contact-person context, and CTA/contact-form signals
- [ ] #4 Extracted email candidates are classified through the TASK-7 rules: generic business emails are preferred; named emails are accepted only when explicitly published as business contact addresses; no guessed addresses are generated
- [ ] #5 Leads discovered by Google Places with a website are automatically scheduled for contact enrichment before they remain in Kontakt fehlt; found usable email updates the lead contact fields and status while preserving phone and source data
- [ ] #6 Simple technical checks include HTTPS/final URL, missing title/meta description, visible contact path, and obvious broken internal links within the crawl limit
- [ ] #7 Crawler failures produce useful dashboard-visible errors without blocking unrelated leads
- [x] #1 Playwright captures desktop and mobile screenshots for the homepage and stores them in Convex File Storage
- [x] #2 Crawler visits a bounded set of relevant subpages: Kontakt, Impressum, Leistungen/Angebot, Über uns/Team when discoverable
- [x] #3 Crawler extracts visible text, page title, meta description, headings, links, phone numbers, email candidates, email source URLs, contact-person context, and CTA/contact-form signals
- [x] #4 Extracted email candidates are classified through the TASK-7 rules: generic business emails are preferred; named emails are accepted only when explicitly published as business contact addresses; no guessed addresses are generated
- [x] #5 Leads discovered by Google Places with a website are automatically scheduled for contact enrichment before they remain in Kontakt fehlt; found usable email updates the lead contact fields and status while preserving phone and source data
- [x] #6 Simple technical checks include HTTPS/final URL, missing title/meta description, visible contact path, and obvious broken internal links within the crawl limit
- [x] #7 Crawler failures produce useful dashboard-visible errors without blocking unrelated leads
<!-- AC:END -->
## Implementation Plan
<!-- SECTION:PLAN:BEGIN -->
1. Add Playwright runtime setup compatible with local development and Coolify container deployment.
2. Define crawl limits, viewports, timeout behavior, and allowed same-domain URL rules.
3. Capture homepage desktop/mobile screenshots and upload them to Convex storage.
4. Discover and inspect relevant subpages with bounded depth.
5. Extract visible text, metadata, links, phone numbers, email candidates, contact-person context, CTA/contact-form signals, and source URLs.
6. Normalize and score email candidates, then call the existing TASK-7 lead review/contact qualification path so usable emails update lead contact fields and unqualified named emails do not.
7. Add contact-enrichment run state and dashboard-visible run events/errors for leads that still need manual contact research.
8. Persist extracted raw evidence, technical checks, screenshots, and crawler errors in Convex.
1. Worker A: add pure crawler/extraction helpers with RED/GREEN tests.
2. Worker B: add Convex schema/run/storage persistence with RED/GREEN tests.
3. Worker C: wire lead-discovery scheduling/contact update flow with RED/GREEN tests.
4. Worker D: add dashboard-visible enrichment state/error UI with RED/GREEN tests where practical.
5. Orchestrator: run spec review, code-quality review, full verification, and update acceptance criteria without marking Done.
<!-- SECTION:PLAN:END -->
## Implementation Notes
<!-- SECTION:NOTES:BEGIN -->
Expanded TASK-8 to cover website-based contact enrichment because Google Places does not provide business email fields. This keeps email handling evidence-based and reuses TASK-7 qualification rules instead of guessing addresses.
Orchestration started on branch codex-task-8-playwright-enrichment. Parallel wave 1 dispatched with gpt-5.3-codex-spark: Worker A owns lib/website-crawler.ts + tests/website-crawler.test.ts; Worker B owns convex/schema.ts + schema tests; Worker C owns Playwright package/runtime docs. All workers instructed to use TDD or config verification and avoid unrelated changes.
Completed wave 1 foundations: Playwright runtime/docs approved; crawler helper spec+quality approved; Convex enrichment schema/run-type parity spec+quality approved. Wave 2 dispatched with gpt-5.3-codex-spark: Worker D owns convex/websiteEnrichment.ts action/persistence; Worker E owns lead-discovery scheduling integration. Orchestrator remains code-review/integration only.
2026-06-04: Worker D started implementing convex/websiteEnrichment.ts with unit/source tests for queue/process/persist enrichment flow and Playwright evidence capture.
2026-06-04: Added TASK-8 source tests for website-enrichment action queue/process/persistence contract and confirmed all assertions pass with existing implementation.
Worker G retry: moved website enrichment scheduling out of persistDiscoveredLeads into processCampaignRun (returns queue items), scoped startCampaignRun active checks to by_type_and_status campaign running, and added source assertions for this sequencing.
Implementation complete pending user confirmation. Built Playwright Chromium website enrichment with bounded crawl, desktop/mobile screenshot storage, raw evidence tables, TASK-7 email qualification reuse, post-discovery scheduling, technical checks, and dashboard-visible run events/errors. Final verification passed: pnpm exec tsc -p tsconfig.json; pnpm test (105/105); pnpm lint (0 errors, existing generated BetterAuth warnings only); pnpm exec convex codegen --dry-run --typecheck enable.
2026-06-04: Updated source tests/README/.env for TASK-8 browser-runtime strategy migration to @sparticuz/chromium-min and TASK8_BROWSER_ASSET_URL deployment expectations.
Resolved Convex Playwright runtime follow-up: local npx playwright install only populates the developer machine cache, not Convex runtime. Full playwright was replaced with playwright-core + @sparticuz/chromium-min and a required TASK8_BROWSER_ASSET_URL source so Convex no longer relies on /home/sbx_user ms-playwright cache. Verification passed: pnpm exec tsc -p tsconfig.json; pnpm test; pnpm lint (existing generated BetterAuth warnings only); pnpm exec convex codegen --dry-run --typecheck enable.
TASK-21 runtime cache fix applied to TASK-8 crawler action: stale @sparticuz/chromium-min /tmp cache is invalidated when browser asset source changes, addressing repeated /tmp/chromium cannot execute binary file after x64/arm64 URL changes.
TASK-8 crawler action now explicitly prepares @sparticuz/chromium-min AL2023 shared libraries for Convex to address /tmp/chromium libnspr4.so missing errors before screenshot/crawl launch.
TASK-23 extractor improvement applied: website enrichment now extracts published emails from mailto links with query params, common German obfuscations, HTML entities/spaced separators, and footer/impressum/contact contexts while preserving TASK-7 no-guessing rules.
TASK-24 Bock Rechtsanwaelte follow-up: mailto candidates on real Impressum HTML were found but incorrectly marked non-business due index mismatch in context detection. Fixed mailto business-context detection and email-label contactPerson suppression; captured Bock HTML now yields usable chemnitz@bock-rechtsanwaelte.de.
<!-- SECTION:NOTES:END -->

View File

@@ -1,9 +1,10 @@
---
id: TASK-9
title: Integrate PageSpeed Insights into internal audits
status: To Do
status: Done
assignee: []
created_date: '2026-06-03 19:13'
updated_date: '2026-06-04 20:12'
labels:
- mvp
- audit
@@ -24,19 +25,55 @@ Add Google PageSpeed Insights as an objective internal audit signal. The system
## Acceptance Criteria
<!-- AC:BEGIN -->
- [ ] #1 PageSpeed API runs for mobile and desktop strategies for qualified website leads
- [ ] #2 Raw PageSpeed/Lighthouse response data is stored internally in Convex
- [ ] #3 Key metrics are normalized for downstream analysis without exposing scores on customer audit pages
- [ ] #4 Failures, quota errors, and unavailable pages are recorded without failing the entire audit pipeline
- [ ] #5 Generated audit inputs translate technical signals into customer-impact language for later text generation
- [x] #1 PageSpeed API runs for mobile and desktop strategies for qualified website leads
- [x] #2 Raw PageSpeed/Lighthouse response data is stored internally in Convex
- [x] #3 Key metrics are normalized for downstream analysis without exposing scores on customer audit pages
- [x] #4 Failures, quota errors, and unavailable pages are recorded without failing the entire audit pipeline
- [x] #5 Generated audit inputs translate technical signals into customer-impact language for later text generation
<!-- AC:END -->
## Implementation Plan
<!-- SECTION:PLAN:BEGIN -->
1. Add PageSpeed API client using environment/Convex secrets.
2. Run mobile and desktop analysis for the lead domain or final URL.
3. Normalize key findings such as load speed, mobile/desktop gap, SEO, accessibility, and best-practice hints.
4. Store raw and normalized results in Convex.
5. Add error handling and dashboard-visible status for quota, timeout, and API failures.
1. Worker A: write RED/GREEN pure PageSpeed client and normalization tests plus implementation in lib/pagespeed-insights.ts.
2. Worker B: write RED/GREEN Convex schema and persistence contract tests plus pageSpeed mutation module.
3. Worker C: write RED/GREEN PageSpeed action queue/process source tests plus Node action implementation.
4. Worker D: write RED/GREEN audit input/public-safety tests plus internal plain-language audit input helper.
5. Orchestrator: run integration verification, resolve conflicts via agents, update acceptance criteria, and leave TASK-9 open for user confirmation.
<!-- SECTION:PLAN:END -->
## Implementation Notes
<!-- SECTION:NOTES:BEGIN -->
2026-06-04: Implementation started on branch codex-task-9-pagespeed-insights. Wave 1 dispatched with gpt-5.3-codex-spark: Worker A owns lib/pagespeed-insights.ts + tests/pagespeed-insights.test.ts; Worker B owns Convex PageSpeed schema/persistence contracts in convex/schema.ts, convex/domain.ts, convex/pageSpeed.ts, and related tests. Orchestrator remains coordination/review only.
2026-06-04T19:40Z: Implemented and validated lib/pagespeed-insights.ts with request URL builder, normalizer, fetch helper, and error classifier. Added tests/pagespeed-insights.test.ts with RED→GREEN coverage (URL contract, normalization, error classification, injected fetch, offline-only assertions).
Wave 1 complete: Worker A delivered pure PageSpeed URL/client/normalizer tests and implementation; Worker B delivered Convex pageSpeedResults schema and internal persistence queue/start/persist/finish module. Worker A concern noted: targeted pnpm test args are incompatible with project script, but isolated compiled test and tsconfig.test passed.
2026-06-04T19:50Z: Worker D taking subtask for TDD implementation of lib/pagespeed-audit-input.ts and tests/pagespeed-audit-input.test.ts (score-free German customer-implication generator).
Implementation complete pending user confirmation. Built PageSpeed API client/normalizer, Convex pageSpeedResults raw-storage persistence, internal audit run queue/action for mobile+desktop, post-website-enrichment scheduling, per-strategy error recording, raw payload size guard, score-free audit input translator, and public-output sanitization. Review findings addressed: malformed JSON 200 responses now fail as api_error; PageSpeed action has outer failure guard; oversized raw payloads fail per strategy; audit inputs strip URLs/markup/JSON/raw score artifacts. Final verification passed: pnpm test (155/155); pnpm exec tsc -p tsconfig.json --pretty false; pnpm lint (0 errors, existing generated BetterAuth warnings only); pnpm exec convex codegen --dry-run --typecheck enable (rerun outside sandbox after DNS ENOTFOUND). TASK-9 remains In Progress until user confirms manual acceptance.
2026-06-04: Follow-up from manual test: PageSpeed failed with generic api_error summary "PageSpeed-API lieferte einen Fehler." Root cause at diagnostics layer: HTTP 4xx/5xx classifier discarded Google error.message/error_message/runtimeError details. Added RED/GREEN regression coverage and now preserves Google API error messages in PageSpeedError summaries. Verification passed: pnpm test (156/156); pnpm exec tsc -p tsconfig.json --pretty false; pnpm lint (0 errors, existing generated BetterAuth warnings only).
2026-06-04: Manual test follow-up: after API key renewal, PageSpeed failed with timeout. Root cause: convex/pageSpeedAction.ts used hardcoded 10_000ms timeout, too short for PageSpeed Insights. Added PAGESPEED_TIMEOUT_MS env support with 60_000ms default and 10_000-120_000 clamp; fetchPageSpeedResult now receives resolved timeout. Updated README/.env.example. Verification passed: pnpm test (159/159); pnpm exec tsc -p tsconfig.json --pretty false; pnpm lint (0 errors, existing generated BetterAuth warnings only).
2026-06-04: Systematic debugging follow-up for recurring PageSpeed timeout/unknown reports. Root cause after increasing timeout was not another HTTP timeout: PageSpeed responses reached the action, but Convex rejected the success payload because `normalized` included extra normalizer-only fields (`strategy`, `sourceUrl`, `finalUrl`, `analysisTimestamp`) not allowed by the persistPageSpeedResult validator. Spark Worker A added `toPersistedPageSpeedNormalizedResult` in convex/pageSpeedAction.ts and success persistence now stores only `scores`, `metrics`, `opportunities`, and `implications`, with `finalUrl` kept top-level. Spark Reviewer B confirmed the mapping matches convex/pageSpeed.ts and convex/schema.ts. Verification passed: pnpm test (160/160), pnpm exec tsc -p tsconfig.json --pretty false, pnpm lint (0 errors, two pre-existing generated BetterAuth warnings), pnpm exec convex dev --once --typecheck enable. Real dev retry for lead jx7cnezm2xg7b2xr2gfmqyeg5h881m2d on run j972t5ra323rgax4a7ycsbrtzd881m8n confirmed desktop persisted successfully with raw storage and normalized keys [implications, metrics, opportunities, scores]; mobile failed separately with a genuine Google/Lighthouse api_error: "Lighthouse returned error: Something went wrong." A subsequent clean rerun could not start because the lead had been manually deleted during Convex cleanup. TASK-9 remains In Progress pending user manual acceptance.
2026-06-04: Manual retest update from user: mobile PageSpeed produced one transient api_error ("Lighthouse returned error: Something went wrong.") and succeeded three times. This confirms the previous timeout/unknown/validator failure is no longer recurring; remaining failure mode is an intermittent Google/Lighthouse strategy-level error that is recorded without breaking the pipeline. TASK-9 remains In Progress until explicit user confirmation to close.
2026-06-04: Follow-up opened from user manual testing: PageSpeed should still be triggered when website enrichment fails but the lead has a website URL. Initial trace shows current queueLeadPageSpeedAudit call is only in the successful enrichment path after persistence; fatal failure paths finish/patch the website enrichment run without queueing PageSpeed. Keeping TASK-9 In Progress.
Started minimal PAGE-SPEED queueing fix for processLeadEnrichment failure paths; targeting invalid-URL guard + outer catch to queue PageSpeed before return and keep existing success queue/warn semantics.
Implemented PASS for processLeadEnrichment missing failure-path queueing: added queue+warning fallback in !rootUrl branch and fatal outer catch when started exists; kept success path queue behavior. Verified with: pnpm exec tsc -p tsconfig.test.json && pnpm exec node --test .test-output/tests/website-enrichment-action.test.js (20 pass).
2026-06-04: Follow-up fixed after manual finding that PageSpeed was not triggered when website enrichment failed. Added RED regression tests for both failure paths in tests/website-enrichment-action.test.ts: invalid URL failure and fatal catch path must queue internal.pageSpeed.queueLeadPageSpeedAudit with leadId started.lead._id and parentRunId runId before returning. Spark GREEN worker updated convex/websiteEnrichmentAction.ts so invalid-url and fatal failure paths queue PageSpeed with warning-safe handling; success path remains queued before success finish. Refactor pass restored guard-style structure and fixed test helper source parameter usage. Verification passed: pnpm test (162/162), pnpm exec tsc -p tsconfig.json --pretty false, pnpm lint (0 errors, two pre-existing generated BetterAuth warnings), pnpm exec convex dev --once --typecheck enable. TASK-9 remains In Progress pending manual acceptance.
<!-- SECTION:NOTES:END -->
## Final Summary
<!-- SECTION:FINAL_SUMMARY:BEGIN -->
Integrated Google PageSpeed Insights into the internal audit pipeline. Added mobile and desktop PageSpeed queue/action processing, raw Convex storage, normalized metrics, score-free customer-impact audit inputs, resilient per-strategy failure recording, API diagnostics, configurable timeout, and follow-up fixes from manual testing: persisted normalized payload shape now matches Convex validators and PageSpeed is triggered even when website enrichment fails for a lead with a website URL. Verified with pnpm test, TypeScript, lint, Convex dev deploy, and user manual retests.
<!-- SECTION:FINAL_SUMMARY:END -->

View File

@@ -267,187 +267,81 @@ export function CampaignsBoard() {
</CardHeader>
</Card>
) : (
<>
<div className="hidden overflow-x-auto rounded-lg border bg-card md:block">
<table className="w-full min-w-[820px] border-separate border-spacing-0">
<thead>
<tr className="text-left text-sm text-muted-foreground">
<th className="sticky left-0 bg-card p-3 font-normal">Kampagne</th>
<th className="p-3 font-normal">PLZ / Radius</th>
<th className="p-3 font-normal">Cadence</th>
<th className="p-3 font-normal">Limits</th>
<th className="p-3 font-normal">Status</th>
<th className="p-3 font-normal">Lauf</th>
<th className="p-3 font-normal">Aktionen</th>
</tr>
</thead>
<tbody>
{campaignsSorted.map((campaign) => (
<tr
className="border-t"
key={campaign._id}
<div className="grid gap-3">
{campaignsSorted.map((campaign) => (
<Card key={campaign._id}>
<CardHeader>
<div className="flex flex-wrap items-start justify-between gap-2">
<div className="min-w-0">
<CardTitle className="truncate">{campaign.name}</CardTitle>
<CardDescription className="truncate">
{formatNiche(campaign)}
</CardDescription>
</div>
<Badge
variant={campaign.status === "active" ? "default" : "secondary"}
>
<td className="max-w-[220px] p-3 align-top">
<div className="space-y-1">
<p className="truncate font-medium">{campaign.name}</p>
<p className="text-sm text-muted-foreground">
{formatNiche(campaign)}
</p>
</div>
</td>
{campaign.status === "active" ? "Aktiv" : "Pausiert"}
</Badge>
</div>
</CardHeader>
<td className="max-w-[180px] p-3 align-top">
<div className="space-y-1 text-sm text-muted-foreground">
<p className="inline-flex items-center gap-1">
<MapPin className="size-3" />
<span>{campaign.postalCode}</span>
</p>
<p>{campaign.radiusKm} km Umkreis</p>
</div>
</td>
<td className="p-3 align-top">
<span className="rounded-md bg-muted px-2 py-1 text-sm">
{recurrenceLabel[campaign.recurrence]}
</span>
</td>
<td className="p-3 align-top">
<p className="text-sm">
Leads: {campaign.maxNewLeadsPerRun} · Audits:{" "}
{campaign.maxAuditsPerRun}
</p>
</td>
<td className="p-3 align-top">
<Badge
variant={campaign.status === "active" ? "default" : "secondary"}
>
{campaign.status === "active" ? "Aktiv" : "Pausiert"}
</Badge>
</td>
<td className="p-3 align-top">
<div className="space-y-1 text-sm text-muted-foreground">
<p>Letzter Lauf: {formatDateTime(campaign.lastRunAt)}</p>
<p>Nächster Lauf: {formatDateTime(campaign.nextRunAt)}</p>
<p>Run-Status: {statusLabel[campaign.currentRunStatus] ?? campaign.currentRunStatus}</p>
</div>
</td>
<td className="p-3 align-top">
<div className="flex flex-wrap gap-2">
<Button
className="w-full sm:w-auto"
variant="outline"
onClick={() => openEditDialog(campaign)}
disabled={actionBusyId === campaign._id}
>
<Pencil className="size-4" />
Bearbeiten
</Button>
<Button
className="w-full sm:w-auto"
variant="outline"
onClick={() => toggleCampaign(campaign)}
disabled={actionBusyId === campaign._id}
>
<RefreshCcw className="size-4" />
{campaign.status === "active" ? "Pausieren" : "Fortfahren"}
</Button>
<Button
className="w-full sm:w-auto"
onClick={() => runCampaign(campaign)}
disabled={actionBusyId === campaign._id}
>
<Play className="size-4" />
Jetzt ausführen
</Button>
</div>
</td>
</tr>
))}
</tbody>
</table>
</div>
<div className="grid gap-3 md:hidden">
{campaignsSorted.map((campaign) => (
<Card key={campaign._id}>
<CardHeader>
<div className="flex flex-wrap items-start justify-between gap-2">
<div className="min-w-0">
<CardTitle className="truncate">{campaign.name}</CardTitle>
<CardDescription className="truncate">
{formatNiche(campaign)}
</CardDescription>
</div>
<Badge
variant={campaign.status === "active" ? "default" : "secondary"}
>
{campaign.status === "active" ? "Aktiv" : "Pausiert"}
</Badge>
<CardContent className="grid gap-2 text-sm">
<div className="flex flex-wrap items-center justify-between gap-3">
<div className="inline-flex items-center gap-1 text-muted-foreground">
<MapPin className="size-3" />
<span>{campaign.postalCode}</span>
</div>
</CardHeader>
<span>{campaign.radiusKm} km</span>
</div>
<Separator className="bg-border" />
<div>
<p>Cadence: {recurrenceLabel[campaign.recurrence]}</p>
<p>
Limits: L {campaign.maxNewLeadsPerRun}, A{" "}
{campaign.maxAuditsPerRun}
</p>
</div>
<div>
<p className="text-muted-foreground">Letzter Lauf: {formatDateTime(campaign.lastRunAt)}</p>
<p className="text-muted-foreground">Nächster Lauf: {formatDateTime(campaign.nextRunAt)}</p>
<p className="text-muted-foreground">
Run-Status: {statusLabel[campaign.currentRunStatus] ?? campaign.currentRunStatus}
</p>
</div>
<CardContent className="grid gap-2 text-sm">
<div className="flex flex-wrap items-center justify-between gap-3">
<div className="inline-flex items-center gap-1 text-muted-foreground">
<MapPin className="size-3" />
<span>{campaign.postalCode}</span>
</div>
<span>{campaign.radiusKm} km</span>
</div>
<Separator className="bg-border" />
<div>
<p>Cadence: {recurrenceLabel[campaign.recurrence]}</p>
<p>
Limits: L {campaign.maxNewLeadsPerRun}, A{" "}
{campaign.maxAuditsPerRun}
</p>
</div>
<div>
<p className="text-muted-foreground">Letzter Lauf: {formatDateTime(campaign.lastRunAt)}</p>
<p className="text-muted-foreground">Nächster Lauf: {formatDateTime(campaign.nextRunAt)}</p>
<p className="text-muted-foreground">
Run-Status: {statusLabel[campaign.currentRunStatus] ?? campaign.currentRunStatus}
</p>
</div>
<div className="grid gap-2">
<Button
variant="outline"
onClick={() => openEditDialog(campaign)}
disabled={actionBusyId === campaign._id}
className="w-full justify-start"
>
<Pencil className="size-4" />
Bearbeiten
</Button>
<Button
variant="outline"
onClick={() => toggleCampaign(campaign)}
disabled={actionBusyId === campaign._id}
className="w-full justify-start"
>
<RefreshCcw className="size-4" />
{campaign.status === "active" ? "Pausieren" : "Fortfahren"}
</Button>
<Button
onClick={() => runCampaign(campaign)}
disabled={actionBusyId === campaign._id}
className="w-full justify-start"
>
<Play className="size-4" />
Jetzt ausführen
</Button>
</div>
</CardContent>
</Card>
))}
</div>
</>
<div className="grid gap-2">
<Button
variant="outline"
onClick={() => openEditDialog(campaign)}
disabled={actionBusyId === campaign._id}
className="w-full justify-start"
>
<Pencil className="size-4" />
Bearbeiten
</Button>
<Button
variant="outline"
onClick={() => toggleCampaign(campaign)}
disabled={actionBusyId === campaign._id}
className="w-full justify-start"
>
<RefreshCcw className="size-4" />
{campaign.status === "active" ? "Pausieren" : "Fortfahren"}
</Button>
<Button
onClick={() => runCampaign(campaign)}
disabled={actionBusyId === campaign._id}
className="w-full justify-start"
>
<Play className="size-4" />
Jetzt ausführen
</Button>
</div>
</CardContent>
</Card>
))}
</div>
)}
</section>
);

View File

@@ -22,7 +22,7 @@ import {
type LeadBlacklistStatus,
} from "@/lib/dashboard-model";
import { Button } from "@/components/ui/button";
import { Card } from "@/components/ui/card";
import { Card, CardHeader } from "@/components/ui/card";
import { Input } from "@/components/ui/input";
import { Select, SelectContent, SelectItem, SelectTrigger, SelectValue } from "@/components/ui/select";
import { Badge } from "@/components/ui/badge";
@@ -148,59 +148,23 @@ export function LeadsReviewTable() {
<h1 className="text-2xl font-semibold tracking-normal">Leads prüfen</h1>
</div>
<div className="mx-auto w-full max-w-7xl">
<Card className="overflow-hidden">
<div className="overflow-x-auto">
<div className="min-w-[1150px]">
<table className="w-full border-separate border-spacing-0 text-sm">
<thead>
<tr className="text-left text-xs text-muted-foreground">
<th className="p-3 font-normal">Firma / Ort</th>
<th className="p-3 font-normal">Kontakt + Quelle</th>
<th className="p-3 font-normal">Priorität</th>
<th className="p-3 font-normal">Kontaktstatus</th>
<th className="p-3 font-normal">Qualität</th>
<th className="p-3 font-normal">Review-Felder</th>
<th className="p-3 font-normal">Aktionen</th>
</tr>
</thead>
{leads === undefined ? (
<tbody>
<tr>
<td className="p-3" colSpan={7}>
<p className="rounded-md bg-muted p-4 text-sm">
Leads werden geladen
</p>
</td>
</tr>
</tbody>
) : sortedLeads.length === 0 ? (
<tbody>
<tr>
<td className="p-3" colSpan={7}>
<p className="rounded-md border p-4 text-sm text-muted-foreground">
Keine Leads vorhanden. Bitte zuerst eine Kampagne starten
oder importieren.
</p>
</td>
</tr>
</tbody>
) : (
<tbody>
{sortedLeads.map((lead) => (
<LeadReviewRow
key={lead._id}
lead={lead}
onActionMessage={setActionMessage}
/>
))}
</tbody>
)}
</table>
</div>
</div>
</Card>
<div className="mx-auto grid w-full max-w-7xl gap-3">
{leads === undefined ? (
<p className="rounded-md bg-muted p-4 text-sm">Leads werden geladen</p>
) : sortedLeads.length === 0 ? (
<p className="rounded-md border p-4 text-sm text-muted-foreground">
Keine Leads vorhanden. Bitte zuerst eine Kampagne starten oder
importieren.
</p>
) : (
sortedLeads.map((lead) => (
<LeadReviewRow
key={lead._id}
lead={lead}
onActionMessage={setActionMessage}
/>
))
)}
</div>
{actionMessage ? (
@@ -219,6 +183,7 @@ function LeadReviewRow({
lead: LeadRow;
onActionMessage: (value: string) => void;
}) {
const [isExpanded, setIsExpanded] = useState(false);
const [draft, setDraft] = useState<LeadReviewDraft>(() => ({
priority: lead.priority,
contactStatus: lead.contactStatus,
@@ -313,264 +278,290 @@ function LeadReviewRow({
setDraft((current) => ({ ...current, [field]: value }));
};
const detailsId = `lead-review-details-${lead._id}`;
return (
<tr className="border-t">
<td className="max-w-[260px] p-3 align-top">
<p className="font-medium">{lead.companyName}</p>
<p className="mt-1 inline-flex items-center gap-1 truncate text-xs text-muted-foreground">
<Building2 className="size-3 shrink-0" />
<span className="truncate">{lead.niche ?? "Nische offen"}</span>
</p>
<p className="mt-2 inline-flex items-center gap-1 text-xs text-muted-foreground">
<MapPin className="size-3 shrink-0" />
<span>{location}</span>
</p>
{lead.address ? (
<p className="mt-1 max-w-full truncate text-xs text-muted-foreground">
{lead.address}
</p>
) : null}
</td>
<Card>
<CardHeader className="pb-3">
<div className="grid min-w-0 gap-2">
<div className="flex min-w-0 flex-wrap items-start justify-between gap-2">
<div className="min-w-0 flex-1">
<p className="max-w-full truncate font-medium">
{lead.companyName}
</p>
<p className="mt-1 inline-flex items-center gap-1 text-xs text-muted-foreground">
<Building2 className="size-3 shrink-0" />
<span className="inline-flex min-w-0 max-w-full break-words">
{lead.niche ?? "Nische offen"}
</span>
</p>
<p className="mt-2 inline-flex items-center gap-1 text-xs text-muted-foreground">
<MapPin className="size-3 shrink-0" />
<span className="inline-flex min-w-0 max-w-full truncate">
{location}
</span>
</p>
</div>
<td className="max-w-[260px] p-3 align-top">
<p className="inline-flex w-full items-start gap-1 text-sm">
<Mail className="mt-0.5 size-3 shrink-0" />
<span className="min-w-0 break-all">
{lead.email || "Keine E-Mail"}
</span>
</p>
{lead.phone ? (
<p className="mt-2 inline-flex w-full items-start gap-1 text-xs text-muted-foreground">
<Phone className="size-3 shrink-0" />
<span className="break-all">{lead.phone}</span>
</p>
) : null}
<p className="mt-2 text-xs text-muted-foreground">
Quelle: {contactSourceLabel(lead)}
</p>
{lead.websiteDomain ? (
<p className="mt-1 text-xs text-muted-foreground">
Domain: {lead.websiteDomain}
</p>
) : null}
</td>
<p
className={`inline-flex shrink-0 rounded-md border px-2 py-1 text-xs font-medium ${priorityBadgeClass(
draft.priority,
)}`}
>
{getLeadPriorityLabel(draft.priority)}
</p>
</div>
<td className="p-3 align-top">
<p
className={`inline-flex rounded-md border px-2 py-1 text-xs font-medium ${priorityBadgeClass(
draft.priority,
)}`}
<div className="grid min-w-0 gap-1 text-xs text-muted-foreground">
<p className="inline-flex min-w-0 items-center gap-1">
<Mail className="size-3 shrink-0" />
<span className="max-w-full min-w-0 break-all">
{lead.email || "Keine E-Mail"}
</span>
</p>
{lead.phone ? (
<p className="inline-flex min-w-0 items-center gap-1">
<Phone className="size-3 shrink-0" />
<span className="max-w-full min-w-0 break-all">{lead.phone}</span>
</p>
) : null}
<p className="truncate max-w-full">
Quelle: {contactSourceLabel(lead)}
</p>
{lead.websiteDomain ? (
<p className="truncate max-w-full">Domain: {lead.websiteDomain}</p>
) : null}
</div>
</div>
</CardHeader>
<div className="border-t p-4 pt-3">
<Button
type="button"
variant="outline"
onClick={() => setIsExpanded((previous) => !previous)}
aria-expanded={isExpanded}
aria-controls={detailsId}
size="sm"
>
{getLeadPriorityLabel(draft.priority)}
</p>
<div className="mt-2 max-w-[160px]">
<Select
value={draft.priority}
onValueChange={(nextPriority) =>
updateDraft("priority", nextPriority as LeadPriority)
}
>
<SelectTrigger>
<SelectValue placeholder="Priorität" />
</SelectTrigger>
<SelectContent>
{leadPriorityOptions.map((value) => (
<SelectItem value={value} key={value}>
{getLeadPriorityLabel(value)}
</SelectItem>
))}
</SelectContent>
</Select>
{isExpanded ? "Weniger anzeigen" : "Mehr anzeigen"}
</Button>
</div>
<div
id={detailsId}
className="grid gap-3 border-t p-4"
hidden={!isExpanded}
>
<div className="grid gap-3 xl:grid-cols-2">
<section className="grid gap-2">
<div>
<p className="text-xs text-muted-foreground">Priorität</p>
<div className="mt-2">
<Select
value={draft.priority}
onValueChange={(nextPriority) =>
updateDraft("priority", nextPriority as LeadPriority)
}
>
<SelectTrigger>
<SelectValue placeholder="Priorität" />
</SelectTrigger>
<SelectContent>
{leadPriorityOptions.map((value) => (
<SelectItem value={value} key={value}>
{getLeadPriorityLabel(value)}
</SelectItem>
))}
</SelectContent>
</Select>
</div>
</div>
<div>
<p className="text-xs text-muted-foreground">Kontaktstatus</p>
<div className="mt-2">
<Select
value={draft.contactStatus}
onValueChange={(nextStatus) =>
updateDraft("contactStatus", nextStatus as LeadContactStatus)
}
>
<SelectTrigger>
<SelectValue placeholder="Kontaktstatus" />
</SelectTrigger>
<SelectContent>
{leadContactStatusOptions.map((status) => (
<SelectItem value={status} key={status}>
{getLeadContactStatusLabel(status)}
</SelectItem>
))}
</SelectContent>
</Select>
</div>
</div>
</section>
<section className="grid gap-2">
<div>
<p className="text-xs text-muted-foreground">Prioritätsgrund</p>
<Input
value={draft.priorityReason}
onChange={(event) => {
updateDraft("priorityReason", event.target.value);
}}
/>
</div>
<div>
<p className="mt-2 text-xs text-muted-foreground">
Kontaktstatus-Notiz
</p>
<Input
value={draft.contactStatusReason}
onChange={(event) => {
updateDraft("contactStatusReason", event.target.value);
}}
/>
</div>
<div>
<p className="mt-2 text-xs text-muted-foreground">Notiz</p>
<Input
value={draft.notes}
onChange={(event) => {
updateDraft("notes", event.target.value);
}}
/>
</div>
<div className="mt-2 space-y-1 text-xs text-muted-foreground">
{reasonParts.length === 0 ? (
<p>Keine Zusatzhinweise</p>
) : (
reasonParts.map((reason) => <p key={reason}> {reason}</p>)
)}
</div>
</section>
<section className="grid gap-2">
<div>
<p className="text-xs text-muted-foreground">Review-E-Mail</p>
<Input
value={draft.reviewEmail}
onChange={(event) => {
updateDraft("reviewEmail", event.target.value);
}}
/>
</div>
<div>
<p className="mt-2 text-xs text-muted-foreground">Review-Quelle</p>
<Input
value={draft.reviewEmailSource}
onChange={(event) => {
updateDraft("reviewEmailSource", event.target.value);
}}
/>
</div>
<div>
<p className="mt-2 text-xs text-muted-foreground">Ansprechperson</p>
<Input
value={draft.reviewContactPerson}
onChange={(event) => {
updateDraft("reviewContactPerson", event.target.value);
}}
/>
</div>
<label className="mt-2 inline-flex items-center gap-2 text-xs text-muted-foreground">
<Switch
checked={draft.reviewIsBusinessContactAddress}
onCheckedChange={(checked) => {
updateDraft("reviewIsBusinessContactAddress", checked);
}}
/>
Genannte E-Mail als Business-Kontakt
</label>
</section>
<section className="grid gap-2">
<div>
<p className="text-xs text-muted-foreground">Duplikatstatus</p>
<div className="mt-2">
<Select
value={draft.duplicateStatus}
onValueChange={(nextStatus) =>
updateDraft("duplicateStatus", nextStatus as LeadDuplicateStatus)
}
>
<SelectTrigger>
<SelectValue placeholder="Duplikatstatus" />
</SelectTrigger>
<SelectContent>
{leadDuplicateStatusOptions.map((status) => (
<SelectItem value={status} key={status}>
{getLeadDuplicateStatusLabel(status)}
</SelectItem>
))}
</SelectContent>
</Select>
</div>
</div>
<div>
<label className="text-xs text-muted-foreground">Sperrstatus</label>
<div className="mt-2">
<Select
value={draft.blacklistStatus}
onValueChange={(nextStatus) =>
updateDraft("blacklistStatus", nextStatus as LeadBlacklistStatus)
}
>
<SelectTrigger>
<SelectValue placeholder="Sperrstatus" />
</SelectTrigger>
<SelectContent>
{leadBlacklistStatusOptions.map((status) => (
<SelectItem value={status} key={status}>
{getLeadBlacklistStatusLabel(status)}
</SelectItem>
))}
</SelectContent>
</Select>
</div>
</div>
<div className="mt-2 grid gap-2 sm:grid-cols-2">
<Badge
variant={duplicateBadgeVariant(draft.duplicateStatus)}
title={lead.duplicateReason ?? undefined}
>
{getLeadDuplicateStatusLabel(draft.duplicateStatus)}
</Badge>
<Badge
variant={lead.blacklistStatus === "blocked" ? "destructive" : "secondary"}
>
{getLeadBlacklistStatusLabel(lead.blacklistStatus)}
</Badge>
</div>
<div className="mt-2 grid gap-2 sm:grid-cols-2">
<Button onClick={saveRow} disabled={isSaving || isBlocking} size="sm">
<span>Speichern</span>
</Button>
<Button
variant="destructive"
onClick={blockLead}
disabled={isSaving || isBlocking}
size="sm"
>
<ShieldAlert className="size-4" />
Sperren
</Button>
</div>
{rowMessage ? (
<p className="text-xs text-muted-foreground">{rowMessage}</p>
) : null}
</section>
</div>
</td>
<td className="p-3 align-top">
<Badge variant="outline">
{getLeadContactStatusLabel(draft.contactStatus)}
</Badge>
<div className="mt-2 max-w-[180px]">
<Select
value={draft.contactStatus}
onValueChange={(nextStatus) =>
updateDraft("contactStatus", nextStatus as LeadContactStatus)
}
>
<SelectTrigger>
<SelectValue placeholder="Kontaktstatus" />
</SelectTrigger>
<SelectContent>
{leadContactStatusOptions.map((status) => (
<SelectItem value={status} key={status}>
{getLeadContactStatusLabel(status)}
</SelectItem>
))}
</SelectContent>
</Select>
</div>
</td>
<td className="max-w-[220px] p-3 align-top">
<div className="grid gap-2">
<p className="text-xs text-muted-foreground">Prioritätsgrund</p>
<Input
value={draft.priorityReason}
onChange={(event) => {
updateDraft("priorityReason", event.target.value);
}}
/>
</div>
<div className="mt-2 grid gap-2">
<p className="text-xs text-muted-foreground">Kontaktstatus-Notiz</p>
<Input
value={draft.contactStatusReason}
onChange={(event) => {
updateDraft("contactStatusReason", event.target.value);
}}
/>
</div>
<div className="mt-2 grid gap-2">
<p className="text-xs text-muted-foreground">Notiz</p>
<Input
value={draft.notes}
onChange={(event) => {
updateDraft("notes", event.target.value);
}}
/>
</div>
<div className="mt-3 flex flex-wrap gap-2">
<Badge
variant={duplicateBadgeVariant(draft.duplicateStatus)}
title={lead.duplicateReason ?? undefined}
>
{getLeadDuplicateStatusLabel(draft.duplicateStatus)}
</Badge>
<Badge
variant={lead.blacklistStatus === "blocked" ? "destructive" : "secondary"}
>
{getLeadBlacklistStatusLabel(lead.blacklistStatus)}
</Badge>
</div>
<div className="mt-2 space-y-1 text-xs text-muted-foreground">
{reasonParts.length === 0 ? (
<p>Keine Zusatzhinweise</p>
) : (
reasonParts.map((reason) => <p key={reason}> {reason}</p>)
)}
</div>
</td>
<td className="min-w-[260px] p-3 align-top">
<div className="grid gap-2">
<p className="text-xs text-muted-foreground">Review-E-Mail</p>
<Input
value={draft.reviewEmail}
onChange={(event) => {
updateDraft("reviewEmail", event.target.value);
}}
/>
</div>
<div className="mt-2 grid gap-2">
<p className="text-xs text-muted-foreground">Review-Quelle</p>
<Input
value={draft.reviewEmailSource}
onChange={(event) => {
updateDraft("reviewEmailSource", event.target.value);
}}
/>
</div>
<div className="mt-2 grid gap-2">
<p className="text-xs text-muted-foreground">Ansprechperson</p>
<Input
value={draft.reviewContactPerson}
onChange={(event) => {
updateDraft("reviewContactPerson", event.target.value);
}}
/>
</div>
<label className="mt-3 inline-flex items-center gap-2 text-xs text-muted-foreground">
<Switch
checked={draft.reviewIsBusinessContactAddress}
onCheckedChange={(checked) => {
updateDraft("reviewIsBusinessContactAddress", checked);
}}
/>
Genannte E-Mail als Business-Kontakt
</label>
<div className="mt-3 grid gap-2">
<p className="text-xs text-muted-foreground">Duplikatstatus</p>
<Select
value={draft.duplicateStatus}
onValueChange={(nextStatus) =>
updateDraft("duplicateStatus", nextStatus as LeadDuplicateStatus)
}
>
<SelectTrigger>
<SelectValue placeholder="Duplikatstatus" />
</SelectTrigger>
<SelectContent>
{leadDuplicateStatusOptions.map((status) => (
<SelectItem value={status} key={status}>
{getLeadDuplicateStatusLabel(status)}
</SelectItem>
))}
</SelectContent>
</Select>
</div>
<div className="mt-2">
<label className="text-xs text-muted-foreground">Sperrstatus</label>
<Select
value={draft.blacklistStatus}
onValueChange={(nextStatus) =>
updateDraft("blacklistStatus", nextStatus as LeadBlacklistStatus)
}
>
<SelectTrigger>
<SelectValue placeholder="Sperrstatus" />
</SelectTrigger>
<SelectContent>
{leadBlacklistStatusOptions.map((status) => (
<SelectItem value={status} key={status}>
{getLeadBlacklistStatusLabel(status)}
</SelectItem>
))}
</SelectContent>
</Select>
</div>
</td>
<td className="max-w-[170px] p-3 align-top">
<div className="grid gap-2">
<Button
onClick={saveRow}
disabled={isSaving || isBlocking}
size="sm"
>
<span>Speichern</span>
</Button>
<Button
variant="destructive"
onClick={blockLead}
disabled={isSaving || isBlocking}
size="sm"
>
<ShieldAlert className="size-4" />
Sperren
</Button>
</div>
{rowMessage ? (
<p className="mt-2 text-xs text-muted-foreground">{rowMessage}</p>
) : null}
</td>
</tr>
</div>
</Card>
);
}

6
convex.json Normal file
View File

@@ -0,0 +1,6 @@
{
"$schema": "https://raw.githubusercontent.com/get-convex/convex-backend/main/npm-packages/convex/schemas/convex.schema.json",
"node": {
"externalPackages": ["playwright-core", "@sparticuz/chromium-min"]
}
}

View File

@@ -8,6 +8,7 @@
* @module
*/
import type * as auditInputs from "../auditInputs.js";
import type * as audits from "../audits.js";
import type * as blacklist from "../blacklist.js";
import type * as campaigns from "../campaigns.js";
@@ -16,9 +17,13 @@ import type * as http from "../http.js";
import type * as leadDiscovery from "../leadDiscovery.js";
import type * as leads from "../leads.js";
import type * as outreach from "../outreach.js";
import type * as pageSpeed from "../pageSpeed.js";
import type * as pageSpeedAction from "../pageSpeedAction.js";
import type * as runs from "../runs.js";
import type * as settings from "../settings.js";
import type * as storage from "../storage.js";
import type * as websiteEnrichment from "../websiteEnrichment.js";
import type * as websiteEnrichmentAction from "../websiteEnrichmentAction.js";
import type {
ApiFromModules,
@@ -27,6 +32,7 @@ import type {
} from "convex/server";
declare const fullApi: ApiFromModules<{
auditInputs: typeof auditInputs;
audits: typeof audits;
blacklist: typeof blacklist;
campaigns: typeof campaigns;
@@ -35,9 +41,13 @@ declare const fullApi: ApiFromModules<{
leadDiscovery: typeof leadDiscovery;
leads: typeof leads;
outreach: typeof outreach;
pageSpeed: typeof pageSpeed;
pageSpeedAction: typeof pageSpeedAction;
runs: typeof runs;
settings: typeof settings;
storage: typeof storage;
websiteEnrichment: typeof websiteEnrichment;
websiteEnrichmentAction: typeof websiteEnrichmentAction;
}>;
/**

60
convex/auditInputs.ts Normal file
View File

@@ -0,0 +1,60 @@
import { v } from "convex/values";
import type { Doc, Id } from "./_generated/dataModel";
import { internalQuery } from "./_generated/server";
import { buildPageSpeedAuditInputs, type PageSpeedMinimalAuditResult } from "../lib/pagespeed-audit-input";
function normalizePageSpeedResultRow(
row: Doc<"pageSpeedResults">,
): PageSpeedMinimalAuditResult {
return {
strategy: row.strategy,
status: row.status,
sourceUrl: row.sourceUrl,
...(row.finalUrl ? { finalUrl: row.finalUrl } : {}),
...(row.normalized ? { normalized: row.normalized } : {}),
...(row.errorType ? { errorType: row.errorType } : {}),
...(row.errorSummary ? { errorSummary: row.errorSummary } : {}),
};
}
export const getPageSpeedAuditInputs = internalQuery({
args: {
leadId: v.optional(v.id("leads")),
auditId: v.optional(v.id("audits")),
},
handler: async (
ctx,
args,
): Promise<{
technicalSignals: string[];
customerImplications: string[];
internalNotes: string[];
}> => {
let results: Doc<"pageSpeedResults">[];
if (args.auditId) {
results = await ctx.db
.query("pageSpeedResults")
.withIndex("by_auditId", (q) => q.eq("auditId", args.auditId as Id<"audits">))
.order("desc")
.take(50);
return buildPageSpeedAuditInputs(results.map(normalizePageSpeedResultRow));
}
if (args.leadId) {
results = await ctx.db
.query("pageSpeedResults")
.withIndex("by_leadId", (q) => q.eq("leadId", args.leadId as Id<"leads">))
.order("desc")
.take(50);
return buildPageSpeedAuditInputs(results.map(normalizePageSpeedResultRow));
}
return {
technicalSignals: [],
customerImplications: [],
internalNotes: [],
};
},
});

View File

@@ -84,6 +84,7 @@ export const RUN_TYPES = [
"audit",
"outreach",
"lifecycle",
"website_enrichment",
] as const;
export const RUN_STATUSES = [
"pending",
@@ -94,6 +95,16 @@ export const RUN_STATUSES = [
] as const;
export const RUN_EVENT_LEVELS = ["info", "warning", "error"] as const;
export const SCREENSHOT_VIEWPORTS = ["desktop", "mobile"] as const;
export const PAGE_SPEED_STRATEGIES = ["mobile", "desktop"] as const;
export const PAGE_SPEED_RESULT_STATUSES = ["succeeded", "failed"] as const;
export const PAGE_SPEED_ERROR_TYPES = [
"quota",
"timeout",
"unavailable",
"invalid_url",
"api_error",
"unknown",
] as const;
export type CampaignStatus = (typeof CAMPAIGN_STATUSES)[number];
export type LeadPriority = (typeof LEAD_PRIORITIES)[number];
@@ -113,6 +124,9 @@ export type RunType = (typeof RUN_TYPES)[number];
export type RunStatus = (typeof RUN_STATUSES)[number];
export type RunEventLevel = (typeof RUN_EVENT_LEVELS)[number];
export type ScreenshotViewport = (typeof SCREENSHOT_VIEWPORTS)[number];
export type PageSpeedStrategy = (typeof PAGE_SPEED_STRATEGIES)[number];
export type PageSpeedResultStatus = (typeof PAGE_SPEED_RESULT_STATUSES)[number];
export type PageSpeedErrorType = (typeof PAGE_SPEED_ERROR_TYPES)[number];
export type SettingsRow = {
key: string;

View File

@@ -17,6 +17,7 @@ import {
buildLeadDiscoveryLeadRecord,
buildLeadDiscoveryCounters,
getLeadDiscoveryPriority,
shouldScheduleWebsiteEnrichment,
} from "../lib/lead-discovery-run";
import { calculateNextRunAt } from "../lib/campaign-scheduling";
@@ -214,6 +215,11 @@ export const processCampaignRun = internalAction({
skippedDuplicates: number;
skippedBlacklisted: number;
errors: number;
websiteEnrichmentQueue: Array<{
leadId: Id<"leads">;
companyName: string;
website: string;
}>;
} = await ctx.runMutation(internal.leadDiscovery.persistDiscoveredLeads, {
runId: args.runId,
campaignId: campaign._id,
@@ -223,6 +229,31 @@ export const processCampaignRun = internalAction({
candidates,
});
for (const enrichment of result.websiteEnrichmentQueue) {
await ctx.runMutation(internal.websiteEnrichment.queueLeadEnrichment, {
leadId: enrichment.leadId,
parentRunId: args.runId,
});
await ctx.runMutation(internal.leadDiscovery.appendRunEvent, {
runId: args.runId,
level: "info",
message: "Website-Kontaktanreicherung geplant.",
details: [
{
label: "Unternehmen",
value: enrichment.companyName,
source: "google_places",
},
{
label: "Website",
value: enrichment.website,
source: "google_places",
},
],
});
}
await ctx.runMutation(internal.leadDiscovery.finishCampaignRun, {
runId: args.runId,
status: "succeeded",
@@ -275,7 +306,9 @@ export const startCampaignRun = internalMutation({
const activeRunning = await ctx.db
.query("agentRuns")
.withIndex("by_status", (q) => q.eq("status", "running"))
.withIndex("by_type_and_status", (q) =>
q.eq("type", "campaign").eq("status", "running"),
)
.take(1);
if (activeRunning.length > 0) {
@@ -390,6 +423,11 @@ export const persistDiscoveredLeads = internalMutation({
let skippedDuplicates = 0;
let skippedBlacklisted = 0;
let errors = 0;
const websiteEnrichmentQueue: Array<{
leadId: Id<"leads">;
companyName: string;
website: string;
}> = [];
for (const candidate of args.candidates) {
if (leadsCreated >= args.maxNewLeads) {
@@ -556,8 +594,15 @@ export const persistDiscoveredLeads = internalMutation({
lead.duplicateOfLeadId = probableDuplicateLead._id;
}
await ctx.db.insert("leads", lead);
const leadId = await ctx.db.insert("leads", lead);
leadsCreated += 1;
if (shouldScheduleWebsiteEnrichment(lead)) {
websiteEnrichmentQueue.push({
leadId,
companyName: lead.companyName,
website: lead.websiteDomain ?? lead.websiteUrl ?? "unbekannt",
});
}
await ctx.db.insert("agentRunEvents", {
runId: args.runId,
level: "info",
@@ -589,6 +634,7 @@ export const persistDiscoveredLeads = internalMutation({
skippedDuplicates,
skippedBlacklisted,
errors,
websiteEnrichmentQueue,
};
},
});

314
convex/pageSpeed.ts Normal file
View File

@@ -0,0 +1,314 @@
import { internal } from "./_generated/api";
import type { Doc, Id } from "./_generated/dataModel";
import { internalMutation } from "./_generated/server";
import { v } from "convex/values";
const PAGE_SPEED_COUNTER_TEMPLATE = {
leadsFound: 1,
leadsCreated: 0,
auditsCreated: 1,
outreachPrepared: 0,
errors: 0,
};
type PageSpeedLead = Pick<
Doc<"leads">,
"_id" | "contactStatus"
> & {
websiteUrl: string;
};
const runStatus = v.union(
v.literal("pending"),
v.literal("running"),
v.literal("succeeded"),
v.literal("failed"),
v.literal("canceled"),
);
const pageSpeedStrategy = v.union(v.literal("mobile"), v.literal("desktop"));
const pageSpeedResultStatus = v.union(
v.literal("succeeded"),
v.literal("failed"),
);
const pageSpeedErrorType = v.union(
v.literal("quota"),
v.literal("timeout"),
v.literal("unavailable"),
v.literal("invalid_url"),
v.literal("api_error"),
v.literal("unknown"),
);
export const queueLeadPageSpeedAudit = internalMutation({
args: {
leadId: v.id("leads"),
parentRunId: v.optional(v.id("agentRuns")),
},
returns: v.union(v.id("agentRuns"), v.null()),
handler: async (ctx, args): Promise<Id<"agentRuns"> | null> => {
const now = Date.now();
const lead = await ctx.db.get(args.leadId);
if (!lead || lead.priority === "blocked" || lead.priority === "defer") {
return null;
}
if (!lead.websiteUrl) {
return null;
}
const existingPending = await ctx.db
.query("agentRuns")
.withIndex("by_type_and_status_and_leadId", (q) =>
q.eq("type", "audit").eq("status", "pending").eq("leadId", args.leadId),
)
.take(1);
const existingRunning = await ctx.db
.query("agentRuns")
.withIndex("by_type_and_status_and_leadId", (q) =>
q.eq("type", "audit").eq("status", "running").eq("leadId", args.leadId),
)
.take(1);
if (existingPending.length > 0) {
return existingPending[0]._id;
}
if (existingRunning.length > 0) {
return existingRunning[0]._id;
}
const runId = await ctx.db.insert("agentRuns", {
type: "audit",
leadId: args.leadId,
status: "pending",
currentStep: "pagespeed_insights",
counters: PAGE_SPEED_COUNTER_TEMPLATE,
createdAt: now,
updatedAt: now,
});
await ctx.db.insert("agentRunEvents", {
runId,
level: "info",
message: "PageSpeed-Analyse wurde in die Warteschlange gesetzt.",
details: [
{ label: "Lead", value: args.leadId },
...(args.parentRunId ? [{ label: "Parent-Run", value: args.parentRunId }] : []),
],
createdAt: now,
});
await ctx.scheduler.runAfter(
0,
internal.pageSpeedAction.processPageSpeedAudit,
{
runId,
},
);
return runId;
},
});
export const startPageSpeedAuditRun = internalMutation({
args: {
runId: v.id("agentRuns"),
},
returns: v.union(
v.object({
lead: v.object({
_id: v.id("leads"),
websiteUrl: v.string(),
contactStatus: v.union(
v.literal("new"),
v.literal("missing_contact"),
v.literal("audit_ready"),
v.literal("outreach_ready"),
v.literal("contacted"),
v.literal("replied"),
v.literal("do_not_contact"),
),
}),
auditId: v.optional(v.id("audits")),
}),
v.null(),
),
handler: async (ctx, args): Promise<
{ lead: PageSpeedLead; auditId?: Id<"audits"> } | null
> => {
const now = Date.now();
const run = await ctx.db.get(args.runId);
if (!run) {
return null;
}
if (run.type !== "audit") {
return null;
}
if (run.status !== "pending") {
return null;
}
if (!run.leadId) {
await ctx.db.patch(args.runId, {
status: "failed",
currentStep: "pagespeed_insights",
errorSummary: "Run hat keine Lead-ID.",
updatedAt: now,
finishedAt: now,
});
await ctx.db.insert("agentRunEvents", {
runId: args.runId,
level: "error",
message:
"PageSpeed-Analyse konnte nicht gestartet werden: Kein Lead verknüpft.",
details: [{ label: "Lead-ID", value: "unbekannt" }],
createdAt: now,
});
return null;
}
const lead = await ctx.db.get(run.leadId);
if (!lead) {
await ctx.db.patch(args.runId, {
status: "failed",
currentStep: "pagespeed_insights",
errorSummary: "Lead wurde nicht gefunden.",
updatedAt: now,
finishedAt: now,
});
await ctx.db.insert("agentRunEvents", {
runId: args.runId,
level: "error",
message:
"PageSpeed-Analyse konnte nicht gestartet werden: Kein Lead mit Website-URL.",
details: [{ label: "Lead-ID", value: run.leadId }],
createdAt: now,
});
return null;
}
if (!lead.websiteUrl) {
await ctx.db.patch(args.runId, {
status: "failed",
currentStep: "pagespeed_insights",
errorSummary: "Lead hat keine Website-URL.",
updatedAt: now,
finishedAt: now,
});
await ctx.db.insert("agentRunEvents", {
runId: args.runId,
level: "error",
message:
"PageSpeed-Analyse konnte nicht gestartet werden: Kein Lead mit Website-URL.",
details: [{ label: "Lead-ID", value: lead._id }],
createdAt: now,
});
return null;
}
await ctx.db.patch(args.runId, {
status: "running",
currentStep: "pagespeed_insights",
startedAt: now,
updatedAt: now,
errorSummary: undefined,
});
await ctx.db.insert("agentRunEvents", {
runId: args.runId,
level: "info",
message: "PageSpeed-Analyse gestartet.",
details: [{ label: "Lead-ID", value: lead._id }],
createdAt: now,
});
return {
lead: {
_id: lead._id,
websiteUrl: lead.websiteUrl,
contactStatus: lead.contactStatus,
},
...(run.auditId ? { auditId: run.auditId } : {}),
};
},
});
export const persistPageSpeedResult = internalMutation({
args: {
leadId: v.id("leads"),
auditId: v.optional(v.id("audits")),
runId: v.id("agentRuns"),
strategy: pageSpeedStrategy,
status: pageSpeedResultStatus,
sourceUrl: v.string(),
finalUrl: v.optional(v.string()),
rawStorageId: v.optional(v.id("_storage")),
errorType: v.optional(pageSpeedErrorType),
errorSummary: v.optional(v.string()),
fetchedAt: v.number(),
normalized: v.optional(
v.object({
scores: v.optional(
v.object({
performance: v.optional(v.number()),
accessibility: v.optional(v.number()),
bestPractices: v.optional(v.number()),
seo: v.optional(v.number()),
}),
),
metrics: v.optional(
v.object({
firstContentfulPaintMs: v.optional(v.number()),
largestContentfulPaintMs: v.optional(v.number()),
cumulativeLayoutShift: v.optional(v.number()),
totalBlockingTimeMs: v.optional(v.number()),
speedIndexMs: v.optional(v.number()),
}),
),
opportunities: v.optional(v.array(v.string())),
implications: v.optional(v.array(v.string())),
}),
),
},
returns: v.id("pageSpeedResults"),
handler: async (ctx, args): Promise<Id<"pageSpeedResults">> => {
return await ctx.db.insert("pageSpeedResults", {
...args,
createdAt: Date.now(),
});
},
});
export const finishPageSpeedAuditRun = internalMutation({
args: {
runId: v.id("agentRuns"),
status: runStatus,
errorSummary: v.optional(v.string()),
errors: v.optional(v.number()),
},
handler: async (ctx, args) => {
const now = Date.now();
await ctx.db.patch(args.runId, {
status: args.status,
updatedAt: now,
finishedAt: now,
currentStep: "pagespeed_insights",
errorSummary: args.errorSummary,
counters: {
...PAGE_SPEED_COUNTER_TEMPLATE,
errors: args.errors ?? 0,
},
});
},
});

289
convex/pageSpeedAction.ts Normal file
View File

@@ -0,0 +1,289 @@
"use node";
import { api, internal } from "./_generated/api";
import { internalAction } from "./_generated/server";
import type { Id } from "./_generated/dataModel";
import { v } from "convex/values";
import {
classifyPageSpeedError,
fetchPageSpeedResult,
normalizePageSpeedResult,
type PageSpeedErrorType,
} from "../lib/pagespeed-insights";
const STRATEGIES = ["mobile", "desktop"] as const;
export const MAX_RAW_PAGESPEED_BYTES = 1_000_000;
const RAW_PAGESPEED_BYTES_SUMMARY =
"PageSpeed-Rohdaten sind groesser als das interne Speicherlimit.";
const DEFAULT_PAGESPEED_TIMEOUT_MS = 60_000;
const MIN_PAGESPEED_TIMEOUT_MS = 10_000;
const MAX_PAGESPEED_TIMEOUT_MS = 120_000;
function toPersistedPageSpeedNormalizedResult(
normalized: ReturnType<typeof normalizePageSpeedResult>,
) {
return {
...(normalized.scores ? { scores: normalized.scores } : {}),
metrics: normalized.metrics,
opportunities: normalized.opportunities,
implications: normalized.implications,
};
}
function parsePageSpeedTimeoutMs(raw: string | undefined): number {
if (!raw) {
return DEFAULT_PAGESPEED_TIMEOUT_MS;
}
const parsed = Number.parseInt(raw, 10);
if (!Number.isFinite(parsed)) {
return DEFAULT_PAGESPEED_TIMEOUT_MS;
}
return Math.min(
Math.max(parsed, MIN_PAGESPEED_TIMEOUT_MS),
MAX_PAGESPEED_TIMEOUT_MS,
);
}
function resolvePageSpeedTimeoutMs() {
return parsePageSpeedTimeoutMs(process.env.PAGESPEED_TIMEOUT_MS);
}
function isPageSpeedErrorType(value: unknown): value is PageSpeedErrorType {
return (
value === "quota" ||
value === "timeout" ||
value === "unavailable" ||
value === "invalid_url" ||
value === "api_error" ||
value === "unknown"
);
}
function sanitizeValue(value: string, secret?: string | null) {
if (!secret || !value) {
return value;
}
const escapedSecret = secret.replace(/[.*+?^${}()|[\]\\]/g, "\\$&");
return value.replace(new RegExp(escapedSecret, "g"), "[REDACTED]");
}
function classifyPageSpeedFailure(input: unknown, apiKey?: string | null) {
const directType =
typeof input === "object" &&
input !== null &&
"errorType" in input &&
(input as { errorType?: unknown }).errorType;
const normalizedType = isPageSpeedErrorType(directType) ? directType : null;
if (normalizedType) {
const message =
input instanceof Error && input.message
? input.message
: typeof input === "string"
? input
: "PageSpeed-Analyse fehlgeschlagen.";
return {
errorType: normalizedType,
errorSummary: sanitizeValue(message, apiKey),
};
}
const classified = classifyPageSpeedError({
error: input,
});
const errorSummary = sanitizeValue(classified.message, apiKey);
return {
errorType: classified.errorType,
errorSummary,
};
}
export const processPageSpeedAudit = internalAction({
args: {
runId: v.id("agentRuns"),
},
handler: async (ctx, args) => {
const apiKeyRaw = process.env.PAGESPEED_API_KEY?.trim();
const apiKey = apiKeyRaw ? apiKeyRaw : undefined;
let started:
| {
lead: {
_id: Id<"leads">;
websiteUrl: string;
};
auditId?: Id<"audits">;
}
| null = null;
try {
started = await ctx.runMutation(internal.pageSpeed.startPageSpeedAuditRun, {
runId: args.runId,
});
} catch (error) {
const { errorSummary } = classifyPageSpeedFailure(error, apiKeyRaw);
await ctx.runMutation(internal.pageSpeed.finishPageSpeedAuditRun, {
runId: args.runId,
status: "failed",
errors: 1,
errorSummary,
});
await ctx.runMutation(api.runs.appendEvent, {
runId: args.runId,
level: "error",
message: "PageSpeed-Analyse fehlgeschlagen.",
details: [{ label: "Fehler", value: errorSummary }],
});
return null;
}
if (!started) {
return null;
}
const sourceUrl = started.lead.websiteUrl;
const timeoutMs = resolvePageSpeedTimeoutMs();
let failedStrategies = 0;
let succeededStrategies = 0;
try {
for (const strategy of STRATEGIES) {
const fetchedAt = Date.now();
try {
const raw = await fetchPageSpeedResult({
url: sourceUrl,
strategy,
apiKey,
timeoutMs,
});
const rawJson = JSON.stringify(raw) ?? "null";
const rawJsonBytes = new TextEncoder().encode(rawJson).byteLength;
if (rawJsonBytes > MAX_RAW_PAGESPEED_BYTES) {
failedStrategies += 1;
await ctx.runMutation(internal.pageSpeed.persistPageSpeedResult, {
leadId: started.lead._id,
...(started.auditId ? { auditId: started.auditId } : {}),
runId: args.runId,
strategy,
status: "failed",
sourceUrl,
errorType: "api_error",
errorSummary: RAW_PAGESPEED_BYTES_SUMMARY,
fetchedAt,
});
await ctx.runMutation(api.runs.appendEvent, {
runId: args.runId,
level: "warning",
message: `PageSpeed-Analyse für ${strategy} fehlgeschlagen.`,
details: [
{ label: "Strategie", value: strategy },
{
label: "Fehler",
value: RAW_PAGESPEED_BYTES_SUMMARY,
},
],
});
continue;
}
const rawStorageId = await ctx.storage.store(
new Blob([rawJson], { type: "application/json" }),
);
const normalized = normalizePageSpeedResult({
strategy,
sourceUrl,
raw,
});
await ctx.runMutation(internal.pageSpeed.persistPageSpeedResult, {
leadId: started.lead._id,
...(started.auditId ? { auditId: started.auditId } : {}),
runId: args.runId,
strategy,
status: "succeeded",
sourceUrl,
finalUrl: normalized.finalUrl,
rawStorageId,
fetchedAt,
normalized: toPersistedPageSpeedNormalizedResult(normalized),
});
await ctx.runMutation(api.runs.appendEvent, {
runId: args.runId,
level: "info",
message: `PageSpeed-Analyse für ${strategy} abgeschlossen.`,
details: [{ label: "Strategie", value: strategy }],
});
succeededStrategies += 1;
} catch (error) {
const { errorType, errorSummary } = classifyPageSpeedFailure(
error,
apiKeyRaw,
);
failedStrategies += 1;
await ctx.runMutation(internal.pageSpeed.persistPageSpeedResult, {
leadId: started.lead._id,
...(started.auditId ? { auditId: started.auditId } : {}),
runId: args.runId,
strategy,
status: "failed",
sourceUrl,
errorType,
errorSummary,
fetchedAt,
});
await ctx.runMutation(api.runs.appendEvent, {
runId: args.runId,
level: "warning",
message: `PageSpeed-Analyse für ${strategy} fehlgeschlagen.`,
details: [
{ label: "Strategie", value: strategy },
{ label: "Fehler", value: errorSummary },
],
});
}
}
const status = succeededStrategies > 0 ? "succeeded" : "failed";
const errors = failedStrategies;
await ctx.runMutation(internal.pageSpeed.finishPageSpeedAuditRun, {
runId: args.runId,
status,
errors,
errorSummary:
status === "failed" && errors > 0
? "Ein oder mehrere PageSpeed-Strategien konnten nicht ausgeführt werden."
: undefined,
});
return args.runId;
} catch (error) {
const { errorSummary } = classifyPageSpeedFailure(error, apiKeyRaw);
await ctx.runMutation(internal.pageSpeed.finishPageSpeedAuditRun, {
runId: args.runId,
status: "failed",
errors: Math.max(1, failedStrategies),
errorSummary,
});
await ctx.runMutation(api.runs.appendEvent, {
runId: args.runId,
level: "error",
message: "PageSpeed-Analyse fehlgeschlagen.",
details: [{ label: "Fehler", value: errorSummary, source: "pagespeed_action" }],
});
return null;
}
},
});

View File

@@ -1,26 +1,17 @@
import { v } from "convex/values";
import { normalizeListLimit } from "./domain";
import {
RUN_EVENT_LEVELS,
RUN_STATUSES,
RUN_TYPES,
normalizeListLimit,
} from "./domain";
import { mutation, query } from "./_generated/server";
const runType = v.union(
v.literal("campaign"),
v.literal("lead_discovery"),
v.literal("audit"),
v.literal("outreach"),
v.literal("lifecycle"),
);
const runStatus = v.union(
v.literal("pending"),
v.literal("running"),
v.literal("succeeded"),
v.literal("failed"),
v.literal("canceled"),
);
const runType = v.union(...RUN_TYPES.map((type) => v.literal(type)));
const runStatus = v.union(...RUN_STATUSES.map((status) => v.literal(status)));
const eventLevel = v.union(
v.literal("info"),
v.literal("warning"),
v.literal("error"),
...RUN_EVENT_LEVELS.map((level) => v.literal(level)),
);
export const create = mutation({
@@ -116,6 +107,16 @@ export const list = query({
.take(limit);
}
if (args.type) {
const type = args.type;
return await ctx.db
.query("agentRuns")
.withIndex("by_type", (q) => q.eq("type", type))
.order("desc")
.take(limit);
}
if (args.status) {
const status = args.status;

View File

@@ -1,6 +1,11 @@
import { defineSchema, defineTable } from "convex/server";
import { v } from "convex/values";
import { tables as authTables } from "./betterAuth/schema";
import {
RUN_EVENT_LEVELS,
RUN_STATUSES,
RUN_TYPES,
} from "./domain";
const campaignStatus = v.union(v.literal("active"), v.literal("paused"));
const leadPriority = v.union(
@@ -75,26 +80,37 @@ const blacklistType = v.union(
v.literal("company"),
v.literal("google_place_id"),
);
const runType = v.union(
v.literal("campaign"),
v.literal("lead_discovery"),
v.literal("audit"),
v.literal("outreach"),
v.literal("lifecycle"),
);
const runStatus = v.union(
v.literal("pending"),
v.literal("running"),
v.literal("succeeded"),
v.literal("failed"),
v.literal("canceled"),
const websiteEnrichmentPageKind = v.union(
v.literal("homepage"),
v.literal("contact"),
v.literal("impressum"),
v.literal("services"),
v.literal("about"),
v.literal("team"),
v.literal("other"),
);
const runType = v.union(...RUN_TYPES.map((type) => v.literal(type)));
const runStatus = v.union(...RUN_STATUSES.map((status) => v.literal(status)));
const runEventLevel = v.union(
v.literal("info"),
v.literal("warning"),
v.literal("error"),
...RUN_EVENT_LEVELS.map((level) => v.literal(level)),
);
const screenshotViewport = v.union(v.literal("desktop"), v.literal("mobile"));
const pageSpeedStrategy = v.union(
v.literal("mobile"),
v.literal("desktop"),
);
const pageSpeedResultStatus = v.union(
v.literal("succeeded"),
v.literal("failed"),
);
const pageSpeedErrorType = v.union(
v.literal("quota"),
v.literal("timeout"),
v.literal("unavailable"),
v.literal("invalid_url"),
v.literal("api_error"),
v.literal("unknown"),
);
const settingsValue = v.union(v.string(), v.number(), v.boolean(), v.null());
const auditMetricSummary = v.object({
performanceScore: v.optional(v.number()),
@@ -255,6 +271,127 @@ export default defineSchema({
.index("by_auditId_and_viewport", ["auditId", "viewport"])
.index("by_storageId", ["storageId"]),
pageSpeedResults: defineTable({
leadId: v.id("leads"),
auditId: v.optional(v.id("audits")),
runId: v.optional(v.id("agentRuns")),
strategy: pageSpeedStrategy,
status: pageSpeedResultStatus,
sourceUrl: v.string(),
finalUrl: v.optional(v.string()),
rawStorageId: v.optional(v.id("_storage")),
errorType: v.optional(pageSpeedErrorType),
errorSummary: v.optional(v.string()),
fetchedAt: v.number(),
createdAt: v.number(),
normalized: v.optional(
v.object({
scores: v.optional(
v.object({
performance: v.optional(v.number()),
accessibility: v.optional(v.number()),
bestPractices: v.optional(v.number()),
seo: v.optional(v.number()),
}),
),
metrics: v.optional(
v.object({
firstContentfulPaintMs: v.optional(v.number()),
largestContentfulPaintMs: v.optional(v.number()),
cumulativeLayoutShift: v.optional(v.number()),
totalBlockingTimeMs: v.optional(v.number()),
speedIndexMs: v.optional(v.number()),
}),
),
opportunities: v.optional(v.array(v.string())),
implications: v.optional(v.array(v.string())),
}),
),
})
.index("by_leadId", ["leadId"])
.index("by_runId", ["runId"])
.index("by_auditId", ["auditId"])
.index("by_leadId_and_strategy", ["leadId", "strategy"]),
websiteCrawlPages: defineTable({
leadId: v.id("leads"),
runId: v.optional(v.id("agentRuns")),
sourceUrl: v.string(),
finalUrl: v.string(),
pageKind: websiteEnrichmentPageKind,
title: v.optional(v.string()),
metaDescription: v.optional(v.string()),
headings: v.array(v.string()),
visibleTextExcerpt: v.optional(v.string()),
hasContactFormSignal: v.boolean(),
hasContactCtaSignal: v.boolean(),
createdAt: v.number(),
})
.index("by_leadId", ["leadId"])
.index("by_runId", ["runId"])
.index("by_leadId_and_createdAt", ["leadId", "createdAt"]),
websiteCrawlLinks: defineTable({
leadId: v.id("leads"),
runId: v.optional(v.id("agentRuns")),
pageUrl: v.string(),
href: v.string(),
text: v.optional(v.string()),
isInternal: v.boolean(),
isBroken: v.optional(v.boolean()),
createdAt: v.number(),
})
.index("by_leadId", ["leadId"])
.index("by_runId", ["runId"]),
websiteEmailCandidates: defineTable({
leadId: v.id("leads"),
runId: v.optional(v.id("agentRuns")),
email: v.string(),
normalizedEmail: v.string(),
emailSource: v.string(),
sourceUrl: v.string(),
contactPerson: v.optional(v.string()),
isBusinessContactAddress: v.boolean(),
isGeneric: v.boolean(),
accepted: v.boolean(),
createdAt: v.number(),
})
.index("by_leadId", ["leadId"])
.index("by_normalizedEmail", ["normalizedEmail"])
.index("by_runId", ["runId"]),
websiteCrawlScreenshots: defineTable({
leadId: v.id("leads"),
runId: v.optional(v.id("agentRuns")),
storageId: v.id("_storage"),
viewport: screenshotViewport,
sourceUrl: v.string(),
capturedAt: v.number(),
width: v.number(),
height: v.number(),
mimeType: v.string(),
createdAt: v.number(),
})
.index("by_leadId", ["leadId"])
.index("by_runId", ["runId"])
.index("by_storageId", ["storageId"]),
websiteTechnicalChecks: defineTable({
leadId: v.id("leads"),
runId: v.optional(v.id("agentRuns")),
sourceUrl: v.string(),
finalUrl: v.optional(v.string()),
usesHttps: v.boolean(),
missingTitle: v.boolean(),
missingMetaDescription: v.boolean(),
hasVisibleContactPath: v.boolean(),
brokenInternalLinkCount: v.number(),
createdAt: v.number(),
})
.index("by_leadId", ["leadId"])
.index("by_runId", ["runId"]),
outreachRecords: defineTable({
leadId: v.id("leads"),
auditId: v.optional(v.id("audits")),
@@ -309,7 +446,9 @@ export default defineSchema({
updatedAt: v.number(),
})
.index("by_status", ["status"])
.index("by_type", ["type"])
.index("by_type_and_status", ["type", "status"])
.index("by_type_and_status_and_leadId", ["type", "status", "leadId"])
.index("by_campaignId_and_updatedAt", ["campaignId", "updatedAt"])
.index("by_campaignId_and_status", ["campaignId", "status"])
.index("by_auditId", ["auditId"]),

408
convex/websiteEnrichment.ts Normal file
View File

@@ -0,0 +1,408 @@
import { v } from "convex/values";
import { internal } from "./_generated/api";
import type { Doc } from "./_generated/dataModel";
import { internalMutation } from "./_generated/server";
import { normalizeEmailAddress } from "../lib/lead-discovery-google";
const RUN_COUNTER_TEMPLATE = {
leadsFound: 0,
leadsCreated: 0,
auditsCreated: 0,
outreachPrepared: 0,
errors: 0,
};
type WebsiteLead = Pick<Doc<"leads">, "_id" | "websiteUrl" | "contactStatus">;
type LeadContactStatus = Doc<"leads">["contactStatus"];
export const queueLeadEnrichment = internalMutation({
args: {
leadId: v.id("leads"),
parentRunId: v.optional(v.id("agentRuns")),
},
returns: v.union(v.id("agentRuns"), v.null()),
handler: async (ctx, args) => {
const now = Date.now();
const lead = await ctx.db.get(args.leadId);
if (!lead || !lead.websiteUrl) {
return null;
}
const activePending = await ctx.db
.query("agentRuns")
.withIndex("by_type_and_status_and_leadId", (q) =>
q
.eq("type", "website_enrichment")
.eq("status", "pending")
.eq("leadId", args.leadId),
)
.take(1);
const activeRunning = await ctx.db
.query("agentRuns")
.withIndex("by_type_and_status_and_leadId", (q) =>
q
.eq("type", "website_enrichment")
.eq("status", "running")
.eq("leadId", args.leadId),
)
.take(1);
if (activePending.length > 0) {
return activePending[0]._id;
}
if (activeRunning.length > 0) {
return activeRunning[0]._id;
}
const runId = await ctx.db.insert("agentRuns", {
type: "website_enrichment",
leadId: args.leadId,
status: "pending",
counters: RUN_COUNTER_TEMPLATE,
currentStep: "website_enrichment",
createdAt: now,
updatedAt: now,
});
await ctx.db.insert("agentRunEvents", {
runId,
level: "info",
message: "Website-Enrichment wurde in die Warteschlange gesetzt.",
details: [
{ label: "Lead", value: args.leadId },
...(args.parentRunId
? [{ label: "Parent-Run", value: args.parentRunId }]
: []),
],
createdAt: now,
});
await ctx.scheduler.runAfter(
0,
internal.websiteEnrichmentAction.processLeadEnrichment,
{
runId,
},
);
return runId;
},
});
export const startLeadEnrichmentRun = internalMutation({
args: { runId: v.id("agentRuns") },
handler: async (ctx, args): Promise<
{ lead: WebsiteLead } | null
> => {
const now = Date.now();
const run = await ctx.db.get(args.runId);
if (!run || run.type !== "website_enrichment" || run.status !== "pending") {
return null;
}
if (!run.leadId) {
await ctx.db.patch(args.runId, {
status: "failed",
currentStep: "website_enrichment",
errorSummary: "Der Lauf hat keine Lead-ID.",
updatedAt: now,
finishedAt: now,
});
await ctx.db.insert("agentRunEvents", {
runId: args.runId,
level: "error",
message:
"Website-Enrichment konnte nicht gestartet werden: Keine Lead-ID.",
details: [{ label: "Lead-ID", value: run.leadId ?? "unbekannt" }],
createdAt: now,
});
return null;
}
const lead = await ctx.db.get(run.leadId);
if (!lead) {
await ctx.db.patch(args.runId, {
status: "failed",
currentStep: "website_enrichment",
errorSummary: "Lead fehlt oder besitzt keine Website.",
updatedAt: now,
finishedAt: now,
});
await ctx.db.insert("agentRunEvents", {
runId: args.runId,
level: "error",
message:
"Website-Enrichment konnte nicht gestartet werden: Kein Lead mit Website-URL.",
details: [{ label: "Lead-ID", value: run.leadId }],
createdAt: now,
});
return null;
}
if (!lead.websiteUrl) {
await ctx.db.patch(args.runId, {
status: "failed",
currentStep: "website_enrichment",
errorSummary: "Lead fehlt oder besitzt keine Website.",
updatedAt: now,
finishedAt: now,
});
await ctx.db.insert("agentRunEvents", {
runId: args.runId,
level: "error",
message:
"Website-Enrichment konnte nicht gestartet werden: Kein Lead mit Website-URL.",
details: [{ label: "Lead-ID", value: lead._id }],
createdAt: now,
});
await ctx.db.patch(lead._id, {
contactStatusReason:
"Website-URL fehlt für das Website-Enrichment.",
updatedAt: now,
});
return null;
}
await ctx.db.patch(args.runId, {
status: "running",
currentStep: "website_enrichment",
startedAt: now,
updatedAt: now,
});
await ctx.db.insert("agentRunEvents", {
runId: args.runId,
level: "info",
message: "Website-Enrichment gestartet.",
details: [{ label: "Lead", value: lead._id }],
createdAt: now,
});
return {
lead: {
_id: lead._id,
websiteUrl: lead.websiteUrl,
contactStatus: lead.contactStatus,
},
};
},
});
export const persistLeadEnrichmentResult = internalMutation({
args: {
runId: v.id("agentRuns"),
leadId: v.id("leads"),
pages: v.array(
v.object({
sourceUrl: v.string(),
finalUrl: v.string(),
pageKind: v.union(
v.literal("homepage"),
v.literal("contact"),
v.literal("impressum"),
v.literal("services"),
v.literal("about"),
v.literal("team"),
v.literal("other"),
),
title: v.optional(v.string()),
metaDescription: v.optional(v.string()),
headings: v.array(v.string()),
visibleTextExcerpt: v.optional(v.string()),
hasContactFormSignal: v.boolean(),
hasContactCtaSignal: v.boolean(),
}),
),
links: v.array(
v.object({
pageUrl: v.string(),
href: v.string(),
text: v.optional(v.string()),
isInternal: v.boolean(),
isBroken: v.optional(v.boolean()),
}),
),
emailCandidates: v.array(
v.object({
email: v.string(),
normalizedEmail: v.string(),
emailSource: v.string(),
sourceUrl: v.string(),
contactPerson: v.optional(v.string()),
isBusinessContactAddress: v.boolean(),
isGeneric: v.boolean(),
accepted: v.boolean(),
}),
),
screenshots: v.array(
v.object({
storageId: v.id("_storage"),
viewport: v.union(v.literal("desktop"), v.literal("mobile")),
sourceUrl: v.string(),
capturedAt: v.number(),
width: v.number(),
height: v.number(),
mimeType: v.string(),
}),
),
technicalChecks: v.array(
v.object({
sourceUrl: v.string(),
finalUrl: v.optional(v.string()),
usesHttps: v.boolean(),
missingTitle: v.boolean(),
missingMetaDescription: v.boolean(),
hasVisibleContactPath: v.boolean(),
brokenInternalLinkCount: v.number(),
}),
),
},
handler: async (ctx, args) => {
const createdAt = Date.now();
for (const page of args.pages) {
await ctx.db.insert("websiteCrawlPages", {
...page,
leadId: args.leadId,
runId: args.runId,
createdAt,
});
}
for (const link of args.links) {
await ctx.db.insert("websiteCrawlLinks", {
...link,
leadId: args.leadId,
runId: args.runId,
createdAt,
});
}
for (const candidate of args.emailCandidates) {
await ctx.db.insert("websiteEmailCandidates", {
...candidate,
leadId: args.leadId,
runId: args.runId,
createdAt,
});
}
for (const screenshot of args.screenshots) {
await ctx.db.insert("websiteCrawlScreenshots", {
...screenshot,
leadId: args.leadId,
runId: args.runId,
createdAt,
});
}
for (const checks of args.technicalChecks) {
await ctx.db.insert("websiteTechnicalChecks", {
...checks,
leadId: args.leadId,
runId: args.runId,
createdAt,
});
}
},
});
export const finishLeadEnrichmentRun = internalMutation({
args: {
runId: v.id("agentRuns"),
status: v.union(
v.literal("succeeded"),
v.literal("failed"),
v.literal("canceled"),
),
currentStep: v.optional(v.string()),
errorSummary: v.optional(v.string()),
errors: v.optional(v.number()),
},
handler: async (ctx, args) => {
const now = Date.now();
await ctx.db.patch(args.runId, {
status: args.status,
updatedAt: now,
finishedAt: now,
currentStep: args.currentStep ?? "website_enrichment",
errorSummary: args.errorSummary,
counters: {
leadsFound: 1,
leadsCreated: 0,
auditsCreated: 0,
outreachPrepared: 0,
errors: args.errors ?? 0,
},
});
},
});
export const patchLeadFromWebsiteEnrichment = internalMutation({
args: {
leadId: v.id("leads"),
email: v.optional(v.string()),
emailSource: v.optional(v.string()),
contactPerson: v.optional(v.string()),
currentContactStatus: v.union(
v.literal("new"),
v.literal("missing_contact"),
v.literal("audit_ready"),
v.literal("outreach_ready"),
v.literal("contacted"),
v.literal("replied"),
v.literal("do_not_contact"),
),
contactStatusReason: v.optional(v.string()),
},
handler: async (ctx, args) => {
const lead = await ctx.db.get(args.leadId);
if (!lead) {
return null;
}
type LeadPatch = {
email?: string;
normalizedEmail?: string;
emailSource?: string;
contactPerson?: string;
contactStatus?: LeadContactStatus;
contactStatusReason?: string;
updatedAt: number;
};
const patch: LeadPatch = {
updatedAt: Date.now(),
};
if (args.email && args.emailSource) {
const normalized = normalizeEmailAddress(args.email);
if (normalized) {
patch.email = normalized;
patch.normalizedEmail = normalized;
patch.emailSource = args.emailSource;
}
}
if (args.contactPerson) {
patch.contactPerson = args.contactPerson;
}
if (args.contactStatusReason !== undefined) {
patch.contactStatusReason = args.contactStatusReason;
} else if (args.email && args.currentContactStatus === "missing_contact") {
patch.contactStatus = "new";
}
if (Object.keys(patch).length > 1) {
await ctx.db.patch(args.leadId, patch);
}
return args.leadId;
},
});

View File

@@ -0,0 +1,788 @@
"use node";
import type { Browser, BrowserContext } from "playwright-core";
import { createHash } from "node:crypto";
import { access, readFile, rm, writeFile } from "node:fs/promises";
import { tmpdir } from "node:os";
import path from "node:path";
import { v } from "convex/values";
import {
buildTechnicalChecks,
discoverRelevantSubpageUrls,
extractContactSignalsFromHtmlLikeText,
isSameRegistrableHostishDomain,
normalizeCrawlUrl,
} from "../lib/website-crawler";
import {
getUsableContactEmailFromEntries,
normalizeEmailAddress,
} from "../lib/lead-discovery-google";
import { api, internal } from "./_generated/api";
import type { Doc, Id } from "./_generated/dataModel";
import { internalAction, type ActionCtx } from "./_generated/server";
const DEFAULT_CRAWL_TIMEOUT_MS = 60_000;
const DEFAULT_CRAWL_MAX_PAGES = 5;
const MAX_PERSISTED_LINKS = 120;
const MAX_PERSISTED_EMAIL_CANDIDATES = 40;
const SCREENSHOT_MIME_TYPE = "image/png";
const CHROMIUM_SOURCE_MARKER_FILE = path.join(tmpdir(), "chromium-source.sha256");
const CHROMIUM_EXECUTABLE_PATH = path.join(tmpdir(), "chromium");
const CHROMIUM_PACK_PATH = path.join(tmpdir(), "chromium-pack");
const GENERIC_EMAIL_LOCALS = new Set([
"info",
"kontakt",
"contact",
"sales",
"team",
"support",
"service",
"hello",
"marketing",
"admin",
"office",
"impressum",
"post",
]);
const CHROMIUM_EXECUTABLE_SOURCE_ENV_VARS = [
"TASK8_BROWSER_ASSET_URL",
"TASK8_CHROMIUM_EXECUTABLE_URL",
"TASK8_CHROMIUM_EXECUTABLE",
];
type EnrichmentPageKind =
| "homepage"
| "contact"
| "impressum"
| "services"
| "about"
| "team"
| "other";
type CrawlPageLink = {
href: string;
text: string;
isInternal: boolean;
};
type PersistedCrawlLink = CrawlPageLink & {
pageUrl: string;
};
type PageResult = {
sourceUrl: string;
finalUrl: string;
pageKind: EnrichmentPageKind;
title: string;
metaDescription: string;
headings: string[];
visibleText: string;
links: CrawlPageLink[];
emailCandidates: Array<{
email: string;
emailSource: string;
contactPerson: string | null;
isBusinessContactAddress: boolean;
isGeneric: boolean;
sourceUrl: string;
accepted: boolean;
normalizedEmail: string;
}>;
hasContactFormSignal: boolean;
hasContactCtaSignal: boolean;
};
type StoredScreenshot = {
storageId: Id<"_storage">;
viewport: "desktop" | "mobile";
sourceUrl: string;
capturedAt: number;
width: number;
height: number;
mimeType: string;
};
type WebsiteLead = Pick<
Doc<"leads">,
"_id" | "websiteUrl" | "contactStatus"
>;
type StartedLead = {
lead: WebsiteLead;
};
type ServerlessChromiumModule = {
args: string[];
executablePath: (input?: string) => Promise<string>;
inflate: (filePath: string) => Promise<string>;
setupLambdaEnvironment: (baseLibPath: string) => void;
};
function messageFromError(error: unknown) {
return error instanceof Error ? error.message : String(error);
}
function readPositiveIntEnv(key: string, fallback: number) {
const raw = process.env[key]?.trim();
if (!raw) {
return fallback;
}
const parsed = Number.parseInt(raw, 10);
return Number.isFinite(parsed) && parsed > 0 ? parsed : fallback;
}
function crawlTimeoutMs() {
return readPositiveIntEnv("TASK8_CRAWL_TIMEOUT_MS", DEFAULT_CRAWL_TIMEOUT_MS);
}
function crawlMaxPages() {
return Math.max(
1,
Math.min(
DEFAULT_CRAWL_MAX_PAGES,
readPositiveIntEnv("TASK8_CRAWL_MAX_PAGES", DEFAULT_CRAWL_MAX_PAGES),
),
);
}
function makePageKind(url: string, rootUrl: string): EnrichmentPageKind {
const normalizedRoot = normalizeCrawlUrl(rootUrl);
if (!normalizedRoot) {
return "other";
}
const homepagePath = new URL(normalizedRoot).pathname.replace(/\/$/, "") || "/";
let pageUrl: string;
try {
pageUrl = new URL(url).pathname.toLowerCase();
} catch {
return "other";
}
if (pageUrl === homepagePath || pageUrl === homepagePath.replace(/\/$/, "")) {
return "homepage";
}
const normalizedPath = pageUrl.toLowerCase();
if (/(?:^|\/)(kontakt|contact)(?:[-/]|$)/.test(normalizedPath)) {
return "contact";
}
if (/(?:^|\/)(impressum|imprint)(?:[-/]|$)/.test(normalizedPath)) {
return "impressum";
}
if (/(?:^|\/)(leistungen|angebot|services?)(?:[-/]|$)/.test(normalizedPath)) {
return "services";
}
if (/(?:^|\/)(ueber|über|about|team)(?:[-/]|$)/.test(normalizedPath)) {
return "about";
}
return "other";
}
function trimExcerpt(value: string) {
return value.replace(/\s+/g, " ").trim().slice(0, 1200);
}
function isGenericBusinessEmail(email: string) {
const local = email.split("@")[0]?.toLowerCase() ?? "";
const base = local.split("+")[0] ?? "";
return GENERIC_EMAIL_LOCALS.has(base);
}
async function loadPlaywrightModules() {
const [playwrightCore, chromiumPackage] = await Promise.all([
import("playwright-core"),
import("@sparticuz/chromium-min"),
]);
return {
playwrightCore,
serverlessChromium: {
args: chromiumPackage.default.args,
executablePath: chromiumPackage.default.executablePath,
inflate: chromiumPackage.inflate,
setupLambdaEnvironment: chromiumPackage.setupLambdaEnvironment,
} as ServerlessChromiumModule,
};
}
function getChromiumExecutableSource() {
for (const key of CHROMIUM_EXECUTABLE_SOURCE_ENV_VARS) {
const value = process.env[key]?.trim();
if (value) {
return value;
}
}
return null;
}
function getChromiumSourceMarker(source: string) {
return createHash("sha256").update(source).digest("hex");
}
async function clearChromiumCacheForSourceMismatch(executableSource: string) {
const nextMarker = getChromiumSourceMarker(executableSource);
const marker = await readFile(CHROMIUM_SOURCE_MARKER_FILE, "utf8").catch(() => null);
if ((marker ?? "").trim() === nextMarker) {
return;
}
await Promise.all([
rm(CHROMIUM_EXECUTABLE_PATH, { force: true, recursive: true }),
rm(CHROMIUM_PACK_PATH, { force: true, recursive: true }),
]);
}
async function resolveChromiumExecutablePath(
chromium: ServerlessChromiumModule,
) {
const executableSource = getChromiumExecutableSource();
if (!executableSource) {
throw new Error(
`Set TASK8_BROWSER_ASSET_URL (or legacy TASK8_CHROMIUM_EXECUTABLE_URL / TASK8_CHROMIUM_EXECUTABLE) to configure the Chromium source; no source is configured.`,
);
}
await clearChromiumCacheForSourceMismatch(executableSource);
const executablePath = await chromium.executablePath(executableSource);
await writeFile(
CHROMIUM_SOURCE_MARKER_FILE,
getChromiumSourceMarker(executableSource),
);
return executablePath;
}
async function captureHomepageScreenshot(
ctx: ActionCtx,
context: BrowserContext,
homepageUrl: string,
viewport: "desktop" | "mobile",
timeoutMs: number,
) {
const page = await context.newPage();
try {
await page.goto(homepageUrl, {
waitUntil: "domcontentloaded",
timeout: timeoutMs,
});
const sourceUrl = page.url();
const screenshot = await page.screenshot({
fullPage: true,
type: "png",
});
const storageId = await ctx.storage.store(
new Blob([new Uint8Array(screenshot)], { type: SCREENSHOT_MIME_TYPE }),
);
const viewportSize = page.viewportSize() ?? { width: 0, height: 0 };
return {
storageId,
viewport,
sourceUrl,
capturedAt: Date.now(),
width: viewportSize.width,
height: viewportSize.height,
mimeType: SCREENSHOT_MIME_TYPE,
} satisfies StoredScreenshot;
} finally {
await page.close();
}
}
async function crawlPage(
context: BrowserContext,
targetUrl: string,
rootUrl: string,
timeoutMs: number,
) {
const page = await context.newPage();
try {
const response = await page.goto(targetUrl, {
waitUntil: "domcontentloaded",
timeout: timeoutMs,
});
if (!response) {
return null;
}
const finalUrl = page.url();
const title = await page.title().catch(() => "");
const metaDescription = await page
.evaluate(() => {
const meta = document.querySelector(
"meta[name='description']",
) as HTMLMetaElement | null;
return meta?.content ?? "";
})
.catch(() => "");
const content = await page.content();
const signals = extractContactSignalsFromHtmlLikeText(content);
const headings = await page
.evaluate(() =>
Array.from(document.querySelectorAll("h1, h2, h3"))
.map((element) => element.textContent?.trim() ?? "")
.filter((heading) => heading.length > 0),
)
.catch(() => []);
const visibleText = await page.evaluate(() => {
return document.body?.innerText ?? "";
});
const rawLinks = await page
.evaluate(() =>
Array.from(document.querySelectorAll("a[href]")).map((anchor) => ({
href: anchor.getAttribute("href") ?? "",
text: anchor.textContent?.trim() ?? "",
})),
)
.catch(() => []);
const normalizedLinks = rawLinks
.map((link) => {
const normalizedHref = normalizeCrawlUrl(link.href, finalUrl);
if (!normalizedHref) {
return null;
}
return {
href: normalizedHref,
text: link.text,
isInternal: isSameRegistrableHostishDomain(normalizedHref, rootUrl),
};
})
.filter(
(entry): entry is { href: string; text: string; isInternal: boolean } =>
entry !== null,
);
const emailCandidates = signals.emailCandidates
.map((entry) => {
const normalizedEmail = normalizeEmailAddress(entry.email);
if (!normalizedEmail) {
return null;
}
return {
email: normalizedEmail,
emailSource: finalUrl,
contactPerson: entry.contactPerson ?? null,
isBusinessContactAddress: entry.isBusinessContactAddress,
isGeneric: isGenericBusinessEmail(normalizedEmail),
sourceUrl: finalUrl,
accepted: false,
normalizedEmail,
};
})
.filter((entry): entry is NonNullable<typeof entry> => entry !== null);
return {
sourceUrl: finalUrl,
finalUrl,
pageKind: makePageKind(targetUrl, rootUrl),
title,
metaDescription,
headings,
visibleText,
links: normalizedLinks,
emailCandidates,
hasContactFormSignal: signals.hasContactFormSignal,
hasContactCtaSignal: signals.hasContactCtaSignal,
} satisfies PageResult;
} finally {
await page.close();
}
}
function deduplicateLeadEmailCandidates(
candidates: PageResult["emailCandidates"],
) {
const unique = new Map<string, PageResult["emailCandidates"][number]>();
for (const candidate of candidates) {
if (!unique.has(candidate.normalizedEmail)) {
unique.set(candidate.normalizedEmail, candidate);
}
}
return [...unique.values()];
}
function deduplicateCrawlLinks(links: PersistedCrawlLink[]) {
const unique = new Map<string, PersistedCrawlLink>();
for (const link of links) {
if (!unique.has(link.href)) {
unique.set(link.href, link);
}
}
return [...unique.values()];
}
export const processLeadEnrichment = internalAction({
args: { runId: v.id("agentRuns") },
handler: async (ctx, args) => {
let started: StartedLead | null = null;
const runId = args.runId;
let browser: Browser | null = null;
let desktopContext: BrowserContext | null = null;
let mobileContext: BrowserContext | null = null;
try {
started = await ctx.runMutation(internal.websiteEnrichment.startLeadEnrichmentRun, {
runId,
});
if (!started) {
return null;
}
const rootUrl = normalizeCrawlUrl(started.lead.websiteUrl);
if (!rootUrl) {
try {
await ctx.runMutation(internal.pageSpeed.queueLeadPageSpeedAudit, {
leadId: started.lead._id,
parentRunId: runId,
});
} catch (pageSpeedQueueError) {
await ctx.runMutation(api.runs.appendEvent, {
runId,
level: "warning",
message: "PageSpeed-Analyse konnte nicht in die Warteschlange gesetzt werden.",
details: [
{ label: "Lead", value: started.lead._id },
{
label: "Fehler",
value: messageFromError(pageSpeedQueueError),
source: "pagespeed_queue",
},
],
});
}
await ctx.runMutation(internal.websiteEnrichment.finishLeadEnrichmentRun, {
runId,
status: "failed",
currentStep: "website_enrichment",
errorSummary: "Ungültige Website-URL.",
errors: 1,
});
await ctx.runMutation(api.runs.appendEvent, {
runId,
level: "error",
message: "Website-Enrichment fehlgeschlagen: Ungültige Website-URL.",
details: [{ label: "Lead", value: started.lead._id }],
});
await ctx.runMutation(internal.websiteEnrichment.patchLeadFromWebsiteEnrichment, {
leadId: started.lead._id,
currentContactStatus: started.lead.contactStatus,
contactStatusReason:
"Website-Enrichment fehlgeschlagen: Ungültige Website-URL.",
});
return null;
}
const timeoutMs = crawlTimeoutMs();
const maxPages = crawlMaxPages();
const { playwrightCore, serverlessChromium } =
await loadPlaywrightModules();
const executablePath = await resolveChromiumExecutablePath(
serverlessChromium,
);
const prepareChromiumSharedLibraries = async (
chromiumRuntime: ServerlessChromiumModule,
) => {
const runtimeArchivePath = path.join(
CHROMIUM_PACK_PATH,
"al2023.tar.br",
);
await access(runtimeArchivePath).catch(() => {
throw new Error(
`AL2023 shared library archive not found at ${runtimeArchivePath}; cannot prepare Chromium shared libraries.`,
);
});
await chromiumRuntime.inflate(runtimeArchivePath);
chromiumRuntime.setupLambdaEnvironment(path.join(tmpdir(), "al2023", "lib"));
};
await prepareChromiumSharedLibraries(serverlessChromium);
browser = await playwrightCore.chromium.launch({
headless: true,
executablePath,
args: serverlessChromium.args,
});
const { devices } = playwrightCore;
desktopContext = await browser.newContext({
...devices["Desktop Chrome"],
});
mobileContext = await browser.newContext({
...devices["iPhone 11"],
});
const homepage = await crawlPage(desktopContext, rootUrl, rootUrl, timeoutMs);
if (!homepage) {
throw new Error("Homepage konnte nicht geladen werden.");
}
const requestedPages = discoverRelevantSubpageUrls(
homepage.links.map((link) => link.href),
rootUrl,
);
const crawlTargets = requestedPages.slice(0, maxPages);
const crawledPages: PageResult[] = [homepage];
for (const pageUrl of crawlTargets.slice(1)) {
const crawled = await crawlPage(desktopContext, pageUrl, rootUrl, timeoutMs);
if (crawled) {
crawledPages.push(crawled);
}
}
const allLinks: PersistedCrawlLink[] = crawledPages.flatMap((page) =>
page.links.map((link) => ({
...link,
pageUrl: page.finalUrl,
})),
);
const internalLinks = allLinks.filter((link) => link.isInternal);
const uniqueInternalLinks = [...new Set(internalLinks.map((link) => link.href))];
const checkMap = new Map<
string,
{ status: number | null; isBroken: boolean }
>();
for (const href of uniqueInternalLinks.slice(0, 30)) {
try {
const response = await desktopContext.request.get(href, {
timeout: Math.max(1_000, timeoutMs - 1_000),
});
const status = response.status();
checkMap.set(href, {
status,
isBroken: status < 200 || status >= 400,
});
} catch {
checkMap.set(href, {
status: null,
isBroken: true,
});
}
}
const desktopScreenshot = await captureHomepageScreenshot(
ctx,
desktopContext,
homepage.finalUrl,
"desktop",
timeoutMs,
);
const mobileScreenshot = await captureHomepageScreenshot(
ctx,
mobileContext,
homepage.finalUrl,
"mobile",
timeoutMs,
);
const technicalInput = buildTechnicalChecks({
rootUrl,
finalUrl: homepage.finalUrl,
title: homepage.title,
metaDescription: homepage.metaDescription,
visibleText: homepage.visibleText,
checkedUrls: crawledPages.map((page) => page.finalUrl),
links: allLinks.map((link) => {
const check = checkMap.get(link.href);
return {
href: link.href,
status: check?.status ?? undefined,
statusCode: check?.status ?? undefined,
isBroken: check?.isBroken,
};
}),
});
const validCandidates = deduplicateLeadEmailCandidates(
crawledPages.flatMap((page) => page.emailCandidates),
);
const persistedLinks = deduplicateCrawlLinks(allLinks).slice(
0,
MAX_PERSISTED_LINKS,
);
const persistedCandidates = validCandidates.slice(
0,
MAX_PERSISTED_EMAIL_CANDIDATES,
);
const usable = getUsableContactEmailFromEntries(
validCandidates.map((candidate) => ({
email: candidate.email,
emailSource: candidate.emailSource,
contactPerson: candidate.contactPerson,
isBusinessContactAddress: candidate.isBusinessContactAddress,
})),
);
await ctx.runMutation(internal.websiteEnrichment.persistLeadEnrichmentResult, {
runId,
leadId: started.lead._id,
pages: crawledPages.map((page) => ({
sourceUrl: page.sourceUrl,
finalUrl: page.finalUrl,
pageKind: page.pageKind,
title: page.title,
metaDescription: page.metaDescription,
headings: page.headings,
visibleTextExcerpt: trimExcerpt(page.visibleText),
hasContactFormSignal: page.hasContactFormSignal,
hasContactCtaSignal: page.hasContactCtaSignal,
})),
links: persistedLinks.map((link) => ({
pageUrl: link.pageUrl,
href: link.href,
text: link.text,
isInternal: link.isInternal,
isBroken: checkMap.get(link.href)?.isBroken,
})),
emailCandidates: persistedCandidates.map((candidate) => ({
email: candidate.email,
normalizedEmail: candidate.normalizedEmail,
emailSource: candidate.emailSource,
sourceUrl: candidate.sourceUrl,
contactPerson: candidate.contactPerson ?? undefined,
isBusinessContactAddress: candidate.isBusinessContactAddress,
isGeneric: candidate.isGeneric,
accepted:
usable !== null && candidate.normalizedEmail === usable.email,
})),
screenshots: [
...(desktopScreenshot ? [desktopScreenshot] : []),
...(mobileScreenshot ? [mobileScreenshot] : []),
],
technicalChecks: [
{
sourceUrl: homepage.sourceUrl,
finalUrl: homepage.finalUrl,
usesHttps: technicalInput.https,
missingTitle: technicalInput.missingTitle,
missingMetaDescription: technicalInput.missingMetaDescription,
hasVisibleContactPath: technicalInput.hasVisibleContactPath,
brokenInternalLinkCount: technicalInput.brokenInternalLinks.length,
},
],
});
if (usable) {
await ctx.runMutation(internal.websiteEnrichment.patchLeadFromWebsiteEnrichment, {
leadId: started.lead._id,
email: usable.email,
emailSource: usable.emailSource ?? undefined,
contactPerson: usable.contactPerson ?? undefined,
currentContactStatus: started.lead.contactStatus,
});
} else {
await ctx.runMutation(internal.websiteEnrichment.patchLeadFromWebsiteEnrichment, {
leadId: started.lead._id,
currentContactStatus: started.lead.contactStatus,
contactStatusReason:
"Kein verwertbarer Kontakt auf der Website gefunden.",
});
}
try {
await ctx.runMutation(internal.pageSpeed.queueLeadPageSpeedAudit, {
leadId: started.lead._id,
parentRunId: runId,
});
} catch (pageSpeedQueueError) {
await ctx.runMutation(api.runs.appendEvent, {
runId,
level: "warning",
message: "PageSpeed-Analyse konnte nicht in die Warteschlange gesetzt werden.",
details: [
{ label: "Lead", value: started.lead._id },
{
label: "Fehler",
value: messageFromError(pageSpeedQueueError),
source: "pagespeed_queue",
},
],
});
}
await ctx.runMutation(internal.websiteEnrichment.finishLeadEnrichmentRun, {
runId,
status: "succeeded",
currentStep: "website_enrichment",
errors: 0,
});
await ctx.runMutation(api.runs.appendEvent, {
runId,
level: "info",
message: usable
? "Website-Enrichment erfolgreich mit nutzbarer E-Mail abgeschlossen."
: "Website-Enrichment abgeschlossen, aber ohne nutzbare E-Mail.",
});
return runId;
} catch (error) {
const errorSummary = messageFromError(error);
await ctx.runMutation(internal.websiteEnrichment.finishLeadEnrichmentRun, {
runId,
status: "failed",
currentStep: "website_enrichment",
errorSummary,
errors: 1,
});
await ctx.runMutation(api.runs.appendEvent, {
runId,
level: "error",
message: "Website-Enrichment fehlgeschlagen.",
details: [
{ label: "Fehler", value: errorSummary, source: "website_enrichment" },
],
});
if (started) {
try {
await ctx.runMutation(internal.pageSpeed.queueLeadPageSpeedAudit, {
leadId: started.lead._id,
parentRunId: runId,
});
} catch (pageSpeedQueueError) {
await ctx.runMutation(api.runs.appendEvent, {
runId,
level: "warning",
message: "PageSpeed-Analyse konnte nicht in die Warteschlange gesetzt werden.",
details: [
{ label: "Lead", value: started.lead._id },
{
label: "Fehler",
value: messageFromError(pageSpeedQueueError),
source: "pagespeed_queue",
},
],
});
}
await ctx.runMutation(internal.websiteEnrichment.patchLeadFromWebsiteEnrichment, {
leadId: started.lead._id,
currentContactStatus: started.lead.contactStatus,
contactStatusReason: `Website-Enrichment fehlgeschlagen: ${errorSummary}`,
});
}
return null;
} finally {
if (desktopContext) {
await desktopContext.close();
}
if (mobileContext) {
await mobileContext.close();
}
if (browser) {
await browser.close();
}
}
},
});

View File

@@ -21,6 +21,21 @@ type LeadDiscoveryContactInput = {
usableEmail?: string | null;
};
export type LeadDiscoveryContactStatus =
| "new"
| "missing_contact"
| "audit_ready"
| "outreach_ready"
| "contacted"
| "replied"
| "do_not_contact";
type WebsiteEnrichmentScheduleInput = {
websiteUrl?: string | null;
websiteDomain?: string | null;
contactStatus: LeadDiscoveryContactStatus;
};
export type LeadDiscoveryPriority = "high" | "medium" | "low" | "defer" | "blocked";
type LeadDiscoveryPriorityInput = {
@@ -39,7 +54,7 @@ type LeadDiscoveryLeadRecordInput<TCampaignId extends string, TRunId extends str
now: number;
};
function optionalString(value: string | null) {
function optionalString(value: string | null | undefined) {
return value && value.trim().length > 0 ? value : undefined;
}
@@ -91,6 +106,16 @@ export function getLeadDiscoveryContactStatus(
return "missing_contact";
}
export function shouldScheduleWebsiteEnrichment(
input: WebsiteEnrichmentScheduleInput,
) {
const hasWebsiteData =
optionalString(input.websiteUrl) !== undefined ||
optionalString(input.websiteDomain) !== undefined;
return input.contactStatus === "missing_contact" && hasWebsiteData;
}
export function buildLeadDiscoveryLeadRecord<
TCampaignId extends string,
TRunId extends string,

View File

@@ -0,0 +1,544 @@
export type PageSpeedStrategy = "mobile" | "desktop";
export type PageSpeedAuditResultStatus = "succeeded" | "failed";
export type PageSpeedAuditErrorType =
| "quota"
| "timeout"
| "unavailable"
| "invalid_url"
| "api_error"
| "unknown";
export type PageSpeedAuditScores = {
performance?: number;
accessibility?: number;
bestPractices?: number;
seo?: number;
};
export type PageSpeedAuditMetrics = {
firstContentfulPaintMs?: number;
largestContentfulPaintMs?: number;
cumulativeLayoutShift?: number;
totalBlockingTimeMs?: number;
speedIndexMs?: number;
};
export type PageSpeedAuditNormalized = {
metrics?: PageSpeedAuditMetrics;
scores?: PageSpeedAuditScores;
opportunities?: string[];
implications?: string[];
};
export type PageSpeedMinimalAuditResult = {
strategy: PageSpeedStrategy;
status: PageSpeedAuditResultStatus;
sourceUrl: string;
finalUrl?: string;
normalized?: PageSpeedAuditNormalized;
errorType?: PageSpeedAuditErrorType;
errorSummary?: string;
};
export type PageSpeedAuditInputs = {
technicalSignals: string[];
customerImplications: string[];
internalNotes: string[];
};
type FailureContext = Readonly<{
status: string;
sourceUrl: string;
strategy: PageSpeedStrategy;
errorType?: PageSpeedAuditErrorType;
errorSummary?: string;
}>;
const CUSTOMER_IMPLICATION_LIMIT = 8;
const TECHNICAL_SIGNAL_LIMIT = 8;
const INTERNAL_NOTE_LIMIT = 6;
const SCORE_WORD_PATTERN =
/\bscore\b/i;
const SCORE_NUMBER_PATTERN =
/\b0?\.\d+\b|\b1(?:\.0+)?\b|\b[2-9]\d*\b/;
const RAW_STORAGE_PATTERN =
/\braw\s*storage\s*id\b/i;
const PAGE_SPEED_PATTERN =
/\bpagespeed\b/i;
const LIGHTHOUSE_PATTERN =
/\blighthouse\b/i;
const URL_PATTERN =
/\b(?:https?:\/\/|www\.)[^\s<>"']+/i;
const MARKUP_PATTERN =
/<[^>]+>/;
const JSON_BRACKET_PATTERN =
/\{[^}]*\}|\[[^\]]*\]/;
const SUSPICIOUS_MACHINE_TOKEN_PATTERN =
/\b[a-z\d_-]{24,}\b/i;
const PUBLIC_MACHINE_KEYWORDS_PATTERN =
/\b(?:raw\s*storage\s*id|rawstorageid|lighthouseresult|lighthouse|pagespeed|score)\b/i;
function toTrimmedText(value: unknown): string {
if (typeof value !== "string") {
return "";
}
return value.replace(/\s+/g, " ").trim();
}
function containsUntrustedPublicText(value: string): boolean {
if (URL_PATTERN.test(value)) {
return true;
}
if (MARKUP_PATTERN.test(value)) {
return true;
}
if (JSON_BRACKET_PATTERN.test(value)) {
return true;
}
if (PUBLIC_MACHINE_KEYWORDS_PATTERN.test(value)) {
return true;
}
if (SUSPICIOUS_MACHINE_TOKEN_PATTERN.test(value)) {
return true;
}
return false;
}
function isLikelyPlainGermanSentence(value: string): boolean {
if (!/[a-zäöüÄÖÜß]/i.test(value)) {
return false;
}
if (value.length > 500) {
return false;
}
return true;
}
function stripPublicText(value: string): string {
let text = toTrimmedText(value);
if (!text) {
return "";
}
if (containsUntrustedPublicText(text)) {
return "";
}
if (RAW_STORAGE_PATTERN.test(text) || PAGE_SPEED_PATTERN.test(text) || LIGHTHOUSE_PATTERN.test(text)) {
return "";
}
text = text.replace(/\b0?\.\d+\b/g, "");
text = text.replace(/\d+/g, "");
text = text.trim().replace(/\s{2,}/g, " ");
text = text.replace(/^[:\s]+/, "");
text = text.trim();
if (!isLikelyPlainGermanSentence(text)) {
return "";
}
if (!text) {
return "";
}
if (SUSPICIOUS_MACHINE_TOKEN_PATTERN.test(text)) {
return "";
}
if (/[<{[\]}]/.test(text)) {
return "";
}
if (PAGE_SPEED_PATTERN.test(text) || LIGHTHOUSE_PATTERN.test(text)) {
return "";
}
text = text.replace(/\s{2,}/g, " ").trim();
return text;
}
function stripInternalText(value: string): string {
let text = toTrimmedText(value);
if (!text) {
return "";
}
if (RAW_STORAGE_PATTERN.test(text)) {
return "";
}
if (URL_PATTERN.test(text)) {
return "";
}
if (MARKUP_PATTERN.test(text)) {
return "";
}
if (JSON_BRACKET_PATTERN.test(text)) {
return "";
}
text = text.replace(/^\s*score\s*[:\-]?\s*\d+(?:\.\d+)?\s*/i, "");
text = text.replace(/\{\s*[^}]*\}\s*/g, "");
text = text.replace(/\[[^\]]*\]\s*/g, "");
text = text.replace(SCORE_WORD_PATTERN, "");
text = text.replace(SCORE_NUMBER_PATTERN, "");
text = text.replace(/\b\d+(?:\.\d+)?\b/g, "");
text = text.replace(/^[:\s]+/, "");
text = text.trim().replace(/\s{2,}/g, " ");
return text;
}
function addUniqueCapped(
bucket: string[],
text: string,
max: number,
sanitize: (value: string) => string = stripPublicText,
): void {
const candidate = sanitize(text);
if (!candidate) {
return;
}
const normalized = candidate.toLowerCase().replace(/\s+/g, " ");
const duplicate = bucket.some(
(existing) =>
existing.toLowerCase().replace(/\s+/g, " ") === normalized,
);
if (!duplicate && bucket.length < max) {
bucket.push(candidate);
}
}
function hasMetricGap(
mobileValue: number | undefined,
desktopValue: number | undefined,
significantFactor = 1.25,
): boolean {
if (mobileValue === undefined || desktopValue === undefined) {
return false;
}
if (mobileValue <= desktopValue) {
return false;
}
if (desktopValue <= 0) {
return true;
}
return mobileValue >= desktopValue * significantFactor;
}
function addMobileWorseMessage(
mobile: PageSpeedAuditMetrics | undefined,
desktop: PageSpeedAuditMetrics | undefined,
technicalSignals: string[],
customerImplications: string[],
) {
if (!mobile || !desktop) {
return;
}
const fcpGap = hasMetricGap(
mobile.firstContentfulPaintMs,
desktop.firstContentfulPaintMs,
);
const lcpGap = hasMetricGap(
mobile.largestContentfulPaintMs,
desktop.largestContentfulPaintMs,
);
const tbtGap = hasMetricGap(
mobile.totalBlockingTimeMs,
desktop.totalBlockingTimeMs,
1.35,
);
const speedGap = hasMetricGap(
mobile.speedIndexMs,
desktop.speedIndexMs,
1.25,
);
const clsGap = hasMetricGap(
mobile.cumulativeLayoutShift,
desktop.cumulativeLayoutShift,
1.2,
);
if (!(fcpGap || lcpGap || tbtGap || speedGap || clsGap)) {
return;
}
const gapSentence =
"Die mobile Version ist deutlich langsamer als die Desktop-Variante.";
const mobileFirstSentence =
"Auf Mobilgeraten verlieren Kunden dadurch frueher den ersten Eindruck.";
addUniqueCapped(technicalSignals, gapSentence, TECHNICAL_SIGNAL_LIMIT);
addUniqueCapped(technicalSignals, mobileFirstSentence, TECHNICAL_SIGNAL_LIMIT);
addUniqueCapped(
customerImplications,
"Die mobile Version ist deutlich langsamer als die Desktop-Variante.",
CUSTOMER_IMPLICATION_LIMIT,
);
addUniqueCapped(
customerImplications,
"Kunden auf dem Telefon warten laenger und brechen den Erstkontakt schneller ab.",
CUSTOMER_IMPLICATION_LIMIT,
);
}
function addScoreBasedSignals(
scores: PageSpeedAuditScores | undefined,
technicalSignals: string[],
customerImplications: string[],
) {
if (!scores) {
return;
}
if ((scores.accessibility ?? 1) < 0.9) {
addUniqueCapped(
technicalSignals,
"Barrierefreiheit und Bedienbarkeit sollten fuer alle Nutzerinnen und Nutzer verbessert werden.",
TECHNICAL_SIGNAL_LIMIT,
);
addUniqueCapped(
customerImplications,
"Einfacher zugang und bessere Bedienbarkeit helfen mehr Interessenten zu erreichen.",
CUSTOMER_IMPLICATION_LIMIT,
);
}
if ((scores.seo ?? 1) < 0.9) {
addUniqueCapped(
technicalSignals,
"Technische Signale deuten auf reduzierte lokale Auffindbarkeit hin.",
TECHNICAL_SIGNAL_LIMIT,
);
addUniqueCapped(
customerImplications,
"Lokale Sichtbarkeit kann dadurch bei Neukundenanfragen sinken.",
CUSTOMER_IMPLICATION_LIMIT,
);
}
if ((scores.performance ?? 1) < 0.9) {
addUniqueCapped(
customerImplications,
"Wahrnehmbare Wartezeiten auf der Seite koennen das Vertrauen in den Auftritt mindern.",
CUSTOMER_IMPLICATION_LIMIT,
);
}
}
function addMetricSignals(
metrics: PageSpeedAuditMetrics | undefined,
technicalSignals: string[],
customerImplications: string[],
) {
if (!metrics) {
return;
}
if ((metrics.firstContentfulPaintMs ?? 0) > 2500) {
addUniqueCapped(
technicalSignals,
"Erster sichtbarer Inhalt erscheint deutlich verzoegert.",
TECHNICAL_SIGNAL_LIMIT,
);
addUniqueCapped(
customerImplications,
"Der erste sichtbare Inhalt erscheint spuetbar zu langsam.",
CUSTOMER_IMPLICATION_LIMIT,
);
}
if ((metrics.largestContentfulPaintMs ?? 0) > 4200) {
addUniqueCapped(
technicalSignals,
"Das wichtigste Inhaltselement wird stark verzoegert sichtbar.",
TECHNICAL_SIGNAL_LIMIT,
);
addUniqueCapped(
customerImplications,
"Wichtige Inhalte erscheinen zu spaet, was den ersten Eindruck schwaecht.",
CUSTOMER_IMPLICATION_LIMIT,
);
}
if ((metrics.totalBlockingTimeMs ?? 0) > 300) {
addUniqueCapped(
technicalSignals,
"Interaktion und Reaktionszeit sind stark beeintraechtigt.",
TECHNICAL_SIGNAL_LIMIT,
);
addUniqueCapped(
customerImplications,
"Bedienaktionen wirken traege und fuehren schneller zu Abbruechen.",
CUSTOMER_IMPLICATION_LIMIT,
);
}
if ((metrics.speedIndexMs ?? 0) > 3500) {
addUniqueCapped(
technicalSignals,
"Die visuelle Komplettierung der Seite verzoegert sich deutlich.",
TECHNICAL_SIGNAL_LIMIT,
);
addUniqueCapped(
customerImplications,
"Die Seite wirkt insgesamt schleppend aufgebaut und reduziert die Nutzungsbereitschaft.",
CUSTOMER_IMPLICATION_LIMIT,
);
}
if ((metrics.cumulativeLayoutShift ?? 0) > 0.1) {
addUniqueCapped(
technicalSignals,
"Instabile Layout-Werte weisen auf Spruenge in der Seitendarstellung hin.",
TECHNICAL_SIGNAL_LIMIT,
);
addUniqueCapped(
customerImplications,
"Elemente, die beim Laden verschieben, wirken unruhig und schwaechen Vertrauen.",
CUSTOMER_IMPLICATION_LIMIT,
);
}
}
function addFailureNote(input: FailureContext, internalNotes: string[]) {
if (input.errorType === "quota") {
addUniqueCapped(
internalNotes,
"Die Abfrage wurde wegen Quota-Limit abgebrochen.",
INTERNAL_NOTE_LIMIT,
stripInternalText,
);
return;
}
if (input.errorType === "unavailable") {
addUniqueCapped(
internalNotes,
"Die Zielseite war nicht erreichbar.",
INTERNAL_NOTE_LIMIT,
stripInternalText,
);
return;
}
if (input.errorType === "timeout") {
addUniqueCapped(
internalNotes,
"Der Aufruf wurde wegen Timeout beendet.",
INTERNAL_NOTE_LIMIT,
stripInternalText,
);
return;
}
const base =
input.errorType === "invalid_url"
? "Die Zieladresse wurde als ungueltig bewertet"
: "Der Lauf wurde mit technischem Fehler abgeschlossen";
const summary = stripInternalText(input.errorSummary || "");
const full = summary ? `${base}: ${summary}` : base;
addUniqueCapped(internalNotes, full, INTERNAL_NOTE_LIMIT, stripInternalText);
}
export function assertNoPublicPageSpeedScores(value: unknown): boolean {
const lines = Array.isArray(value) ? value : [value];
for (const line of lines) {
if (typeof line !== "string" || !line.trim()) {
continue;
}
if (containsUntrustedPublicText(line)) {
return false;
}
const asString = String(line);
if (/\bscore\b/i.test(asString) || /\b\d+\b/.test(asString) || /\b\d+\.\d+\b/.test(asString)) {
return false;
}
}
return true;
}
export function buildPageSpeedAuditInputs(
results: readonly PageSpeedMinimalAuditResult[],
): PageSpeedAuditInputs {
const technicalSignals: string[] = [];
const customerImplications: string[] = [];
const internalNotes: string[] = [];
const list = Array.isArray(results) ? results : [];
let mobileResult: PageSpeedMinimalAuditResult | undefined;
let desktopResult: PageSpeedMinimalAuditResult | undefined;
for (const result of list) {
if (result.status === "succeeded") {
const normalized = result.normalized ?? {};
if (result.strategy === "mobile") {
mobileResult = result;
} else {
desktopResult = result;
}
for (const implication of normalized.implications ?? []) {
addUniqueCapped(customerImplications, implication, CUSTOMER_IMPLICATION_LIMIT);
}
for (const opportunity of normalized.opportunities ?? []) {
addUniqueCapped(
technicalSignals,
`Moegliche Optimierung: ${opportunity}`,
TECHNICAL_SIGNAL_LIMIT,
);
}
addMetricSignals(normalized.metrics, technicalSignals, customerImplications);
addScoreBasedSignals(
normalized.scores,
technicalSignals,
customerImplications,
);
continue;
}
addFailureNote(result, internalNotes);
}
addMobileWorseMessage(
mobileResult?.normalized?.metrics,
desktopResult?.normalized?.metrics,
technicalSignals,
customerImplications,
);
return {
technicalSignals,
customerImplications: customerImplications.slice(
0,
CUSTOMER_IMPLICATION_LIMIT,
),
internalNotes: internalNotes.slice(0, INTERNAL_NOTE_LIMIT),
};
}

544
lib/pagespeed-insights.ts Normal file
View File

@@ -0,0 +1,544 @@
export type PageSpeedStrategy = "mobile" | "desktop";
export type PageSpeedErrorType =
| "quota"
| "timeout"
| "unavailable"
| "invalid_url"
| "api_error"
| "unknown";
export type PageSpeedScores = {
performance?: number;
accessibility?: number;
bestPractices?: number;
seo?: number;
};
export type PageSpeedMetrics = {
firstContentfulPaintMs?: number;
largestContentfulPaintMs?: number;
cumulativeLayoutShift?: number;
totalBlockingTimeMs?: number;
speedIndexMs?: number;
};
export type PageSpeedNormalizedResult = {
strategy: PageSpeedStrategy;
sourceUrl: string;
finalUrl?: string;
analysisTimestamp?: string;
scores?: PageSpeedScores;
metrics: PageSpeedMetrics;
opportunities: string[];
implications: string[];
};
type ClassifiedError = {
errorType: PageSpeedErrorType;
message: string;
};
type FetchLike = (
input: string,
init: { signal?: AbortSignal } | undefined,
) => Promise<{
ok: boolean;
status: number;
json: () => Promise<unknown>;
}>;
const PAGESPEED_ENDPOINT =
"https://pagespeedonline.googleapis.com/pagespeedonline/v5/runPagespeed";
const DEFAULT_TIMEOUT_MS = 10_000;
function asRecord(value: unknown): Record<string, unknown> | null {
if (!value || typeof value !== "object") {
return null;
}
return value as Record<string, unknown>;
}
function asString(value: unknown): string | null {
if (typeof value !== "string") {
return null;
}
const trimmed = value.trim();
return trimmed.length > 0 ? trimmed : null;
}
function asNumber(value: unknown): number | null {
if (typeof value === "number" && Number.isFinite(value)) {
return value;
}
return null;
}
function safeToLower(value: unknown): string {
if (typeof value === "string") {
return value.toLowerCase();
}
if (value instanceof Error) {
return `${value.name} ${value.message}`.toLowerCase();
}
if (value == null) {
return "";
}
if (typeof value === "object") {
try {
return JSON.stringify(value).toLowerCase();
} catch {
return "";
}
}
return "";
}
function hasPattern(value: unknown, patterns: string[]): boolean {
const lower = safeToLower(value);
return patterns.some((pattern) => lower.includes(pattern));
}
function firstNonEmptyString(...values: unknown[]): string | null {
for (const value of values) {
const stringValue = asString(value);
if (stringValue) {
return stringValue;
}
}
return null;
}
function extractPageSpeedErrorMessage(body: unknown): string | null {
const bodyRecord = asRecord(body);
const error = asRecord(bodyRecord?.error);
const lighthouseResult = asRecord(bodyRecord?.lighthouseResult);
const runtimeError = asRecord(lighthouseResult?.runtimeError);
return firstNonEmptyString(
error?.message,
bodyRecord?.error_message,
runtimeError?.message,
);
}
function buildImpactStatements(
scores: PageSpeedScores,
metrics: PageSpeedMetrics,
) {
const implications: string[] = [];
if ((scores.performance ?? 1) < 0.9) {
implications.push(
"Die allgemeine Seitengeschwindigkeit wirkt noch deutlich verbesserungswürdig.",
);
}
if ((metrics.firstContentfulPaintMs ?? 0) > 2_000) {
implications.push(
"Besucher sehen den ersten sichtbaren Inhalt auf der Seite zu langsam.",
);
}
if ((metrics.largestContentfulPaintMs ?? 0) > 3_000) {
implications.push(
"Der wichtigste Inhalt wird erst verspätet vollständig sichtbar.",
);
}
if ((metrics.cumulativeLayoutShift ?? 0) > 0.1) {
implications.push(
"Inhalte springen beim Laden nach, was den wahrgenommenen Seitenkomfort mindert.",
);
}
if ((metrics.totalBlockingTimeMs ?? 0) > 300) {
implications.push(
"Lange Blockierungszeiten können das Bediengefühl auf der Seite merklich spürbar verlangsamen.",
);
}
if ((metrics.speedIndexMs ?? 0) > 3_500) {
implications.push(
"Der visuelle Seitenaufbau ist verzögert und die Wahrnehmung der Seitenqualität leidet.",
);
}
if ((scores.bestPractices ?? 1) < 0.85) {
implications.push(
"Es gibt mehrere technische Best-Practice-Punkte, die aktuell noch nachgebessert werden sollten.",
);
}
if ((scores.accessibility ?? 1) < 0.9) {
implications.push(
"Die Barrierefreiheit sollte verbessert werden, damit alle Nutzerinnen und Nutzer die Seite besser erreichen.",
);
}
return implications;
}
function normalizePageSpeedAnalysisTimestamp(raw: unknown): string | undefined {
const value = asString(
asRecord(raw)?.analysisUTCTimestamp ?? asRecord(raw)?.analysisTimestamp,
);
return value ?? undefined;
}
function normalizePageSpeedScores(lighthouseResult: Record<string, unknown>) {
const categories = asRecord(lighthouseResult.categories) ?? {};
const scores: PageSpeedScores = {};
const performance = asNumber(
asRecord(categories.performance)?.score,
);
if (performance !== null) {
scores.performance = performance;
}
const accessibility = asNumber(
asRecord(categories.accessibility)?.score,
);
if (accessibility !== null) {
scores.accessibility = accessibility;
}
const bestPractices = asNumber(
asRecord(categories["best-practices"])?.score,
);
if (bestPractices !== null) {
scores.bestPractices = bestPractices;
}
const seo = asNumber(asRecord(categories.seo)?.score);
if (seo !== null) {
scores.seo = seo;
}
return scores;
}
function normalizePageSpeedMetrics(audits: Record<string, unknown>) {
const metrics: PageSpeedMetrics = {};
const fcp = asNumber(asRecord(audits["first-contentful-paint"])?.numericValue);
if (fcp !== null) {
metrics.firstContentfulPaintMs = fcp;
}
const lcp = asNumber(asRecord(audits["largest-contentful-paint"])?.numericValue);
if (lcp !== null) {
metrics.largestContentfulPaintMs = lcp;
}
const cls = asNumber(asRecord(audits["cumulative-layout-shift"])?.numericValue);
if (cls !== null) {
metrics.cumulativeLayoutShift = cls;
}
const tbt = asNumber(asRecord(audits["total-blocking-time"])?.numericValue);
if (tbt !== null) {
metrics.totalBlockingTimeMs = tbt;
}
const speedIndex = asNumber(asRecord(audits["speed-index"])?.numericValue);
if (speedIndex !== null) {
metrics.speedIndexMs = speedIndex;
}
return metrics;
}
function formatSavingsHint(value: number) {
const rounded = Math.abs(Math.round(value));
if (rounded >= 1024) {
return `${Math.round(rounded / 1024)} MB`;
}
return `${rounded} ms`;
}
function normalizeOpportunities(audits: Record<string, unknown>) {
const opportunities: string[] = [];
for (const [id, rawAudit] of Object.entries(audits)) {
const audit = asRecord(rawAudit);
if (!audit) {
continue;
}
const details = asRecord(audit.details);
const type = asString(details?.type);
if (type !== "opportunity") {
continue;
}
const title = asString(audit.title) ?? id;
const savingsMs = asNumber(details?.overallSavingsMs);
const savingsBytes = asNumber(details?.overallSavingsBytes);
if (savingsMs !== null) {
opportunities.push(`${title}: ca. ${formatSavingsHint(savingsMs)} Einsparung möglich.`);
continue;
}
if (savingsBytes !== null) {
opportunities.push(`${title}: potenziell ${formatSavingsHint(savingsBytes)} weniger Last.`);
continue;
}
const score = asNumber(audit.score);
if (score !== null && score < 0.9) {
opportunities.push(`${title}: hier ist weiteres Optimierungspotenzial vorhanden.`);
}
}
return opportunities;
}
export function buildPageSpeedRequestUrl(input: {
url: string;
strategy: PageSpeedStrategy;
apiKey?: string | null;
locale?: string;
}): string {
const requestUrl = new URL(PAGESPEED_ENDPOINT);
requestUrl.searchParams.append("url", input.url);
requestUrl.searchParams.set("strategy", input.strategy);
requestUrl.searchParams.append("category", "performance");
requestUrl.searchParams.append("category", "accessibility");
requestUrl.searchParams.append("category", "best-practices");
requestUrl.searchParams.append("category", "seo");
requestUrl.searchParams.set("locale", input.locale ?? "de-DE");
if (asString(input.apiKey)) {
requestUrl.searchParams.set("key", input.apiKey as string);
}
return requestUrl.toString();
}
export function normalizePageSpeedResult(input: {
strategy: PageSpeedStrategy;
sourceUrl: string;
raw: unknown;
}): PageSpeedNormalizedResult {
const lighthouseResult = asRecord(asRecord(input.raw)?.lighthouseResult) ?? {};
const audits = asRecord(lighthouseResult.audits) ?? {};
const scores = normalizePageSpeedScores(lighthouseResult);
const metrics = normalizePageSpeedMetrics(audits);
const opportunities = normalizeOpportunities(audits);
const implications = buildImpactStatements(scores, metrics);
for (let i = implications.length - 1; i >= 0; i -= 1) {
if (implications[i] === "") {
implications.splice(i, 1);
}
}
const finalUrl = asString(lighthouseResult.finalUrl) ?? undefined;
const analysisTimestamp = normalizePageSpeedAnalysisTimestamp(input.raw);
const result: PageSpeedNormalizedResult = {
strategy: input.strategy,
sourceUrl: input.sourceUrl,
metrics,
opportunities,
implications,
};
if (finalUrl) {
result.finalUrl = finalUrl;
}
if (analysisTimestamp) {
result.analysisTimestamp = analysisTimestamp;
}
const hasAnyScore = Object.values(scores).some((value) => value !== undefined);
if (hasAnyScore) {
result.scores = scores;
}
return result;
}
export function classifyPageSpeedError(input: {
error?: unknown;
status?: number;
body?: unknown;
}): ClassifiedError {
const status = Number.isFinite(input?.status)
? Math.trunc(input.status as number)
: undefined;
const statusBodyText = [input?.error, input?.body, input?.status]
.map(safeToLower)
.join(" ");
const abortLike = input?.error instanceof DOMException
? input.error.name === "AbortError"
: false;
const errorMessage = safeToLower(
input?.error instanceof Error ? input.error.message : input?.error,
);
if (input?.error instanceof SyntaxError) {
return {
errorType: "api_error",
message: `PageSpeed-Antwort war kein gültiges JSON: ${errorMessage || "Unbekannt"}`,
};
}
if (abortLike || errorMessage.includes("abort") && errorMessage.includes("timeout")) {
return {
errorType: "timeout",
message: "PageSpeed-Anfrage wurde wegen Timeout abgebrochen.",
};
}
if (
hasPattern(statusBodyText, [
"429",
"quota",
"userratelimit",
"rate limit",
"user rate limit",
]) ||
status === 429) {
return {
errorType: "quota",
message: "PageSpeed-Anfrage wurde wegen API-Quota abgelehnt.",
};
}
if (
status === 404 ||
hasPattern(statusBodyText, [
"failed document",
"failed to fetch document",
"could not fetch document",
"unreachable document",
"could not fetch",
"not found",
])
) {
return {
errorType: "unavailable",
message: "Die analysierte Seite ist aktuell nicht erreichbar.",
};
}
if (
hasPattern(statusBodyText, [
"invalid url",
"invalid_url",
"bad url",
"malformed url",
"url parsing",
"unsupported url",
"missing required parameter: url",
])
) {
return {
errorType: "invalid_url",
message: "Die angegebene URL ist nicht valide für PageSpeed.",
};
}
if (status !== undefined && status >= 400 && status < 600) {
const apiMessage = extractPageSpeedErrorMessage(input?.body);
return {
errorType: "api_error",
message: apiMessage
? `PageSpeed-API lieferte einen Fehler: ${apiMessage}`
: "PageSpeed-API lieferte einen Fehler.",
};
}
return {
errorType: "unknown",
message: input?.error instanceof Error && input.error.message
? `Unbekannter Fehler beim PageSpeed-Zugriff: ${input.error.message}`
: "Unbekannter Fehler beim PageSpeed-Zugriff.",
};
};
function createPageSpeedError(classification: ClassifiedError): Error {
const error = new Error(classification.message);
return Object.assign(error, {
errorType: classification.errorType,
name: `PageSpeedError:${classification.errorType}`,
});
}
function isPageSpeedError(error: unknown): error is Error & { errorType: PageSpeedErrorType } {
return (
typeof error === "object" &&
error !== null &&
"errorType" in error &&
typeof (error as { errorType?: unknown }).errorType === "string"
);
}
async function parseResponseBody(
response: { json: () => Promise<unknown> },
swallowParseErrors: boolean,
): Promise<unknown> {
try {
return await response.json();
} catch (error) {
if (swallowParseErrors) {
return null;
}
throw error;
}
}
export async function fetchPageSpeedResult(input: {
url: string;
strategy: PageSpeedStrategy;
apiKey?: string | null;
timeoutMs?: number;
fetchImpl?: FetchLike;
}): Promise<unknown> {
const timeoutMs = input.timeoutMs ?? DEFAULT_TIMEOUT_MS;
const fetchImpl: FetchLike =
input.fetchImpl ??
((fetch as typeof globalThis.fetch) as unknown as FetchLike);
const requestUrl = buildPageSpeedRequestUrl({
url: input.url,
strategy: input.strategy,
apiKey: input.apiKey,
});
const controller = new AbortController();
const timer = setTimeout(() => {
controller.abort();
}, timeoutMs);
try {
const response = await fetchImpl(requestUrl, {
signal: controller.signal,
});
const body = await parseResponseBody(response, !response.ok);
if (!response.ok) {
const classification = classifyPageSpeedError({
status: response.status,
body,
});
throw createPageSpeedError(classification);
}
return body;
} catch (error) {
if (isPageSpeedError(error)) {
throw error;
}
const classification = classifyPageSpeedError({ error });
if (classification.errorType === "unknown") {
throw error;
}
throw createPageSpeedError(classification);
} finally {
clearTimeout(timer);
}
}

605
lib/website-crawler.ts Normal file
View File

@@ -0,0 +1,605 @@
import { normalizeEmailAddress } from "./lead-discovery-google";
const HTTP_SCHEMES = new Set(["http:", "https:"]);
const RELEVANT_PATH_PATTERNS = [
/(?:^|\/)(kontakt|contact)(?:[-/]|$)/i,
/(?:^|\/)(impressum|imprint)(?:[-/]|$)/i,
/(?:^|\/)(leistungen|angebot|services?)(?:[-/]|$)/i,
/(?:^|\/)(ueber|über|team|about)(?:[-/]|$)/i,
];
const CONTACT_CONTEXT_KEYWORDS = [
"ansprechpartner",
"kontakt",
"e-mail",
"email",
"team",
"impressum",
"geschäftsführung",
"imprint",
"footer",
"anfrage",
];
const GENERIC_BUSINESS_LOCALS = new Set([
"info",
"kontakt",
"contact",
"office",
"hello",
"sales",
"support",
"service",
"team",
"post",
]);
export type WebsiteCrawlEmailCandidate = {
email: string;
emailSource: string | null;
contactPerson: string | null;
isBusinessContactAddress: boolean;
};
export type WebsiteCrawlContactSignals = {
visibleText: string;
phoneNumbers: string[];
emailCandidates: WebsiteCrawlEmailCandidate[];
hasContactFormSignal: boolean;
hasContactCtaSignal: boolean;
};
export type TechnicalChecksInput = {
rootUrl?: string | null;
finalUrl?: string | null;
title?: string | null;
metaDescription?: string | null;
visibleText?: string | null;
checkedUrls?: string[];
links?: Array<
| string
| {
href?: string;
status?: number;
statusCode?: number;
isBroken?: boolean;
}
>;
};
export type WebsiteTechnicalChecks = {
https: boolean;
finalUrl: string;
missingTitle: boolean;
missingMetaDescription: boolean;
hasVisibleContactPath: boolean;
brokenInternalLinks: string[];
};
function stripWww(host: string) {
return host.replace(/^www\./i, "");
}
function toLowerHost(value: string) {
try {
return new URL(value).hostname.toLowerCase();
} catch {
return "";
}
}
export function normalizeCrawlUrl(input?: string | null, base?: string) {
if (!input) {
return null;
}
const trimmed = input.trim();
if (!trimmed) {
return null;
}
if (!base && (trimmed.startsWith("//") || !trimmed.includes("://"))) {
return null;
}
let parsed: URL;
try {
parsed = new URL(trimmed, base);
} catch {
return null;
}
if (!HTTP_SCHEMES.has(parsed.protocol)) {
return null;
}
const normalizedHost = stripWww(parsed.hostname.toLowerCase());
const search = parsed.search;
const path = parsed.pathname || "/";
return `${parsed.protocol}//${normalizedHost}${parsed.port ? `:${parsed.port}` : ""}${path}${search}`;
}
export function isSameRegistrableHostishDomain(
candidateUrl: string,
rootUrl: string,
) {
const root = normalizeCrawlUrl(rootUrl) ?? undefined;
const candidate = normalizeCrawlUrl(candidateUrl, root);
if (!candidate || !root) {
return false;
}
const candidateHost = stripWww(toLowerHost(candidate));
const rootHost = stripWww(toLowerHost(root));
return candidateHost === rootHost && candidateHost.length > 0;
}
function normalizeForQueue(value: string | null) {
if (!value) {
return null;
}
let url: URL;
try {
url = new URL(value);
} catch {
return null;
}
const host = `${stripWww(url.hostname.toLowerCase())}${url.port ? `:${url.port}` : ""}`;
return `${url.protocol}//${host}${url.pathname.replace(/\/$/, "") || "/"}`;
}
export function discoverRelevantSubpageUrls(links: string[], rootUrl: string) {
const root = normalizeCrawlUrl(rootUrl);
if (!root) {
return [];
}
const parsedRoot = new URL(root);
const homepage = `${parsedRoot.protocol}//${stripWww(
parsedRoot.hostname.toLowerCase(),
)}${parsedRoot.port ? `:${parsedRoot.port}` : ""}/`;
const seen = new Set<string>([homepage]);
const buckets: string[][] = [[], [], [], []];
for (const link of links) {
const normalized = normalizeCrawlUrl(link, rootUrl);
if (!normalized || !isSameRegistrableHostishDomain(normalized, rootUrl)) {
continue;
}
const canonical = normalizeForQueue(normalized);
if (!canonical || seen.has(canonical)) {
continue;
}
let path: string;
try {
path = new URL(normalized).pathname.toLowerCase();
} catch {
continue;
}
for (const [priority, pattern] of RELEVANT_PATH_PATTERNS.entries()) {
if (pattern.test(path)) {
if (buckets[priority].length > 0) {
break;
}
buckets[priority].push(canonical);
seen.add(canonical);
break;
}
}
}
const relevant = [...buckets.flat()];
return [homepage, ...relevant].slice(0, 5);
}
function stripHtml(input: string) {
return input
.replace(/<script[\s\S]*?<\/script>/gi, " ")
.replace(/<style[\s]*?[\s\S]*?<\/style>/gi, " ")
.replace(/<[^>]*>/g, " ")
.replace(/\s+/g, " ")
.trim();
}
function stripLeadingToText(input: string) {
return input.replace(/<[^>]*>/g, "").replace(/\s+/g, " ").trim();
}
function decodeCommonEmailEntities(input: string) {
return input
.replace(/&nbsp;|&#xa0;|&#160;/gi, " ")
.replace(/&commat;|&#64;|&#x40;/gi, "@")
.replace(/&period;|&#46;|&#x2e;/gi, ".");
}
function normalizeEmailExtractionInput(input: string) {
return decodeCommonEmailEntities(input)
.replace(/<script[\s\S]*?<\/script>/gi, " ")
.replace(/<style[\s\S]*?<\/style>/gi, " ")
.replace(/\s+/g, " ")
.trim();
}
function normalizeMailtoAddress(value: string) {
const strippedQuery = value.split("?")[0] ?? "";
const withoutMailto = strippedQuery.replace(/^mailto:/i, "");
try {
return decodeURIComponent(withoutMailto).trim();
} catch {
return withoutMailto.trim();
}
}
function denormalizeObfuscatedEmail(value: string) {
const withAt = value
.replace(/\[\s*at\s*\]|\(\s*at\s*\)|\{\s*at\s*\}/gi, "@")
.replace(/\bpunkt\b|\bdot\b/gi, ".")
.replace(/\[\s*dot\s*\]|\(\s*dot\s*\)|\{\s*dot\s*\}/gi, ".");
return withAt
.replace(/\s*@\s*/g, "@")
.replace(/\s*\.\s*/g, ".")
.replace(/\s+/g, "");
}
function addEmailCandidate(
entries: WebsiteCrawlEmailCandidate[],
seen: Set<string>,
email: string,
source: string,
index: number,
length: number,
explicitPersons: Map<string, string>,
) {
const normalized = normalizeEmailAddress(email);
if (!normalized || seen.has(normalized)) {
return;
}
const businessContext = hasBusinessContactContext(source, index, length);
const explicitPerson =
explicitPersons.get(normalized) ?? getContactPersonForEmail(source, email, index);
entries.push({
email: normalized,
emailSource: null,
contactPerson: explicitPerson,
isBusinessContactAddress: businessContext,
});
seen.add(normalized);
}
function collectObfuscatedEmailCandidates(
source: string,
explicitPersons: Map<string, string>,
) {
const normalizedSource = normalizeEmailExtractionInput(source);
const localPart = "[a-z0-9._%+-]{1,64}";
const domainLabel = "[a-z0-9](?:[a-z0-9-]{0,61}[a-z0-9])?";
const tld = "[a-z]{2,}";
const strictAtSeparator =
"(?:@|\\[\\s*at\\s*\\]|\\(\\s*at\\s*\\)|\\{\\s*at\\s*\\})";
const looseAtSeparator = "\\bat\\b";
const atSeparator = `(?:${strictAtSeparator}|${looseAtSeparator})`;
const strictDotSeparator =
"(?:\\.|\\[\\s*(?:dot|punkt)\\s*\\]|\\(\\s*(?:dot|punkt)\\s*\\)|\\{\\s*(?:dot|punkt)\\s*\\})";
const looseDotSeparator = "\\b(?:dot|punkt)\\b";
const dotSeparator = `(?:${strictDotSeparator}|${looseDotSeparator})`;
const obfuscatedEmailRegex = new RegExp(
`\\b(?<local>${localPart})\\s*(?<at>${atSeparator})\\s*(?<domain>${domainLabel}(?:\\s*${dotSeparator}\\s*${domainLabel})*\\s*${dotSeparator}\\s*${tld})\\b`,
"gi",
);
const candidates: WebsiteCrawlEmailCandidate[] = [];
const seen = new Set<string>();
for (const match of normalizedSource.matchAll(obfuscatedEmailRegex)) {
const rawCandidate = match[0];
if (!rawCandidate) {
continue;
}
const localPartMatch = match.groups?.local ?? "";
const atSeparatorMatch = match.groups?.at ?? "";
const domainPartMatch = match.groups?.domain ?? "";
const isBareAt =
/\bat\b/i.test(atSeparatorMatch) && !/@|\[|\(|\{/.test(atSeparatorMatch);
const hasBareDot = /\b(?:dot|punkt)\b/i.test(domainPartMatch);
const deobfuscationIndex = match.index ?? -1;
if (deobfuscationIndex < 0) {
continue;
}
if ((isBareAt || hasBareDot) && !GENERIC_BUSINESS_LOCALS.has(localPartMatch.toLowerCase()) &&
!hasBusinessContactContext(
normalizedSource,
deobfuscationIndex,
rawCandidate.length,
)) {
continue;
}
const normalized = denormalizeObfuscatedEmail(rawCandidate);
const normalizedEmail = normalizeEmailAddress(normalized);
if (!normalizedEmail || seen.has(normalizedEmail)) {
continue;
}
const explicitPerson =
explicitPersons.get(normalizedEmail) ??
getContactPersonForEmail(normalizedSource, rawCandidate, deobfuscationIndex);
const businessContext = hasBusinessContactContext(
normalizedSource,
deobfuscationIndex,
rawCandidate.length,
);
candidates.push({
email: normalizedEmail,
emailSource: null,
contactPerson: explicitPerson,
isBusinessContactAddress: businessContext,
});
seen.add(normalizedEmail);
}
return candidates;
}
function getContactPersonForEmail(
text: string,
email: string,
index: number,
) {
const windowStart = Math.max(0, index - 120);
const windowEnd = Math.min(text.length, index + email.length + 120);
const context = text.slice(windowStart, windowEnd);
const beforeEmailContext = context.slice(0, index - windowStart);
const anchorMatches = Array.from(
beforeEmailContext.matchAll(/<a\b[^>]*>(.*?)<\/a>/gi),
);
const nearestAnchor = anchorMatches.at(-1);
if (nearestAnchor?.[1]) {
const anchorText = stripLeadingToText(nearestAnchor[1]).trim();
if (anchorText && !/@/.test(anchorText) && anchorText.length < 120) {
return anchorText;
}
}
const nearMatch = context.match(
/(?:(?:^|[>\s])([A-ZÄÖÜ][a-zäöüßÄÖÜ]+\s+[A-ZÄÖÜ][a-zäöüßÄÖÜ-]+(?:\s+[A-ZÄÖÜ][a-zäöüßÄÖÜ-]+)?))$/u,
);
if (nearMatch?.[1]) {
return stripLeadingToText(nearMatch[1]!).trim();
}
const directMatch = text.slice(0, index).match(
/([A-ZÄÖÜ][a-zäöüßÄÖÜ-]+\s+[A-ZÄÖÜ][a-zäöüßÄÖÜ-]+)\s*(?:,|\s+\()?\s*$/u,
);
return directMatch?.[1]?.trim() ?? null;
}
function hasBusinessContactContext(text: string, index: number, length: number) {
const context = text
.slice(Math.max(0, index - 140), Math.min(text.length, index + length + 140))
.toLowerCase();
return CONTACT_CONTEXT_KEYWORDS.some((keyword) => context.includes(keyword));
}
function makePhoneNumberSet(input: string) {
const phoneRegex = /(?:\+?\d[\d\s./()-]{7,}\d)/g;
const matches = input.matchAll(phoneRegex);
const values = new Set<string>();
for (const match of matches) {
const raw = match[0] ?? "";
const normalized = raw.replace(/[^\d+]/g, "");
if (normalized.length >= 7) {
values.add(raw.trim());
values.add(normalized);
}
}
return Array.from(values).filter((value) => value.length >= 7);
}
function makeEmailCandidates(input: string) {
const emailRegex = /[a-z0-9._%+-]+@[a-z0-9.-]+\.[a-z]{2,}(?:\b)?/gi;
const mailtoAnchors = input.matchAll(
/href=["']mailto:([^"'>\s]+)["'][^>]*>(.*?)<\/a>/gi,
);
const normalizedInput = normalizeEmailExtractionInput(input);
const explicitPersons = new Map<string, string>();
const entries: WebsiteCrawlEmailCandidate[] = [];
const seen = new Set<string>();
for (const anchorMatch of mailtoAnchors) {
const rawHref = normalizeMailtoAddress(anchorMatch[1] ?? "");
const email = normalizeEmailAddress(rawHref);
if (!email) {
continue;
}
const label = stripLeadingToText(
decodeCommonEmailEntities(anchorMatch[2] ?? ""),
).trim();
const normalizedLabelEmail = normalizeEmailAddress(label);
if (label && label.length <= 64 && !label.includes("@")) {
explicitPersons.set(email, label);
}
if (seen.has(email)) {
continue;
}
const anchorIndex = anchorMatch.index ?? -1;
if (anchorIndex < 0) {
continue;
}
const contactPerson =
normalizedLabelEmail && normalizedLabelEmail === email ? null : label || null;
entries.push({
email,
emailSource: null,
contactPerson,
isBusinessContactAddress: hasBusinessContactContext(
input,
anchorIndex,
email.length,
),
});
seen.add(email);
}
for (const match of normalizedInput.matchAll(emailRegex)) {
const rawEmail = match[0] ?? "";
const idx = match.index ?? -1;
if (rawEmail.length === 0 || idx < 0) {
continue;
}
addEmailCandidate(
entries,
seen,
rawEmail,
normalizedInput,
idx,
rawEmail.length,
explicitPersons,
);
}
for (const candidate of collectObfuscatedEmailCandidates(input, explicitPersons)) {
if (seen.has(candidate.email)) {
continue;
}
entries.push(candidate);
seen.add(candidate.email);
}
return entries;
}
export function extractContactSignalsFromHtmlLikeText(input: string) {
const visibleText = stripHtml(input);
const phoneNumbers = makePhoneNumberSet(visibleText);
const emailCandidates = makeEmailCandidates(input);
const lowerInput = input.toLowerCase();
const hasContactFormSignal =
/kontaktformular|anfrageformular|contact form|<form\b/i.test(lowerInput);
const hasContactCtaSignal =
/kontaktformular|anfrageformular|anfrage\s*senden|anfrage\s*stellen|schreiben\s+sie\s+uns|kontaktieren\s+(?:sie|du)|kontakt\s+(?:und|mit|zu)|<form\b/i.test(
lowerInput,
);
return {
visibleText,
phoneNumbers,
emailCandidates,
hasContactFormSignal,
hasContactCtaSignal,
};
}
function isRelevantContactPathText(value: string) {
const normalized = value.toLowerCase();
return (
/(?:^|\/)(kontakt|contact)(?:[-/]|$)/.test(normalized) ||
/(?:^|\/)(impressum|imprint)(?:[-/]|$)/.test(normalized) ||
/(?:^|\/)(leistungen|angebot|services?)(?:[-/]|$)/.test(normalized) ||
/(?:^|\/)(ueber|über|about|team)(?:[-/]|$)/.test(normalized) ||
/\bkontakt\b/.test(normalized) ||
/\bkontaktformular\b/.test(normalized)
);
}
export function buildTechnicalChecks(input: TechnicalChecksInput): WebsiteTechnicalChecks {
const finalUrl = normalizeCrawlUrl(input.finalUrl ?? "", input.rootUrl ?? undefined) ??
normalizeCrawlUrl(input.rootUrl ?? "", undefined) ??
"";
const normalizedRoot = normalizeCrawlUrl(input.rootUrl ?? finalUrl ?? "", undefined) ??
finalUrl;
const title = input.title?.trim() ?? "";
const metaDescription = input.metaDescription?.trim() ?? "";
const visibleText = input.visibleText ?? "";
const relevantVisibleText = visibleText.toLowerCase();
const hasVisibleContactPath =
isRelevantContactPathText(relevantVisibleText) ||
isRelevantContactPathText(finalUrl) ||
isRelevantContactPathText(new URL(finalUrl || "https://localhost").pathname);
const checkedUrlSet = new Set<string>();
const checkedUrls = input.checkedUrls ?? [];
for (const checkedUrl of checkedUrls) {
const normalizedCheckedUrl = normalizeCrawlUrl(checkedUrl, normalizedRoot ?? undefined);
if (!normalizedCheckedUrl || !isSameRegistrableHostishDomain(normalizedCheckedUrl, normalizedRoot)) {
continue;
}
const canonicalCheckedUrl = normalizeForQueue(normalizedCheckedUrl);
if (canonicalCheckedUrl) {
checkedUrlSet.add(canonicalCheckedUrl);
}
}
const hasCheckedUrls = checkedUrlSet.size > 0;
const brokenInternalLinksSet = new Set<string>();
for (const entry of input.links ?? []) {
const href = typeof entry === "string" ? entry : (entry.href ?? "");
const normalizedLink = normalizeCrawlUrl(href, normalizedRoot ?? undefined);
if (!normalizedLink || !isSameRegistrableHostishDomain(normalizedLink, normalizedRoot)) {
continue;
}
const canonical = normalizeForQueue(normalizedLink);
if (!canonical) {
continue;
}
if (hasCheckedUrls && !checkedUrlSet.has(canonical)) {
continue;
}
let isBroken = false;
if (typeof entry !== "string") {
if (entry.isBroken === true) {
isBroken = true;
}
const status = entry.status ?? entry.statusCode;
if (typeof status === "number" && (status >= 400 || status <= 0)) {
isBroken = true;
}
}
if (isBroken) {
brokenInternalLinksSet.add(canonical);
}
}
return {
https: finalUrl.startsWith("https://"),
finalUrl,
missingTitle: title.length === 0,
missingMetaDescription: metaDescription.length === 0,
hasVisibleContactPath,
brokenInternalLinks: Array.from(brokenInternalLinksSet),
};
}

View File

@@ -13,12 +13,14 @@
"dependencies": {
"@convex-dev/better-auth": "^0.12.2",
"@hookform/resolvers": "^5.4.0",
"@sparticuz/chromium-min": "^149.0.0",
"better-auth": "^1.6.14",
"class-variance-authority": "^0.7.1",
"clsx": "^2.1.1",
"convex": "^1.40.0",
"lucide-react": "^1.17.0",
"next": "16.2.7",
"playwright-core": "^1.60.0",
"radix-ui": "^1.4.3",
"react": "19.2.4",
"react-dom": "19.2.4",

197
pnpm-lock.yaml generated
View File

@@ -14,6 +14,9 @@ importers:
'@hookform/resolvers':
specifier: ^5.4.0
version: 5.4.0(react-hook-form@7.77.0(react@19.2.4))
'@sparticuz/chromium-min':
specifier: ^149.0.0
version: 149.0.0
better-auth:
specifier: ^1.6.14
version: 1.6.14(next@16.2.7(@babel/core@7.29.7)(react-dom@19.2.4(react@19.2.4))(react@19.2.4))(react-dom@19.2.4(react@19.2.4))(react@19.2.4)
@@ -32,6 +35,9 @@ importers:
next:
specifier: 16.2.7
version: 16.2.7(@babel/core@7.29.7)(react-dom@19.2.4(react@19.2.4))(react@19.2.4)
playwright-core:
specifier: ^1.60.0
version: 1.60.0
radix-ui:
specifier: ^1.4.3
version: 1.4.3(@types/react-dom@19.2.3(@types/react@19.2.16))(@types/react@19.2.16)(react-dom@19.2.4(react@19.2.4))(react@19.2.4)
@@ -1596,6 +1602,10 @@ packages:
resolution: {integrity: sha512-tlqY9xq5ukxTUZBmoOp+m61cqwQD5pHJtFY3Mn8CA8ps6yghLH/Hw8UPdqg4OLmFW3IFlcXnQNmo/dh8HzXYIQ==}
engines: {node: '>=18'}
'@sparticuz/chromium-min@149.0.0':
resolution: {integrity: sha512-/+QWJ6jDQnm/U7BITWVVcoe1CbuyW13pjonFpfBY67ZxePbaY/j4Ho+//n82AoGwugdkVVOYGY00KzMJzfYQdg==}
engines: {node: ^22.17.0 || >=24.0.0}
'@standard-schema/spec@1.1.0':
resolution: {integrity: sha512-l2aFy5jALhniG5HgqrD6jXLi/rUWrKvqN/qJx6yoJsgKhblVd+iqqU4RCXavm/jPityDo5TCvKMnpjKnOriy0w==}
@@ -2025,6 +2035,14 @@ packages:
resolution: {integrity: sha512-qIj0G9wZbMGNLjLmg1PT6v2mE9AH2zlnADJD/2tC6E00hgmhUOfEB6greHPAfLRSufHqROIUTkw6E+M3lH0PTQ==}
engines: {node: '>= 0.4'}
b4a@1.8.1:
resolution: {integrity: sha512-aiqre1Nr0B/6DgE2N5vwTc+2/oQZ4Wh1t4NznYY4E00y8LCt6NqdRv81so00oo27D8MVKTpUa/MwUUtBLXCoDw==}
peerDependencies:
react-native-b4a: '*'
peerDependenciesMeta:
react-native-b4a:
optional: true
balanced-match@1.0.2:
resolution: {integrity: sha512-3oSeUO0TMV67hN1AmbXsK4yaqU7tjiHlbxRDZOpH0KW9+CeX4bRAaX0Anxt0tx2MrpRpWwQaPwIlISEJhYU5Pw==}
@@ -2032,6 +2050,47 @@ packages:
resolution: {integrity: sha512-BLrgEcRTwX2o6gGxGOCNyMvGSp35YofuYzw9h1IMTRmKqttAZZVU67bdb9Pr2vUHA8+j3i2tJfjO6C6+4myGTA==}
engines: {node: 18 || 20 || >=22}
bare-events@2.9.1:
resolution: {integrity: sha512-Z0oHEHAFDZkffN8Qc39zNZjQlMDkPJRyyyZieU1VH7u8c5S+qHZ2S8ixdKIAxEjfHO7FJxXmJWgteOghVanIsg==}
peerDependencies:
bare-abort-controller: '*'
peerDependenciesMeta:
bare-abort-controller:
optional: true
bare-fs@4.7.2:
resolution: {integrity: sha512-aTvMFUWkBmjzKtEQMDGGDNF8bkfpD5N1b/FCwt7A3wrU4t1o/e/85Wzkluh6JlODCjqVESYCkQCdTXqZ9G7VFg==}
engines: {bare: '>=1.16.0'}
peerDependencies:
bare-buffer: '*'
peerDependenciesMeta:
bare-buffer:
optional: true
bare-os@3.9.1:
resolution: {integrity: sha512-6M5XjcnsygQNPMCMPXSK379xrJFiZ/AEMNBmFEmQW8d/789VQATvriyi5r0HYTL9TkQ26rn3kgdTG3aisbrXkQ==}
engines: {bare: '>=1.14.0'}
bare-path@3.0.1:
resolution: {integrity: sha512-ghj2DSK/2e99a1anTVPCV4m4YIYtrbXhfM7V3D7XZLOTsybnYyaJloymGqssQc8l/or0UoDyRtNQkmkEF/ysgQ==}
bare-stream@2.13.1:
resolution: {integrity: sha512-Vp0cnjYyrEC4whYTymQ+YZi6pBpfiICZO3cfRG8sy67ZNWe951urv1x4eW1BKNngw3U+3fPYb5JQvHbCtxH7Ow==}
peerDependencies:
bare-abort-controller: '*'
bare-buffer: '*'
bare-events: '*'
peerDependenciesMeta:
bare-abort-controller:
optional: true
bare-buffer:
optional: true
bare-events:
optional: true
bare-url@2.4.3:
resolution: {integrity: sha512-Kccpc7ACfXaxfeInfqKcZtW4pT5YBn1mesc4sCsun6sRwtbJ4h+sNOaksUpYEJUKfN65YWC6Bw2OJEFiKxq8nQ==}
baseline-browser-mapping@2.10.33:
resolution: {integrity: sha512-bA6+tcSLpz2tIEdDXZPpPTIuxBcC4+w6SieaYyfigIa4h8GlFxbA17v22Vx3JUtuZQj9SgOsnbK+aTBzyDyEuw==}
engines: {node: '>=6.0.0'}
@@ -2430,6 +2489,9 @@ packages:
resolution: {integrity: sha512-Q0n9HRi4m6JuGIV1eFlmvJB7ZEVxu93IrMyiMsGC0lrMJMWzRgx6WGquyfQgZVb31vhGgXnfmPNNXmxnOkRBrg==}
engines: {node: '>= 0.8'}
end-of-stream@1.4.5:
resolution: {integrity: sha512-ooEGc6HP26xXq/N+GCGOT0JKCLDGrq2bQUZrQ7gyrJiZANJ/8YDTxTpQBXGMn+WbIQXNVpyWymm7KYVICQnyOg==}
enhanced-resolve@5.22.1:
resolution: {integrity: sha512-6QEuw3zoX1SJQc7b87aBXke/no+mG2bTBgw29gWMQonLmpEkWoCAVkl+M49e48AZlWzxiDzDZzYdp6kobcyLww==}
engines: {node: '>=10.13.0'}
@@ -2622,6 +2684,9 @@ packages:
resolution: {integrity: sha512-aIL5Fx7mawVa300al2BnEE4iNvo1qETxLrPI/o05L7z6go7fCw1J6EQmbK4FmJ2AS7kgVF/KEZWufBfdClMcPg==}
engines: {node: '>= 0.6'}
events-universal@1.0.1:
resolution: {integrity: sha512-LUd5euvbMLpwOF8m6ivPCbhQeSiYVNb8Vs0fQ8QjXo0JTkEHpz8pxdQf0gStltaPpw0Cca8b39KxvK9cfKRiAw==}
eventsource-parser@3.1.0:
resolution: {integrity: sha512-kJezFj9YFAMLeORyi7aCLxLbD5/qWMQnoMVlVPyHIll7lgRJCc3JVln9Vgl9nwQi0YkMnhdGTMNn7CkRRAptMg==}
engines: {node: '>=18.0.0'}
@@ -2651,6 +2716,9 @@ packages:
fast-deep-equal@3.1.3:
resolution: {integrity: sha512-f3qQ9oQy9j2AhBe/H9VC91wLmKBCCU/gDOnKNAYG5hswO7BLKj09Hc5HYNz9cGI++xlpDCIgDaitVs03ATR84Q==}
fast-fifo@1.3.2:
resolution: {integrity: sha512-/d9sfos4yxzpwkDkuN7k2SqFKtYNmCTzgfEpz82x34IM9/zc8KGxQoXg1liNC/izpRM/MBdt44Nmx41ZWqk+FQ==}
fast-glob@3.3.1:
resolution: {integrity: sha512-kNFPyjhh5cKjrUltxs+wFx+ZkbRaxxmZ+X0ZU31SOsxCEtP9VPgtq2teZw1DebupL5GmDaNQ6yKMMVcM41iqDg==}
engines: {node: '>=8.6.0'}
@@ -3550,6 +3618,11 @@ packages:
resolution: {integrity: sha512-wQ0b/W4Fr01qtpHlqSqspcj3EhBvimsdh0KlHhH8HRZnMsEa0ea2fTULOXOS9ccQr3om+GcGRk4e+isrZWV8qQ==}
engines: {node: '>=16.20.0'}
playwright-core@1.60.0:
resolution: {integrity: sha512-9bW6zvX/m0lEbgTKJ6YppOKx8H3VOPBMOCFh2irXFOT4BbHgrx5hPjwJYLT40Lu+4qtD36qKc/Hn56StUW57IA==}
engines: {node: '>=18'}
hasBin: true
possible-typed-array-names@1.1.0:
resolution: {integrity: sha512-/+5VFTchJDoVj3bhoqi6UeymcD00DAwb1nJwamzPvHEszJ4FpF6SNNbUbOS8yI56qHzdV8eK0qEfOSiodkTdxg==}
engines: {node: '>= 0.4'}
@@ -3594,6 +3667,9 @@ packages:
resolution: {integrity: sha512-llQsMLSUDUPT44jdrU/O37qlnifitDP+ZwrmmZcoSKyLKvtZxpyV0n2/bD/N4tBAAZ/gJEdZU7KMraoK1+XYAg==}
engines: {node: '>= 0.10'}
pump@3.0.4:
resolution: {integrity: sha512-VS7sjc6KR7e1ukRFhQSY5LM2uBWAUPiOPa/A3mkKmiMwSmRFUITt0xuj+/lesgnCv+dPIEYlkzrcyXgquIHMcA==}
punycode@2.3.1:
resolution: {integrity: sha512-vYt7UD1U9Wg6138shLtLOvdAu+8DsC/ilFtEVHcH+wydcSpNE20AfSOduf6MkRFahL5FY7X1oU7nKVZFtfq8Fg==}
engines: {node: '>=6'}
@@ -3853,6 +3929,9 @@ packages:
resolution: {integrity: sha512-eLoXW/DHyl62zxY4SCaIgnRhuMr6ri4juEYARS8E6sCEqzKpOiE521Ucofdx+KnDZl5xmvGYaaKCk5FEOxJCoQ==}
engines: {node: '>= 0.4'}
streamx@2.26.0:
resolution: {integrity: sha512-VvNG1K72Po/xwJzxZFnZ++Tbrv4lwSptsbkFuzXCJAYZvCK5nnxsvXU6ajqkv7chyiI1Y0YXq2Jh8Iy8Y7NF/A==}
strict-event-emitter@0.5.1:
resolution: {integrity: sha512-vMgjE/GGEPEFnhFub6pa4FmJBRBVOLpIII2hvCZ8Kzb7K0hlHo7mQv6xYrBvCL2LtAIBwFUK8wvuJgTVSQ5MFQ==}
@@ -3950,6 +4029,18 @@ packages:
resolution: {integrity: sha512-uxc/zpqFg6x7C8vOE7lh6Lbda8eEL9zmVm/PLeTPBRhh1xCgdWaQ+J1CUieGpIfm2HdtsUpRv+HshiasBMcc6A==}
engines: {node: '>=6'}
tar-fs@3.1.2:
resolution: {integrity: sha512-QGxxTxxyleAdyM3kpFs14ymbYmNFrfY+pHj7Z8FgtbZ7w2//VAgLMac7sT6nRpIHjppXO2AwwEOg0bPFVRcmXw==}
tar-stream@3.2.0:
resolution: {integrity: sha512-ojzvCvVaNp6aOTFmG7jaRD0meowIAuPc3cMMhSgKiVWws1GyHbGd/xvnyuRKcKlMpt3qvxx6r0hreCNITP9hIg==}
teex@1.0.1:
resolution: {integrity: sha512-eYE6iEI62Ni1H8oIa7KlDU6uQBtqr4Eajni3wX7rpfXD8ysFx8z0+dri+KWEPWpBsxXfxu58x/0jvTVT1ekOSg==}
text-decoder@1.2.7:
resolution: {integrity: sha512-vlLytXkeP4xvEq2otHeJfSQIRyWxo/oZGEbXrtEEF9Hnmrdly59sUbzZ/QgyWuLYHctCHxFF4tRQZNQ9k60ExQ==}
tiny-invariant@1.3.3:
resolution: {integrity: sha512-+FbBPE1o9QAYvviau/qC5SE3caw21q3xkvWKBtja5vgqOWIHHJ3ioaq1VPfn/Szqctz2bU/oYeKd9/z5BL+PVg==}
@@ -5671,6 +5762,14 @@ snapshots:
'@sindresorhus/merge-streams@4.0.0': {}
'@sparticuz/chromium-min@149.0.0':
dependencies:
tar-fs: 3.1.2
transitivePeerDependencies:
- bare-abort-controller
- bare-buffer
- react-native-b4a
'@standard-schema/spec@1.1.0': {}
'@standard-schema/utils@0.3.0': {}
@@ -6078,10 +6177,44 @@ snapshots:
axobject-query@4.1.0: {}
b4a@1.8.1: {}
balanced-match@1.0.2: {}
balanced-match@4.0.4: {}
bare-events@2.9.1: {}
bare-fs@4.7.2:
dependencies:
bare-events: 2.9.1
bare-path: 3.0.1
bare-stream: 2.13.1(bare-events@2.9.1)
bare-url: 2.4.3
fast-fifo: 1.3.2
transitivePeerDependencies:
- bare-abort-controller
- react-native-b4a
bare-os@3.9.1: {}
bare-path@3.0.1:
dependencies:
bare-os: 3.9.1
bare-stream@2.13.1(bare-events@2.9.1):
dependencies:
streamx: 2.26.0
teex: 1.0.1
optionalDependencies:
bare-events: 2.9.1
transitivePeerDependencies:
- react-native-b4a
bare-url@2.4.3:
dependencies:
bare-path: 3.0.1
baseline-browser-mapping@2.10.33: {}
better-auth@1.6.14(next@16.2.7(@babel/core@7.29.7)(react-dom@19.2.4(react@19.2.4))(react@19.2.4))(react-dom@19.2.4(react@19.2.4))(react@19.2.4):
@@ -6384,6 +6517,10 @@ snapshots:
encodeurl@2.0.0: {}
end-of-stream@1.4.5:
dependencies:
once: 1.4.0
enhanced-resolve@5.22.1:
dependencies:
graceful-fs: 4.2.11
@@ -6745,6 +6882,12 @@ snapshots:
etag@1.8.1: {}
events-universal@1.0.1:
dependencies:
bare-events: 2.9.1
transitivePeerDependencies:
- bare-abort-controller
eventsource-parser@3.1.0: {}
eventsource@3.0.7:
@@ -6818,6 +6961,8 @@ snapshots:
fast-deep-equal@3.1.3: {}
fast-fifo@1.3.2: {}
fast-glob@3.3.1:
dependencies:
'@nodelib/fs.stat': 2.0.5
@@ -7654,6 +7799,8 @@ snapshots:
pkce-challenge@5.0.1: {}
playwright-core@1.60.0: {}
possible-typed-array-names@1.1.0: {}
postcss-selector-parser@7.1.1:
@@ -7699,6 +7846,11 @@ snapshots:
forwarded: 0.2.0
ipaddr.js: 1.9.1
pump@3.0.4:
dependencies:
end-of-stream: 1.4.5
once: 1.4.0
punycode@2.3.1: {}
qs@6.15.2:
@@ -8101,6 +8253,15 @@ snapshots:
es-errors: 1.3.0
internal-slot: 1.1.0
streamx@2.26.0:
dependencies:
events-universal: 1.0.1
fast-fifo: 1.3.2
text-decoder: 1.2.7
transitivePeerDependencies:
- bare-abort-controller
- react-native-b4a
strict-event-emitter@0.5.1: {}
string-width@4.2.3:
@@ -8208,6 +8369,42 @@ snapshots:
tapable@2.3.3: {}
tar-fs@3.1.2:
dependencies:
pump: 3.0.4
tar-stream: 3.2.0
optionalDependencies:
bare-fs: 4.7.2
bare-path: 3.0.1
transitivePeerDependencies:
- bare-abort-controller
- bare-buffer
- react-native-b4a
tar-stream@3.2.0:
dependencies:
b4a: 1.8.1
bare-fs: 4.7.2
fast-fifo: 1.3.2
streamx: 2.26.0
transitivePeerDependencies:
- bare-abort-controller
- bare-buffer
- react-native-b4a
teex@1.0.1:
dependencies:
streamx: 2.26.0
transitivePeerDependencies:
- bare-abort-controller
- react-native-b4a
text-decoder@1.2.7:
dependencies:
b4a: 1.8.1
transitivePeerDependencies:
- react-native-b4a
tiny-invariant@1.3.3: {}
tinyglobby@0.2.17:

View File

@@ -0,0 +1,30 @@
import assert from "node:assert/strict";
import { readFile } from "node:fs/promises";
import { join } from "node:path";
import test from "node:test";
const campaignsBoardPath = join(
process.cwd(),
"components",
"campaigns",
"campaigns-board.tsx",
);
test("campaign board renders campaigns as responsive cards", async () => {
const source = await readFile(campaignsBoardPath, "utf8");
assert.doesNotMatch(source, /<table\b/i);
assert.doesNotMatch(source, /<thead\b/i);
assert.doesNotMatch(source, /<tbody\b/i);
assert.doesNotMatch(source, /<tr\b/i);
assert.doesNotMatch(source, /<td\b/i);
assert.doesNotMatch(source, /<th\b/i);
assert.doesNotMatch(source, /md:hidden/i);
assert.doesNotMatch(source, /md:block/i);
assert.match(source, /className="grid gap-3"/);
assert.match(source, /openEditDialog\(campaign\)/);
assert.match(source, /toggleCampaign\(campaign\)/);
assert.match(source, /runCampaign\(campaign\)/);
});

View File

@@ -8,6 +8,7 @@ import {
isStalePendingAgentRun,
getLeadDiscoveryContactStatus,
getLeadDiscoveryPriority,
shouldScheduleWebsiteEnrichment,
} from "../lib/lead-discovery-run";
test("agent run guard ignores stale pending runs but blocks active runs", () => {
@@ -180,6 +181,69 @@ test("lead discovery lead record stores valid email and sets contactStatus to ne
assert.equal(record.contactPerson, undefined);
});
test("scheduling helper triggers website enrichment for missing contact leads with website data", () => {
assert.equal(
shouldScheduleWebsiteEnrichment({
websiteUrl: "https://www.example.de",
websiteDomain: "example.de",
contactStatus: "missing_contact",
}),
true,
);
});
test("scheduling helper does not trigger without website data", () => {
assert.equal(
shouldScheduleWebsiteEnrichment({
websiteUrl: null,
websiteDomain: "",
contactStatus: "missing_contact",
}),
false,
);
});
test("scheduling helper does not trigger when contact status is already usable", () => {
assert.equal(
shouldScheduleWebsiteEnrichment({
websiteUrl: "https://www.example.de",
websiteDomain: "example.de",
contactStatus: "new",
}),
false,
);
});
test("scheduling helper does not trigger for audit-ready leads", () => {
assert.equal(
shouldScheduleWebsiteEnrichment({
websiteUrl: "https://www.example.de",
websiteDomain: "example.de",
contactStatus: "audit_ready",
}),
false,
);
});
test("scheduling helper preserves existing contact-status behavior beyond TASK-7", () => {
assert.equal(
shouldScheduleWebsiteEnrichment({
websiteUrl: "https://www.example.de",
websiteDomain: "example.de",
contactStatus: "outreach_ready",
}),
false,
);
assert.equal(
shouldScheduleWebsiteEnrichment({
websiteUrl: "https://www.example.de",
websiteDomain: "example.de",
contactStatus: "do_not_contact",
}),
false,
);
});
test("lead discovery lead record stores normalized matching fields", () => {
const record = buildLeadDiscoveryLeadRecord({
campaignId: "campaign-1",

View File

@@ -0,0 +1,84 @@
import assert from "node:assert/strict";
import { readFileSync } from "node:fs";
import path from "node:path";
import test from "node:test";
const leadDiscoveryPath = path.join(process.cwd(), "convex", "leadDiscovery.ts");
const leadDiscoverySource = readFileSync(leadDiscoveryPath, "utf8");
function hasPattern(source: string, pattern: RegExp) {
return pattern.test(source);
}
function extractExportSource(name: string) {
const marker = `export const ${name} = `;
const declarationIndex = leadDiscoverySource.indexOf(marker);
assert.notEqual(declarationIndex, -1, `Expected declaration for ${name}`);
const openBraceIndex = leadDiscoverySource.indexOf("{", declarationIndex);
let depth = 0;
let end = -1;
for (let index = openBraceIndex; index < leadDiscoverySource.length; index++) {
const char = leadDiscoverySource[index];
if (char === "{") {
depth += 1;
} else if (char === "}") {
depth -= 1;
if (depth === 0) {
end = index;
break;
}
}
}
assert.notEqual(end, -1, `Expected balanced braces for ${name}`);
return leadDiscoverySource.slice(openBraceIndex, end + 1);
}
test("startCampaignRun checks active campaign runs via by_type_and_status", () => {
const source = extractExportSource("startCampaignRun");
assert.equal(
hasPattern(
source,
/withIndex\(\s*"by_type_and_status"\s*,\s*\(q\)\s*=>[\s\S]*?q\.eq\("type",\s*"campaign"\)\.eq\("status",\s*"running"\),?[\s\S]*?\)/,
),
true,
"Campaign starts should only consider running campaign-type runs as blockers",
);
});
test("persistDiscoveredLeads does not schedule website enrichment jobs directly", () => {
const source = extractExportSource("persistDiscoveredLeads");
assert.equal(
source.includes("ctx.scheduler.runAfter"),
false,
"Lead persistence must not call runAfter",
);
});
test("processCampaignRun schedules website enrichment after lead persistence", () => {
const source = extractExportSource("processCampaignRun");
const persistIndex = source.indexOf(
"internal.leadDiscovery.persistDiscoveredLeads",
);
const queueCall = source.indexOf("internal.websiteEnrichment.queueLeadEnrichment");
const eventMessageIndex = source.indexOf("Website-Kontaktanreicherung geplant.");
assert.notEqual(persistIndex, -1, "processCampaignRun should persist discovered leads");
assert.notEqual(queueCall, -1, "processCampaignRun should schedule website enrichment");
assert.notEqual(eventMessageIndex, -1, "processCampaignRun should append enrichment schedule events");
assert.ok(
persistIndex < queueCall,
"processCampaignRun should schedule enrichment after persistence succeeds",
);
assert.ok(
queueCall < eventMessageIndex,
"processCampaignRun should append enrichment event after scheduling",
);
});

View File

@@ -0,0 +1,112 @@
import assert from "node:assert/strict";
import { readFile } from "node:fs/promises";
import { join } from "node:path";
import test from "node:test";
const leadsReviewPath = join(
process.cwd(),
"components",
"leads",
"leads-review-table.tsx",
);
test("LeadsReviewTable uses compact card summaries with expandable review details", async () => {
const source = await readFile(leadsReviewPath, "utf8");
assert.doesNotMatch(source, /<table\b/i);
assert.doesNotMatch(source, /<thead\b/i);
assert.doesNotMatch(source, /<tbody\b/i);
assert.doesNotMatch(source, /<tr\b/i);
assert.doesNotMatch(source, /<td\b/i);
assert.doesNotMatch(source, /<th\b/i);
assert.doesNotMatch(source, /min-w-\[/i);
assert.match(source, /Mehr anzeigen/);
assert.match(source, /Weniger anzeigen/);
assert.match(source, /aria-expanded=\{[^}]+\}/);
assert.match(source, /aria-controls=\{[^}]+\}/);
assert.match(source, /id=\{[^}]+\}/);
assert.match(
source,
/aria-expanded=\{[^}]+\}[\s\S]{0,160}aria-controls=\{[^}]+\}[\s\S]{0,160}(Mehr anzeigen|Weniger anzeigen)/i,
);
assert.match(
source,
/hidden=\{!?isExpanded\}/,
);
const companyNameMatch = source.match(
/<p className="([^"]+)">\s*\{lead\.companyName\}\s*<\/p>/,
);
assert.ok(
companyNameMatch !== null &&
/(?:^|\s)(truncate|max-w-full|min-w-0|break-words)(?:\s|$)/.test(
companyNameMatch[1],
),
"Company name should use overflow-safe text classes in compact card.",
);
const nicheMatch = source.match(
/lead\.niche\s+\?\?\s+"Nische offen"\}\s*<\/span>/,
);
assert.ok(
nicheMatch !== null,
"Niche rendering should still be asserted in test fixture.",
);
const nicheContainerMatch = source.match(
/<span className="([^"]+)">\s*\{lead\.niche\s+\?\?\s+"Nische offen"\}\s*<\/span>/,
);
assert.ok(
nicheContainerMatch !== null &&
/(?:^|\s)(truncate|max-w-full|break-all|break-words)(?:\s|$)/.test(
nicheContainerMatch[1],
),
"Niche should use overflow-safe text classes in compact card.",
);
const locationMatch = source.match(/\{location\}/);
assert.ok(
locationMatch !== null,
"Location rendering should still be present in compact card.",
);
const locationContainerMatch = source.match(
/<span className="([^"]+)">\s*\{location\}\s*<\/span>/,
);
assert.ok(
locationContainerMatch !== null &&
/(?:^|\s)(truncate|max-w-full|break-words)(?:\s|$)/.test(
locationContainerMatch[1],
),
"Location should use overflow-safe text classes in compact card.",
);
const emailSpanMatch = source.match(
/<span className="([^"]+)">\s*\{lead\.email \|\| "Keine E-Mail"\}\s*<\/span>/,
);
assert.ok(
emailSpanMatch !== null &&
/(?:^|\s)(break-all|max-w-full|min-w-0)(?:\s|$)/.test(
emailSpanMatch[1],
),
"Lead email should use overflow-safe text classes in compact card.",
);
const phoneSpanMatch = source.match(
/<span className="([^"]+)">\s*\{lead\.phone\}\s*<\/span>/,
);
assert.ok(
phoneSpanMatch !== null &&
/(?:^|\s)(break-all|max-w-full|min-w-0)(?:\s|$)/.test(phoneSpanMatch[1]),
"Lead phone should use overflow-safe text classes in compact card.",
);
assert.match(source, /Kontaktstatus/);
assert.match(source, /Review-E-Mail/);
assert.match(source, /Review-Quelle/);
assert.match(source, /Ansprechperson/);
assert.match(source, /Genannte E-Mail als Business-Kontakt/);
assert.match(source, /Duplikatstatus/);
assert.match(source, /Sperrstatus/);
assert.match(source, /Sperren/);
assert.match(source, /Speichern/);
});

View File

@@ -0,0 +1,365 @@
import assert from "node:assert/strict";
import { existsSync, readFileSync } from "node:fs";
import path from "node:path";
import test from "node:test";
import ts from "typescript";
const actionPath = path.join(process.cwd(), "convex", "pageSpeedAction.ts");
const actionSource = existsSync(actionPath) ? readFileSync(actionPath, "utf8") : "";
const actionSourceFile = ts.createSourceFile(
"pageSpeedAction.ts",
actionSource,
ts.ScriptTarget.ES2022,
true,
ts.ScriptKind.TS,
);
function getExportedConstNames(file: ts.SourceFile) {
const names = new Set<string>();
const visit = (node: ts.Node) => {
if (ts.isVariableStatement(node)) {
const isExported = node.modifiers?.some(
(mod) => mod.kind === ts.SyntaxKind.ExportKeyword,
);
if (!isExported) {
ts.forEachChild(node, visit);
return;
}
const isConst = node.declarationList.flags & ts.NodeFlags.Const;
if (!isConst) {
ts.forEachChild(node, visit);
return;
}
for (const declaration of node.declarationList.declarations) {
if (ts.isIdentifier(declaration.name)) {
names.add(declaration.name.text);
}
}
}
ts.forEachChild(node, visit);
};
ts.forEachChild(file, visit);
return names;
}
function hasPattern(source: string, pattern: RegExp) {
return pattern.test(source);
}
test("pageSpeedAction module exists and runs in Node runtime", () => {
assert.equal(existsSync(actionPath), true, "pageSpeedAction.ts should exist");
assert.equal(
hasPattern(actionSource, /^"use node";/m),
true,
"pageSpeedAction.ts should use Node runtime",
);
});
test("pageSpeedAction exports processPageSpeedAudit as internalAction with runId validator", () => {
const exports = getExportedConstNames(actionSourceFile);
assert.equal(
exports.has("processPageSpeedAudit"),
true,
"processPageSpeedAudit should be exported",
);
assert.equal(
hasPattern(
actionSource,
/processPageSpeedAudit\s*=\s*internalAction\(\s*{\s*args:\s*{[\s\S]*runId:\s*v\.id\(\s*["']agentRuns["']\s*\)/,
),
true,
"processPageSpeedAudit should be an internalAction with runId validator",
);
});
test("pageSpeedAction starts and finishes run mutations", () => {
assert.equal(
hasPattern(
actionSource,
/internal\.pageSpeed\.startPageSpeedAuditRun/,
),
true,
"Action should call internal.pageSpeed.startPageSpeedAuditRun",
);
assert.equal(
hasPattern(
actionSource,
/internal\.pageSpeed\.finishPageSpeedAuditRun/,
),
true,
"Action should call internal.pageSpeed.finishPageSpeedAuditRun",
);
});
test("pageSpeedAction has action-level guard to fail whole run on unexpected errors", () => {
assert.equal(
hasPattern(
actionSource,
/try\s*{[\s\S]*?await ctx\.runMutation\(internal\.pageSpeed\.startPageSpeedAuditRun,\s*{[\s\S]*?}\);\s*[\s\S]*?for\s*\(\s*(?:const|let)\s+strategy\s+of\s+STRATEGIES[\s\S]*?\}\s*catch \(error\)\s*{[\s\S]*classifyPageSpeedFailure\(error,\s*apiKeyRaw\)[\s\S]*?internal\.pageSpeed\.finishPageSpeedAuditRun[\s\S]*status:\s*["']failed["']/,
),
true,
"Action should wrap run lifecycle in an outer try/catch that finalizes the run as failed.",
);
});
test("pageSpeedAction enforces raw payload size guard before storage", () => {
assert.equal(
hasPattern(actionSource, /MAX_RAW_PAGESPEED_BYTES/),
true,
"Action should declare MAX_RAW_PAGESPEED_BYTES constant.",
);
assert.equal(
hasPattern(
actionSource,
/new TextEncoder\(\)\.encode\(rawJson\)\.byteLength/,
),
true,
"Action should calculate raw JSON byte length before attempting to store.",
);
assert.equal(
hasPattern(
actionSource,
/rawJsonBytes\s*>\s*MAX_RAW_PAGESPEED_BYTES[\s\S]*?errorType:\s*["']api_error["'][\s\S]*?errorSummary:\s*RAW_PAGESPEED_BYTES_SUMMARY/,
),
true,
"Oversized raw payloads should be rejected as api_error with the required German summary.",
);
assert.equal(
hasPattern(
actionSource,
/new Blob\(\[rawJson\][\s\S]*type:\s*["']application\/json["']/,
),
true,
"Normal raw payloads should still be stored as application/json blobs.",
);
assert.equal(
hasPattern(
actionSource,
/if\s*\(\s*rawJsonBytes\s*>\s*MAX_RAW_PAGESPEED_BYTES[\s\S]*?}\s*[\s\S]*?continue;[\s\S]*?await ctx\.storage\.store\(/,
),
true,
"Raw payload storage must be skipped for oversized payloads.",
);
});
test("pageSpeedAction runs both strategies and catches per-strategy errors", () => {
assert.equal(
hasPattern(
actionSource,
/["']mobile["'][\s\S]*["']desktop["']/,
),
true,
"Action should include both page speed strategies: mobile and desktop",
);
assert.equal(
hasPattern(
actionSource,
/for\s*\(\s*(?:const|let)\s+strategy\s+of[\s\S]*?\)\s*{[\s\S]*?try[\s\S]*?catch\s*\([^)]*\)[\s\S]*?}/,
),
true,
"Action should catch errors inside per-strategy loop",
);
});
test("pageSpeedAction stores and persists results and writes events", () => {
assert.equal(
hasPattern(
actionSource,
/ctx\.storage\.store\([\s\S]*new Blob\(\[\s*rawJson\s*[\s\S]*type:\s*["']application\/json["']/,
),
true,
"Raw PageSpeed payload should be stored via ctx.storage.store with application/json blob",
);
assert.equal(
hasPattern(
actionSource,
/internal\.pageSpeed\.persistPageSpeedResult[\s\S]*status:\s*["']succeeded["']/,
),
true,
"Action should persist succeeded PageSpeed results",
);
assert.equal(
hasPattern(
actionSource,
/internal\.pageSpeed\.persistPageSpeedResult[\s\S]*status:\s*["']failed["']/,
),
true,
"Action should persist failed PageSpeed results",
);
assert.equal(
/api\.runs\.appendEvent,\s*{\s*[\s\S]*runId:\s*args\.runId,\s*[\s\S]*level:\s*["']info["']/.test(
actionSource,
),
true,
"Action should append info events for successful strategy results",
);
assert.equal(
/level:\s*["']warning["']/.test(actionSource) ||
/level:\s*["']error["']/.test(actionSource),
true,
"Action should append warning/error events for failed strategy results",
);
});
test("pageSpeedAction strips non-persisted normalized fields before Convex mutation", () => {
assert.equal(
hasPattern(actionSource, /toPersistedPageSpeedNormalizedResult/),
true,
"Action should map normalized PageSpeed output into the Convex validator shape.",
);
assert.equal(
hasPattern(
actionSource,
/normalized:\s*toPersistedPageSpeedNormalizedResult\(normalized\)/,
),
true,
"Action should persist only the normalized subset accepted by convex/pageSpeed.ts.",
);
assert.equal(
hasPattern(
actionSource,
/normalized,\s*[\r\n]/,
),
false,
"Action should not pass the full normalized object with strategy/sourceUrl/finalUrl/analysisTimestamp.",
);
});
test("pageSpeedAction does not expose API key in event messages/details", () => {
assert.equal(
hasPattern(
actionSource,
/api\.runs\.appendEvent[\s\S]{0,500}PAGESPEED_API_KEY/,
),
false,
"Action events should not include raw PAGESPEED_API_KEY",
);
});
test("pageSpeedAction imports PageSpeed helpers from lib/pagespeed-insights", () => {
const hasLibImport =
actionSource.includes("fetchPageSpeedResult") &&
actionSource.includes("normalizePageSpeedResult") &&
actionSource.includes("classifyPageSpeedError") &&
actionSource.includes('from "../lib/pagespeed-insights"');
assert.equal(hasLibImport, true, "Action should import required PageSpeed helper functions");
});
test("pageSpeedAction exposes configurable PageSpeed timeout via env var", () => {
assert.equal(
hasPattern(
actionSource,
/PAGESPEED_TIMEOUT_MS/
),
true,
"PageSpeed timeout should be configurable with PAGESPEED_TIMEOUT_MS.",
);
assert.equal(
hasPattern(actionSource, /DEFAULT_PAGESPEED_TIMEOUT_MS\s*=\s*60_000/),
true,
"PageSpeed timeout default should be 60_000ms.",
);
assert.equal(
hasPattern(actionSource, /MIN_PAGESPEED_TIMEOUT_MS\s*=\s*10_000/),
true,
"PageSpeed timeout min clamp should be 10_000ms.",
);
assert.equal(
hasPattern(actionSource, /MAX_PAGESPEED_TIMEOUT_MS\s*=\s*120_000/),
true,
"PageSpeed timeout max clamp should be 120_000ms.",
);
});
test("pageSpeedAction parses and clamps timeout values before use", () => {
assert.equal(
hasPattern(
actionSource,
/function parsePageSpeedTimeoutMs\(\s*raw:\s*string \| undefined\)/,
),
true,
"Action should parse PAGESPEED_TIMEOUT_MS via a dedicated helper.",
);
assert.equal(
hasPattern(actionSource, /Number\.parseInt\(raw,\s*10\)/),
true,
"Action should parse env timeout values as decimal integers.",
);
assert.equal(
hasPattern(actionSource, /Number\.isFinite\(/),
true,
"Invalid timeout values should be handled via Number.isFinite validation.",
);
assert.equal(
hasPattern(
actionSource,
/Math\.max\(\s*parsed,\s*MIN_PAGESPEED_TIMEOUT_MS\s*\)/,
),
true,
"Timeout below min should be clamped.",
);
assert.equal(
hasPattern(
actionSource,
/Math\.min\(\s*[\s\S]*MAX_PAGESPEED_TIMEOUT_MS\s*,\s*\)/,
),
true,
"Timeout above max should be clamped.",
);
});
test("pageSpeedAction passes resolved timeout to PageSpeed fetch calls", () => {
assert.equal(
hasPattern(
actionSource,
/const timeoutMs = resolvePageSpeedTimeoutMs\(\)/,
),
true,
"Action should resolve timeout once from helper and pass it to fetch calls.",
);
assert.equal(
hasPattern(
actionSource,
/fetchPageSpeedResult\([\s\S]{0,250}timeoutMs,/
),
true,
"Action should pass resolved timeout to fetchPageSpeedResult.",
);
assert.equal(
hasPattern(
actionSource,
/const timeoutMs\s*=\s*10_000/,
),
false,
"Timeout should not be hardcoded to 10_000ms in processPageSpeedAudit.",
);
});

View File

@@ -0,0 +1,191 @@
import assert from "node:assert/strict";
import { existsSync, readFileSync } from "node:fs";
import path from "node:path";
import test from "node:test";
import ts from "typescript";
const auditInputsPath = path.join(process.cwd(), "convex", "auditInputs.ts");
const auditInputsSource = existsSync(auditInputsPath)
? readFileSync(auditInputsPath, "utf8")
: "";
const sourceFile = ts.createSourceFile(
"auditInputs.ts",
auditInputsSource,
ts.ScriptTarget.ES2022,
true,
ts.ScriptKind.TS,
);
function getExportedConstNames(file: ts.SourceFile) {
const names = new Set<string>();
const visit = (node: ts.Node) => {
if (ts.isVariableStatement(node)) {
const isExported = node.modifiers?.some(
(mod) => mod.kind === ts.SyntaxKind.ExportKeyword,
);
if (!isExported) {
ts.forEachChild(node, visit);
return;
}
const isConst = node.declarationList.flags & ts.NodeFlags.Const;
if (!isConst) {
ts.forEachChild(node, visit);
return;
}
for (const declaration of node.declarationList.declarations) {
if (ts.isIdentifier(declaration.name)) {
names.add(declaration.name.text);
}
}
}
ts.forEachChild(node, visit);
};
ts.forEachChild(file, visit);
return names;
}
function hasPattern(source: string, pattern: RegExp) {
return pattern.test(source);
}
function extractExportSource(name: string) {
const marker = `export const ${name} = `;
const declarationIndex = auditInputsSource.indexOf(marker);
assert.notEqual(declarationIndex, -1, `Expected declaration for ${name}`);
const openBraceIndex = auditInputsSource.indexOf("{", declarationIndex);
let depth = 0;
let end = -1;
for (let index = openBraceIndex; index < auditInputsSource.length; index += 1) {
const char = auditInputsSource[index];
if (char === "{") {
depth += 1;
} else if (char === "}") {
depth -= 1;
if (depth === 0) {
end = index;
break;
}
}
}
assert.notEqual(end, -1, `Expected balanced braces for ${name}`);
return auditInputsSource.slice(openBraceIndex, end + 1);
}
test("auditInputs module exists and exports pageSpeed input translator query", () => {
assert.equal(
existsSync(auditInputsPath),
true,
"convex/auditInputs.ts should be present",
);
const exports = getExportedConstNames(sourceFile);
assert.equal(
exports.has("getPageSpeedAuditInputs"),
true,
"auditInputs module should export getPageSpeedAuditInputs",
);
});
test("auditInputs module calls buildPageSpeedAuditInputs for lead/audit PageSpeed results", () => {
const querySource = extractExportSource("getPageSpeedAuditInputs");
assert.equal(
hasPattern(
auditInputsSource,
/buildPageSpeedAuditInputs[\s\S]*?from\s*["']\.\.\/lib\/pagespeed-audit-input["']/,
),
true,
"auditInputs should import buildPageSpeedAuditInputs",
);
assert.equal(
hasPattern(querySource, /buildPageSpeedAuditInputs\(results\.map\(/),
true,
"auditInputs should call buildPageSpeedAuditInputs",
);
});
test("auditInputs query fetches stored pageSpeedResults by lead or audit", () => {
const querySource = extractExportSource("getPageSpeedAuditInputs");
assert.equal(
hasPattern(
auditInputsSource,
/getPageSpeedAuditInputs\s*=\s*internalQuery\s*\(/,
),
true,
"getPageSpeedAuditInputs should be registered as an internal query",
);
assert.equal(
hasPattern(
querySource,
/handler\s*:\s*async\s*\(/,
),
true,
"getPageSpeedAuditInputs source block should include an async handler",
);
assert.equal(
hasPattern(
auditInputsSource,
/ctx\.db[\s\S]*?\.query\([\s\S]*?["']pageSpeedResults["']\s*\)/,
),
true,
"auditInputs should read from pageSpeedResults table",
);
assert.equal(
hasPattern(
auditInputsSource,
/withIndex\(["']by_auditId["'][\s\S]*?eq\([\s\S]*?auditId[\s\S]*?\)/,
),
true,
"auditInputs should support audit-scoped PageSpeed results",
);
assert.equal(
hasPattern(
auditInputsSource,
/withIndex\(["']by_leadId["'][\s\S]*?eq\([\s\S]*?leadId[\s\S]*?\)/,
),
true,
"auditInputs should support lead-scoped PageSpeed results",
);
});
test("auditInputs returns only plain-language prompt fields", () => {
const querySource = extractExportSource("getPageSpeedAuditInputs");
assert.equal(
hasPattern(
querySource,
/technicalSignals\s*:\s*string\[\][\s\S]*customerImplications\s*:\s*string\[\][\s\S]*internalNotes\s*:\s*string\[\]/,
),
true,
"Return type should expose only technicalSignals, customerImplications, and internalNotes",
);
const returnConstruction = querySource.match(
/buildPageSpeedAuditInputs\([\s\S]*?\);/,
);
assert.notEqual(
returnConstruction,
null,
"auditInputs should return buildPageSpeedAuditInputs output",
);
assert.equal(
/rawStorageId/.test(returnConstruction?.[0] ?? ""),
false,
"Returned fields must not include rawStorageId",
);
assert.equal(
/\bscores\b/.test(returnConstruction?.[0] ?? ""),
false,
"Returned fields must not include scores",
);
});

View File

@@ -0,0 +1,301 @@
import assert from "node:assert/strict";
import test from "node:test";
import {
assertNoPublicPageSpeedScores,
buildPageSpeedAuditInputs,
type PageSpeedMinimalAuditResult,
} from "../lib/pagespeed-audit-input";
const MOBILE_AND_DESKTOP_FIXTURES: PageSpeedMinimalAuditResult[] = [
{
strategy: "mobile",
status: "succeeded",
sourceUrl: "https://example.com",
normalized: {
metrics: {
firstContentfulPaintMs: 3200,
largestContentfulPaintMs: 5000,
cumulativeLayoutShift: 0.2,
},
implications: [
"Score 0.42: Der erste sichtbare Inhalt erscheint zu langsam.",
"Die Seite zeigt das Hauptbild zu langsam.",
"Die Inhalte verschieben sich beim Laden.",
],
opportunities: [
"Nicht verwendetes CSS kann entfernt werden.",
"Bilder ohne passende Komprimierung koennen verzichtet werden.",
],
},
},
{
strategy: "desktop",
status: "succeeded",
sourceUrl: "https://example.com",
normalized: {
metrics: {
firstContentfulPaintMs: 1200,
largestContentfulPaintMs: 2200,
cumulativeLayoutShift: 0.04,
},
implications: [
"Die Seite zeigt das Hauptbild zu langsam.",
"Inhalte werden beim Laden sauber angezeigt.",
],
opportunities: ["Serverantworten sind stabil.", "Inhalte werden gestaffelt geladen."],
},
},
];
test("buildPageSpeedAuditInputs converts normalized implications into German customer impact statements", () => {
const actual = buildPageSpeedAuditInputs(MOBILE_AND_DESKTOP_FIXTURES);
assert.equal(actual.customerImplications.length > 0, true);
assert.equal(
actual.customerImplications.includes(
"Die Seite zeigt das Hauptbild zu langsam.",
),
true,
);
assert.equal(
actual.customerImplications.includes("Die Inhalte verschieben sich beim Laden."),
true,
);
assert.equal(
actual.customerImplications.some((line) => /mobile/i.test(line)),
true,
"Customer implications should include a mobile-centric statement.",
);
assert.equal(
assertNoPublicPageSpeedScores(actual.customerImplications),
true,
"Customer implications must not contain score-like values.",
);
assert.equal(
assertNoPublicPageSpeedScores(actual.technicalSignals),
true,
"Technical signals must not contain score-like values.",
);
});
test("buildPageSpeedAuditInputs detects meaningful mobile performance gaps versus desktop", () => {
const actual = buildPageSpeedAuditInputs(MOBILE_AND_DESKTOP_FIXTURES);
assert.equal(
actual.customerImplications.some((line) =>
/mobile/i.test(line) &&
/deutlich|spurbar|signifikant|langsamer/.test(line) &&
/desktop/i.test(line),
),
true,
);
});
test("buildPageSpeedAuditInputs keeps quota/api/unavailable failures in internal notes only", () => {
const actual = buildPageSpeedAuditInputs([
{
strategy: "mobile",
status: "failed",
sourceUrl: "https://bad.example",
errorType: "quota",
errorSummary: "API quota has been exceeded for this host.",
},
{
strategy: "desktop",
status: "failed",
sourceUrl: "https://bad2.example",
errorType: "unavailable",
errorSummary: "Page not reachable at the moment.",
},
{
strategy: "mobile",
status: "failed",
sourceUrl: "https://bad3.example",
errorType: "api_error",
errorSummary: "Lighthouse processing failed due to API timeout.",
},
{
strategy: "desktop",
status: "succeeded",
sourceUrl: "https://example.com",
normalized: {
implications: ["Die wichtigste Information wird zu langsam sichtbar."],
},
},
]);
assert.equal(
actual.customerImplications.some((line) => /quota|unavailable|timeout|api/i.test(line)),
false,
);
assert.equal(
actual.technicalSignals.some((line) => /quota|unavailable|timeout|api/i.test(line)),
false,
);
assert.equal(actual.internalNotes.length >= 3, true);
assert.equal(actual.internalNotes.some((line) => /quota/i.test(line)), true);
assert.equal(actual.internalNotes.some((line) => /not reachable|unreachable|erreich|timeout/i.test(line)), true);
assert.equal(actual.internalNotes.some((line) => /api/i.test(line)), true);
});
test("buildPageSpeedAuditInputs strips score-like and raw strings from public outputs", () => {
const actual = buildPageSpeedAuditInputs([
{
strategy: "mobile",
status: "succeeded",
sourceUrl: "https://example.com",
normalized: {
implications: [
"Score 0.42: FCP is high.",
"rawStorageId: file_123",
"Lighthouse category performance is present.",
"Die Seite laedt in 3.2 Sekunden.",
],
opportunities: [
"Ein { \"score\": 0.91 } kann optimiert werden.",
"redundante CSS Dateien.",
],
},
},
]);
assert.equal(assertNoPublicPageSpeedScores(actual.customerImplications), true);
assert.equal(assertNoPublicPageSpeedScores(actual.technicalSignals), true);
assert.equal(
actual.customerImplications.every((line) => !/\d/.test(line)),
true,
);
});
test("buildPageSpeedAuditInputs strips URLs, markup, JSON-like payloads, and machine-like words from public outputs", () => {
const actual = buildPageSpeedAuditInputs([
{
strategy: "mobile",
status: "succeeded",
sourceUrl: "https://example.com",
normalized: {
implications: [
"Weitere Infos findest du in https://example.com/details",
"Das <strong>Element</strong> lädt stabil.",
"{ \"pagespeed\": 0.84, \"lighthouseResult\": {} }",
"[\"rawStorageId\":\"id-0123456789abcdef0123456789\"]",
"rawStorageId: run_2026_0001",
"lighthouseResult suggests a bad candidate.",
"Die Seite laedt insgesamt spuertbar langsam.",
],
opportunities: [
"Moeglichkeit: <img src=\"x\" />",
"Pagespeed Score should not appear.",
"[{\"audit\":\"speed\"}]",
"Reduziere ungenutzte JavaScript-Dateien.",
"A longMachineToken_0123456789abcdef0123456789 to test filtering.",
],
},
},
]);
assert.equal(assertNoPublicPageSpeedScores(actual.customerImplications), true);
assert.equal(assertNoPublicPageSpeedScores(actual.technicalSignals), true);
assert.equal(actual.customerImplications.includes("Die Seite laedt insgesamt spuertbar langsam."), true);
assert.equal(
actual.technicalSignals.some((line) => /unused|reduziere|javascript/i.test(line)),
true,
);
assert.equal(
actual.customerImplications.every((line) => !/\bhttps?:\/\/|rawstorageid|lighthouseresult|pagespeed|score|<|>|\\{|\\}|\\[|\\]/i.test(line)),
true,
);
assert.equal(
actual.technicalSignals.every((line) => !/\bhttps?:\/\/|rawstorageid|lighthouseresult|pagespeed|score|<|>|\\{|\\}|\\[|\\]/i.test(line)),
true,
);
});
test("buildPageSpeedAuditInputs keeps failure categories in internal notes while removing URLs and JSON fragments", () => {
const actual = buildPageSpeedAuditInputs([
{
strategy: "mobile",
status: "failed",
sourceUrl: "https://example.com/audit?x=1",
errorType: "api_error",
errorSummary:
"PageSpeed API failed: { \"lighthouseResult\": {\"code\":\"timeout\"}, \"rawStorageId\": \"abc123\" }",
},
{
strategy: "desktop",
status: "succeeded",
sourceUrl: "https://example.com",
normalized: {
implications: [
"Die Seite laedt spuerbar schneller auf Desktop.",
],
},
},
]);
assert.equal(actual.internalNotes.length >= 1, true);
assert.equal(
actual.internalNotes.every(
(line) =>
!/https?:\/\//i.test(line) &&
!/\{|\}|\[|\]/i.test(line) &&
!/rawstorageid|lighthouseresult/i.test(line),
),
true,
);
assert.equal(
actual.internalNotes.some((line) => /api|technisch/i.test(line)),
true,
);
});
test("buildPageSpeedAuditInputs deduplicates and caps output lists", () => {
const manyImplications = Array.from({ length: 12 }, (_, index) => [
"Die Seite ist zu langsam.",
"Die Seite ist zu langsam.",
`Implication ${index}`,
"Wichtige Inhalte sind nicht sofort sichtbar.",
"Wichtige Inhalte sind nicht sofort sichtbar.",
]).flat();
const manyOpportunities = Array.from({ length: 12 }, (_, index) => [
"Komprimieren Sie Bilder.",
`Opportunity ${index}`,
"Komprimieren Sie Bilder.",
"Inhalte werden nachgeladen.",
]).flat();
const actual = buildPageSpeedAuditInputs([
{
strategy: "mobile",
status: "succeeded",
sourceUrl: "https://example.com",
normalized: {
implications: manyImplications,
opportunities: manyOpportunities,
},
},
...Array.from({ length: 10 }, (_, index) => ({
strategy: "desktop" as const,
status: "failed" as const,
sourceUrl: `https://example.com/${index}`,
errorType: "api_error" as const,
errorSummary: `Run ${String.fromCharCode(97 + (index % 26))} had internal problem.`,
})),
]);
assert.equal(actual.customerImplications.length <= 8, true);
assert.equal(actual.technicalSignals.length <= 8, true);
assert.equal(actual.customerImplications.length > 0, true);
assert.equal(actual.technicalSignals.length > 0, true);
assert.equal(actual.internalNotes.length, 6);
assert.equal(
new Set(actual.customerImplications).size,
actual.customerImplications.length,
);
assert.equal(
new Set(actual.technicalSignals).size,
actual.technicalSignals.length,
);
});

View File

@@ -0,0 +1,343 @@
import assert from "node:assert/strict";
import test from "node:test";
import {
buildPageSpeedRequestUrl,
classifyPageSpeedError,
fetchPageSpeedResult,
normalizePageSpeedResult,
type PageSpeedErrorType,
} from "../lib/pagespeed-insights";
const MOBILE_RAW_FIXTURE = {
analysisUTCTimestamp: "2026-06-01T08:00:00.000Z",
lighthouseResult: {
finalUrl: "https://example.com/mobil",
categories: {
performance: { score: 0.81 },
accessibility: { score: 0.96 },
"best-practices": { score: 0.77 },
seo: { score: 0.94 },
},
audits: {
"first-contentful-paint": {
title: "Erste Inhalte",
numericValue: 2850,
score: 0.76,
},
"largest-contentful-paint": {
title: "Größtes Inhaltselement",
numericValue: 4300,
score: 0.6,
},
"cumulative-layout-shift": {
title: "Layout-Verschiebung",
numericValue: 0.14,
},
"total-blocking-time": {
title: "Blockierende Skripte",
numericValue: 420,
},
"speed-index": {
title: "Speed Index",
numericValue: 4800,
},
"unused-css-rules": {
title: "Nicht verwendetes CSS",
score: 0.4,
details: {
type: "opportunity",
overallSavingsMs: 380,
},
},
"modern-image-formats": {
title: "Moderne Bildformate",
score: 0.55,
details: {
type: "opportunity",
overallSavingsBytes: 52000,
},
},
},
},
};
const DESKTOP_RAW_FIXTURE = {
analysisUTCTimestamp: "2026-06-01T09:00:00.000Z",
lighthouseResult: {
finalUrl: "https://example.com/desktop",
categories: {
performance: { score: 0.93 },
accessibility: { score: 0.99 },
"best-practices": { score: 0.85 },
seo: { score: 0.97 },
},
audits: {
"first-contentful-paint": {
title: "Erste Inhalte",
numericValue: 1800,
score: 0.91,
},
"largest-contentful-paint": {
title: "Größtes Inhaltselement",
numericValue: 3600,
score: 0.73,
},
"cumulative-layout-shift": {
title: "Layout-Verschiebung",
numericValue: 0.08,
},
"total-blocking-time": {
title: "Blockierende Skripte",
numericValue: 310,
},
"speed-index": {
title: "Speed Index",
numericValue: 3800,
},
"offscreen-images": {
title: "Außenseiten-Bilder",
score: 0.9,
details: {
type: "opportunity",
overallSavingsMs: 210,
},
},
},
},
};
test("buildPageSpeedRequestUrl includes required query params and repeated categories", () => {
const url = buildPageSpeedRequestUrl({
url: "https://example.com/landing?x=1&y=2",
strategy: "mobile",
locale: "de-DE",
apiKey: "super-secret",
});
const parsed = new URL(url);
assert.equal(
parsed.origin + parsed.pathname,
"https://pagespeedonline.googleapis.com/pagespeedonline/v5/runPagespeed",
);
assert.equal(parsed.searchParams.get("url"), "https://example.com/landing?x=1&y=2");
assert.equal(parsed.searchParams.get("strategy"), "mobile");
assert.equal(parsed.searchParams.get("locale"), "de-DE");
assert.equal(parsed.searchParams.get("key"), "super-secret");
assert.deepEqual(parsed.searchParams.getAll("category"), [
"performance",
"accessibility",
"best-practices",
"seo",
]);
assert.equal(
parsed.search.includes("url=https%3A%2F%2Fexample.com%2Flanding%3Fx%3D1%26y%3D2"),
true,
"URL input should be encoded",
);
});
test("buildPageSpeedRequestUrl omits empty API keys", () => {
const url = buildPageSpeedRequestUrl({
url: "https://example.com",
strategy: "desktop",
apiKey: "",
locale: "de-DE",
});
const parsed = new URL(url);
assert.equal(parsed.searchParams.has("key"), false);
});
test("normalizePageSpeedResult maps mobile scores, metrics, and implications", () => {
const normalized = normalizePageSpeedResult({
strategy: "mobile",
sourceUrl: "https://example.com",
raw: MOBILE_RAW_FIXTURE,
});
assert.equal(normalized.strategy, "mobile");
assert.equal(normalized.sourceUrl, "https://example.com");
assert.equal(normalized.finalUrl, "https://example.com/mobil");
assert.equal(normalized.analysisTimestamp, "2026-06-01T08:00:00.000Z");
assert.equal(normalized.scores?.performance, 0.81);
assert.equal(normalized.scores?.accessibility, 0.96);
assert.equal(normalized.scores?.bestPractices, 0.77);
assert.equal(normalized.scores?.seo, 0.94);
assert.equal(normalized.metrics.firstContentfulPaintMs, 2850);
assert.equal(normalized.metrics.largestContentfulPaintMs, 4300);
assert.equal(normalized.metrics.cumulativeLayoutShift, 0.14);
assert.equal(normalized.metrics.totalBlockingTimeMs, 420);
assert.equal(normalized.metrics.speedIndexMs, 4800);
assert.equal(normalized.opportunities.length >= 1, true);
assert.equal(normalized.implications.length >= 2, true);
assert.equal(normalized.implications.some((text) => text.includes("Besucher")), true);
for (const implication of normalized.implications) {
assert.equal(
/score\s*\d+/i.test(implication),
false,
`Implication should not contain raw score text: ${implication}`,
);
assert.equal(implication.length > 0, true);
}
});
test("normalizePageSpeedResult maps desktop scores and metrics", () => {
const normalized = normalizePageSpeedResult({
strategy: "desktop",
sourceUrl: "https://example.com/landing",
raw: DESKTOP_RAW_FIXTURE,
});
assert.equal(normalized.strategy, "desktop");
assert.equal(normalized.sourceUrl, "https://example.com/landing");
assert.equal(normalized.finalUrl, "https://example.com/desktop");
assert.equal(normalized.analysisTimestamp, "2026-06-01T09:00:00.000Z");
assert.equal(normalized.scores?.performance, 0.93);
assert.equal(normalized.scores?.bestPractices, 0.85);
assert.equal(normalized.metrics.firstContentfulPaintMs, 1800);
assert.equal(normalized.metrics.speedIndexMs, 3800);
assert.equal(normalized.metrics.totalBlockingTimeMs, 310);
assert.equal(normalized.opportunities.length >= 1, true);
assert.equal(normalized.implications.length >= 2, true);
});
test("classifyPageSpeedError maps status and body signals", () => {
const quotaByStatus = classifyPageSpeedError({ status: 429 });
assert.equal(quotaByStatus.errorType, "quota");
const quotaByBody = classifyPageSpeedError({
status: 403,
body: { error: { errors: [{ reason: "userRateLimitExceeded" }] } },
});
assert.equal(quotaByBody.errorType, "quota");
const timeoutError = classifyPageSpeedError({
error: new DOMException("timed out", "AbortError"),
});
assert.equal(timeoutError.errorType, "timeout");
const unavailableByStatus = classifyPageSpeedError({ status: 404 });
assert.equal(unavailableByStatus.errorType, "unavailable");
const unavailableByBody = classifyPageSpeedError({
status: 500,
body: { error: { message: "Failed to fetch document from given URL" } },
});
assert.equal(unavailableByBody.errorType, "unavailable");
const invalidUrl = classifyPageSpeedError({
status: 400,
body: { error: { message: "Invalid URL: unsupported format" } },
});
assert.equal(invalidUrl.errorType, "invalid_url");
const apiError = classifyPageSpeedError({
status: 500,
body: { error: { message: "backend down" } },
});
assert.equal(apiError.errorType, "api_error");
assert.match(apiError.message, /backend down/);
});
test("classifyPageSpeedError returns unknown for non-classified cases", () => {
const classified = classifyPageSpeedError({
error: new Error("something odd"),
});
const errorType: PageSpeedErrorType = classified.errorType;
assert.equal(errorType, "unknown");
assert.match(classified.message, /something odd/);
});
test("fetchPageSpeedResult uses injected fetch and uses the built request URL", async () => {
const calls: string[] = [];
const fetchImpl = async (url: string) => {
calls.push(url);
return {
ok: true,
status: 200,
async json() {
return { ok: true };
},
} as Response;
};
const actual = await fetchPageSpeedResult({
url: "https://example.com/test?tracking=true",
strategy: "desktop",
apiKey: "secret-key",
fetchImpl,
});
assert.deepEqual(actual, { ok: true });
assert.equal(calls.length, 1);
const parsed = new URL(calls[0]);
assert.equal(parsed.searchParams.get("strategy"), "desktop");
assert.equal(parsed.searchParams.get("locale"), "de-DE");
assert.deepEqual(
parsed.searchParams.getAll("category"),
["performance", "accessibility", "best-practices", "seo"],
);
assert.equal(parsed.searchParams.get("key"), "secret-key");
});
test(
"fetchPageSpeedResult throws classified api_error when response.ok response has invalid JSON",
async () => {
const fetchImpl = async () =>
({
ok: true,
status: 200,
async json() {
throw new SyntaxError("Unexpected token <");
},
}) as unknown as Response;
let caughtError: unknown;
try {
await fetchPageSpeedResult({
url: "https://example.com/broken-json",
strategy: "mobile",
fetchImpl,
});
assert.fail("Expected fetchPageSpeedResult to throw");
} catch (error) {
caughtError = error;
}
assert.match(String((caughtError as Error).message), /Unexpected token </i);
assert.equal((caughtError as Error & { errorType?: string }).errorType, "api_error");
},
);
test("fetchPageSpeedResult preserves Google API error messages", async () => {
const fetchImpl = async () =>
({
ok: false,
status: 403,
async json() {
return {
error: {
code: 403,
message: "API key not valid. Please pass a valid API key.",
status: "PERMISSION_DENIED",
},
};
},
}) as unknown as Response;
let caughtError: unknown;
try {
await fetchPageSpeedResult({
url: "https://example.com/key-error",
strategy: "desktop",
fetchImpl,
});
assert.fail("Expected fetchPageSpeedResult to throw");
} catch (error) {
caughtError = error;
}
assert.equal((caughtError as Error & { errorType?: string }).errorType, "api_error");
assert.match(String((caughtError as Error).message), /API key not valid/);
});

View File

@@ -0,0 +1,242 @@
import assert from "node:assert/strict";
import { existsSync, readFileSync } from "node:fs";
import path from "node:path";
import test from "node:test";
import ts from "typescript";
const pageSpeedPath = path.join(process.cwd(), "convex", "pageSpeed.ts");
const pageSpeedSource = existsSync(pageSpeedPath)
? readFileSync(pageSpeedPath, "utf8")
: "";
const sourceFile = ts.createSourceFile(
"pageSpeed.ts",
pageSpeedSource,
ts.ScriptTarget.ES2022,
true,
ts.ScriptKind.TS,
);
function getExportedConstNames(file: ts.SourceFile) {
const names = new Set<string>();
const visit = (node: ts.Node) => {
if (ts.isVariableStatement(node)) {
const isExported = node.modifiers?.some(
(mod) => mod.kind === ts.SyntaxKind.ExportKeyword,
);
if (!isExported) {
ts.forEachChild(node, visit);
return;
}
const isConst = node.declarationList.flags & ts.NodeFlags.Const;
if (!isConst) {
ts.forEachChild(node, visit);
return;
}
for (const declaration of node.declarationList.declarations) {
if (ts.isIdentifier(declaration.name)) {
names.add(declaration.name.text);
}
}
}
ts.forEachChild(node, visit);
};
ts.forEachChild(file, visit);
return names;
}
function hasPattern(source: string, pattern: RegExp) {
return pattern.test(source);
}
function extractExportSource(name: string) {
const marker = `export const ${name} = `;
const declarationIndex = pageSpeedSource.indexOf(marker);
assert.notEqual(declarationIndex, -1, `Expected declaration for ${name}`);
const openBraceIndex = pageSpeedSource.indexOf("{", declarationIndex);
let depth = 0;
let end = -1;
for (let index = openBraceIndex; index < pageSpeedSource.length; index += 1) {
const char = pageSpeedSource[index];
if (char === "{") {
depth += 1;
} else if (char === "}") {
depth -= 1;
if (depth === 0) {
end = index;
break;
}
}
}
assert.notEqual(end, -1, `Expected balanced braces for ${name}`);
return pageSpeedSource.slice(openBraceIndex, end + 1);
}
test("pageSpeed module exports mutation contracts", () => {
assert.equal(existsSync(pageSpeedPath), true, "pageSpeed.ts should be present");
const exports = getExportedConstNames(sourceFile);
const required = [
"queueLeadPageSpeedAudit",
"startPageSpeedAuditRun",
"persistPageSpeedResult",
"finishPageSpeedAuditRun",
];
for (const exportName of required) {
assert.equal(exports.has(exportName), true, `Expected export: ${exportName}`);
}
});
test("pageSpeed module uses internalMutation for queue/start/persist/finish", () => {
for (const name of [
"queueLeadPageSpeedAudit",
"startPageSpeedAuditRun",
"persistPageSpeedResult",
"finishPageSpeedAuditRun",
]) {
assert.equal(
hasPattern(pageSpeedSource, new RegExp(`export const ${name} = internalMutation\\s*\\(`)),
true,
`${name} should be registered as internalMutation.`,
);
}
});
test("queueLeadPageSpeedAudit dedupes per lead and schedules pagespeed action", () => {
const queueSource = extractExportSource("queueLeadPageSpeedAudit");
assert.equal(
hasPattern(
queueSource,
/withIndex\(\s*"by_type_and_status_and_leadId"[\s\S]*?eq\("type",\s*"audit"\)[\s\S]*?eq\("status",\s*"pending"\)[\s\S]*?eq\("leadId",\s*args\.leadId\)/,
),
true,
"Queue should dedupe pending audit runs by type+status+leadId.",
);
assert.equal(
hasPattern(
queueSource,
/withIndex\(\s*"by_type_and_status_and_leadId"[\s\S]*?eq\("type",\s*"audit"\)[\s\S]*?eq\("status",\s*"running"\)[\s\S]*?eq\("leadId",\s*args\.leadId\)/,
),
true,
"Queue should dedupe running audit runs by type+status+leadId.",
);
assert.equal(
hasPattern(
queueSource,
/currentStep:\s*["']pagespeed_insights["']/,
),
true,
"Queued page speed runs should use currentStep pagespeed_insights.",
);
assert.equal(
hasPattern(
queueSource,
/ctx\.scheduler\.runAfter\(\s*0,\s*internal\.pageSpeedAction\.processPageSpeedAudit,\s*\{[\s\S]*?runId/,
),
true,
"queueLeadPageSpeedAudit must schedule internal.pageSpeedAction.processPageSpeedAudit with runAfter(0, ...).",
);
assert.equal(
hasPattern(
queueSource,
/PageSpeed-Analyse wurde in die Warteschlange gesetzt\./,
),
true,
"queueLeadPageSpeedAudit should emit queue-start event message.",
);
});
test("startPageSpeedAuditRun marks run as running and handles clear failures", () => {
const startSource = extractExportSource("startPageSpeedAuditRun");
assert.equal(
hasPattern(
startSource,
/run\.type\s*!==?\s*["']audit["']/,
),
true,
"start function should require audit run type.",
);
assert.equal(
hasPattern(startSource, /run\.status\s*!==?\s*["']pending["']/),
true,
"start function should require pending status.",
);
assert.equal(
hasPattern(
startSource,
/ctx\.db\.patch\(\s*args\.runId,\s*\{[\s\S]*status:\s*["']running["']/,
),
true,
"start function should set status running.",
);
assert.equal(
hasPattern(startSource, /currentStep:\s*["']pagespeed_insights["']/),
true,
"start function should set currentStep pagespeed_insights.",
);
assert.equal(
hasPattern(
startSource,
/!run\.leadId[\s\S]*status:\s*["']failed["']/,
),
true,
"start should fail and record missing leadId.",
);
assert.equal(
hasPattern(
startSource,
/!lead\.websiteUrl[\s\S]*status:\s*["']failed["']/,
),
true,
"start should fail and record missing website URL.",
);
assert.equal(
hasPattern(
startSource,
/message:\s*["'][^"']*konnte nicht gestartet werden[^"']*["']/i,
),
true,
"start should add clear failure events.",
);
});
test("persistPageSpeedResult writes pageSpeedResults table", () => {
const persistSource = pageSpeedSource
? extractExportSource("persistPageSpeedResult")
: "export const persistPageSpeedResult = {}";
assert.equal(
hasPattern(persistSource, /ctx\.db\.insert\(\s*["']pageSpeedResults["']/),
true,
"persistPageSpeedResult should insert into pageSpeedResults.",
);
});
test("finishPageSpeedAuditRun writes completion status and finishedAt", () => {
const finishSource = extractExportSource("finishPageSpeedAuditRun");
assert.equal(
hasPattern(finishSource, /ctx\.db\.patch\(\s*args\.runId,[\s\S]*?finishedAt:\s*now/),
true,
"finish function should set finishedAt.",
);
assert.equal(
hasPattern(
finishSource,
/counters:\s*\{\s*[\s\S]*?errors:\s*args\.errors\s*\?\?/,
),
true,
"finish function should update counters.",
);
assert.equal(
hasPattern(finishSource, /currentStep:\s*["']pagespeed_insights["']/),
true,
"finish function should set currentStep pagespeed_insights.",
);
});

View File

@@ -0,0 +1,226 @@
import assert from "node:assert/strict";
import { readFileSync } from "node:fs";
import { join } from "node:path";
import test from "node:test";
const schemaSource = readFileSync(
join(process.cwd(), "convex", "schema.ts"),
"utf8",
);
type ExactSetEquality<A, B> = [
Exclude<A, B>,
] extends [never]
? [Exclude<B, A>] extends [never]
? true
: false
: false;
type AssertPageSpeedStrategy = "mobile" | "desktop";
type AssertPageSpeedResultStatus = "succeeded" | "failed";
type AssertPageSpeedErrorType =
| "quota"
| "timeout"
| "unavailable"
| "invalid_url"
| "api_error"
| "unknown";
type PageSpeedStrategyParity = ExactSetEquality<
AssertPageSpeedStrategy,
("mobile" | "desktop")
>;
type PageSpeedResultStatusParity = ExactSetEquality<
AssertPageSpeedResultStatus,
"succeeded" | "failed"
>;
type PageSpeedErrorTypeParity = ExactSetEquality<
AssertPageSpeedErrorType,
"quota" | "timeout" | "unavailable" | "invalid_url" | "api_error" | "unknown"
>;
const _assertPageSpeedStrategyParity: PageSpeedStrategyParity = true;
const _assertPageSpeedResultStatusParity: PageSpeedResultStatusParity = true;
const _assertPageSpeedErrorTypeParity: PageSpeedErrorTypeParity = true;
function extractTableSection(tableName: string) {
const marker = `${tableName}: defineTable({`;
const markerIndex = schemaSource.indexOf(marker);
assert.notEqual(
markerIndex,
-1,
`Expected schema table definition for ${tableName}.`,
);
const objectStart = schemaSource.indexOf("{", markerIndex);
let depth = 0;
let objectEnd = -1;
for (let i = objectStart; i < schemaSource.length; i += 1) {
if (schemaSource[i] === "{") {
depth += 1;
} else if (schemaSource[i] === "}") {
depth -= 1;
if (depth === 0) {
objectEnd = i;
break;
}
}
}
assert.notEqual(objectEnd, -1, `Could not parse schema object for ${tableName}.`);
const remainder = schemaSource.slice(objectEnd + 1);
const nextTableMatch = remainder.match(/^\s*[a-zA-Z_][\w]*:\s*defineTable\(/m);
const sectionEnd =
nextTableMatch === null ? schemaSource.length : objectEnd + 1 + nextTableMatch.index!;
const section = schemaSource.slice(markerIndex, sectionEnd);
const objectBlock = schemaSource.slice(markerIndex, objectEnd + 1);
return { section, objectBlock };
}
function assertHas(pattern: RegExp, source: string, message: string) {
assert.equal(pattern.test(source), true, message);
}
test("PageSpeed validator unions are declared", () => {
assert.equal(_assertPageSpeedStrategyParity, true);
assert.equal(_assertPageSpeedResultStatusParity, true);
assert.equal(_assertPageSpeedErrorTypeParity, true);
assertHas(
/const\s+pageSpeedStrategy\s*=\s*v\.union\(\s*[\s\S]*v\.literal\(\s*["']mobile["']\s*\)\s*,\s*[\s\S]*v\.literal\(\s*["']desktop["']\s*\)[\s\S]*\)/,
schemaSource,
"Schema should define pageSpeedStrategy union with mobile and desktop.",
);
assertHas(
/const\s+pageSpeedResultStatus\s*=\s*v\.union\(\s*[\s\S]*v\.literal\(\s*["']succeeded["']\s*\)\s*,\s*[\s\S]*v\.literal\(\s*["']failed["']\s*\)[\s\S]*\)/,
schemaSource,
"Schema should define pageSpeedResultStatus union with succeeded and failed.",
);
assertHas(
/const\s+pageSpeedErrorType\s*=\s*v\.union\(\s*[\s\S]*v\.literal\(\s*["']quota["']\s*\)\s*,\s*[\s\S]*v\.literal\(\s*["']timeout["']\s*\)\s*,\s*[\s\S]*v\.literal\(\s*["']unavailable["']\s*\)\s*,\s*[\s\S]*v\.literal\(\s*["']invalid_url["']\s*\)\s*,\s*[\s\S]*v\.literal\(\s*["']api_error["']\s*\)\s*,\s*[\s\S]*v\.literal\(\s*["']unknown["']\s*\)[\s\S]*\)/,
schemaSource,
"Schema should define pageSpeedErrorType union with all declared values.",
);
});
test("pageSpeedResults table has contract fields and indexes", () => {
const { section, objectBlock } = extractTableSection("pageSpeedResults");
assertHas(
/leadId:\s*v\.id\(["']leads["']\)/,
objectBlock,
"pageSpeedResults.leadId should be required lead id.",
);
assertHas(
/auditId:\s*v\.optional\(\s*v\.id\(["']audits["']\)\s*\)/,
objectBlock,
"pageSpeedResults.auditId should be optional audit id.",
);
assertHas(
/runId:\s*v\.optional\(\s*v\.id\(["']agentRuns["']\)\s*\)/,
objectBlock,
"pageSpeedResults.runId should be optional run id.",
);
assertHas(
/strategy:\s*pageSpeedStrategy/,
objectBlock,
"pageSpeedResults.strategy should use pageSpeedStrategy validator.",
);
assertHas(
/status:\s*pageSpeedResultStatus/,
objectBlock,
"pageSpeedResults.status should use pageSpeedResultStatus validator.",
);
assertHas(
/sourceUrl:\s*v\.string\(\)/,
objectBlock,
"pageSpeedResults.sourceUrl should be required.",
);
assertHas(
/finalUrl:\s*v\.optional\(\s*v\.string\(\)\s*\)/,
objectBlock,
"pageSpeedResults.finalUrl should be optional string.",
);
assertHas(
/rawStorageId:\s*v\.optional\(\s*v\.id\(["']_storage["']\)\s*\)/,
objectBlock,
"pageSpeedResults.rawStorageId should be optional storage id.",
);
assertHas(
/errorType:\s*v\.optional\(\s*pageSpeedErrorType\s*\)/,
objectBlock,
"pageSpeedResults.errorType should be optional error type.",
);
assertHas(
/errorSummary:\s*v\.optional\(\s*v\.string\(\)\s*\)/,
objectBlock,
"pageSpeedResults.errorSummary should be optional.",
);
assertHas(
/fetchedAt:\s*v\.number\(\)/,
objectBlock,
"pageSpeedResults.fetchedAt should be required.",
);
assertHas(
/createdAt:\s*v\.number\(\)/,
objectBlock,
"pageSpeedResults.createdAt should be required.",
);
assertHas(
/scores:\s*v\.optional\(\s*v\.object\([\s\S]*?performance:\s*v\.optional\(v\.number\(\)\)[\s\S]*?accessibility:\s*v\.optional\(v\.number\(\)\)[\s\S]*?bestPractices:\s*v\.optional\(v\.number\(\)\)[\s\S]*?seo:\s*v\.optional\(v\.number\(\)\)[\s\S]*?\)\s*\)/,
objectBlock,
"pageSpeedResults.normalized.scores should include expected keys.",
);
assertHas(
/metrics:\s*v\.optional\(\s*v\.object\([\s\S]*?firstContentfulPaintMs:\s*v\.optional\(v\.number\(\)\)[\s\S]*?largestContentfulPaintMs:\s*v\.optional\(v\.number\(\)\)[\s\S]*?cumulativeLayoutShift:\s*v\.optional\(v\.number\(\)\)[\s\S]*?totalBlockingTimeMs:\s*v\.optional\(v\.number\(\)\)[\s\S]*?speedIndexMs:\s*v\.optional\(v\.number\(\)\)[\s\S]*?\)\s*\)/,
objectBlock,
"pageSpeedResults.normalized.metrics should include expected keys.",
);
assertHas(
/opportunities:\s*v\.optional\(\s*v\.array\(v\.string\(\)\)\s*\)/,
objectBlock,
"pageSpeedResults.normalized.opportunities should be optional string array.",
);
assertHas(
/implications:\s*v\.optional\(\s*v\.array\(v\.string\(\)\)\s*\)/,
objectBlock,
"pageSpeedResults.normalized.implications should be optional string array.",
);
assertHas(
/index\("by_leadId",\s*\["leadId"\]\)/,
section,
"pageSpeedResults should have by_leadId index.",
);
assertHas(
/index\("by_runId",\s*\["runId"\]\)/,
section,
"pageSpeedResults should have by_runId index.",
);
assertHas(
/index\("by_auditId",\s*\["auditId"\]\)/,
section,
"pageSpeedResults should have by_auditId index.",
);
assertHas(
/index\("by_leadId_and_strategy",\s*\["leadId",\s*"strategy"\]\)/,
section,
"pageSpeedResults should have by_leadId_and_strategy index.",
);
});
test("audits should not include public raw PageSpeed/Lighthouse JSON fields", () => {
const { objectBlock } = extractTableSection("audits");
const hasPublicRawJson = /raw.*pagespeed|pagespeed.*raw|raw.*lighthouse|lighthouse.*raw/i.test(
objectBlock,
);
assert.equal(
hasPublicRawJson,
false,
"audits should not expose raw PageSpeed/Lighthouse JSON fields.",
);
});

27
tests/runs-domain.test.ts Normal file
View File

@@ -0,0 +1,27 @@
import assert from "node:assert/strict";
import { readFileSync } from "node:fs";
import { join } from "node:path";
import test from "node:test";
const runsSource = readFileSync(
join(process.cwd(), "convex", "runs.ts"),
"utf8",
);
const schemaSource = readFileSync(
join(process.cwd(), "convex", "schema.ts"),
"utf8",
);
test("run listing supports type-only filtering", () => {
assert.match(
runsSource,
/if\s*\(\s*args\.type\s*\)\s*\{[\s\S]*?\.withIndex\(\s*"by_type"\s*,\s*\(q\)\s*=>\s*q\.eq\("type",\s*type\)\)/,
);
});
test("agentRuns schema defines by_type index", () => {
assert.match(
schemaSource,
/\.index\("by_type",\s*\["type"\]\)/,
);
});

View File

@@ -0,0 +1,291 @@
import assert from "node:assert/strict";
import test from "node:test";
import {
buildTechnicalChecks,
isSameRegistrableHostishDomain,
normalizeCrawlUrl,
discoverRelevantSubpageUrls,
extractContactSignalsFromHtmlLikeText,
} from "../lib/website-crawler";
import { getUsableContactEmailFromEntries } from "../lib/lead-discovery-google";
test("normalizeCrawlUrl normalizes host and strips fragments while supporting relative links with base", () => {
assert.equal(
normalizeCrawlUrl("https://WWW.Example.Com/path?x=1#kontakt", undefined),
"https://example.com/path?x=1",
);
assert.equal(normalizeCrawlUrl("/kontakt?lang=de#top", "https://www.example.de/start"), "https://example.de/kontakt?lang=de");
assert.equal(normalizeCrawlUrl("mailto:owner@example.de", "https://example.de"), null);
});
test("isSameRegistrableHostishDomain treats www domain variants as same domain", () => {
assert.equal(
isSameRegistrableHostishDomain("https://www.example.de/kontakt", "http://example.de"),
true,
);
assert.equal(
isSameRegistrableHostishDomain("//example.de/contact", "https://www.example.de"),
true,
);
assert.equal(
isSameRegistrableHostishDomain("https://blog.example.de/kontakt", "https://example.de"),
false,
);
});
test("discoverRelevantSubpageUrls keeps homepage first, prioritizes relevant categories, and is bounded", () => {
const links = [
"https://other.example.com/kontakt",
"mailto:kontakt@example.de",
"https://example.de/leistungen?source=seo",
"/kontakt",
"/angebot",
"/impressum?x=1",
"/ueber-uns",
"/services?foo=bar",
"/irrelevant",
];
const discovered = discoverRelevantSubpageUrls(links, "https://www.example.de");
assert.deepEqual(discovered, [
"https://example.de/",
"https://example.de/kontakt",
"https://example.de/impressum",
"https://example.de/leistungen",
"https://example.de/ueber-uns",
]);
});
test("discoverRelevantSubpageUrls deduplicates query variants before bounded selection", () => {
const links = [
"https://example.de/kontakt?a=1",
"/kontakt?a=2",
"/kontakt?source=google",
"https://example.de/ueber-uns?team=1",
];
const discovered = discoverRelevantSubpageUrls(links, "https://www.example.de");
assert.deepEqual(discovered, [
"https://example.de/",
"https://example.de/kontakt",
"https://example.de/ueber-uns",
]);
});
test("discoverRelevantSubpageUrls ignores cross-domain and non-navigational link schemes", () => {
const links = [
"mailto:kontakt@example.de",
"tel:+49 30 1234 567",
"javascript:void(0)",
"https://example.de/contact",
"https://blog.example.de/impressum",
"//other.de/team",
"http://example.de/leistungen",
];
const discovered = discoverRelevantSubpageUrls(links, "https://www.example.de/path");
assert.deepEqual(discovered, [
"https://example.de/",
"https://example.de/contact",
"http://example.de/leistungen",
]);
});
test("generic contact emails beat named emails when selected through TASK-7 rule helper", () => {
const signals = extractContactSignalsFromHtmlLikeText(
"<h1>Kontakt</h1><p>Schreiben Sie an <a href=\"mailto:owner@example.de\">Max Mustermann</a> oder info@example.de.</p>",
);
const usable = getUsableContactEmailFromEntries(signals.emailCandidates);
assert.equal(usable?.email, "info@example.de");
});
test("named email without explicit business-contact context is not accepted by TASK-7 helper", () => {
const signals = extractContactSignalsFromHtmlLikeText(
"<p>Wir beantworten offene Fragen per max.mustermann@example.de und stehen Ihnen werktags zur Verfügung.</p>",
);
const usable = getUsableContactEmailFromEntries(signals.emailCandidates);
assert.equal(usable, null);
assert.equal(signals.emailCandidates[0]?.isBusinessContactAddress, false);
});
test("extractContactSignalsFromHtmlLikeText marks Bock Impressum mailto candidates as business contact", () => {
const signals = extractContactSignalsFromHtmlLikeText(
"<p>Impressum</p>" +
"<script>" +
"x".repeat(320) +
"</script>" +
"<p>E-Mail: <a href=\"mailto:chemnitz@bock-rechtsanwaelte.de\">chemnitz@bock-rechtsanwaelte.de</a> oder <a href=\"mailto:aue@bock-rechtsanwaelte.de\">aue@bock-rechtsanwaelte.de</a></p>" +
"<p>Weitere E-Mail-Adressen: dresden@bock-rechtsanwaelte.de, mittweida@bock-rechtsanwaelte.de, meerane@bock-rechtsanwaelte.de</p>",
);
const usable = getUsableContactEmailFromEntries(signals.emailCandidates);
assert.equal(usable !== null, true);
assert.equal(
usable?.email === "chemnitz@bock-rechtsanwaelte.de" || usable !== null,
true,
);
for (const candidate of signals.emailCandidates) {
assert.equal(candidate.isBusinessContactAddress, true);
}
});
test("email-labeled mailto links should not populate contactPerson", () => {
const signals = extractContactSignalsFromHtmlLikeText(
"<p>Impressum - E-Mail: <a href=\"mailto:chemnitz@bock-rechtsanwaelte.de\">chemnitz@bock-rechtsanwaelte.de</a></p>",
);
const candidate = signals.emailCandidates.find(
(entry) => entry.email === "chemnitz@bock-rechtsanwaelte.de",
);
assert.equal(candidate?.contactPerson, null);
});
test("extractContactSignalsFromHtmlLikeText parses mailto links with query parameters in contact context", () => {
const signals = extractContactSignalsFromHtmlLikeText(
'<footer><p><a href="mailto:info@example.de?subject=Anfrage">Jetzt schreiben</a></p></footer>',
);
const candidate = signals.emailCandidates[0];
assert.equal(signals.emailCandidates.length, 1);
assert.equal(candidate?.email, "info@example.de");
assert.equal(candidate?.isBusinessContactAddress, true);
});
test("extractContactSignalsFromHtmlLikeText parses common obfuscations in visible text", () => {
const signals = extractContactSignalsFromHtmlLikeText(
"<p>Sie erreichen uns unter info [at] example.de, kontakt (at) example punkt de oder office&nbsp;@&nbsp;example.de.</p>",
);
const emails = signals.emailCandidates.map((entry) => entry.email).sort();
assert.deepEqual(emails, [
"info@example.de",
"kontakt@example.de",
"office@example.de",
]);
});
test("does not infer obfuscated emails from normal prose with bare at/dot", () => {
const signals = extractContactSignalsFromHtmlLikeText(
"<p>We are at example dot de for a workshop in the city center.</p>",
);
assert.equal(signals.emailCandidates.length, 0);
});
test("deduplicates repeated mailto entries", () => {
const signals = extractContactSignalsFromHtmlLikeText(
"<p><a href=\"mailto:info@example.de\">info@example.de</a> and again <a href=\"mailto:info@example.de\">also</a></p>",
);
assert.equal(signals.emailCandidates.length, 1);
});
test("TASK-7 keeps generic contact emails in footer/impressum usable and rejects named emails without context", () => {
const footerSignals = extractContactSignalsFromHtmlLikeText(
"<footer>Impressum: info@example.de für allgemeine Anfragen.</footer>",
);
assert.equal(
getUsableContactEmailFromEntries(footerSignals.emailCandidates)?.email,
"info@example.de",
);
const impressionSignals = extractContactSignalsFromHtmlLikeText(
"<p>Impressum der Firma office@example.de ist die Hauptadresse.</p>",
);
assert.equal(
getUsableContactEmailFromEntries(impressionSignals.emailCandidates)?.email,
"office@example.de",
);
const namedSignals = extractContactSignalsFromHtmlLikeText(
"<p>Bitte wenden Sie sich an max.mustermann@example.de bei Fragen.</p>",
);
assert.equal(
getUsableContactEmailFromEntries(namedSignals.emailCandidates),
null,
);
});
test("extractContactSignalsFromHtmlLikeText captures contact-person from adjacent raw HTML context", () => {
const signals = extractContactSignalsFromHtmlLikeText(
"<p>Ansprechpartner: <a href=\"/team/max-mustermann\">Max Mustermann</a> max.mustermann@example.de</p>",
);
const candidate = signals.emailCandidates[0];
assert.equal(candidate?.email, "max.mustermann@example.de");
assert.equal(candidate?.contactPerson, "Max Mustermann");
assert.equal(candidate?.isBusinessContactAddress, true);
});
test("technical checks detect protocol, missing metadata, contact path, and broken internal links", () => {
const checks = buildTechnicalChecks({
rootUrl: "https://www.example.de",
finalUrl: "http://example.de/firma",
title: " ",
metaDescription: "",
visibleText: "Wir freuen uns, wenn Sie uns kontaktieren. Hier geht es zum Kontaktformular.",
links: [
"/kontakt",
{ href: "/impressum", statusCode: 200 },
{ href: "https://example.de/broken", statusCode: 404 },
{ href: "https://partner.example.de/team", statusCode: 500 },
],
});
assert.equal(checks.https, false);
assert.equal(checks.finalUrl, "http://example.de/firma");
assert.equal(checks.missingTitle, true);
assert.equal(checks.missingMetaDescription, true);
assert.equal(checks.hasVisibleContactPath, true);
assert.deepEqual(checks.brokenInternalLinks, ["https://example.de/broken"]);
});
test("technical checks only report broken links that are in the crawl-bounded checked URL set", () => {
const checks = buildTechnicalChecks({
rootUrl: "https://www.example.de",
finalUrl: "https://example.de",
links: [
{ href: "/kontakt", statusCode: 200 },
{ href: "/broken-a", statusCode: 404 },
{ href: "/broken-b", statusCode: 500 },
{ href: "/outside", statusCode: 404 },
],
checkedUrls: ["https://example.de/kontakt", "https://example.de/broken-a"],
});
assert.deepEqual(checks.brokenInternalLinks, ["https://example.de/broken-a"]);
});
test("contact signals require contact-context and do not fire on generic words alone", () => {
const generic = extractContactSignalsFromHtmlLikeText(
"<p>Bitte warten Sie einen Moment, wir senden Ihnen gleich Infos.</p><span>Jetzt ist alles bereit.</span>",
);
assert.equal(generic.hasContactFormSignal, false);
assert.equal(generic.hasContactCtaSignal, false);
});
test("contact signals fire for explicit contact forms and Anfrage senden", () => {
const formSignal = extractContactSignalsFromHtmlLikeText(
"<h1>Kontaktformular</h1><form><input name=\"name\"><button>Absenden</button></form>",
);
const requestSignal = extractContactSignalsFromHtmlLikeText(
"<p>Schreiben Sie uns eine Anfrage senden.</p>",
);
assert.equal(formSignal.hasContactFormSignal, true);
assert.equal(formSignal.hasContactCtaSignal, true);
assert.equal(requestSignal.hasContactFormSignal, false);
assert.equal(requestSignal.hasContactCtaSignal, true);
});

View File

@@ -0,0 +1,679 @@
import assert from "node:assert/strict";
import { existsSync, readFileSync } from "node:fs";
import path from "node:path";
import test from "node:test";
import ts from "typescript";
const convexConfigPath = path.join(process.cwd(), "convex.json");
const convexConfigSource = readFileSync(convexConfigPath, "utf8");
const websiteEnrichmentPath = path.join(
process.cwd(),
"convex/websiteEnrichment.ts",
);
const actionPath = path.join(process.cwd(), "convex/websiteEnrichmentAction.ts");
const websiteEnrichmentSource = readFileSync(websiteEnrichmentPath, "utf8");
const actionSource = readFileSync(actionPath, "utf8");
const websiteEnrichmentSourceFile = ts.createSourceFile(
"websiteEnrichment.ts",
websiteEnrichmentSource,
ts.ScriptTarget.ES2022,
true,
ts.ScriptKind.TS,
);
const actionSourceFile = ts.createSourceFile(
"websiteEnrichmentAction.ts",
actionSource,
ts.ScriptTarget.ES2022,
true,
ts.ScriptKind.TS,
);
function getExportedConstNames(file: ts.SourceFile) {
const names = new Set<string>();
const visit = (node: ts.Node) => {
if (ts.isVariableStatement(node)) {
const isExported = node.modifiers?.some(
(mod) => mod.kind === ts.SyntaxKind.ExportKeyword,
);
if (!isExported) {
ts.forEachChild(node, visit);
return;
}
const isConst = node.declarationList.flags & ts.NodeFlags.Const;
if (!isConst) {
ts.forEachChild(node, visit);
return;
}
for (const declaration of node.declarationList.declarations) {
if (ts.isIdentifier(declaration.name)) {
names.add(declaration.name.text);
}
}
}
ts.forEachChild(node, visit);
};
ts.forEachChild(file, visit);
return names;
}
function hasPattern(source: string, pattern: RegExp) {
return pattern.test(source);
}
function extractExportSource(source: string, name: string) {
const marker = `export const ${name} = `;
const declarationIndex = source.indexOf(marker);
assert.notEqual(declarationIndex, -1, `Expected declaration for ${name}`);
const openBraceIndex = source.indexOf("{", declarationIndex);
let depth = 0;
let end = -1;
for (let index = openBraceIndex; index < source.length; index += 1) {
const char = source[index];
if (char === "{") {
depth += 1;
} else if (char === "}") {
depth -= 1;
if (depth === 0) {
end = index;
break;
}
}
}
assert.notEqual(end, -1, `Expected balanced braces for ${name}`);
return source.slice(openBraceIndex, end + 1);
}
test("website enrichment mutation module exists and has runtime assertions", () => {
assert.equal(
existsSync(websiteEnrichmentPath),
true,
"websiteEnrichment.ts should be present",
);
assert.equal(
hasPattern(websiteEnrichmentSource, /^"use node";/m),
false,
"websiteEnrichment.ts should not declare a Node runtime",
);
});
test("website enrichment action module exists and uses Node runtime", () => {
assert.equal(
existsSync(actionPath),
true,
"websiteEnrichmentAction.ts should be present",
);
assert.equal(
hasPattern(actionSource, /^"use node";/m),
true,
"websiteEnrichmentAction.ts should declare Node runtime",
);
});
test("module exports are split across mutations and action", () => {
const mutationExports = getExportedConstNames(websiteEnrichmentSourceFile);
const actionExports = getExportedConstNames(actionSourceFile);
const requiredMutationExports = [
"queueLeadEnrichment",
"startLeadEnrichmentRun",
"persistLeadEnrichmentResult",
"finishLeadEnrichmentRun",
"patchLeadFromWebsiteEnrichment",
];
const requiredActionExports = ["processLeadEnrichment"];
for (const exportName of requiredMutationExports) {
assert.equal(
mutationExports.has(exportName),
true,
`Expected mutation export in websiteEnrichment.ts: ${exportName}`,
);
}
for (const exportName of requiredActionExports) {
assert.equal(
actionExports.has(exportName),
true,
`Expected action export in websiteEnrichmentAction.ts: ${exportName}`,
);
}
});
test("queueLeadEnrichment schedules internal.websiteEnrichmentAction.processLeadEnrichment", () => {
assert.equal(
hasPattern(
websiteEnrichmentSource,
/queueLeadEnrichment\s*=\s*internalMutation\([\s\S]*?ctx\.scheduler\.runAfter\(\s*0,\s*internal\.websiteEnrichmentAction\.processLeadEnrichment/,
),
true,
"Queue mutation should schedule action with runAfter(0, internal.websiteEnrichmentAction.processLeadEnrichment)",
);
});
test("queueLeadEnrichment uses lead-aware run index and does not use fixed-size .take(50) windows", () => {
const queueBodyMatch = websiteEnrichmentSource.match(
/export const queueLeadEnrichment[\s\S]*?(?=\nexport const startLeadEnrichmentRun)/,
);
assert.equal(
queueBodyMatch !== null,
true,
"queueLeadEnrichment block should be parseable for source assertions",
);
const queueBody = queueBodyMatch?.[0] ?? "";
assert.equal(
hasPattern(
queueBody,
/withIndex\("by_type_and_status_and_leadId"[\s\S]*?eq\("type",\s*"website_enrichment"\)[\s\S]*?eq\("status",\s*"pending"\)[\s\S]*?eq\("leadId",\s*args\.leadId\)/,
),
true,
"Queue dedupe for pending runs should use direct type+status+leadId index.",
);
assert.equal(
hasPattern(
queueBody,
/withIndex\("by_type_and_status_and_leadId"[\s\S]*?eq\("type",\s*"website_enrichment"\)[\s\S]*?eq\("status",\s*"running"\)[\s\S]*?eq\("leadId",\s*args\.leadId\)/,
),
true,
"Queue dedupe for running runs should use direct type+status+leadId index.",
);
assert.equal(hasPattern(queueBody, /take\(50\)/), false, "No fixed-size .take(50) window in dedupe queries.");
});
test("website enrichment action uses Chromium desktop/mobile devices and runtime Playwright import", () => {
assert.equal(
hasPattern(
actionSource,
/import\s+type\s+\{[^\n]*BrowserContext[^\n]*\}\s+from\s+["']playwright-core["']/,
),
true,
"Action should import BrowserContext type for typed helper signatures",
);
assert.equal(
hasPattern(actionSource, /loadPlaywrightModules\(\)/),
true,
"Action should load Playwright at runtime from inside action",
);
assert.equal(
hasPattern(actionSource, /import\("playwright-core"\)/),
true,
"Action should use a dynamic import for playwright-core that Convex can detect as an external package",
);
assert.equal(
hasPattern(actionSource, /import\("@sparticuz\/chromium-min"\)/),
true,
"Action should use a dynamic import for @sparticuz/chromium-min as the lightweight browser package",
);
assert.equal(
hasPattern(actionSource, /TASK8_BROWSER_ASSET_URL/),
true,
"Action should reference TASK8_BROWSER_ASSET_URL when loading browser assets",
);
assert.equal(
hasPattern(
actionSource,
/TASK8_BROWSER_ASSET_URL[\s\S]{0,240}(throw|Error|required|missing|not configured|configured|konfiguriert|setze)/i,
),
true,
"Action should surface a clear error when the browser asset URL is not configured",
);
assert.equal(
hasPattern(actionSource, /import\("@sparticuz\/chromium"\)/),
false,
"Action should not import the oversized @sparticuz/chromium package",
);
const externalPackages = JSON.parse(convexConfigSource).node?.externalPackages;
assert.equal(Array.isArray(externalPackages), true, "convex.json should define node.externalPackages");
assert.equal(
externalPackages?.includes("playwright-core"),
true,
"convex.json must include playwright-core in externalPackages",
);
assert.equal(
externalPackages?.includes("@sparticuz/chromium-min"),
true,
"convex.json should include @sparticuz/chromium-min for browser runtime",
);
assert.equal(
externalPackages?.includes("@sparticuz/chromium"),
false,
"convex.json should not include the oversized @sparticuz/chromium package",
);
assert.equal(
hasPattern(actionSource, /serverlessChromium/),
true,
"Runtime bootstrap should still use a serverless Chromium wrapper object for launch config",
);
assert.equal(
hasPattern(actionSource, /devices\["Desktop Chrome"\]/),
true,
"Desktop context should use Playwright Desktop Chrome device profile",
);
assert.equal(
hasPattern(actionSource, /devices\["iPhone 11"\]/),
true,
"Mobile context should use Playwright iPhone 11 device profile",
);
});
test("website enrichment action invalidates stale @sparticuz/chromium-min cache when source changes", () => {
assert.equal(
hasPattern(actionSource, /CHROMIUM_SOURCE_MARKER_FILE/),
true,
"Action should declare a temporary marker file path for Chromium executable source cache tracking.",
);
assert.equal(
hasPattern(
actionSource,
/tmpdir\(\)/,
),
true,
"Action should derive temporary cache paths from os.tmpdir().",
);
assert.equal(
hasPattern(actionSource, /getChromiumSourceMarker\(/),
true,
"Action should hash executable sources into a stable marker.",
);
assert.equal(
hasPattern(actionSource, /clearChromiumCacheForSourceMismatch\(/),
true,
"Action should centralize cache invalidation in a dedicated helper.",
);
assert.equal(
hasPattern(
actionSource,
/rm\(CHROMIUM_EXECUTABLE_PATH,\s*\{ force: true, recursive: true \}\),/,
),
true,
"Action should remove /tmp/chromium when executable source changes.",
);
assert.equal(
hasPattern(
actionSource,
/rm\(CHROMIUM_PACK_PATH,\s*\{ force: true, recursive: true \}\),/,
),
true,
"Action should remove /tmp/chromium-pack when executable source changes.",
);
assert.equal(
hasPattern(
actionSource,
/clearChromiumCacheForSourceMismatch\(executableSource\)[\s\S]*?chromium\.executablePath\(executableSource\)/,
),
true,
"Action should clear stale cache before resolving Chromium executable path.",
);
assert.equal(
hasPattern(
actionSource,
/writeFile\([\s\S]*?CHROMIUM_SOURCE_MARKER_FILE,[\s\S]*?getChromiumSourceMarker\(executableSource\)/,
),
true,
"Action should persist the source marker after executable path resolution.",
);
});
test("website enrichment action prepares Chromium AL2023 shared libraries for Convex runtime", () => {
const hasChromiumHelpers =
(hasPattern(actionSource, /inflate/) &&
hasPattern(actionSource, /setupLambdaEnvironment/)) ||
hasPattern(actionSource, /LD_LIBRARY_PATH/);
assert.equal(
hasChromiumHelpers,
true,
"Action should explicitly prepare chromium-min runtime environment for AL2023 shared libraries to avoid `/tmp/chromium: error while loading shared libraries: libnspr4.so` (inflate/setupLambdaEnvironment or LD_LIBRARY_PATH).",
);
const hasAl2023LibPath =
hasPattern(
actionSource,
/path\.join\(\s*tmpdir\(\),\s*["']al2023["'],\s*["']lib["']\s*\)/,
) ||
(hasPattern(actionSource, /LD_LIBRARY_PATH/) &&
hasPattern(actionSource, /al2023\/lib/));
const referencesRuntimeArchive = hasPattern(actionSource, /al2023\.tar\.br/);
const referencesPackPath = hasPattern(
actionSource,
/CHROMIUM_PACK_PATH/,
);
assert.equal(
referencesRuntimeArchive && referencesPackPath && hasAl2023LibPath,
true,
"Action should reference al2023.tar.br, track CHROMIUM_PACK_PATH, and ensure /tmp/al2023/lib is prepared for Convex launch.",
);
const executableIndex = actionSource.indexOf(
"const executablePath = await resolveChromiumExecutablePath(",
);
const launchIndex = actionSource.indexOf("chromium.launch({");
const hasSetupIndex = Math.max(
actionSource.indexOf("setupLambdaEnvironment("),
actionSource.indexOf("LD_LIBRARY_PATH"),
actionSource.indexOf("path.join(tmpdir(), \"al2023\", \"lib\")"),
);
assert.equal(
executableIndex >= 0 &&
hasSetupIndex > executableIndex &&
hasSetupIndex < launchIndex,
true,
"Executable resolution and AL2023 shared-library setup should happen before chromium launch in the action runtime path.",
);
});
test("processLeadEnrichment wraps Playwright bootstrap in protected try/catch", () => {
assert.equal(
hasPattern(
actionSource,
/try\s*\{[\s\S]*?const \{ playwrightCore, serverlessChromium \}\s*=\s*await loadPlaywrightModules\(\);[\s\S]*?const executablePath = await resolveChromiumExecutablePath\(\s*serverlessChromium,\s*\);[\s\S]*?browser = await playwrightCore\.chromium\.launch\([\s\S]*?executablePath,[\s\S]*?desktopContext = await browser\.newContext\([\s\S]*?mobileContext = await browser\.newContext\(/,
),
true,
"Playwright runtime bootstrap should use resolveChromiumExecutablePath() inside the action's try/catch-protected block",
);
assert.equal(
hasPattern(
actionSource,
/catch\s*\(error\)\s*\{[\s\S]*?finishLeadEnrichmentRun[\s\S]*?runs\.appendEvent[\s\S]*?patchLeadFromWebsiteEnrichment/,
),
true,
"Bootstrap failures should be handled by finish + error event + lead patch in catch",
);
});
test("persistence caps candidates and links before writing", () => {
assert.equal(
hasPattern(actionSource, /MAX_PERSISTED_LINKS\s*=\s*120/),
true,
"Action should define MAX_PERSISTED_LINKS with value 120.",
);
assert.equal(
hasPattern(actionSource, /MAX_PERSISTED_EMAIL_CANDIDATES\s*=\s*40/),
true,
"Action should define MAX_PERSISTED_EMAIL_CANDIDATES with value 40.",
);
assert.equal(
hasPattern(
actionSource,
/deduplicateCrawlLinks\(allLinks\)[\s\S]*?slice\([\s\S]*?MAX_PERSISTED_LINKS/,
),
true,
"Action should dedupe and cap link persistence at MAX_PERSISTED_LINKS.",
);
assert.equal(
hasPattern(
actionSource,
/validCandidates\.slice\([\s\S]*?MAX_PERSISTED_EMAIL_CANDIDATES/,
),
true,
"Action should cap candidate persistence at MAX_PERSISTED_EMAIL_CANDIDATES.",
);
});
test("website enrichment process stores homepage screenshots in Convex storage as PNG", () => {
assert.equal(
hasPattern(actionSource, /ctx\.storage\.store\(/),
true,
"Action should store screenshot blobs via ctx.storage.store",
);
assert.equal(
hasPattern(
actionSource,
/new\s+Blob\(\[[\s\S]*?SCREENSHOT_MIME_TYPE/,
),
true,
"Action should wrap screenshots in Blob with image/png MIME type",
);
});
test("startLeadEnrichmentRun marks missing website lead with contact status reason", () => {
assert.equal(
hasPattern(
websiteEnrichmentSource,
/if \(!lead\.websiteUrl\)\s*\{[\s\S]*?status:\s*"failed"[\s\S]*?contactStatusReason:\s*"Website-URL fehlt für das Website-Enrichment\."/,
),
true,
"Missing websiteUrl should set a specific contactStatusReason on the lead",
);
});
test("website enrichment persistence inserts all required evidence table rows", () => {
const expectedTables = [
"websiteCrawlPages",
"websiteCrawlLinks",
"websiteEmailCandidates",
"websiteCrawlScreenshots",
"websiteTechnicalChecks",
] as const;
for (const tableName of expectedTables) {
assert.equal(
hasPattern(
websiteEnrichmentSource,
new RegExp(`ctx\\.db\\.insert\\(["']${tableName}["']`, "s"),
),
true,
`persistLeadEnrichmentResult should insert into ${tableName}`,
);
}
});
test("website enrichment flow uses TASK-7 email selection helper for lead patching", () => {
assert.equal(
hasPattern(
actionSource,
/getUsableContactEmailFromEntries\([\s\S]*?\)/,
),
true,
"Action should call getUsableContactEmailFromEntries",
);
assert.equal(
hasPattern(
actionSource,
/runMutation\(\s*internal\.websiteEnrichment\.patchLeadFromWebsiteEnrichment[\s\S]*?\{[\s\S]*?email:\s*usable\.email/,
),
true,
"Action should patch lead from usable email result",
);
assert.equal(
hasPattern(
actionSource,
/currentContactStatus\s*:\s*started\.lead\.contactStatus/,
),
true,
"Action should pass lead contact status to patchLeadFromWebsiteEnrichment",
);
assert.equal(
hasPattern(websiteEnrichmentSource, /args\.currentContactStatus\s*===\s*\"missing_contact\"/),
true,
"Lead patch mutation should only set new status for missing_contact",
);
});
test("failure handling marks run as failed and writes lead-facing reason", () => {
assert.equal(
hasPattern(
actionSource,
/runMutation\(\s*internal\.websiteEnrichment\.finishLeadEnrichmentRun[\s\S]*?status:\s*"failed"/,
),
true,
"Action should persist failed run state on fatal crawl errors",
);
assert.equal(
hasPattern(
actionSource,
/runMutation\(\s*api\.runs\.appendEvent[\s\S]*?level:\s*"error"[\s\S]*?message:\s*"Website-Enrichment fehlgeschlagen/,
),
true,
"Action should append a visible error event on failure",
);
assert.equal(
hasPattern(
actionSource,
/contactStatusReason:\s*`Website-Enrichment fehlgeschlagen:\s*\$\{errorSummary\}`/,
),
true,
"Action should patch the lead with an actionable failure reason",
);
assert.equal(
hasPattern(
actionSource,
/contactStatusReason:\s*"Website-Enrichment fehlgeschlagen: Ungültige Website-URL\."/,
),
true,
"Invalid-url failure should also update lead contact status reason",
);
});
test("website enrichment enforces TASK-8 crawler limits and runtime timeboxes", () => {
assert.equal(
hasPattern(actionSource, /TASK8_CRAWL_TIMEOUT_MS/g),
true,
"TASK8_CRAWL_TIMEOUT_MS environment override should be used",
);
assert.equal(
hasPattern(actionSource, /DEFAULT_CRAWL_TIMEOUT_MS\s*=\s*60_000/),
true,
"Default crawl timeout should be 60s",
);
assert.equal(
hasPattern(actionSource, /DEFAULT_CRAWL_MAX_PAGES\s*=\s*5/),
true,
"Default max crawl page count should be 5",
);
});
test("processLeadEnrichment schedules PageSpeed audit jobs after successful enrichment", () => {
const processBody = extractExportSource(actionSource, "processLeadEnrichment");
const persistIndex = processBody.indexOf(
"internal.websiteEnrichment.persistLeadEnrichmentResult",
);
const queueIndex = processBody.indexOf(
"internal.pageSpeed.queueLeadPageSpeedAudit",
persistIndex,
);
const finishIndex = processBody.indexOf(
"internal.websiteEnrichment.finishLeadEnrichmentRun",
persistIndex,
);
assert.notEqual(queueIndex, -1, "processLeadEnrichment should queue PageSpeed audits");
assert.notEqual(persistIndex, -1, "processLeadEnrichment should persist website enrichment result");
assert.notEqual(finishIndex, -1, "processLeadEnrichment should finish enrichment run");
assert.equal(
hasPattern(
processBody,
/runMutation\(\s*internal\.pageSpeed\.queueLeadPageSpeedAudit[\s\S]*leadId:\s*started\.lead\._id[\s\S]*parentRunId:\s*runId[\s\S]*\)/,
),
true,
"Queue call should pass lead ID and parent run ID",
);
assert.equal(queueIndex > persistIndex, true, "PageSpeed queueing should happen after persistence");
assert.equal(queueIndex < finishIndex, true, "PageSpeed queueing should happen before success finish");
});
test("processLeadEnrichment records warning on PageSpeed queue failure and continues", () => {
const processBody = extractExportSource(actionSource, "processLeadEnrichment");
assert.equal(
hasPattern(
processBody,
/try\s*\{[\s\S]*internal\.pageSpeed\.queueLeadPageSpeedAudit[\s\S]*\}\s*catch\s*\([^)]*\)\s*\{[\s\S]*api\.runs\.appendEvent[\s\S]*level:\s*"warning"/,
),
true,
"Queueing PageSpeed should be wrapped in warning-safe try/catch",
);
assert.equal(
hasPattern(
processBody,
/PageSpeed-Analyse konnte nicht in die Warteschlange gesetzt werden\./,
),
true,
"Warning event should describe queue failure",
);
});
test("processLeadEnrichment regression: queue PageSpeed on invalid URL failure when started lead exists", () => {
const processBody = extractExportSource(actionSource, "processLeadEnrichment");
const invalidUrlStart = processBody.indexOf("if (!rootUrl)");
assert.notEqual(invalidUrlStart, -1, "Invalid URL guard should exist");
const invalidUrlReturnNull = processBody.indexOf("return null;", invalidUrlStart);
assert.notEqual(
invalidUrlReturnNull,
-1,
"Invalid URL branch should return null",
);
const queueCallInInvalidUrl = processBody.indexOf(
"internal.pageSpeed.queueLeadPageSpeedAudit",
invalidUrlStart,
);
assert.equal(
queueCallInInvalidUrl > invalidUrlStart && queueCallInInvalidUrl < invalidUrlReturnNull,
true,
"Invalid URL failure path should queue PageSpeed before returning.",
);
const invalidUrlBranch = processBody.slice(invalidUrlStart, invalidUrlReturnNull);
assert.equal(
hasPattern(
invalidUrlBranch,
/leadId:\s*started\.lead\._id[\s\S]*?parentRunId:\s*runId/,
),
true,
"Invalid URL queue payload should use started.lead._id and parentRunId runId.",
);
});
test("processLeadEnrichment regression: queue PageSpeed in fatal catch path with started lead", () => {
const processBody = extractExportSource(actionSource, "processLeadEnrichment");
const outerCatchStart = processBody.lastIndexOf("catch (error)");
assert.notEqual(outerCatchStart, -1, "Outer catch block should exist");
const startedGuard = processBody.indexOf("if (started)", outerCatchStart);
assert.notEqual(startedGuard, -1, "Outer catch should guard lead patch by started check.");
const catchReturnNull = processBody.indexOf("return null;", outerCatchStart);
assert.notEqual(
catchReturnNull,
-1,
"Outer catch should return null on unrecoverable errors.",
);
const queueCallInCatch = processBody.indexOf(
"internal.pageSpeed.queueLeadPageSpeedAudit",
outerCatchStart,
);
assert.equal(
queueCallInCatch > outerCatchStart &&
queueCallInCatch > startedGuard &&
queueCallInCatch < catchReturnNull,
true,
"Fatal catch path should queue PageSpeed before returning, while started lead exists.",
);
const catchBlock = processBody.slice(outerCatchStart, catchReturnNull);
assert.equal(
hasPattern(
catchBlock,
/leadId:\s*started\.lead\._id[\s\S]*?parentRunId:\s*runId/,
),
true,
"Catch-path PageSpeed queue payload should use started.lead._id and parentRunId runId.",
);
});

View File

@@ -0,0 +1,163 @@
import assert from "node:assert/strict";
import { readFileSync } from "node:fs";
import { join } from "node:path";
import test from "node:test";
import type { Doc } from "../convex/_generated/dataModel";
import { RUN_EVENT_LEVELS, RUN_STATUSES, RUN_TYPES } from "../convex/domain";
type ExactSetEquality<A, B> = [
Exclude<A, B>,
] extends [never]
? [Exclude<B, A>] extends [never]
? true
: false
: false;
type IsRequired<T> = undefined extends T ? false : true;
type IsOptional<T> = undefined extends T ? true : false;
type AgentRunType = Doc<"agentRuns">["type"];
type AgentRunStatus = Doc<"agentRuns">["status"];
type AgentRunEventLevel = Doc<"agentRunEvents">["level"];
type AssertWebsiteEnrichmentRunType = Extract<AgentRunType, "website_enrichment">;
type RunTypeFromDomain = (typeof RUN_TYPES)[number];
type RunStatusFromDomain = (typeof RUN_STATUSES)[number];
type RunEventLevelFromDomain = (typeof RUN_EVENT_LEVELS)[number];
type AssertLeadCrawlPageKind = Extract<
Doc<"websiteCrawlPages">["pageKind"],
"homepage"
>;
type AssertCrawlViewportDesktop = Extract<
Doc<"websiteCrawlScreenshots">["viewport"],
"desktop"
>;
type AssertCrawlViewportMobile = Extract<
Doc<"websiteCrawlScreenshots">["viewport"],
"mobile"
>;
type AssertNormalizedEmailType = Doc<"websiteEmailCandidates">["normalizedEmail"];
type AssertAcceptedEmailFlag = Doc<"websiteEmailCandidates">["accepted"];
type AssertTechnicalUsesHttps = Doc<"websiteTechnicalChecks">["usesHttps"];
type AssertTechnicalHasVisibleContactPath = Doc<"websiteTechnicalChecks">["hasVisibleContactPath"];
type AssertRunTypeInDomain = "website_enrichment" extends (
typeof RUN_TYPES
) [number]
? true
: false;
type AssertRunTypeEnumParity = ExactSetEquality<AgentRunType, RunTypeFromDomain>;
type AssertRunStatusEnumParity = ExactSetEquality<
AgentRunStatus,
RunStatusFromDomain
>;
type AssertRunEventLevelEnumParity = ExactSetEquality<
AgentRunEventLevel,
RunEventLevelFromDomain
>;
const schemaSource = readFileSync(
join(process.cwd(), "convex", "schema.ts"),
"utf8",
);
const _assertRunTypeSchemaHasWebsiteEnrichment: AssertWebsiteEnrichmentRunType =
"website_enrichment";
const _assertRunTypeInDomainHasWebsiteEnrichment: AssertRunTypeInDomain = true;
const _assertRunTypeEnumParity: AssertRunTypeEnumParity = true;
const _assertRunStatusEnumParity: AssertRunStatusEnumParity = true;
const _assertRunEventLevelEnumParity: AssertRunEventLevelEnumParity = true;
const _assertPageKindSchemaIncludesHomepage: AssertLeadCrawlPageKind =
"homepage";
const _assertScreenshotViewportTypeDesktop: AssertCrawlViewportDesktop = "desktop";
const _assertScreenshotViewportTypeMobile: AssertCrawlViewportMobile = "mobile";
const _assertRunIdOptionalOnPages: IsOptional<Doc<"websiteCrawlPages">["runId"]> =
true;
const _assertRunIdOptionalOnLinks: IsOptional<Doc<"websiteCrawlLinks">["runId"]> =
true;
const _assertRunIdOptionalOnEmailCandidates: IsOptional<
Doc<"websiteEmailCandidates">["runId"]
> = true;
const _assertRunIdOptionalOnScreenshots: IsOptional<
Doc<"websiteCrawlScreenshots">["runId"]
> = true;
const _assertRunIdOptionalOnTechnicalChecks: IsOptional<
Doc<"websiteTechnicalChecks">["runId"]
> = true;
const _assertPagesHasCreatedAt: IsRequired<Doc<"websiteCrawlPages">["createdAt"]> =
true;
const _assertLinksHasCreatedAt: IsRequired<Doc<"websiteCrawlLinks">["createdAt"]> =
true;
const _assertEmailCandidatesHasCreatedAt: IsRequired<
Doc<"websiteEmailCandidates">["createdAt"]
> = true;
const _assertScreenshotsHasCreatedAt: IsRequired<
Doc<"websiteCrawlScreenshots">["createdAt"]
> = true;
const _assertTechnicalChecksHasCreatedAt: IsRequired<
Doc<"websiteTechnicalChecks">["createdAt"]
> = true;
const _assertWebsiteEmailCandidatesNormalizedEmail: AssertNormalizedEmailType = "user@example.com";
const _assertEmailAcceptedTrue: AssertAcceptedEmailFlag = true;
const _assertEmailAcceptedFalse: AssertAcceptedEmailFlag = false;
const _assertScreenshotStorageIdRequired: IsRequired<
Doc<"websiteCrawlScreenshots">["storageId"]
> = true;
const _assertTechnicalUsesHttpsTrue: AssertTechnicalUsesHttps = true;
const _assertTechnicalUsesHttpsFalse: AssertTechnicalUsesHttps = false;
const _assertTechnicalMissingTitleFalse: Doc<"websiteTechnicalChecks">["missingTitle"] =
false;
const _assertTechnicalMissingMetaDescriptionTrue: Doc<"websiteTechnicalChecks">["missingMetaDescription"] =
true;
const _assertTechnicalHasVisibleContactPathTrue: AssertTechnicalHasVisibleContactPath =
true;
const _assertTechnicalHasVisibleContactPathFalse: AssertTechnicalHasVisibleContactPath =
false;
// Convex index structure can't be asserted from Doc types safely; this test validates
// field contracts and value domains that are practical to verify at compile/runtime.
test("website enrichment schema contracts are present", () => {
assert.equal(_assertRunTypeSchemaHasWebsiteEnrichment, "website_enrichment");
assert.equal(_assertRunTypeInDomainHasWebsiteEnrichment, true);
assert.equal(_assertRunTypeEnumParity, true);
assert.equal(_assertRunStatusEnumParity, true);
assert.equal(_assertRunEventLevelEnumParity, true);
assert.equal(_assertPageKindSchemaIncludesHomepage, "homepage");
assert.equal(_assertScreenshotViewportTypeDesktop, "desktop");
assert.equal(_assertScreenshotViewportTypeMobile, "mobile");
assert.equal(_assertRunIdOptionalOnPages, true);
assert.equal(_assertRunIdOptionalOnLinks, true);
assert.equal(_assertRunIdOptionalOnEmailCandidates, true);
assert.equal(_assertRunIdOptionalOnScreenshots, true);
assert.equal(_assertRunIdOptionalOnTechnicalChecks, true);
assert.equal(_assertPagesHasCreatedAt, true);
assert.equal(_assertLinksHasCreatedAt, true);
assert.equal(_assertEmailCandidatesHasCreatedAt, true);
assert.equal(_assertScreenshotsHasCreatedAt, true);
assert.equal(_assertTechnicalChecksHasCreatedAt, true);
assert.equal(_assertScreenshotStorageIdRequired, true);
assert.equal(_assertWebsiteEmailCandidatesNormalizedEmail, "user@example.com");
assert.equal(_assertEmailAcceptedTrue, true);
assert.equal(_assertEmailAcceptedFalse, false);
assert.equal(_assertTechnicalUsesHttpsTrue, true);
assert.equal(_assertTechnicalUsesHttpsFalse, false);
assert.equal(_assertTechnicalMissingTitleFalse, false);
assert.equal(_assertTechnicalMissingMetaDescriptionTrue, true);
assert.equal(_assertTechnicalHasVisibleContactPathTrue, true);
assert.equal(_assertTechnicalHasVisibleContactPathFalse, false);
});
test("agentRuns schema defines lead-aware active-run index", () => {
assert.equal(
schemaSource.includes('["type", "status", "leadId"]'),
true,
"Schema should include by_type_and_status_and_leadId index fields in order.",
);
assert.equal(
schemaSource.includes('by_type_and_status_and_leadId'),
true,
"Schema should define the by_type_and_status_and_leadId index.",
);
});