chore: add MVP planning backlog
This commit is contained in:
@@ -0,0 +1,42 @@
|
||||
---
|
||||
id: TASK-8
|
||||
title: Implement Playwright website crawling and screenshot capture
|
||||
status: To Do
|
||||
assignee: []
|
||||
created_date: '2026-06-03 19:13'
|
||||
labels:
|
||||
- mvp
|
||||
- audit
|
||||
- playwright
|
||||
dependencies:
|
||||
- TASK-7
|
||||
references:
|
||||
- PRD.md
|
||||
priority: high
|
||||
ordinal: 8000
|
||||
---
|
||||
|
||||
## Description
|
||||
|
||||
<!-- SECTION:DESCRIPTION:BEGIN -->
|
||||
Build the website inspection layer using Playwright. For qualified leads, the system should load the company website, inspect the homepage and a small set of relevant subpages, capture desktop/mobile screenshots, extract visible text and contact signals, and store all raw evidence in Convex.
|
||||
<!-- SECTION:DESCRIPTION:END -->
|
||||
|
||||
## Acceptance Criteria
|
||||
<!-- AC:BEGIN -->
|
||||
- [ ] #1 Playwright captures desktop and mobile screenshots for the homepage and stores them in Convex File Storage
|
||||
- [ ] #2 Crawler visits a bounded set of relevant subpages: Kontakt, Impressum, Leistungen/Angebot, Über uns/Team when discoverable
|
||||
- [ ] #3 Crawler extracts visible text, page title, meta description, headings, links, phone numbers, email candidates, and CTA/contact-form signals
|
||||
- [ ] #4 Simple technical checks include HTTPS/final URL, missing title/meta description, visible contact path, and obvious broken internal links within the crawl limit
|
||||
- [ ] #5 Crawler failures produce useful dashboard-visible errors without blocking unrelated leads
|
||||
<!-- AC:END -->
|
||||
|
||||
## Implementation Plan
|
||||
|
||||
<!-- SECTION:PLAN:BEGIN -->
|
||||
1. Add Playwright runtime setup compatible with local development and Coolify container deployment.
|
||||
2. Define crawl limits, viewports, timeout behavior, and allowed same-domain URL rules.
|
||||
3. Capture homepage desktop/mobile screenshots and upload to Convex storage.
|
||||
4. Discover and inspect relevant subpages with bounded depth.
|
||||
5. Persist extracted text, metadata, contact candidates, technical checks, screenshots, and errors.
|
||||
<!-- SECTION:PLAN:END -->
|
||||
Reference in New Issue
Block a user