webdev-pipeline/backlog/tasks/task-8 - Implement-Playwright-website-crawling-and-screenshot-capture.md at 20615e12a14d29a4b3bb9839bf324d242112c610 - webdev-pipeline

matthias/webdev-pipeline

Fork 0

Files

Matthias 762571cb43 chore: add MVP planning backlog

2026-06-03 21:18:36 +02:00

1.8 KiB

Raw Blame History

id, title, status, assignee, created_date, labels, dependencies, references, priority, ordinal

title

status

assignee

created_date

labels

dependencies

references

priority

ordinal

TASK-8

Implement Playwright website crawling and screenshot capture

To Do

2026-06-03 19:13

mvp

audit

playwright

TASK-7

PRD.md

high

8000

Description

Build the website inspection layer using Playwright. For qualified leads, the system should load the company website, inspect the homepage and a small set of relevant subpages, capture desktop/mobile screenshots, extract visible text and contact signals, and store all raw evidence in Convex.

Acceptance Criteria

#1 Playwright captures desktop and mobile screenshots for the homepage and stores them in Convex File Storage
#2 Crawler visits a bounded set of relevant subpages: Kontakt, Impressum, Leistungen/Angebot, Über uns/Team when discoverable
#3 Crawler extracts visible text, page title, meta description, headings, links, phone numbers, email candidates, and CTA/contact-form signals
#4 Simple technical checks include HTTPS/final URL, missing title/meta description, visible contact path, and obvious broken internal links within the crawl limit
#5 Crawler failures produce useful dashboard-visible errors without blocking unrelated leads

Implementation Plan

Add Playwright runtime setup compatible with local development and Coolify container deployment.
Define crawl limits, viewports, timeout behavior, and allowed same-domain URL rules.
Capture homepage desktop/mobile screenshots and upload to Convex storage.
Discover and inspect relevant subpages with bounded depth.
Persist extracted text, metadata, contact candidates, technical checks, screenshots, and errors.

1.8 KiB Raw Blame History

Description

Acceptance Criteria

Implementation Plan

1.8 KiB

Raw Blame History