Files
pitchfast/backlog/tasks/task-47 - Fix-evidence-verifier-audit-generation-failure.md

3.5 KiB

id, title, status, assignee, created_date, updated_date, labels, dependencies, priority, ordinal
id title status assignee created_date updated_date labels dependencies priority ordinal
TASK-47 Fix evidence verifier audit generation failure In Progress
2026-06-08 09:35 2026-06-08 10:07
high 49000

Description

Diagnose and fix the evidenceVerifier stage failure in the Convex specialist fan-out audit pipeline so live audit generation can complete or fail with actionable verifier diagnostics.

Acceptance Criteria

  • #1 Root cause is identified from persisted run or generation evidence
  • #2 Evidence verifier schema or prompt no longer fails on valid specialist outputs
  • #3 Audit generation preserves strict evidence gates without schema-induced false failures
  • #4 Focused and full regression tests pass

Implementation Plan

  1. Pull the failing evidenceVerifier error details from Convex run/generation records
  2. Add a RED regression test for the root cause
  3. Fix the verifier schema/prompt or fallback behavior at the source
  4. Run focused fan-out tests and full pnpm test
  5. Record verification notes and keep task In Progress until user confirms live audit works

Implementation Notes

Root cause from Convex auditGenerations/agentRunEvents: all specialist structured-output calls failed before content generation because Azure rejected the response_format schema. The shared evidenceRef object declared sourceUrl as an optional property, but Azure/OpenAI strict structured outputs require every declared property to be listed in required. The verifier then received an empty findings array and failed on the same schema issue. Fix: made Specialist/Verifier output schemas strict-output compatible by requiring sourceUrl and required array fields, added explicit prompt guidance for sourceUrl/status/findings/notes, and replaced rejectedFindings with a narrow rejection schema so unknown/generic rejected claims do not have to pass the publishable finding schema. Verification: RED test reproduced schema.findings[].evidenceRefs[].sourceUrl missing from required; focused schema tests now pass; fan-out/persistence/UI tests pass; pnpm test passes 386/386; git diff --check passes; ESLint on touched source/test files passes.

Second live failure root cause: after the strict schema fix, specialist stages succeeded, but evidenceVerifier failed with "No object generated: could not parse the response." The persisted verifier prompt contained about 10 full specialist findings and the verifier schema required echoing full verifiedFindings objects back. With the classification profile capped at 1200 output tokens, this made verifier output too large/fragile to parse. Context7 AI SDK docs confirmed AI SDK 6 uses strict OpenAI JSON schema behavior by default; the issue was now output shape/size rather than schema rejection. Fix: changed evidenceVerifier output to compact verifiedFindingIds plus small rejected decisions, then deterministically map accepted IDs back to original specialist findings in the action. This preserves strict evidence gates while removing verifier echoing/mutation of findings. Verification: added RED schema regression for compact verifier IDs and many findings; focused schema/action tests pass; adjacent audit persistence/schema/UI/evidence tests pass; pnpm test passes 387/387; git diff --check passes; ESLint on touched files passes; npx convex dev --once synced the fix to dev deployment.