Files
finanzen/.agents/skills/convex-performance-audit/references/hot-path-rules.md
2026-06-15 11:33:23 +02:00

11 KiB

Hot Path Rules

Use these rules when the top-level workflow points to read amplification, denormalization, index rollout, reactive query cost, or invalidation-heavy writes.

Contents

  • Core Principle
  • Consistency Rule
    1. Push Filters To Storage (indexes, migration rule, redundant indexes)
    1. Minimize Data Sources (denormalization, fallback rule)
    1. Minimize Row Size (digest tables)
    1. Skip No-Op Writes
    1. Match Consistency To Read Patterns (high-read/low-write, high-read/high-write)
  • Convex-Specific Notes (reactive queries, point-in-time reads, triggers, aggregates, backfills)
  • Verification

Core Principle

Every byte read or written multiplies with concurrency.

Think:

cost x calls_per_second x 86400

In Convex, every write can also fan out into reactive invalidation, replication work, and downstream sync.

Consistency Rule

If you fix a hot-path pattern for one function, audit sibling functions touching the same tables for the same pattern.

Do this especially for:

  • multiple list queries over the same table
  • multiple writers to the same table
  • public browse and search queries over the same records
  • helper functions reused by more than one endpoint

1. Push Filters To Storage

Both JavaScript .filter() and the Convex query .filter() method after a DB scan mean you already paid for the read. The Convex .filter() method has the same performance as filtering in JS, it does not push the predicate to the storage layer. Only .withIndex() and .withSearchIndex() actually reduce the documents scanned.

Prefer:

  • withIndex(...)
  • .withSearchIndex(...) for text search
  • narrower tables
  • summary tables

before accepting a scan-plus-filter pattern.

// Bad: scans then filters in JavaScript
export const listOpen = query({
  args: {},
  handler: async (ctx) => {
    const tasks = await ctx.db.query("tasks").collect();
    return tasks.filter((task) => task.status === "open");
  },
});
// Also bad: Convex .filter() does not push to storage either
export const listOpen = query({
  args: {},
  handler: async (ctx) => {
    return await ctx.db
      .query("tasks")
      .filter((q) => q.eq(q.field("status"), "open"))
      .collect();
  },
});
// Good: use an index so storage does the filtering
export const listOpen = query({
  args: {},
  handler: async (ctx) => {
    return await ctx.db
      .query("tasks")
      .withIndex("by_status", (q) => q.eq("status", "open"))
      .collect();
  },
});

Migration rule for indexes

New indexes on partially backfilled fields can create correctness bugs during rollout.

Important Convex detail:

undefined !== false

If an older document is missing a field entirely, it will not match a compound index entry that expects false.

Do not trust old comments saying a field is "not backfilled" or "already backfilled". Verify.

If correctness depends on handling old and new states during rollout, do not improvise a partial-backfill workaround in the hot path. Use a migration-safe rollout and consult skills/convex-migration-helper/SKILL.md.

// Bad: optional booleans can miss older rows where the field is undefined
const projects = await ctx.db
  .query("projects")
  .withIndex("by_archived_and_updated", (q) => q.eq("isArchived", false))
  .order("desc")
  .take(20);
// Good: switch hot-path reads only after the rollout is migration-safe
// See the migration helper skill for dual-read / backfill / cutover patterns.

Check for redundant indexes

Indexes like by_foo and by_foo_and_bar are usually redundant. You only need by_foo_and_bar, since you can query it with just the foo condition and omit bar. Extra indexes add storage cost and write overhead on every insert, patch, and delete.

// Bad: two indexes where one would do
defineTable({ team: v.id("teams"), user: v.id("users") })
  .index("by_team", ["team"])
  .index("by_team_and_user", ["team", "user"]);
// Good: single compound index serves both query patterns
defineTable({ team: v.id("teams"), user: v.id("users") }).index(
  "by_team_and_user",
  ["team", "user"],
);

Exception: .index("by_foo", ["foo"]) is really an index on foo + _creationTime, while .index("by_foo_and_bar", ["foo", "bar"]) is on foo + bar + _creationTime. If you need results sorted by foo then _creationTime, you need the single-field index because the compound one would sort by bar first.

2. Minimize Data Sources

Trace every read.

If a function resolves a foreign key for a tiny display field and a denormalized copy already exists, prefer the denormalized field on the hot path.

When to denormalize

Denormalize when all of these are true:

  • the path is hot
  • the joined document is much larger than the field you need
  • many readers are paying that join cost repeatedly

Useful mental model:

join_cost = rows_per_page x foreign_doc_size x pages_per_second

Small-table joins are often fine. Large-document joins for tiny fields on hot list pages are usually not.

Fallback rule

Denormalized data is an optimization. Live data is the correctness path.

Rules:

  • If the denormalized field is missing or null, fall back to the live read
  • Do not show placeholders instead of falling back
  • In lookup maps, only include fully populated entries
// Bad: missing denormalized data becomes a placeholder and blocks correctness
const ownerName = project.ownerName ?? "Unknown owner";
// Good: denormalized data is an optimization, not the only source of truth
const ownerName =
  project.ownerName ?? (await ctx.db.get(project.ownerId))?.name ?? null;

Bad lookup map pattern:

const ownersById = {
  [project.ownerId]: { ownerName: null },
};

That blocks fallback because the map says "I have data" when it does not.

Good lookup map pattern:

const ownersById =
  project.ownerName !== undefined && project.ownerName !== null
    ? { [project.ownerId]: { ownerName: project.ownerName } }
    : {};

No denormalized copy yet

Prefer adding fields to an existing summary, companion, or digest table instead of bloating the primary hot-path table.

If introducing the new field or table requires a staged rollout, backfill, or old/new-shape handling, use the migration helper skill for the rollout plan.

Rollout order:

  1. Update schema
  2. Update write path
  3. Backfill
  4. Switch read path

3. Minimize Row Size

Hot list pages should read the smallest document shape that still answers the UI.

Prefer summary or digest tables over full source tables when:

  • the list page only needs a subset of fields
  • source documents are large
  • the query is high volume

An 800 byte summary row is materially cheaper than a 3 KB full document on a hot page.

Digest tables are a tradeoff, not a default:

  • Worth it when the path is clearly hot, the source rows are much larger than the UI needs, or many readers are repeatedly paying the same join and payload cost
  • Probably not worth it when an indexed read on the source table is already cheap enough, the table is still small, or the extra write and migration complexity would dominate the benefit
// Bad: list page reads source docs, then joins owner data per row
const projects = await ctx.db
  .query("projects")
  .withIndex("by_public", (q) => q.eq("isPublic", true))
  .collect();
// Good: list page reads the smaller digest shape first
const projects = await ctx.db
  .query("projectDigests")
  .withIndex("by_public_and_updated", (q) => q.eq("isPublic", true))
  .order("desc")
  .take(20);

4. Isolate Frequently-Updated Fields

Convex already no-ops unchanged writes. The invalidation problem here is real writes hitting documents that many queries subscribe to.

Move high-churn fields like lastSeen, counters, presence, or ephemeral status off widely-read documents when most readers do not need them.

Apply this across sibling writers too. Splitting one write path does not help much if three other mutations still update the same widely-read document.

// Bad: every presence heartbeat invalidates subscribers to the whole profile
await ctx.db.patch(user._id, {
  name: args.name,
  avatarUrl: args.avatarUrl,
  lastSeen: Date.now(),
});
// Good: keep profile reads stable, move heartbeat updates to a separate document
await ctx.db.patch(user._id, {
  name: args.name,
  avatarUrl: args.avatarUrl,
});

await ctx.db.patch(presence._id, {
  lastSeen: Date.now(),
});

5. Match Consistency To Read Patterns

Choose read strategy based on traffic shape.

High-read, low-write

Examples:

  • public browse pages
  • search results
  • landing pages
  • directory listings

Prefer:

  • point-in-time reads where appropriate
  • explicit refresh
  • local state for pagination
  • caching where appropriate

Do not treat subscriptions as automatically wrong here. Prefer point-in-time reads only when the product does not need live freshness and the reactive cost is material. See subscription-cost.md for detailed patterns.

High-read, high-write

Examples:

  • collaborative editors
  • live dashboards
  • presence-heavy views

Reactive queries may be worth the ongoing cost.

Convex-Specific Notes

Reactive queries

Every ctx.db.get() and ctx.db.query() contributes to the invalidation set for the query.

On the client:

  • useQuery creates a live subscription
  • usePaginatedQuery creates a live subscription per page

For low-freshness flows, consider a point-in-time read instead of a live subscription only when the product does not need updates pushed automatically.

Point-in-time reads

Framework helpers, server-rendered fetches, or one-shot client reads can avoid ongoing subscription cost when live updates are not useful.

Use them for:

  • aggregate snapshots
  • reports
  • low-churn listings
  • pages where explicit refresh is fine

Triggers and fan-out

Triggers fire on every write, including writes that did not materially change the document.

When a write exists only to keep derived state in sync:

  • diff before patching
  • move expensive non-blocking work to ctx.scheduler.runAfter when appropriate

Aggregates

Reactive global counts invalidate frequently on busy tables.

Prefer:

  • one-shot aggregate fetches
  • periodic recomputation
  • precomputed summary rows

for global stats that do not need live updates every second.

Backfills

For larger backfills, use cursor-based, self-scheduling internalMutation jobs or the migrations component.

Deploy code that can handle both states before running the backfill.

During the gap:

  • writes should populate the new shape
  • reads should fall back safely

Verification

Before closing the audit, confirm:

  1. Same results as before, no dropped records
  2. The removed table or lookup is no longer in the hot-path read set
  3. Tests or validation cover fallback behavior
  4. Migration safety is preserved while fields or indexes are unbackfilled
  5. Sibling functions were fixed consistently