Infrastructure

Incremental Search Indexing for Product Workspaces

By Journal June 11, 2026

Search quality depends on freshness as much as ranking. In a collaborative workspace, documents, comments, tasks, and attachments can change quickly, so a full reindex after every write is too expensive.

This sample post describes an incremental indexing loop that favors small jobs, durable checkpoints, and idempotent workers.

Architecture sketch

Writer service
    |
    v
Change log table ---> Index queue ---> Index worker ---> Search backend
    |                                      |
    +----------- checkpoint store <--------+

Each write appends a change record. Workers consume ordered ranges from the change log and update the search backend with the latest document projection.

Change record

type ChangeRecord = {
  sequence: number;
  workspaceId: string;
  entityType: 'document' | 'task' | 'comment';
  entityId: string;
  operation: 'upsert' | 'delete';
  committedAt: string;
};

Worker loop

async function indexWorkspace(workspaceId: string) {
  let cursor = await checkpoints.read(workspaceId);

  while (true) {
    const changes = await changeLog.readAfter({ workspaceId, cursor, limit: 100 });

    if (changes.length === 0) {
      return;
    }

    for (const change of changes) {
      await applyChange(change);
      cursor = change.sequence;
    }

    await checkpoints.write(workspaceId, cursor);
  }
}

Retry policy

Use bounded retries for transient failures and move poison messages into a review queue.

{
  "maxAttempts": 5,
  "backoff": "exponential",
  "initialDelayMs": 500,
  "maxDelayMs": 30000,
  "deadLetterQueue": "search-indexing-review"
}

Failure modes

Out-of-order delivery: use sequence numbers and checkpoint only after successful application.
Duplicate work: make search backend writes idempotent by document ID and version.
Deleted entities: emit tombstones so stale documents are removed from the index.
Large workspaces: partition queue work by workspace and entity type.

Freshness service-level objective

Workspace tier	Target freshness	Rebuild cadence
Free	15 minutes	Weekly
Team	5 minutes	Daily
Enterprise	60 seconds	Hourly safety scan

Operator runbook

When indexing falls behind:

Check worker error rate and queue age.
Pause non-critical rebuild jobs.
Increase workers for hot partitions.
Inspect dead-letter records for schema drift.
Re-run affected ranges after the fix ships.

journal-search replay --workspace ws_123 --from-sequence 98110 --to-sequence 98420

Design tradeoffs

Incremental indexing is a consistency strategy. It accepts that search can lag behind writes briefly, then makes that lag observable and recoverable.

A good indexing pipeline should be easy to replay, safe to retry, and explicit about freshness expectations. Those constraints matter more than a clever ranking algorithm when users simply need their newest work to appear.