Designing Scalable Projections with the KurrentDB Projection Engine

Tony Young avatar
Tony Young

Introduction

Projections are a foundational capability of KurrentDB, enabling transformations of events into new streams or schemas. When designed correctly, projections can unlock many benefits for developers, simplify application architectures, and scale reliably with growing event volumes

However, projections are also one of the most common sources of performance, scalability, and operational issues. Overly broad subscriptions, oversized state, and misuse of emitted events can lead to excessive compute load, memory pressure, and operational fragility

This whitepaper outlines best practices and anti-patterns for building projections with the KurrentDB Projection Engine. It focuses on how to design projections that are efficient, maintainable, and production-safe—while also highlighting when projections should not be used, and alternative architectural approaches to consider

Understanding the Role of Projections in KurrentDB

At their core, projections are event-driven functions that:

  • Subscribe to one or more event streams or categories
  • Process events sequentially
  • Maintain derived state or emit events

Projections, therefore, can be thought of as to provide materialized views optimized for specific access patterns. Projections are not intended to be:

  • A general-purpose compute engine
  • A replacement for external analytics or batch processing
  • A dumping ground for all historical data

Treating projections as targeted, purpose-built components is the key to success

Best Practices

1. Narrow the Scope of Your Projections

Design projections to listen only to the events they truly need.

Every projection subscription defines how much work the system must do. Broad subscriptions - especially those using fromAll - increase processing costs and complexity.

Best practice:

  • Prefer fromStream or fromCategories over fromAll
  • Create multiple focused projections rather than one monolithic one
  • Align projections to a single responsibility or need

Why this matters:

  • Reduces unnecessary event processing
  • Improves replay performance
  • Simplifies reasoning about correctness and behavior

2. Store Only the Required Data in Projection State

Projection state should represent the minimal information required to drive downstream behavior.

Best practice:

  • Store derived values, not raw event payloads
  • Avoid copying entire events into state
  • Use aggregation and summarization aggressively

Example:
Instead of storing every order line item, store:

  • Total count
  • Total value
  • Current status
  • Etc

Why this matters:

  • Smaller state is processed more efficiently by the projection engine, and consumers
  • Reduces memory pressure
  • Avoids hitting size limits (see below)

3. Decide Carefully Between Emitting Events vs. Producing State

Projections can:

  • Maintain state
  • Emit events

These are distinct responsibilities and should be chosen intentionally

Consider emitting events when:

  • You need to trigger downstream workflows
  • The output represents a new business fact
  • The event will be consumed by other systems or services

Prefer state-only projections when:

  • You need retrievable, derived data
  • The result is not a business event
  • The output is ephemeral or view-specific

Why this matters:
Emitted events and state become part of your event log and must be treated as durable and immutable

4. Monitor Projection State and Checkpoint Size

KurrentDB logs a warning when projection state or checkpoint size exceeds 8 MB

Best practice:

  • Actively monitor server logs
  • Treat size warnings as early indicators that projection design needs to be altered
  • Refactor projections before limits are exceeded

Key limits to remember:

  • 16 MB hard limit for projection state
  • 16 MB limit for event payloads

Why this matters:
Exceeding these limits can cause:

  • Projection faults
  • Replay instability
  • Operational incidents under load

5. Offload Large or Unbounded Projections to External Compute

For projections that require very complex lookups / transformations, or compute large state, running inside KurrentDB is not the right model

Best practice:

  • Build a user projection that runs externally
  • Use a KurrentDB client to stream events
  • Persist derived state in an external store, or append emitted events back to KurrentDB

Why this matters:

  • Keeps compute pressure out of the database
  • Removes state size constraints
  • Enables richer processing and storage strategies

If your projection state grows without a clear upper bound, it likely does not belong inside the Projection Engine

Anti-Patterns: What You Should Not Do

1. Do Not Process the Same Event Multiple Times in a Single Pipeline

Repeatedly transforming or rehydrating the same event within one projection pipeline is inefficient and error-prone

Symptoms:

  • Excessive CPU usage
  • Complex, brittle logic
  • Hard-to-debug inconsistencies

Best practice:

  • Normalize processing once per event
  • Derive all needed values in a single pass
  • Split projections by responsibility if needed

2. Do Not Exceed the 16 MB State or Event Payload Limit

Both projection state and emitted events have a hard 16 MB size limit

Common causes:

  • Storing full event histories in state
  • Accumulating arrays or maps indefinitely
  • Emitting large, denormalized, events

Best practice:

  • Summarize ,aggregate, and trim unnecessary data, wherever possible
  • Emit links, or compact events / state
  • Offload large data elsewhere

3. Do Not Write to Emitted Event Streams or System Streams

Projection streams are append-only outputs of projections

Never:

  • Write directly to emitted streams
  • Write to any system stream (streams starting with $)

Why this matters:

  • Breaks projection guarantees and causes faults (remember (2) above; the 16 MB rule!)
  • Can corrupt state or cause replay problems
  • May cause system faults where the database expects specific schemas or element values

See this link for more information on this very important topic: Introduction to projections | Kurrent Docs

4. Do Not Store All Data in Projection State

Projection state is not a database.

Storing all historical or raw data in state:

  • Defeats the purpose of event sourcing
  • Using state as the source of truth means values can be overwritten and lost, also defeating the purpose of event sourcing
  • Causes state bloat (remember (2) above; the 16 MB rule!)
  • Makes projections fragile and slow

Best practice:

  • Treat state as a cache or summary
  • Rely on the event log as the source of truth
  • Rebuild state from events when needed

5. Do Not Use fromAll with Post-Filtering

Using fromAll and filtering events inside the projection code is a common anti-pattern

Why this matters:

  • The projection will process every appended event
  • Filtering logic can become complex and brittle
  • Replay times increase dramatically

Best practice:

  • Split logic into multiple projections
  • Use fromCategories, fromStream, or fromStreams
  • Let the subscription do the filtering

6. Do Not Track Emitted Streams Unless You Plan to Delete the Projection

Tracking emitted streams creates tight coupling between the projection and its outputs, requiring KurrentDB to do more processing, and store more information, for each projection stream

Track Emitted Streams If:

  • You intend to delete or reset the projection
  • You need deterministic cleanup behavior

Otherwise:

  • Leave emitted streams untracked
  • Manage lifecycle independently

Design Principles for Sustainable Projections

To summarize, effective projections in KurrentDB follow a few core principles:

  1. Focus – Each projection does one thing well
  2. Minimalism – Store and emit only what is necessary
  3. Boundedness – State should have a clear boundaries and limit stored information
  4. Observability – Monitor logs and warnings proactively
  5. Appropriate Placement – More complex / less efficient projections belong outside the database

Conclusion

The KurrentDB Projection Engine is a powerful tool for performing real-time, event-driven, transformations. By narrowing scope, minimizing element storage, respecting size limits, and offloading large-scale processing when appropriate, teams can build projections that are fast, reliable, and easy to operate

Used correctly, projections become a strategic advantage - turning raw event streams into actionable insights without compromising the integrity or performance of your system