Event-Driven Architecture - Building Scalable, Loosely Coupled Production Systems

Event-driven architecture (EDA) enables building highly scalable, loosely coupled systems by having components communicate through events rather than direct calls. Companies like Netflix (processing billions of events daily), Uber (event sourcing for trip state), and LinkedIn (event streaming with Kafka) use EDA to achieve massive scale and system resilience.

This guide covers event fundamentals, event sourcing for state management, CQRS pattern for read/write separation, event streaming with Kafka, publish-subscribe messaging patterns, event choreography vs orchestration, handling eventual consistency, and production deployment strategies for building robust event-driven systems.

Event-Driven Architecture Fundamentals

What are Events?

Events are immutable facts that represent something that happened in the system. Unlike commands (requests to do something), events describe what already occurred and cannot be changed or rejected.

Event characteristics:

Immutable - Once created, events never change
Past tense - Named for what happened (OrderPlaced, PaymentCompleted)
Timestamped - Include when the event occurred
Self-contained - Carry all necessary data

Example event:

{
  "eventId": "evt_123",
  "eventType": "OrderPlaced",
  "timestamp": "2026-03-17T10:30:00Z",
  "aggregateId": "order_456",
  "data": {
    "userId": "user_789",
    "items": [
      { "productId": "prod_1", "quantity": 2, "price": 29.99 }
    ],
    "total": 59.98,
    "shippingAddress": { "street": "123 Main St", "city": "Seattle" }
  },
  "metadata": {
    "causationId": "cmd_abc",
    "correlationId": "trace_xyz"
  }
}

Benefits of Event-Driven Architecture

Loose Coupling:
Services don't need to know about each other. Publishers emit events without knowing who consumes them.

Scalability:
Event consumers can be scaled independently based on workload. Multiple instances process events in parallel.

Resilience:
If a consumer is down, events are persisted and processed when it recovers. No data loss.

Audit Trail:
Event log provides complete history of what happened in the system for debugging and compliance.

Temporal Decoupling:
Producers and consumers don't need to be online simultaneously. Events persist in queue/log.

Event Sourcing

Storing Events Instead of Current State

Traditional approach stores current state:

// Database: orders table
{
  id: "order_456",
  status: "SHIPPED",  // Current state only
  total: 59.98,
  updatedAt: "2026-03-17T14:00:00Z"
}

Event sourcing stores all events:

// Event store: order events
[
  { type: "OrderPlaced", timestamp: "10:30:00", data: {...} },
  { type: "PaymentCompleted", timestamp: "10:31:00", data: {...} },
  { type: "OrderShipped", timestamp: "14:00:00", data: {...} }
]
// Current state is derived by replaying events
const order = events.reduce((state, event) => {
switch (event.type) {
case 'OrderPlaced':
return { status: 'PENDING', ...event.data };
case 'PaymentCompleted':
return { ...state, status: 'PAID', paymentId: event.data.paymentId };
case 'OrderShipped':
return { ...state, status: 'SHIPPED', trackingNumber: event.data.trackingNumber };
}
}, {});

Benefits:

Complete audit trail - Never lose historical data
Time travel - Rebuild state at any point in time
Bug fixes - Replay events with fixed logic
Analytics - Process historical events for insights

Challenges:

Complexity - More complex than CRUD
Event schema evolution - Must handle old event formats
Performance - Replaying many events can be slow (solved with snapshots)

Event Store Implementation

class EventStore {
  async appendEvent(streamId, event) {
    // Append event to stream (immutable)
    await db.events.insert({
      streamId,
      eventId: event.eventId,
      eventType: event.eventType,
      timestamp: event.timestamp,
      data: event.data,
      metadata: event.metadata,
      version: await this.getNextVersion(streamId)
    });
// Publish event to subscribers
await eventBus.publish(event.eventType, event);

}
async getEvents(streamId, fromVersion = 0) {
return db.events.find({
streamId,
version: { $gte: fromVersion }
}).sort({ version: 1 });
}
async getCurrentState(streamId) {
// Check snapshot cache first
let state = await this.getSnapshot(streamId);
const fromVersion = state ? state.version + 1 : 0;
// Replay events since snapshot
const events = await this.getEvents(streamId, fromVersion);
return events.reduce((s, e) =&gt; this.applyEvent(s, e), state || {});

}
async saveSnapshot(streamId, state, version) {
// Save snapshot every N events for performance
await db.snapshots.upsert({ streamId }, { state, version });
}
}

CQRS Pattern

Command Query Responsibility Segregation

Separate read and write models for different optimization strategies:

Write Model (Commands):

Optimized for consistency and validation
Enforces business rules
Writes to event store

Read Model (Queries):

Optimized for fast queries
Denormalized views
Eventually consistent with write model

Implementation:

// Write side: Handle commands, emit events
class OrderCommandHandler {
  async placeOrder(command) {
    // Validate command
    if (!command.items.length) {
      throw new Error('Order must have items');
    }
// Create event
const event = {
  eventId: uuid(),
  eventType: 'OrderPlaced',
  timestamp: new Date(),
  aggregateId: command.orderId,
  data: {
    userId: command.userId,
    items: command.items,
    total: command.items.reduce((sum, item) =&gt; sum + item.price * item.quantity, 0)
  }
};

// Append to event store
await eventStore.appendEvent(`order-${command.orderId}`, event);

return { orderId: command.orderId };

}
}
// Read side: Project events into query models
class OrderProjection {
async onOrderPlaced(event) {
// Update read model (denormalized for fast queries)
await db.orderViews.insert({
orderId: event.aggregateId,
userId: event.data.userId,
status: 'PENDING',
total: event.data.total,
itemCount: event.data.items.length,
placedAt: event.timestamp
});
// Update user's order list view
await db.userOrderLists.push(event.data.userId, {
  orderId: event.aggregateId,
  total: event.data.total,
  status: 'PENDING'
});

}
async onOrderShipped(event) {
// Update multiple read models
await db.orderViews.update(event.aggregateId, {
status: 'SHIPPED',
shippedAt: event.timestamp
});
await db.userOrderLists.update(
  { userId: event.data.userId, orderId: event.aggregateId },
  { status: 'SHIPPED' }
);

}
}
// Query side: Fast reads from denormalized views
class OrderQueryService {
async getOrderById(orderId) {
// Read from optimized view (no joins needed)
return db.orderViews.findOne({ orderId });
}
async getUserOrders(userId, limit = 10) {
// Pre-computed user order list
return db.userOrderLists.find({ userId }).limit(limit);
}
}

Benefits:

Performance - Reads don't impact writes and vice versa
Scalability - Scale read and write sides independently
Flexibility - Different data models for different needs

Event Streaming with Kafka

Kafka for High-Throughput Event Processing

Kafka provides durable, ordered event log with high throughput:

const { Kafka } = require('kafkajs');
const kafka = new Kafka({
clientId: 'order-service',
brokers: ['kafka1:9092', 'kafka2:9092', 'kafka3:9092']
});
// Producer: Publish events
const producer = kafka.producer();
async function publishEvent(event) {
await producer.send({
topic: 'orders',
messages: [{
key: event.aggregateId,  // Ensures same order events go to same partition
value: JSON.stringify(event),
headers: {
'event-type': event.eventType,
'correlation-id': event.metadata.correlationId
}
}]
});
}
// Consumer: Subscribe to events
const consumer = kafka.consumer({ groupId: 'order-projections' });
await consumer.subscribe({ topic: 'orders', fromBeginning: false });
await consumer.run({
eachMessage: async ({ topic, partition, message }) => {
const event = JSON.parse(message.value.toString());
console.log(`Processing ${event.eventType} from partition ${partition}`);

// Project event into read models
await orderProjection.handle(event);

// Commit offset after successful processing

}
});

Kafka Benefits:

Durability - Events persisted to disk, replicated across brokers
Ordering - Guaranteed order within partition
Scalability - Horizontal scaling with partitions
Replay - Consumers can replay events from any point
Multiple consumers - Many services consume same events

Publish-Subscribe Pattern

Decoupled Event Distribution

Publishers emit events to topics without knowing subscribers:

// Event bus with Redis Pub/Sub
const Redis = require('ioredis');
const pub = new Redis();
const sub = new Redis();
class EventBus {
async publish(eventType, event) {
await pub.publish(eventType, JSON.stringify(event));
}
subscribe(eventType, handler) {
sub.subscribe(eventType);
sub.on('message', async (channel, message) =&gt; {
  if (channel === eventType) {
    const event = JSON.parse(message);
    await handler(event);
  }
});

}
}
// Publishers don't know about subscribers
class OrderService {
async placeOrder(orderData) {
const order = await db.orders.create(orderData);
// Emit event
await eventBus.publish('OrderPlaced', {
  orderId: order.id,
  userId: order.userId,
  total: order.total
});

return order;

}
}
// Multiple independent subscribers
class InventoryService {
constructor() {
eventBus.subscribe('OrderPlaced', this.reserveInventory.bind(this));
}
async reserveInventory(event) {
console.log('Reserving inventory for order:', event.orderId);
// Reserve stock for order items
}
}
class NotificationService {
constructor() {
eventBus.subscribe('OrderPlaced', this.sendConfirmation.bind(this));
}
async sendConfirmation(event) {
console.log('Sending confirmation for order:', event.orderId);
// Send email confirmation
}
}
class AnalyticsService {
constructor() {
eventBus.subscribe('OrderPlaced', this.trackOrder.bind(this));
}
async trackOrder(event) {
console.log('Tracking order for analytics:', event.orderId);
// Update analytics dashboard
}
}

Choreography vs Orchestration

Event Choreography (Decentralized)

Services react to events independently without central coordinator:

// Order placed event starts the saga
eventBus.on('OrderPlaced', async (event) => {
  // Multiple services react independently
});
// Inventory service reserves stock
inventoryService.on('OrderPlaced', async (event) => {
const reserved = await inventoryService.reserve(event.items);
if (reserved) {
await eventBus.publish('InventoryReserved', { orderId: event.orderId });
} else {
await eventBus.publish('InventoryReserveFailed', { orderId: event.orderId });
}
});
// Payment service processes payment
paymentService.on('InventoryReserved', async (event) => {
try {
const payment = await paymentService.charge(event.orderId);
await eventBus.publish('PaymentCompleted', { orderId: event.orderId, payment });
} catch (error) {
await eventBus.publish('PaymentFailed', { orderId: event.orderId });
}
});
// Shipping service ships order
shippingService.on('PaymentCompleted', async (event) => {
await shippingService.ship(event.orderId);
await eventBus.publish('OrderShipped', { orderId: event.orderId });
});
// Handle failures (compensating transactions)
inventoryService.on('PaymentFailed', async (event) => {
await inventoryService.releaseReservation(event.orderId);
await eventBus.publish('InventoryReleased', { orderId: event.orderId });
});

Pros: Loose coupling, easy to add new participants
Cons: Hard to track workflow, implicit dependencies

Event Orchestration (Centralized)

Central orchestrator manages workflow:

class OrderSagaOrchestrator {
  async executeOrderSaga(orderId) {
    const saga = {
      orderId,
      status: 'STARTED',
      steps: []
    };
try {
  // Step 1: Reserve inventory
  saga.steps.push({ name: 'ReserveInventory', status: 'PENDING' });
  await this.reserveInventory(orderId);
  saga.steps[0].status = 'COMPLETED';

  // Step 2: Process payment
  saga.steps.push({ name: 'ProcessPayment', status: 'PENDING' });
  await this.processPayment(orderId);
  saga.steps[1].status = 'COMPLETED';

  // Step 3: Ship order
  saga.steps.push({ name: 'ShipOrder', status: 'PENDING' });
  await this.shipOrder(orderId);
  saga.steps[2].status = 'COMPLETED';

  saga.status = 'COMPLETED';
  await eventBus.publish('OrderCompleted', { orderId });

} catch (error) {
  saga.status = 'FAILED';

  // Compensate completed steps in reverse order
  await this.compensate(saga);
}

return saga;

}
async compensate(saga) {
const completedSteps = saga.steps
.filter(s => s.status === 'COMPLETED')
.reverse();
for (const step of completedSteps) {
  switch (step.name) {
    case 'ShipOrder':
      await this.cancelShipment(saga.orderId);
      break;
    case 'ProcessPayment':
      await this.refundPayment(saga.orderId);
      break;
    case 'ReserveInventory':
      await this.releaseInventory(saga.orderId);
      break;
  }
}

await eventBus.publish('OrderFailed', { orderId: saga.orderId });

}
}

Pros: Clear workflow visibility, easier debugging
Cons: Central point of failure, tighter coupling

Eventual Consistency

Handling Async State Propagation

Event-driven systems are eventually consistent - read models update asynchronously:

// Write happens immediately
app.post('/orders', async (req, res) => {
  const command = { orderId: uuid(), ...req.body };
await orderCommandHandler.placeOrder(command);
// Return immediately (write completed)
res.status(202).json({
orderId: command.orderId,
message: 'Order placed successfully'
});
});
// Read might not show order yet (eventual consistency)
app.get('/orders/:id', async (req, res) => {
const order = await orderQueryService.getOrderById(req.params.id);
if (!order) {
// Order might be in event queue, not yet projected
return res.status(404).json({ error: 'Order not found (processing)' });
}
res.json(order);
});
// Solution: Include processing status
res.status(202).json({
orderId: command.orderId,
status: 'PROCESSING',
_links: {
self: /orders/${command.orderId},
status: /orders/${command.orderId}/status
}
});
// Client polls or uses WebSocket for updates

Strategies for handling eventual consistency:

Accept it - Design UX for async operations
Optimistic UI - Show expected state immediately
Status polling - Client checks until ready
WebSockets - Push notifications when ready
Inbox pattern - Track processed events to ensure idempotency

Real-World Examples

Netflix Event-Driven Architecture

Scale:

Billions of events processed daily
Event streaming with Apache Kafka
Event sourcing for user viewing history
CQRS for recommendations (write to master, read from replicas)

Patterns:

Change Data Capture (CDC) from databases
Event-driven microservices communication
Real-time stream processing with Flink

Uber Trip State Management

Event Sourcing:

Trip lifecycle tracked as events
TripRequested → DriverMatched → TripStarted → TripCompleted
Complete audit trail for disputes
Replay events for analytics

Benefits:

Handle 100M+ trips
Complete trip history
Support for complex workflows (cancellations, route changes)

Conclusion

Event-driven architecture enables building scalable, loosely coupled systems through asynchronous event communication. Key takeaways:

Core Concepts:

Events are immutable facts about what happened
Publishers and subscribers are decoupled
Temporal decoupling for resilience

Patterns:

Event sourcing stores all events (complete audit trail)
CQRS separates read/write models (optimized for each)
Choreography for loose coupling, orchestration for visibility
Eventual consistency requires UX design

Technology:

Kafka for high-throughput event streaming
Event stores for event sourcing
Message queues for pub/sub patterns

Trade-offs:

Complexity vs scalability
Eventual consistency vs strong consistency
Debugging distributed workflows

Companies like Netflix and Uber demonstrate EDA scales to billions of events when implemented correctly. Start simple with pub/sub, adopt event sourcing and CQRS when benefits outweigh complexity.

Event-Driven Architecture - Building Scalable, Loosely Coupled Production Systems

Table of Contents

Event-Driven Architecture Fundamentals

What are Events?

Benefits of Event-Driven Architecture

Event Sourcing

Storing Events Instead of Current State

Event Store Implementation

CQRS Pattern

Command Query Responsibility Segregation

Event Streaming with Kafka

Kafka for High-Throughput Event Processing

Publish-Subscribe Pattern

Decoupled Event Distribution

Choreography vs Orchestration

Event Choreography (Decentralized)

Event Orchestration (Centralized)

Eventual Consistency

Handling Async State Propagation

Real-World Examples

Netflix Event-Driven Architecture

Uber Trip State Management

Conclusion

Related Articles

Event-Driven Architecture - Building Scalable, Loosely Coupled Production Systems

Microservices Architecture Patterns - Production Design and Implementation Best Practices

GraphQL API Design - Production Architecture and Best Practices for Scalable Systems

Written by StaticBlock