Streaming Kappa Architecture

Real-time Kappa-pattern architecture with Kafka/Kinesis, stream processing, serving layer, and operational patterns for low-latency analytics.

Data Engineering ArchitecturesAdvancedWorkflow Template

Architecture Diagram

AWS reference layout with grouped regions, numbered flows, and official service icons.

Streaming Kappa Architecture on AWSSingle stream path with no separate batch layer
Event ProducersStream BusStream ProcessingServing LayerSchema & Error Handling1234567archivepoison eventsAWS LambdaApp eventsCDCAmazon RDSChange data captureAmazon MSKKafka topicsAmazon KinesisData streamsWindow + aggregateManaged FlinkAWS LambdaMicro-batchOLAP queriesAmazon OpenSearchSub-ms cacheAmazon ElastiCacheArchiveAmazon S3Replay / complianceSchema RegistryAWS GlueDead letter queueAmazon S3

Produce → validate schema → process → serve (OpenSearch/ElastiCache) → optional S3 archive for replay

Code preview

54 lines

Replace {{PLACEHOLDERS}} with your environment values, then deploy to your stack.

# Streaming Lambda Architecture (Kappa Pattern)

> DE Architecture · {{ORGANIZATION_NAME}}

## Overview

Real-time analytics architecture using Kappa pattern: single stream processing path, no separate batch layer for speed-critical use cases.

## Architecture

```
 Producers                Stream Bus              Processing              Serving
┌──────────┐            ┌──────────┐            ┌──────────┐            ┌──────────┐
│ Apps/IoT │──events──▶ │ {{KAFKA}} │──consume──▶│ {{FLINK}} │──sink────▶│ {{OLAP}}  │
│ DB CDC   │            │ /Kinesis │            │ /Spark SS│            │ /Cache   │
└──────────┘            └──────────┘            └──────────┘            └──────────┘
                               │                       │
                               ▼                       ▼
                         Schema Registry         Dead Letter Queue
                         ({{REGISTRY}})          s3://{{DLQ_PATH}}
```

## Event Workflow

1. **Produce** - Events keyed by `{{PARTITION_KEY}}` with Avro/Protobuf schema
2. **Validate** - Schema registry enforces compatibility (BACKWARD)
3. **Process** - Stream job: parse → enrich → window → aggregate
4. **Serve** - Push to Pinot/Druid/Redis for sub-second queries
5. **Archive** - Optional compacted topic → S3 for replay/compliance

## Windowing & Aggregation Example

- Tumbling 5-minute windows for transaction counts
- Session windows for user journey analytics
- Late data handling: allowed lateness {{LATENESS_MINUTES}} min

## Operational Concerns

| Concern | Pattern |
|---------|---------|
| Exactly-once | Idempotent sinks + transactional offsets |
| Backfill | Reset offsets or replay from S3 archive |
| Schema change | Registry review + dual-write migration |
| Monitoring | Lag, throughput, error rate dashboards |

## When to Use Kappa vs Lambda

- **Kappa:** Low-latency dashboards, fraud detection, operational metrics
- **Lambda (batch+stream):** Complex historical corrections, heavy transforms

## {{ORGANIZATION_NAME}} Stack

Replace {{KAFKA}}, {{FLINK}}, {{OLAP}} with your streaming stack (MSK, Flink, ClickHouse, etc.).

How to use this architecture

  • Use in architecture review meetings or RFC documents
  • Map each component to your cloud accounts, teams, and tools
  • Replace {{PLACEHOLDERS}} with environment-specific values
  • Extend workflow steps with your org's SLAs and governance gates
streamingkappakafkaflinkreal-time
Downloads41
UpdatedJul 2, 2026
Login to share feedback