Modern Data Stack Reference Architecture

End-to-end reference architecture for ingestion, dbt transformation, Snowflake warehouse, Airflow orchestration, and BI consumption — with workflow steps and ownership matrix.

← Back to Data Engineering Architectures

Data Engineering ArchitecturesIntermediateWorkflow Template

Architecture Diagram

AWS reference layout with grouped regions, numbered flows, and official service icons.

Modern Data Stack on AWSIngestion → transform → warehouse → BI

Orchestrated by MWAA + Step Functions + EventBridge · Swap Redshift for Snowflake/BigQuery if needed

Code preview

80 lines

Replace {{PLACEHOLDERS}} with your environment values, then deploy to your stack.

# Modern Data Stack Reference Architecture

> DE Architecture · Workflow template for {{ORGANIZATION_NAME}}

## Purpose

Reference architecture for a modern analytics stack: ingestion → transformation (dbt) → warehouse (Snowflake/BigQuery) → orchestration (Airflow) → consumption (BI/ML).

## Architecture Diagram

```
┌─────────────┐    ┌──────────────┐    ┌─────────────┐    ┌──────────────┐
│   Sources   │───▶│  Ingestion   │───▶│  Raw/Bronze │───▶│ dbt Staging  │
│ SaaS, DBs   │    │ Fivetran/Air │    │  S3/BQ/SF   │    │   stg_*      │
└─────────────┘    └──────────────┘    └─────────────┘    └──────┬───────┘
                                                                  │
┌─────────────┐    ┌──────────────┐    ┌─────────────┐           │
│  BI / ML    │◀───│    Marts     │◀───│ dbt Int/Marts│◀──────────┘
│ Looker/Hex  │    │  fct_/dim_   │    │  int_*, marts│
└─────────────┘    └──────────────┘    └─────────────┘
                          ▲
                   ┌──────┴───────┐
                   │   Airflow    │
                   │  Scheduler   │
                   └──────────────┘
```

## Component Responsibilities

| Layer | Tool | Owner | SLA |
|-------|------|-------|-----|
| Ingestion | {{INGESTION_TOOL}} | {{INGESTION_TEAM}} | {{INGESTION_SLA}} |
| Storage | {{WAREHOUSE}} | {{PLATFORM_TEAM}} | 99.9% |
| Transform | dbt | {{ANALYTICS_ENG_TEAM}} | {{DBT_SLA}} |
| Orchestrate | Airflow | {{DE_TEAM}} | {{ORCHESTRATION_SLA}} |
| Consume | {{BI_TOOL}} | {{ANALYTICS_TEAM}} | Business hours |

## End-to-End Workflow

### Step 1 - Source onboarding
1. Register source in catalog with owner and classification
2. Configure connector with least-privilege credentials
3. Land raw data in `{{RAW_DATABASE}}.{{RAW_SCHEMA}}`
4. Validate row counts and schema against contract

### Step 2 - Staging & modeling
1. Create `stg_{{ENTITY}}` with source-aligned cleaning
2. Build `int_{{ENTITY}}` for joins and business logic
3. Publish `fct_{{ENTITY}}` / `dim_{{ENTITY}}` marts
4. Run dbt tests (unique, not_null, relationships, custom DQ)

### Step 3 - Orchestration
1. Airflow DAG triggers after ingestion completion sensor
2. dbt run + test in CI/CD or Airflow BashOperator
3. On failure: page {{ONCALL_CHANNEL}}, block downstream marts

### Step 4 - Consumption & feedback
1. Expose marts to BI semantic layer / metrics store
2. Track query adoption and dataset freshness SLAs
3. Quarterly review unused models for deprecation

## Non-Functional Requirements

- **Security:** RBAC on warehouse; PII masked in marts
- **Cost:** Warehouse auto-suspend; incremental models default
- **Governance:** All models documented in dbt + catalog
- **Observability:** Freshness checks on top 20 critical marts

## Implementation Checklist

- [ ] Define naming conventions (stg_, int_, fct_, dim_)
- [ ] Set up dev/staging/prod environments
- [ ] Configure dbt Cloud or CI pipeline
- [ ] Wire Airflow sensors to ingestion completion
- [ ] Establish on-call runbook for pipeline failures

## {{ORGANIZATION_NAME}} Customization Notes

Replace tool names, team owners, and SLAs above. Add domain-specific marts and lineage diagrams as appendices.

How to use this architecture

Use in architecture review meetings or RFC documents
Map each component to your cloud accounts, teams, and tools
Replace {{PLACEHOLDERS}} with environment-specific values
Extend workflow steps with your org's SLAs and governance gates

modern data stackdbtairflowsnowflakeworkflow

Downloads92

UpdatedJul 2, 2026