Modern Data Stack Reference Architecture

End-to-end reference architecture for ingestion, dbt transformation, Snowflake warehouse, Airflow orchestration, and BI consumption — with workflow steps and ownership matrix.

Data Engineering ArchitecturesIntermediateWorkflow Template

Architecture Diagram

AWS reference layout with grouped regions, numbered flows, and official service icons.

Modern Data Stack on AWSIngestion → transform → warehouse → BI
Analytics PipelineOrchestration Layer1234567orchestrateAmazon AppFlowSaaS / APIsAWS DMSDatabase replicationBronzeAmazon S3Raw zoneTransformAWS Gluedbt / SparkSilverAmazon S3CuratedMartsAmazon RedshiftWarehouseConsumptionAmazon QuickSightDashboardsAmazon MWAAAirflow DAGsAWS Step FunctionsWorkflowsAmazon EventBridgeScheduling

Orchestrated by MWAA + Step Functions + EventBridge · Swap Redshift for Snowflake/BigQuery if needed

Code preview

80 lines

Replace {{PLACEHOLDERS}} with your environment values, then deploy to your stack.

# Modern Data Stack Reference Architecture

> DE Architecture · Workflow template for {{ORGANIZATION_NAME}}

## Purpose

Reference architecture for a modern analytics stack: ingestion → transformation (dbt) → warehouse (Snowflake/BigQuery) → orchestration (Airflow) → consumption (BI/ML).

## Architecture Diagram

```
┌─────────────┐    ┌──────────────┐    ┌─────────────┐    ┌──────────────┐
│   Sources   │───▶│  Ingestion   │───▶│  Raw/Bronze │───▶│ dbt Staging  │
│ SaaS, DBs   │    │ Fivetran/Air │    │  S3/BQ/SF   │    │   stg_*      │
└─────────────┘    └──────────────┘    └─────────────┘    └──────┬───────┘
                                                                  │
┌─────────────┐    ┌──────────────┐    ┌─────────────┐           │
│  BI / ML    │◀───│    Marts     │◀───│ dbt Int/Marts│◀──────────┘
│ Looker/Hex  │    │  fct_/dim_   │    │  int_*, marts│
└─────────────┘    └──────────────┘    └─────────────┘
                          ▲
                   ┌──────┴───────┐
                   │   Airflow    │
                   │  Scheduler   │
                   └──────────────┘
```

## Component Responsibilities

| Layer | Tool | Owner | SLA |
|-------|------|-------|-----|
| Ingestion | {{INGESTION_TOOL}} | {{INGESTION_TEAM}} | {{INGESTION_SLA}} |
| Storage | {{WAREHOUSE}} | {{PLATFORM_TEAM}} | 99.9% |
| Transform | dbt | {{ANALYTICS_ENG_TEAM}} | {{DBT_SLA}} |
| Orchestrate | Airflow | {{DE_TEAM}} | {{ORCHESTRATION_SLA}} |
| Consume | {{BI_TOOL}} | {{ANALYTICS_TEAM}} | Business hours |

## End-to-End Workflow

### Step 1 - Source onboarding
1. Register source in catalog with owner and classification
2. Configure connector with least-privilege credentials
3. Land raw data in `{{RAW_DATABASE}}.{{RAW_SCHEMA}}`
4. Validate row counts and schema against contract

### Step 2 - Staging & modeling
1. Create `stg_{{ENTITY}}` with source-aligned cleaning
2. Build `int_{{ENTITY}}` for joins and business logic
3. Publish `fct_{{ENTITY}}` / `dim_{{ENTITY}}` marts
4. Run dbt tests (unique, not_null, relationships, custom DQ)

### Step 3 - Orchestration
1. Airflow DAG triggers after ingestion completion sensor
2. dbt run + test in CI/CD or Airflow BashOperator
3. On failure: page {{ONCALL_CHANNEL}}, block downstream marts

### Step 4 - Consumption & feedback
1. Expose marts to BI semantic layer / metrics store
2. Track query adoption and dataset freshness SLAs
3. Quarterly review unused models for deprecation

## Non-Functional Requirements

- **Security:** RBAC on warehouse; PII masked in marts
- **Cost:** Warehouse auto-suspend; incremental models default
- **Governance:** All models documented in dbt + catalog
- **Observability:** Freshness checks on top 20 critical marts

## Implementation Checklist

- [ ] Define naming conventions (stg_, int_, fct_, dim_)
- [ ] Set up dev/staging/prod environments
- [ ] Configure dbt Cloud or CI pipeline
- [ ] Wire Airflow sensors to ingestion completion
- [ ] Establish on-call runbook for pipeline failures

## {{ORGANIZATION_NAME}} Customization Notes

Replace tool names, team owners, and SLAs above. Add domain-specific marts and lineage diagrams as appendices.

How to use this architecture

  • Use in architecture review meetings or RFC documents
  • Map each component to your cloud accounts, teams, and tools
  • Replace {{PLACEHOLDERS}} with environment-specific values
  • Extend workflow steps with your org's SLAs and governance gates
modern data stackdbtairflowsnowflakeworkflow
Downloads92
UpdatedJul 2, 2026
Login to share feedback