Data Architecture

Purpose

This document defines the data storage, caching, and messaging architecture for the Farmer1st platform.

Current State

Data Stores Overview

┌─────────────────────────────────────────────────────────────────────────────┐
│                          DATA ARCHITECTURE                                  │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│   ┌─────────────────────────────────────────────────────────────────────┐  │
│   │                        APPLICATION LAYER                             │  │
│   │                         (Python APIs)                                │  │
│   └───────────┬─────────────────┬─────────────────┬─────────────────────┘  │
│               │                 │                 │                        │
│               ▼                 ▼                 ▼                        │
│   ┌───────────────┐ ┌───────────────────┐ ┌─────────────────────────────┐ │
│   │               │ │                   │ │                             │ │
│   │    Redis      │ │   PostgreSQL      │ │     MSK (Kafka)             │ │
│   │               │ │                   │ │                             │ │
│   │  • Cache      │ │  • Primary data   │ │  • Event streaming          │ │
│   │  • Sessions   │ │  • Transactional  │ │  • Async messaging          │ │
│   │  • Rate       │ │  • Relational     │ │  • Event sourcing           │ │
│   │    limiting   │ │  • ACID           │ │  • Service decoupling       │ │
│   │  • Pub/Sub    │ │                   │ │                             │ │
│   │               │ │                   │ │                             │ │
│   └───────────────┘ └───────────────────┘ └─────────────────────────────┘ │
│                                                                             │
│   ┌─────────────────────────────────────────────────────────────────────┐  │
│   │                          TEMPORAL                                    │  │
│   │                                                                      │  │
│   │  • Workflow state persistence                                        │  │
│   │  • Activity history                                                  │  │
│   │  • Scheduled tasks                                                   │  │
│   │  • Long-running process coordination                                 │  │
│   │                                                                      │  │
│   │  (Uses its own PostgreSQL or Cassandra for persistence)              │  │
│   └─────────────────────────────────────────────────────────────────────┘  │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Decisions and Rationale

PostgreSQL - Primary Database

Decision	Rationale
PostgreSQL for primary data	ACID compliance, complex queries, relational integrity, proven scale
AWS RDS or Aurora	Managed service, automated backups, read replicas for scale

Use Cases: - User accounts and profiles - Farmer data and records - Stakeholder data - Application metadata - SuperTokens auth data - Transactional data

Redis - Cache Layer

Decision	Rationale
Redis for caching	Sub-millisecond latency, versatile data structures, proven reliability
AWS ElastiCache or self-managed	TBD based on operational preference

Use Cases: - API response caching - Session storage (backup to JWT) - Rate limiting counters - Real-time leaderboards/metrics - Pub/Sub for real-time features - Distributed locks

Kafka (MSK) - Event Streaming

Decision	Rationale
Kafka for messaging	Durable, replayable, high throughput, event sourcing capability
AWS MSK (Managed)	Reduced operational burden, AWS integration

Use Cases: - Async processing (notifications, reports) - Event sourcing for audit trails - Service-to-service communication - Data pipeline ingestion - Analytics event streaming

Temporal - Workflow Orchestration

Decision	Rationale
Temporal for workflows	Durable execution, built-in retries, complex coordination, visibility

Use Cases: - Multi-step farmer onboarding - Complex approval workflows - Scheduled batch processing - Long-running data processing - Saga patterns for distributed transactions - Retry logic for external integrations

Data Flow Patterns

Synchronous (Simple Requests)

Client → API → PostgreSQL/Redis → Response

Asynchronous (Background Processing)

Client → API → Kafka (publish) → Response (accepted)
                    ↓
              Worker (consume) → Process → PostgreSQL

Complex Workflow

Client → API → Temporal (start workflow) → Response (workflow ID)
                    ↓
              Temporal orchestrates:
                → Activity 1 (API call)
                → Activity 2 (DB update)
                → Activity 3 (Kafka publish)
                → Activity 4 (External service)

Trade-offs Considered

PostgreSQL vs MongoDB

PostgreSQL	MongoDB
✅ ACID transactions	✅ Flexible schema
✅ Complex joins	❌ Limited joins
✅ Mature ecosystem	✅ Document model
⚠️ Schema migrations needed	⚠️ Eventual consistency

Decision: PostgreSQL for data integrity and relational queries important in agricultural data.

Kafka vs SQS/SNS

Kafka (MSK)	SQS/SNS
✅ Event replay	❌ No replay
✅ Ordering guarantees	⚠️ FIFO queues limited
✅ High throughput	✅ Simpler
❌ More complex	✅ Fully managed

Decision: Kafka for event replay and future event sourcing patterns.

Temporal vs Step Functions

Temporal	Step Functions
✅ Code-based workflows	⚠️ JSON/YAML definitions
✅ Portable (not AWS-locked)	❌ AWS only
✅ Complex logic support	⚠️ Limited expressiveness
❌ Self-managed	✅ Fully managed

Decision: Temporal for code-based workflows and portability.

Caching Strategy

Cache-Aside Pattern (Primary)

def get_farmer(farmer_id):
    # 1. Check cache
    cached = redis.get(f"farmer:{farmer_id}")
    if cached:
        return cached

    # 2. Query database
    farmer = db.query(Farmer).get(farmer_id)

    # 3. Populate cache
    redis.setex(f"farmer:{farmer_id}", TTL, farmer)

    return farmer

Cache Invalidation

TTL-based expiration (default)
Event-driven invalidation via Kafka for critical data
Write-through for frequently accessed data

Open Questions

PostgreSQL: RDS vs Aurora?
Redis: ElastiCache vs self-managed?
Temporal: Self-hosted vs Temporal Cloud?
Data retention policies?
Backup and disaster recovery strategy?
Data partitioning strategy for scale?
Read replica strategy for global distribution?

Dependencies

AWS infrastructure (see 03-infrastructure-architecture.md)
Network connectivity via Cloudflare Tunnel

Last Updated: 2025-12-25