# Online Movie Ticket Booking Platform
Solution Architecture



# Online Movie Ticket Booking Platform (Backend)

<Callout type="note">
This post focuses on **backend system design only** (no UI). It covers architecture, service boundaries, booking correctness, scalability, observability, deployment, and API contracts for one read + one write scenario.
</Callout>

---

## 1. System Overview

### 1.1 Problem Statement

The objective is to design and implement a scalable, highly available **Online Movie Ticket Booking Platform** that enables seamless collaboration between theatre partners and end customers, while supporting:

- High-volume concurrent bookings  
- Multiple cities and languages  
- Secure payment processing  

The platform must:

- Allow theatre partners to onboard, manage theatres, screens, seat layouts, and show schedules.
- Enable customers to browse movies, theatres, and show timings across cities and dates.
- Support real-time seat availability, ticket booking, payments, and notifications.
- Scale reliably during peak demand (e.g., movie releases, weekends).
- Ensure data consistency, prevent double booking, and meet stringent non-functional requirements such as **99.99% availability**, security, and compliance.

The solution is expected to be **service-oriented, cloud-native, and production-grade**, focusing on backend services only.

---

### 1.2 Key Stakeholders

- **Customers (B2C)**  
  Browse movies and showtimes, select seats, make payments, and receive booking confirmations.

- **Theatre Partners (B2B)**  
  Onboard theatres, define screens and seat maps, create/update shows, and manage availability.

- **Platform Operations Team**  
  Monitor system health, handle deployments, manage configurations, and ensure SLA compliance.

- **External Systems**  
  Payment gateways, notification providers (SMS/Email/WhatsApp), and analytics platforms.

---

### 1.3 Core Functional Capabilities

#### Theatre & Partner Management
- Partner onboarding and authentication  
- Theatre, screen, and seat-map configuration  
- Show creation, updates, and cancellations  

#### Movie Discovery & Browsing
- Browse movies by city, language, genre, and date  
- View theatres currently running a selected movie  
- Retrieve show timings and seat availability  

#### Ticket Booking & Payments
- Seat selection with real-time availability  
- Temporary seat locking to avoid double booking  
- Secure payment initiation and confirmation  
- Booking confirmation and ticket issuance  

#### Offers & Pricing
- Rule-based promotional offers (e.g., third ticket at 50% discount)  
- Flexible pricing strategies configurable per city or theatre  

#### Notifications
- Booking confirmations, cancellations, and refunds  
- Asynchronous notification delivery via messaging systems  

---

### 1.4 Key Non-Functional Requirements

- **High Availability:**  
  Achieve **99.99% uptime**, especially during peak booking windows.

- **Scalability:**  
  Horizontal scalability for spikes in read traffic (browsing) and write traffic (bookings).

- **Consistency & Concurrency Control:**  
  Strong consistency for seat booking to prevent overselling and race conditions.

- **Performance:**  
  Low-latency responses for browsing and booking flows.

- **Security & Compliance:**  
  Protection against OWASP Top 10, secure handling of PII, and compliant payment processing.

- **Extensibility:**  
  Easy onboarding of new cities, theatres, languages, and future features like recommendations or dynamic pricing.

---

### 1.5 In-Scope and Out-of-Scope

#### In Scope
- Backend service design and implementation  
- API design for **one read scenario** (browse shows)  
- API design for **one write scenario** (ticket booking)  
- Database schema design  
- Concurrency, transactions, and failure handling  
- High-level architecture and deployment strategy  

#### Out of Scope
- UI / frontend implementation  
- Recommendation engines or advanced AI models  
- Detailed financial settlement with theatre partners  
- Manual customer support workflows  

---

### 1.6 Solution Approach Summary

The proposed solution adopts a **microservices-based architecture** with clear separation of concerns across catalog, search, booking, payment, and notification domains.

It leverages **event-driven communication**, **caching**, and **search indexing** to achieve scalability and resilience, while ensuring strong transactional guarantees for seat booking and payments.

---

## 2. Architecture Diagram

### 2.1 High-Level Architecture

The platform is designed as a cloud-native microservices system optimized for:

- **Read-heavy browsing** (movie discovery, theatres, show timings) using **Search + Cache**
- **Write-critical booking** (seat selection → lock → payment → confirmation) using **strong consistency + idempotency**
- **High availability (99.99%)** via stateless services, horizontal scaling, and multi-AZ persistence
- **Loose coupling** via an event-driven backbone for non-blocking workflows like notifications, analytics, and booking transitions

<Callout type="tip">
If you want, you can embed a diagram here (Mermaid, Excalidraw export, or an image). In MDX, you can also import a React diagram component.
</Callout>

---

### 2.2 Component Responsibilities

#### API Gateway
- Central entry point for customers and partners  
- Authentication/authorization, request validation, rate limiting, WAF rules, routing  
- Supports versioned APIs and throttling during peak traffic  

#### Browse Path (Read Optimized)
Search Service + Cache handle most browse traffic:
- Search service provides low-latency browse APIs (movie → city → theatres → shows)
- Uses OpenSearch/Elasticsearch index populated via events from Catalog/Show services
- Redis caches hot keys (popular movies, cities, show listings) to reduce DB load

#### Booking Path (Write / Consistency Critical)
Booking Service owns:
- Seat selection validation and seat lock with TTL
- Booking state machine: `INITIATED → SEATS_LOCKED → PAYMENT_PENDING → CONFIRMED / FAILED / EXPIRED`
- Double-booking prevention via:
  - Redis seat locks (fast TTL-based locking) and/or
  - DB-level constraints/transactions during confirmation (final authority)

#### Payment Processing
Payment Service integrates with external gateways:
- Creates payment intent for a booking (idempotent)
- Handles gateway webhooks (signature verification)
- Publishes payment events (`payment.succeeded`, `payment.failed`) to Kafka

#### Offers & Pricing
Offer/Pricing Service:
- Centralized discount rules (e.g., “3rd ticket 50%”)
- Pluggable rule execution (Strategy pattern)
- Ensures consistent pricing during booking creation/confirmation

#### Notifications
Notification Service:
- Asynchronous confirmation/cancellation messages triggered by booking/payment events
- Multi-channel (SMS/Email/WhatsApp) with retries and DLQ

---

### 2.3 Key Data and Event Flows

#### Read Scenario: Browse Shows (Fast, Scalable)
1. Client calls Search APIs via Gateway (e.g., movie + city + date)
2. Search service fetches from OpenSearch + Redis cache
3. If needed, resolves freshness via show updates consumed from Kafka
4. Returns theatre list + show timings + availability summary

Design goal: keep browsing highly available even under heavy load.

#### Write Scenario: Book Tickets (Correctness First)
1. Client requests booking with `showId + seatIds`
2. Booking service locks seats (Redis TTL), creates booking (`PAYMENT_PENDING`)
3. Pricing calculated via Offer/Pricing service
4. Client initiates payment → Payment service creates payment intent
5. Gateway webhook triggers payment status update
6. Payment service publishes payment event to Kafka
7. Booking service consumes event and finalizes booking (`CONFIRMED`) or releases locks (`FAILED`)

Design goal: no double-booking, idempotent retries, consistent final state.

---

### 2.4 Architecture Decisions (Why This Works)

- **Read/Write separation:** browsing uses Search + Cache; booking uses strong consistency  
- **Event-driven workflows:** decouple booking, payment, notifications, indexing  
- **Idempotency everywhere:** protects against retries/timeouts in booking/payment  
- **Horizontal scalability:** stateless services scale behind load balancers  
- **High availability:** multi-AZ, retries/backoff, circuit breakers, graceful degradation  

---

## 3. Component Descriptions

This section describes major components, responsibilities, data ownership, key APIs, and interactions.

### 3.1 Core Application Services

#### 3.1.1 API Gateway

| Aspect | Description |
|---|---|
| Purpose | Single entry point for external client and partner requests |
| Responsibilities | AuthN/AuthZ, rate limiting, request validation, WAF rules, API versioning |
| Data Store | None (stateless) |
| Key Interactions | Routes requests to backend microservices |
| Availability Strategy | Stateless, horizontally scalable behind load balancer |

#### 3.1.2 Catalog Service

| Aspect | Description |
|---|---|
| Purpose | Manages static and semi-static metadata |
| Responsibilities | Movies, genres, languages, cities, availability metadata |
| Data Store | Document or relational DB (MongoDB / PostgreSQL) |
| Key APIs | Get movies by city/language, retrieve movie metadata |
| Events | Publishes catalog update events |
| Scaling Considerations | Read-heavy; aggressively cached |

#### 3.1.3 Theatre & Partner Service

| Aspect | Description |
|---|---|
| Purpose | Manages theatre partners and infrastructure |
| Responsibilities | Partner onboarding, theatre creation, screen setup, seat layout configuration |
| Data Store | Relational database |
| Key APIs | Create/update theatre, screens, seat maps |
| Security | Partner-role based access control |
| Events | Theatre and seat layout change events |

#### 3.1.4 Show Service

| Aspect | Description |
|---|---|
| Purpose | Manages movie show scheduling |
| Responsibilities | Create/update/cancel shows; maintain show status |
| Data Store | Relational database |
| Key APIs | Create show, fetch shows by theatre/movie/date |
| Events | Show published/updated events for indexing |
| Consistency Model | Strong consistency per show |

#### 3.1.5 Search Service

| Aspect | Description |
|---|---|
| Purpose | Fast, browse-optimized read APIs |
| Responsibilities | Movie → city → theatre → show discovery |
| Data Store | OpenSearch / Elasticsearch |
| Key APIs | Browse shows, theatres by filters |
| Data Source | Event-driven updates from Catalog and Show services |
| Performance | Millisecond-level latency for high traffic |

#### 3.1.6 Booking Service

| Aspect | Description |
|---|---|
| Purpose | Core transactional service for ticket booking |
| Responsibilities | Seat locking, booking lifecycle, concurrency control |
| Data Store | Relational database (strong consistency) |
| Cache | Redis locks with TTL |
| Key APIs | Create booking, confirm booking, cancel booking |
| Events | Booking created/confirmed/failed/expired |
| Design Patterns | State Machine, Idempotency |

#### 3.1.7 Offer & Pricing Service

| Aspect | Description |
|---|---|
| Purpose | Centralized pricing and promotions engine |
| Responsibilities | Discount rules, price calculation |
| Data Store | Relational/config store |
| Key APIs | Calculate price |
| Design Patterns | Strategy pattern |
| Extensibility | Plug-and-play rule addition |

#### 3.1.8 Payment Service

| Aspect | Description |
|---|---|
| Purpose | Payment orchestration |
| Responsibilities | Payment intent, webhook processing, status tracking |
| Data Store | Relational database |
| Key APIs | Initiate payment, webhook endpoint |
| External Integrations | Payment gateways |
| Reliability | Idempotent processing, retries/backoff |

#### 3.1.9 Notification Service

| Aspect | Description |
|---|---|
| Purpose | Asynchronous communication |
| Responsibilities | Confirmations, cancellations, refunds |
| Data Store | None / lightweight delivery status store |
| Channels | SMS, Email, WhatsApp |
| Trigger Mechanism | Kafka events |
| Fault Handling | Retry + DLQ |

---

### 3.2 Infrastructure Components

#### 3.2.1 Cache (Redis)

| Aspect | Description |
|---|---|
| Purpose | Reduce latency and DB load |
| Usage | Seat locks, hot browse data |
| TTL Strategy | Short TTL for locks; longer TTL for catalog |
| Availability | Clustered, multi-AZ |

#### 3.2.2 Message Broker (Kafka)

| Aspect | Description |
|---|---|
| Purpose | Asynchronous communication backbone |
| Usage | Booking/payment/notification/indexing events |
| Benefits | Loose coupling, retryability, replay |
| Consumers | Booking, Payment, Search, Notification services |

#### 3.2.3 Databases

| Database Type | Usage |
|---|---|
| Relational DB | Bookings, payments, theatres, shows |
| Document DB | Catalog metadata (optional) |
| Search Index | Browse-optimized queries |

---

### 3.3 Component Interaction Summary

- **Browse path:** API Gateway → Search Service → Search Index / Cache  
- **Booking path:** API Gateway → Booking Service → Offer Service → Payment Service  
- **Async path:** Core services → Kafka → Search/Notification consumers  
- **Failure isolation:** booking/payment issues don’t impact browse functionality  

---

### 3.4 Design Rationale

- Clear service boundaries enable independent scaling and deployments.
- Event-driven design improves resilience and extensibility.
- Strong transactional guarantees are limited to booking + payment.
- Read-optimized components protect the system during traffic spikes.

---

## 4. Booking & Seat Locking Deep Dive

This section explains how correctness and consistency are ensured under high concurrency.

### 4.1 Core Challenges Addressed

- Prevent double booking under concurrent requests  
- Handle partial failures (payment outcomes, timeouts)  
- Ensure idempotent behavior for retries  
- Support high throughput while maintaining correctness  
- Auto-recover abandoned bookings  

The solution prioritizes **data integrity** for booking and **availability** for browsing.

---

### 4.2 Booking Lifecycle (State Machine)

Each booking follows a deterministic state machine owned by the Booking Service:

- `INITIATED`
  → `SEATS_LOCKED`
  → `PAYMENT_PENDING`
  → `CONFIRMED`
  → `COMPLETED`

Failure / Exit states:
- `FAILED` – payment failure or validation error  
- `EXPIRED` – seat lock TTL expired before payment completion  
- `CANCELLED` – user-initiated cancellation  

Rules:
- Monotonic + idempotent transitions  
- Only valid transitions allowed (e.g., `EXPIRED → CONFIRMED` rejected)  
- Transitions persisted atomically in the booking DB  

---

### 4.3 Seat Locking Strategy

**Design goal:** prevent two users from booking the same seat, while allowing temporary reservation during payment.

#### Chosen Approach: Redis Seat Lock with TTL

Each seat lock key:
- `seatlock:{showId}:{seatId}`

Lock acquisition:
- Booking Service uses atomic `SETNX`
- Lock TTL (example: 5 minutes)
- All seats must lock successfully, or booking fails

Properties:
- Atomic (exclusive ownership)  
- Time-bound (TTL prevents deadlocks)  
- Fast (high concurrency without DB hotspot)  

Failure handling:
- If any seat lock fails:
  - Release all acquired locks
  - Reject with conflict

---

### 4.4 Booking Creation Flow

1. Client selects show and seats  
2. Booking Service validates show status + seatIds  
3. Acquire Redis seat locks  
4. Create booking record in `SEATS_LOCKED`  
5. Store booking expiry timestamp (lock TTL)  
6. Calculate pricing via Offer & Pricing  
7. Transition to `PAYMENT_PENDING`

Idempotent via client-provided `Idempotency-Key`.

---

### 4.5 Payment Orchestration

#### Payment initiation
- Payment Service creates payment intent with gateway
- `bookingId` used as idempotency key
- Payment state tracked independently

#### Webhook processing
- Gateway sends signed webhook callbacks
- Payment Service verifies signature and publishes:
  - `payment.succeeded`
  - `payment.failed`

#### Booking finalization
- Booking Service consumes payment events
- On success:
  - booking → `CONFIRMED`
  - seat inventory marked sold
  - seat locks released
- On failure:
  - booking → `FAILED`
  - locks released

---

### 4.6 Database Consistency Guarantees

**Final authority:** relational database.

Constraints:
- Unique constraint on `(show_id, seat_id)` for sold seats  
- Transactional updates during confirmation  
- Defensive checks during finalization  

Dual-layer approach (Redis + DB) gives strong consistency + throughput.

---

### 4.7 Handling Expired Bookings

Scheduled job:
- Detect expired bookings
- Transition to `EXPIRED`
- Release remaining locks
- Publish `booking.expired`

Covers abandoned payments, client crashes, timeouts.

---

### 4.8 Idempotency & Retry Handling

Retries happen due to timeouts, duplicates, webhook replays.

Implementation:
- All write APIs accept `Idempotency-Key`
- Booking/payment deduped
- Webhooks: exactly-once logically, at-least-once physically

---

### 4.9 Failure Scenarios & Recovery

| Scenario | Handling |
|---|---|
| Payment gateway timeout | Booking stays pending until webhook or expiry |
| Duplicate payment callback | Idempotent event handling |
| Booking service crash | Redis TTL releases locks |
| Notification failure | Retried async |
| Partial system outage | Browsing remains available |

---

### 4.10 Why This Design Works

- Strong correctness guarantees  
- High concurrency without DB hot-spotting  
- Self-healing via TTL + background jobs  
- Clear ownership of state transitions  
- Scales linearly with traffic  

---

## 5. Technology Stack

### 5.1 Application & Service Layer

| Technology | Purpose | Rationale |
|---|---|---|
| Java 17+ | Backend services | Mature ecosystem, strong concurrency, reliability |
| Spring Boot | Microservices framework | Production-ready defaults, observability/security |
| Spring Web / REST | API development | Standard REST integration |
| Spring Security | AuthN/AuthZ | OAuth2/JWT, role-based access |
| Spring State Machine (optional) | Booking lifecycle | Enforced state transitions |

---

### 5.2 API & Integration Layer

| Technology | Purpose | Rationale |
|---|---|---|
| REST (OpenAPI/Swagger) | Service contracts | Clear documentation, contract-first |
| JSON | Data format | Lightweight, widely supported |
| Webhooks | Payment callbacks | Async, reliable confirmations |

---

### 5.3 Data Storage Layer

| Data Type | Technology | Usage |
|---|---|---|
| Transactional | PostgreSQL / MySQL | Bookings, payments, shows, seat inventory |
| Catalog | MongoDB / PostgreSQL | Movies, cities, languages, genres |
| Search | OpenSearch / Elasticsearch | Browse-optimized queries |
| Cache / Locks | Redis | Seat locks, hot caching |

**Rationale**
- Relational DB for ACID booking/payment
- Search index for low-latency browsing
- Redis for fast TTL locks and reduced contention

---

### 5.4 Messaging & Event Streaming

| Technology | Purpose | Rationale |
|---|---|---|
| Apache Kafka | Event streaming | High throughput, durable, replayable |
| Schema Registry (optional) | Event contracts | Backward/forward compatibility |
| DLQs | Failure handling | Safe retries and diagnostics |

Events for:
- Booking state changes
- Payment outcomes
- Show/catalog updates
- Notification triggers
- Search indexing updates

---

### 5.5 Security & Compliance

| Technology | Purpose | Rationale |
|---|---|---|
| OAuth2 / JWT | Authentication | Stateless and secure |
| Gateway WAF | Threat protection | OWASP mitigation |
| TLS (HTTPS) | In-transit security | End-to-end encryption |
| Secrets Manager / Vault | Secret storage | No secrets in code |
| Encryption at Rest | PII protection | Compliance needs |

---

### 5.6 Observability & Monitoring

| Technology | Purpose | Rationale |
|---|---|---|
| OpenTelemetry | Tracing | End-to-end visibility |
| Prometheus | Metrics | Golden signals |
| Grafana | Dashboards | Live visibility |
| ELK / Loki | Logs | Faster debugging + audits |
| Alertmanager | Alerts | SLA/SLO enforcement |

Key metrics:
- API latency/error rates
- Seat lock contention
- Booking success/failure
- Payment success rate
- Booking expiry %

---

### 5.7 Deployment & Platform

| Technology | Purpose | Rationale |
|---|---|---|
| Docker | Containers | Consistent environments |
| Kubernetes (EKS) | Orchestration | Auto-scaling, self-healing |
| CI/CD | Automation | Safer, faster releases |
| Blue-Green / Canary | Release strategy | Zero downtime |
| Multi-AZ | HA | Fault tolerance |

---

### 5.8 Cloud Infrastructure

| Component | Technology | Usage |
|---|---|---|
| Compute | AWS EKS / EC2 | Service runtime |
| Database | AWS RDS / Aurora | Multi-AZ relational |
| Cache | ElastiCache (Redis) | HA caching/locks |
| Search | AWS OpenSearch | Distributed search |
| Messaging | MSK / Managed Kafka | Event streaming |
| Load Balancer | ALB / NLB | Traffic distribution |

---

### 5.9 AI & Intelligent Capabilities (Suggested)

- Search relevance: typo tolerance, fuzzy matching  
- Customer support: FAQ / booking-status bot  
- Risk signals: fraud/anomaly detection  
- Ops insights: demand forecasting, capacity planning  

AI is pluggable and does not impact booking consistency.

---

### 5.10 Key Technology Drivers Summary

- Correctness first: ACID DB + controlled concurrency  
- Scale on demand: stateless + cache + search  
- Resilience: event-driven + graceful degradation  
- Security by design: OWASP-compliant, encrypted, auditable  
- Future-ready: modular services, extensible pricing + AI  

---

## 6. Scalability & 99.99% Availability

### 6.1 Availability Target & Design Philosophy

**99.99% uptime** → less than **4.4 minutes downtime/month**.

Principles:
- Stateless services
- Failure isolation (browse vs booking)
- Graceful degradation
- Automated recovery

Correctness is prioritized for booking; availability for browsing.

---

### 6.2 Horizontal Scalability Strategy

- Stateless microservices behind load balancers
- Kubernetes HPA based on CPU, latency, and custom metrics

| Workload | Scaling Strategy |
|---|---|
| Browsing | Scale Search + Cache aggressively |
| Booking | Scale Booking carefully to avoid lock contention |
| Payments | Scale independently based on gateway latency |
| Notifications | Fully async and elastic |

---

### 6.3 Read-Heavy Workload Optimization

Techniques:
- Search index for denormalized reads
- Redis hot caching
- Event-driven index updates
- TTL eviction to balance freshness

Result:
- Browse APIs stay up even if transactional services degrade
- Sub-second latency during spikes

---

### 6.4 Write Path Scalability (Booking)

Challenges:
- Contention on popular shows
- Strict correctness

Solutions:
- Redis locks to minimize DB contention
- Shard booking data by city or showId
- Short-lived TTL locks
- Backoff retries to avoid thundering herd

Trade-off: write throughput is intentionally constrained to preserve correctness.

---

### 6.5 Database High Availability

Relational DB:
- Multi-AZ failover
- Read replicas for reporting
- Partitioning by city/time

Cache:
- Redis cluster with replicas + failover
- TTL recovery

Search:
- Multi-node cluster
- Shard replication and rebalancing

---

### 6.6 Event-Driven Resilience

Kafka provides:
- At-least-once delivery
- Idempotent consumers
- DLQs for poison messages

Ensures:
- Booking confirmation doesn’t block notifications
- Payment processing survives transient failures

---

### 6.7 Fault Tolerance & Graceful Degradation

| Failure Scenario | System Behavior |
|---|---|
| Search down | Fallback to cache |
| Payment slow | Booking pending until webhook |
| Notification failure | Async retries |
| Booking down | Browsing unaffected |
| Cache failure | Fall back to DB |

---

### 6.8 Disaster Recovery & Multi-AZ Strategy

- Services across multiple AZs
- Load balancers spread traffic
- Automated pod restarts/rescheduling
- DB failover in minutes

Targets:
- **RTO:** < 5 minutes  
- **RPO:** near-zero for bookings/payments  

---

### 6.9 Observability-Driven Reliability

Focus:
- Golden signals: latency, traffic, errors, saturation
- Booking metrics: lock success/failure, payment rate, expiry rate

Alerts:
- SLA breach risk
- Error spikes
- Capacity thresholds

---

### 6.10 Why This Meets 99.99%

- No single point of failure
- Read/write isolation prevents cascading outages
- Stateless compute + auto-scaling
- Strong transactional boundaries only where needed
- Event-driven retries/recovery

---

## 7. Monitoring & Observability

### 7.1 Monitoring Objectives

- Ensure 99.99% SLA compliance
- Detect issues before customer impact
- Deep visibility into booking + payment correctness
- Faster RCA
- Capacity planning and forecasting

Monitoring is a first-class production feature.

---

### 7.2 Observability Pillars

1. **Metrics** (numbers)  
2. **Logs** (structured debugging + audit)  
3. **Traces** (end-to-end flow visibility)  

---

### 7.3 Metrics Monitoring

Infrastructure:
- CPU/memory/disk/network
- Pod restarts, node health
- HPA activity

Golden signals:

| Signal | Examples |
|---|---|
| Latency | p95/p99 response times |
| Traffic | RPS per service |
| Errors | HTTP 4xx/5xx |
| Saturation | thread/connection pool exhaustion |

Booking metrics:
- Seat lock success vs failure
- Booking creation and confirmation rate
- Booking expiry %
- Double-booking prevention triggers

Payment metrics:
- Payment initiation success
- Gateway latency
- Webhook success/failure
- Payment-to-booking conversion

---

### 7.4 Logging Strategy

- Structured JSON logs
- Centralized aggregation

Types:
- Application logs
- Audit logs (state transitions, payment events)
- Security logs (auth failures)

Principles:
- Correlation IDs everywhere
- Mask PII/payment data
- Retention aligned with compliance

---

### 7.5 Distributed Tracing

Coverage:
- Browse → search → cache → response
- Booking → lock → pricing → payment
- Webhook → confirmation → notification

Benefits:
- Find latency bottlenecks
- Detect dependency failures
- Speed up RCA

---

### 7.6 Alerting & Incident Response

Alerts are actionable:
- SLA near-breach
- Error spikes
- Payment gateway down
- Seat lock contention anomalies
- Resource saturation

Response:
- On-call routing
- Runbooks
- Automated recovery (restart/scale)

---

### 7.7 SLA & SLO Tracking

SLIs:
- API availability
- Booking success rate
- Payment completion rate

SLOs:
- ≥ 99.99% availability
- p99 latency thresholds
- acceptable booking failure %

Error budgets:
- guide release velocity vs stability
- freeze/rollback on exhaustion

---

### 7.8 Capacity Planning & Trend Analysis

Use monitoring data for:
- peak forecasting (releases/weekends)
- scaling validation
- performance regression detection
- cost optimization

---

### 7.9 Why This Works

- Combines technical + business metrics
- Proactive detection
- Fast troubleshooting
- SLA visibility always
- Scales with platform growth

---

## 8. Deployment & Release Strategy

### 8.1 Deployment Environments

| Environment | Purpose |
|---|---|
| Development | Local feature development |
| QA / Integration | Functional + integration tests |
| Staging / Pre-Prod | Production-like validation |
| Production | Live traffic |

Each environment is isolated (infra/config/data boundaries).

---

### 8.2 CI/CD Pipeline

CI:
- Build + unit tests
- Static analysis + security scans
- Container image build
- Versioned artifacts stored in registry

CD:
- Auto deploy to lower envs
- Manual approval to prod
- Config injection per env
- Immutable image promotion

---

### 8.3 Containerization & Orchestration

Docker:
- Each microservice packaged as image

Kubernetes (EKS):
- Rolling updates
- Liveness/readiness probes
- Auto-scaling

---

### 8.4 Release Strategies

- **Blue-Green**
  - Two identical environments
  - Switch traffic after validation
  - Instant rollback

- **Canary**
  - Small % traffic first
  - Gradual ramp using health metrics
  - Auto rollback on error thresholds

- **Feature Flags**
  - Toggle offers/pricing without redeploy
  - City/theatre rollouts

---

### 8.5 Zero-Downtime Guarantee

Achieved via:
- Stateless services behind LBs
- Rolling/canary
- Backward-compatible API/schema changes
- Safe DB migrations with versioning tools

---

### 8.6 Database Migration Strategy

- Managed via migration tools
- Prefer backward-compatible migrations
- Multi-step for breaking changes:
  1. Expand schema
  2. Deploy app changes
  3. Contract old fields
- Maintain rollback scripts

---

### 8.7 Multi-Region & Multi-AZ

AZ strategy:
- Services spread across AZs
- LBs distribute traffic

DR:
- DB failover
- Backup/restore drills
- **RTO < 5 minutes**, **RPO near zero**

---

### 8.8 Configuration & Secrets

- Centralized config
- Env-specific externalized settings
- Secrets in vault/secrets manager
- No secrets in code or images

---

### 8.9 Rollback & Recovery

Triggers:
- SLA breach
- Error spikes
- Payment/booking failures

Methods:
- Traffic switch (blue-green)
- Abort canary
- Disable feature flags

Fast + reversible + automated.

---

### 8.10 Why This Works

- Frequent safe releases
- Minimal customer impact
- Controlled experimentation
- Operational stability
- Scales with growth

---

````mdx
## 9. API Contracts (Read & Write)

This section defines the backend API contracts for **one Read scenario** (browse theatres & show timings for a selected movie in a city on a given date) and **one Write scenario** (book tickets by selecting theatre/timing/seats). APIs are **RESTful**, **versioned**, and designed for **idempotency**, **concurrency safety**, and clear **error handling**.

---

### 9.1 API Standards (Common)

#### Base
- **Base URL:** `/api/v1`  
- **Content-Type:** `application/json`  
- **Auth:** `Authorization: Bearer <JWT>`

#### Idempotency (Write APIs)
- **Header:** `Idempotency-Key: <uuid>`
- Re-sending the same request with the same key returns the same logical outcome (same booking/payment intent).

#### Correlation
- **Header:** `X-Correlation-Id: <uuid>` (optional; server generates if absent)
- Returned in response headers for traceability.

#### Standard Error Format
```json
{
  "error": {
    "code": "SEAT_LOCK_FAILED",
    "message": "One or more seats are no longer available.",
    "details": {
      "showId": "sh_123",
      "unavailableSeatIds": ["A10", "A11"]
    },
    "correlationId": "c0b7f2d0-6c8d-4e6f-9c5f-3ff2f2f0f9b1"
  }
}
````

#### Common Error Codes

* `VALIDATION_ERROR`, `UNAUTHORIZED`, `FORBIDDEN`, `NOT_FOUND`
* `SEAT_LOCK_FAILED`, `BOOKING_EXPIRED`, `PAYMENT_PENDING`, `PAYMENT_FAILED`
* `RATE_LIMITED`, `INTERNAL_ERROR`

---

### 9.2 Read Scenario APIs (Browse)

#### 9.2.1 Browse theatres running a movie in a city (with show timings for a date)

**Endpoint**

* `GET /api/v1/cities/{cityId}/movies/{movieId}/shows?date=YYYY-MM-DD&language=<optional>&format=<optional>`

**Description**
Returns theatres currently running the selected movie in the given city on the specified date, along with show timings and a lightweight availability summary.

**Path Params**

* `cityId` (required): City identifier (e.g., `blr`)
* `movieId` (required): Movie identifier (e.g., `mv_901`)

**Query Params**

* `date` (required): `YYYY-MM-DD`
* `language` (optional): e.g., `en`, `hi`, `te`
* `format` (optional): e.g., `2D`, `3D`, `IMAX`

**Response 200 (Example)**

```json
{
  "cityId": "blr",
  "movie": {
    "movieId": "mv_901",
    "title": "Example Movie",
    "language": "en",
    "genres": ["Thriller", "Drama"],
    "durationMins": 128
  },
  "date": "2026-02-01",
  "theatres": [
    {
      "theatreId": "th_101",
      "name": "PVR Orion",
      "area": "Malleshwaram",
      "shows": [
        {
          "showId": "sh_5001",
          "screenName": "Screen 3",
          "startTime": "2026-02-01T18:30:00+05:30",
          "format": "2D",
          "priceFrom": 180,
          "availability": {
            "availableSeats": 72,
            "status": "AVAILABLE"
          }
        },
        {
          "showId": "sh_5002",
          "screenName": "Screen 3",
          "startTime": "2026-02-01T21:45:00+05:30",
          "format": "2D",
          "priceFrom": 220,
          "availability": {
            "availableSeats": 12,
            "status": "FAST_FILLING"
          }
        }
      ]
    }
  ]
}
```

**Errors**

* `400 VALIDATION_ERROR` (invalid date format)
* `404 NOT_FOUND` (city/movie not found)
* `429 RATE_LIMITED`

---

#### 9.2.2 Get seat map + real-time availability for a show

**Endpoint**

* `GET /api/v1/shows/{showId}/seats`

**Description**
Returns seat layout and availability at the time of request. Availability is derived from:

* **sold seats** (DB truth)
* **locked seats** (Redis locks, TTL)

**Path Params**

* `showId` (required): Show identifier (e.g., `sh_5001`)

**Response 200 (Example)**

```json
{
  "showId": "sh_5001",
  "theatreId": "th_101",
  "screenName": "Screen 3",
  "startTime": "2026-02-01T18:30:00+05:30",
  "seatMap": {
    "rows": [
      {
        "rowLabel": "A",
        "seats": [
          { "seatId": "A1", "type": "REGULAR", "price": 180, "status": "AVAILABLE" },
          { "seatId": "A2", "type": "REGULAR", "price": 180, "status": "LOCKED", "lockExpiresAt": "2026-02-01T18:05:00+05:30" },
          { "seatId": "A3", "type": "REGULAR", "price": 180, "status": "SOLD" }
        ]
      }
    ]
  }
}
```

**Errors**

* `404 NOT_FOUND` (show not found)
* `429 RATE_LIMITED`

<details>
  <summary><strong>Notes on seat status</strong></summary>

* `AVAILABLE`: Not sold and not locked
* `LOCKED`: Temporarily reserved by another booking; includes `lockExpiresAt` when available
* `SOLD`: Finalized purchase (DB truth)

</details>

---

### 9.3 Write Scenario APIs (Booking)

#### 9.3.1 Create booking + lock seats (Seat Reservation)

**Endpoint**

* `POST /api/v1/bookings`

**Headers**

* `Authorization: Bearer <JWT>`
* `Idempotency-Key: <uuid>`
* `X-Correlation-Id: <uuid>` (optional)

**Request Body**

```json
{
  "userId": "u_2001",
  "showId": "sh_5001",
  "seatIds": ["A10", "A11"],
  "couponCode": "THIRD50",
  "client": {
    "platform": "WEB",
    "appVersion": "1.0.0"
  }
}
```

**Behavior**

* Validates `showId` is active and `seatIds` are valid
* Attempts to lock all requested seats with TTL (e.g., 5 minutes)
* Calculates price via Offer & Pricing Service
* Creates booking in `PAYMENT_PENDING` (or `SEATS_LOCKED` then transitions internally)

**Response 201 (Example)**

```json
{
  "bookingId": "bk_90001",
  "status": "PAYMENT_PENDING",
  "expiresAt": "2026-02-01T18:10:00+05:30",
  "show": {
    "showId": "sh_5001",
    "theatreId": "th_101",
    "startTime": "2026-02-01T18:30:00+05:30"
  },
  "seats": ["A10", "A11"],
  "pricing": {
    "currency": "INR",
    "subtotal": 360,
    "discount": 90,
    "fees": 20,
    "tax": 12,
    "total": 302,
    "appliedOffers": [
      { "code": "THIRD50", "description": "50% off on eligible ticket" }
    ]
  }
}
```

**Errors**

* `400 VALIDATION_ERROR` (invalid seatIds, show not bookable)
* `404 NOT_FOUND` (show not found)
* `409 SEAT_LOCK_FAILED` (one/more seats unavailable/locked)
* `410 BOOKING_EXPIRED` (idempotency replay edge-case referencing expired booking)
* `429 RATE_LIMITED`

---

#### 9.3.2 Initiate payment for a booking (Create Payment Intent)

**Endpoint**

* `POST /api/v1/bookings/{bookingId}/payments`

**Headers**

* `Authorization: Bearer <JWT>`
* `Idempotency-Key: <uuid>`
* `X-Correlation-Id: <uuid>` (optional)

**Request Body**

```json
{
  "provider": "RAZORPAY",
  "returnUrl": "https://app.example.com/payment/return"
}
```

**Response 201 (Example)**

```json
{
  "paymentId": "pay_771",
  "bookingId": "bk_90001",
  "status": "INITIATED",
  "amount": 302,
  "currency": "INR",
  "provider": "RAZORPAY",
  "providerPayload": {
    "paymentIntentId": "pi_abc123",
    "redirectUrl": "https://gateway.example.com/pay/pi_abc123"
  }
}
```

**Errors**

* `404 NOT_FOUND` (booking not found)
* `409 PAYMENT_PENDING` (payment already initiated)
* `410 BOOKING_EXPIRED` (seat lock expired)
* `422` (booking not in payable state)

<details>
  <summary><strong>State validation rules (payable)</strong></summary>

A booking is payable only if:

* status is `PAYMENT_PENDING`
* and current time is **before** `expiresAt`

</details>

---

#### 9.3.3 Payment webhook (Provider → Platform)

This endpoint is called by the payment gateway and **not** by the client.

**Endpoint**

* `POST /api/v1/payments/webhooks/{provider}`

**Security**

* Signature validation (provider-specific headers)
* Optional IP allowlist
* Idempotent handling based on `eventId` / provider event identifier

**Webhook Payload (Example)**

```json
{
  "eventId": "evt_99001",
  "paymentIntentId": "pi_abc123",
  "status": "SUCCEEDED",
  "amount": 302,
  "currency": "INR",
  "bookingReference": "bk_90001",
  "timestamp": "2026-02-01T18:06:30+05:30"
}
```

**Webhook Response**

* `200 OK` if processed (or already processed)
* `400` if signature invalid

**Downstream Effect**

* Payment Service publishes `payment.succeeded` / `payment.failed` event
* Booking Service consumes event and finalizes booking

---

#### 9.3.4 Get booking status (Client polling fallback)

**Endpoint**

* `GET /api/v1/bookings/{bookingId}`

**Response 200 (Example)**

```json
{
  "bookingId": "bk_90001",
  "status": "CONFIRMED",
  "confirmedAt": "2026-02-01T18:06:45+05:30",
  "ticket": {
    "ticketId": "tkt_50001",
    "qrCodeData": "BASE64_OR_TOKEN",
    "seatIds": ["A10", "A11"]
  }
}
```

**Errors**

* `404 NOT_FOUND`
* `403 FORBIDDEN` (booking does not belong to user)

---

#### 9.3.5 Cancel booking (Optional but production-realistic)

**Endpoint**

* `POST /api/v1/bookings/{bookingId}/cancel`

**Behavior**

* If `PAYMENT_PENDING` and lock valid → cancel + release locks
* If already `CONFIRMED` → initiate refund workflow (if supported)

**Response 200 (Example)**

```json
{
  "bookingId": "bk_90001",
  "status": "CANCELLED",
  "refund": {
    "status": "INITIATED",
    "amount": 302
  }
}
```

---

### 9.4 Status Codes Summary

| API              | Success Codes | Common Failure Codes    |
| ---------------- | ------------: | ----------------------- |
| Browse shows     |           200 | 400, 404, 429           |
| Seat map         |           200 | 404, 429                |
| Create booking   |           201 | 400, 404, 409, 410, 429 |
| Initiate payment |           201 | 404, 409, 410, 422      |
| Webhook          |           200 | 400                     |
| Get booking      |           200 | 403, 404                |


