AWS DynamoDB — NoSQL, Keys, Indexes, Streams, and DAX
A full walkthrough of AWS DynamoDB — tables, partition and sort keys, read/write capacity, GSIs, DynamoDB Streams, DAX, TTL, transactions, and design patterns
What is DynamoDB?
Amazon DynamoDB is a fully managed, serverless, key-value and document NoSQL database. It delivers single-digit millisecond performance at any scale — from a few requests per second to millions. There is no server to manage, no OS to patch, no capacity to pre-allocate.
DynamoDB is designed for workloads where:
- You need consistent, low-latency reads and writes at any scale
- Access patterns are known and relatively simple
- You need a schemaless, flexible data model
- You want zero operational overhead
It is not a replacement for relational databases — it has no joins, no aggregation functions, no complex queries across arbitrary columns.
Core Concepts
| Term | Meaning |
|---|---|
| Table | The container for data — equivalent to a database table |
| Item | A single record in a table — equivalent to a row |
| Attribute | A data field on an item — equivalent to a column, but schemaless |
| Partition Key | The primary key — used to distribute items across partitions |
| Sort Key | Optional second part of a composite key — used to sort items within a partition |
| Item size limit | 400 KB per item |
Primary Keys
The primary key uniquely identifies each item in a table. DynamoDB supports two types:
Simple Primary Key (Partition Key only)
Each item’s partition key must be unique.
1
2
3
4
5
6
7
Table: Users
Partition Key: user_id (String)
user_id │ name │ email
──────────────┼──────────────┼─────────────────
u_001 │ Alice │ alice@example.com
u_002 │ Bob │ bob@example.com
Composite Primary Key (Partition Key + Sort Key)
Multiple items can share the same partition key — but the combination of partition key + sort key must be unique. Items with the same partition key are stored together, sorted by sort key. This is the most powerful key design pattern.
1
2
3
4
5
6
7
8
9
Table: Orders
Partition Key: customer_id (String)
Sort Key: order_date (String, ISO format)
customer_id │ order_date │ total
──────────────┼──────────────────────┼──────
c_001 │ 2026-05-01T10:00:00 │ 49.99
c_001 │ 2026-05-15T14:30:00 │ 129.50
c_002 │ 2026-05-02T09:00:00 │ 75.00
With this design, you can query all orders for a customer (partition key = c_001) and filter by date range (sort key between dates) — efficiently, without scanning the whole table.
Data Types
Scalar Types
| Type | Example |
|---|---|
| String (S) | "hello", "2026-05-01" |
| Number (N) | 42, 3.14 (stored as string internally) |
| Binary (B) | Base64-encoded binary data |
| Boolean (BOOL) | true, false |
| Null (NULL) | true (represents a null value) |
Document Types
| Type | Example |
|---|---|
| Map (M) | {"street": "123 Main St", "city": "London"} (nested object) |
| List (L) | ["red", "green", "blue"] (ordered list, mixed types) |
Set Types
| Type | Example |
|---|---|
| String Set (SS) | {"a", "b", "c"} (unique strings) |
| Number Set (NS) | {1, 2, 3} (unique numbers) |
| Binary Set (BS) | Set of unique binary values |
Read/Write Capacity Modes
On-Demand Mode
Pay per request — DynamoDB scales instantly with no capacity planning. No minimum capacity, no throttling. More expensive per request than provisioned, but zero waste. Use for unpredictable traffic, new tables, or low-traffic workloads.
1
2
3
4
5
6
7
8
9
10
# Create table in on-demand mode
aws dynamodb create-table \
--table-name Orders \
--attribute-definitions \
AttributeName=customer_id,AttributeType=S \
AttributeName=order_date,AttributeType=S \
--key-schema \
AttributeName=customer_id,KeyType=HASH \
AttributeName=order_date,KeyType=RANGE \
--billing-mode PAY_PER_REQUEST
Provisioned Mode
You set read and write capacity units (RCU/WCU) in advance. Traffic that exceeds provisioned capacity is throttled (requests fail with ProvisionedThroughputExceededException). Cheaper per unit than on-demand — cost-effective for predictable, consistent workloads. Use Auto Scaling with provisioned mode to adjust capacity automatically based on utilisation targets.
| Capacity Unit | Definition |
|---|---|
| 1 RCU | 1 strongly consistent read per second (or 2 eventually consistent) for items up to 4 KB |
| 1 WCU | 1 write per second for items up to 1 KB |
1
2
3
4
5
6
7
# Create table with provisioned capacity + auto scaling
aws dynamodb create-table \
--table-name Products \
--attribute-definitions AttributeName=product_id,AttributeType=S \
--key-schema AttributeName=product_id,KeyType=HASH \
--billing-mode PROVISIONED \
--provisioned-throughput ReadCapacityUnits=100,WriteCapacityUnits=50
CRUD Operations
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
# Put item (create or replace)
aws dynamodb put-item \
--table-name Orders \
--item '{
"customer_id": {"S": "c_001"},
"order_date": {"S": "2026-06-01T10:00:00"},
"total": {"N": "99.99"},
"status": {"S": "pending"},
"items": {"L": [
{"M": {"sku": {"S": "ABC123"}, "qty": {"N": "2"}}}
]}
}'
# Get item (by primary key — fastest operation)
aws dynamodb get-item \
--table-name Orders \
--key '{"customer_id": {"S": "c_001"}, "order_date": {"S": "2026-06-01T10:00:00"}}'
# Update a specific attribute (partial update — no full replace)
aws dynamodb update-item \
--table-name Orders \
--key '{"customer_id": {"S": "c_001"}, "order_date": {"S": "2026-06-01T10:00:00"}}' \
--update-expression "SET #s = :status" \
--expression-attribute-names '{"#s": "status"}' \
--expression-attribute-values '{":status": {"S": "shipped"}}'
# Delete item
aws dynamodb delete-item \
--table-name Orders \
--key '{"customer_id": {"S": "c_001"}, "order_date": {"S": "2026-06-01T10:00:00"}}'
# Query — fetch all orders for a customer, last 30 days
aws dynamodb query \
--table-name Orders \
--key-condition-expression "customer_id = :cid AND order_date >= :start" \
--expression-attribute-values '{
":cid": {"S": "c_001"},
":start": {"S": "2026-05-01"}
}'
# Scan — reads every item (expensive — avoid in production)
aws dynamodb scan --table-name Products --filter-expression "price < :p" \
--expression-attribute-values '{":p": {"N": "50"}}'
Query vs Scan: Query reads only items matching the partition key — efficient. Scan reads every item in the table — expensive and slow at scale. Design your keys and indexes so you never need a full table scan in your hot path.
Global Secondary Indexes (GSI)
A GSI lets you query on attributes other than the primary key. It’s a separate projection of the table with its own partition key and optional sort key. A table can have up to 20 GSIs.
1
2
3
4
5
Table: Orders (PK: customer_id, SK: order_date)
GSI: StatusIndex (PK: status, SK: order_date)
→ Query: "give me all pending orders placed after May 1"
→ Uses StatusIndex: status = "pending" AND order_date > "2026-05-01"
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# Create table with a GSI
aws dynamodb create-table \
--table-name Orders \
--attribute-definitions \
AttributeName=customer_id,AttributeType=S \
AttributeName=order_date,AttributeType=S \
AttributeName=status,AttributeType=S \
--key-schema \
AttributeName=customer_id,KeyType=HASH \
AttributeName=order_date,KeyType=RANGE \
--global-secondary-indexes '[{
"IndexName": "StatusIndex",
"KeySchema": [
{"AttributeName": "status", "KeyType": "HASH"},
{"AttributeName": "order_date", "KeyType": "RANGE"}
],
"Projection": {"ProjectionType": "ALL"},
"ProvisionedThroughput": {"ReadCapacityUnits": 50, "WriteCapacityUnits": 25}
}]' \
--billing-mode PROVISIONED \
--provisioned-throughput ReadCapacityUnits=100,WriteCapacityUnits=50
Local Secondary Indexes (LSI)
An LSI shares the same partition key as the table but uses a different sort key. Must be defined at table creation — cannot add one later. A table can have up to 5 LSIs. LSIs share the table’s provisioned capacity (unlike GSIs which have their own).
DynamoDB Streams
DynamoDB Streams captures a time-ordered sequence of all item-level modifications in a table. Each record contains the old and/or new image of the item. Records are retained for 24 hours.
Use streams for:
- Triggering Lambda on data changes (change data capture)
- Replicating data to other tables or services
- Building audit logs
- Cache invalidation
📸 SCREENSHOT: DynamoDB → Table → Exports and streams → DynamoDB stream details. Show the stream enabled with “New and old images” view type, and the Lambda trigger attached to it.
1
2
3
4
5
6
7
8
9
10
11
# Enable streams on an existing table
aws dynamodb update-table \
--table-name Orders \
--stream-specification StreamEnabled=true,StreamViewType=NEW_AND_OLD_IMAGES
# Add a Lambda trigger (Lambda processes records from the stream)
aws lambda create-event-source-mapping \
--event-source-arn arn:aws:dynamodb:eu-west-1:123456789012:table/Orders/stream/2026-06-01T00:00:00.000 \
--function-name process-order-changes \
--batch-size 100 \
--starting-position LATEST
TTL — Time to Live
TTL automatically deletes items when a specified Unix timestamp attribute expires. Deletes happen within 48 hours of expiry — not exactly at expiry time. There is no additional cost for TTL deletes.
Use TTL for session data, temporary records, cache tables, or compliance data with retention limits.
1
2
3
4
5
6
7
8
9
10
11
12
13
# Enable TTL on a table (specify which attribute holds the expiry timestamp)
aws dynamodb update-time-to-live \
--table-name Sessions \
--time-to-live-specification Enabled=true,AttributeName=expires_at
# When writing an item, include the expiry timestamp (Unix epoch)
aws dynamodb put-item \
--table-name Sessions \
--item '{
"session_id": {"S": "sess_abc123"},
"user_id": {"S": "u_001"},
"expires_at": {"N": "1780000000"}
}'
Transactions
DynamoDB supports ACID transactions across multiple items and tables in the same region. Use transactions when you need all-or-nothing writes — for example, transferring credits between two user accounts.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# Transact write — atomically update two items
aws dynamodb transact-write-items \
--transact-items '[
{
"Update": {
"TableName": "Accounts",
"Key": {"account_id": {"S": "acc_001"}},
"UpdateExpression": "SET balance = balance - :amount",
"ConditionExpression": "balance >= :amount",
"ExpressionAttributeValues": {":amount": {"N": "100"}}
}
},
{
"Update": {
"TableName": "Accounts",
"Key": {"account_id": {"S": "acc_002"}},
"UpdateExpression": "SET balance = balance + :amount",
"ExpressionAttributeValues": {":amount": {"N": "100"}}
}
}
]'
Transactions consume 2× the RCUs/WCUs of regular operations.
DynamoDB Accelerator (DAX)
DAX is an in-memory cache for DynamoDB, fully API-compatible with DynamoDB. It reduces read latency from milliseconds to microseconds without changing your application code. DAX is ideal for read-heavy workloads (product catalogues, leaderboards, social feeds).
DAX does NOT help with:
- Write-heavy workloads (writes go to DynamoDB directly)
- Strongly consistent reads (DAX only serves eventually consistent reads)
- Table scans and queries across large datasets
1
2
3
Application → DAX Cluster → DynamoDB
│ cache hit: <1ms
└── cache miss: reads from DynamoDB and caches result
📸 SCREENSHOT: DynamoDB → DAX → Create cluster. Show the cluster configuration with node type, number of nodes, subnet group, and the security group selection.
Point-in-Time Recovery (PITR)
PITR continuously backs up your table and lets you restore it to any second within the last 35 days. Enable it on every production table.
1
2
3
4
5
6
7
8
9
10
# Enable PITR
aws dynamodb update-continuous-backups \
--table-name Orders \
--point-in-time-recovery-specification PointInTimeRecoveryEnabled=true
# Restore to a point in time (creates a new table)
aws dynamodb restore-table-to-point-in-time \
--source-table-name Orders \
--target-table-name Orders-restored-2026-06-01 \
--restore-date-time 2026-06-01T12:00:00Z
DynamoDB Design Principles
DynamoDB rewards good key design and punishes bad key design. The goal is to spread traffic evenly across partitions — a hot partition (one partition getting all the traffic) causes throttling.
Access Pattern First
Unlike relational databases where you design the schema first, DynamoDB requires you to know your access patterns up front. Start with: “what queries does my application run?” — then design keys and indexes to serve those queries efficiently.
Single-Table Design
Advanced DynamoDB users often put multiple entity types in one table (called single-table design). Different item types use different prefixes in the key: USER#u_001, ORDER#o_001, PRODUCT#p_001. GSIs allow multiple access patterns across these entity types. This reduces costs and latency by keeping related data in one place.
Hot Partition Prevention
Avoid partition keys with low cardinality (e.g. status = "active" — all active items hit the same partition). Use high-cardinality keys (user IDs, UUIDs, order IDs). For write-heavy workloads, add a random suffix (sharding) to distribute writes: user_id#0, user_id#1, …, user_id#9.
Quick Reference
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
# Tables
aws dynamodb list-tables
aws dynamodb describe-table --table-name Orders
aws dynamodb describe-table --table-name Orders \
--query 'Table.[TableName,TableStatus,ItemCount,TableSizeBytes]'
# Capacity
aws dynamodb describe-table --table-name Orders \
--query 'Table.BillingModeSummary'
# Backup
aws dynamodb list-backups --table-name Orders
aws dynamodb create-backup --table-name Orders --backup-name orders-backup-$(date +%Y%m%d)
# Export to S3 (for analytics without consuming table capacity)
aws dynamodb export-table-to-point-in-time \
--table-arn arn:aws:dynamodb:eu-west-1:123456789012:table/Orders \
--s3-bucket my-ddb-exports \
--export-format DYNAMODB_JSON
# Check CloudWatch metrics
aws cloudwatch get-metric-statistics \
--namespace AWS/DynamoDB \
--metric-name ConsumedReadCapacityUnits \
--dimensions Name=TableName,Value=Orders \
--start-time $(date -d '1 hour ago' --iso-8601=seconds) \
--end-time $(date --iso-8601=seconds) \
--period 60 \
--statistics Sum
| When to use DynamoDB | When to use RDS/Aurora |
|---|---|
| Single-digit ms latency at any scale | Complex queries, JOINs, aggregations |
| Known, simple access patterns | Ad-hoc querying and reporting |
| Schemaless, evolving data model | Strong relational integrity (foreign keys) |
| Serverless, event-driven architectures | Existing relational application |
| Massive scale with auto-scaling | ACID transactions across many entities |