# senior-database-engineer-nosql

> Expert NoSQL database engineering including MongoDB, Redis, document design, caching strategies, and distributed data patterns

- Author: cahyo40
- Repository: cahyo40/agent
- Version: 20260202183836
- Stars: 0
- Forks: 0
- Last Updated: 2026-02-06
- Source: https://github.com/cahyo40/agent
- Web: https://mule.run/skillshub/@@cahyo40/agent~senior-database-engineer-nosql:20260202183836

---

---
name: senior-database-engineer-nosql
description: "Expert NoSQL database engineering including MongoDB, Redis, document design, caching strategies, and distributed data patterns"
---

# Senior Database Engineer (NoSQL)

## Overview

This skill transforms you into an experienced Senior Database Engineer specializing in NoSQL databases. You'll design efficient document schemas, implement caching strategies, handle distributed data patterns, and optimize performance for various NoSQL systems.

## When to Use This Skill

- Use when designing NoSQL database schemas
- Use when choosing between NoSQL database types
- Use when implementing caching with Redis
- Use when working with MongoDB, DynamoDB, or Cassandra
- Use when handling unstructured or semi-structured data
- Use when scaling horizontally

## How It Works

### Step 1: Choose the Right NoSQL Type

```
NoSQL DATABASE TYPES
├── DOCUMENT (MongoDB, CouchDB, Firestore)
│   ├── Best for: Flexible schemas, nested data
│   ├── Data model: JSON/BSON documents
│   └── Use case: CMS, catalogs, user profiles
│
├── KEY-VALUE (Redis, DynamoDB, Memcached)
│   ├── Best for: Caching, sessions, real-time data
│   ├── Data model: Key → Value pairs
│   └── Use case: Sessions, leaderboards, queues
│
├── COLUMN-FAMILY (Cassandra, HBase)
│   ├── Best for: Time-series, write-heavy workloads
│   ├── Data model: Row key → Column families
│   └── Use case: IoT, logs, analytics
│
└── GRAPH (Neo4j, Amazon Neptune)
    ├── Best for: Highly connected data
    ├── Data model: Nodes + Edges
    └── Use case: Social networks, recommendations
```

### Step 2: Design MongoDB Documents

```javascript
// Document Design Patterns

// 1. EMBEDDED DOCUMENTS (denormalized)
// ✅ Good for: Data accessed together, 1:few relationships
{
  "_id": ObjectId("..."),
  "email": "user@example.com",
  "profile": {
    "name": "John Doe",
    "avatar": "https://..."
  },
  "addresses": [
    { "type": "home", "street": "123 Main St", "city": "NYC" },
    { "type": "work", "street": "456 Office Blvd", "city": "NYC" }
  ]
}

// 2. REFERENCES (normalized)
// ✅ Good for: Large/growing subdocuments, many-to-many
{
  "_id": ObjectId("order123"),
  "userId": ObjectId("user456"),  // Reference
  "productIds": [ObjectId("prod1"), ObjectId("prod2")],
  "total": 299.99
}

// 3. BUCKET PATTERN (for time-series)
{
  "_id": ObjectId("..."),
  "sensorId": "temp-001",
  "date": ISODate("2025-01-30"),
  "measurements": [
    { "ts": ISODate("..."), "value": 23.5 },
    { "ts": ISODate("..."), "value": 24.1 }
  ],
  "count": 2,
  "sum": 47.6
}
```

### Step 3: Master Redis Patterns

```redis
# STRING: Simple key-value
SET user:1:name "John Doe"
GET user:1:name
SETEX session:abc123 3600 "user_data"  # Expires in 1 hour

# HASH: Object-like structure
HSET user:1 name "John" email "john@example.com" age 30
HGET user:1 name
HGETALL user:1

# LIST: Queues, recent items
LPUSH queue:emails "email1" "email2"
RPOP queue:emails
LRANGE recent:products 0 9  # Last 10 products

# SET: Unique collections
SADD tags:article:1 "tech" "news" "trending"
SMEMBERS tags:article:1
SINTER tags:article:1 tags:article:2  # Common tags

# SORTED SET: Leaderboards, rankings
ZADD leaderboard 1500 "player1" 1200 "player2"
ZREVRANGE leaderboard 0 9 WITHSCORES  # Top 10
ZINCRBY leaderboard 100 "player1"

# CACHING PATTERN
def get_user(user_id):
    cached = redis.get(f"user:{user_id}")
    if cached:
        return json.loads(cached)
    
    user = db.users.find_one({"_id": user_id})
    redis.setex(f"user:{user_id}", 3600, json.dumps(user))
    return user
```

## Examples

### Example 1: MongoDB Indexing

```javascript
// Create indexes
db.products.createIndex({ "sku": 1 }, { unique: true })
db.products.createIndex({ "category": 1, "price": 1 })
db.products.createIndex({ "name": "text", "description": "text" })
db.orders.createIndex({ "userId": 1, "createdAt": -1 })

// Compound index for queries
db.orders.find({ userId: "123", status: "pending" })
    .sort({ createdAt: -1 })
// Needs: { userId: 1, status: 1, createdAt: -1 }
```

### Example 2: Cache-Aside Pattern

```python
import redis
import json

class CacheService:
    def __init__(self, redis_client, db, ttl=3600):
        self.redis = redis_client
        self.db = db
        self.ttl = ttl
    
    def get(self, key, fetch_fn):
        # Try cache first
        cached = self.redis.get(key)
        if cached:
            return json.loads(cached)
        
        # Cache miss - fetch from DB
        data = fetch_fn()
        if data:
            self.redis.setex(key, self.ttl, json.dumps(data))
        return data
    
    def invalidate(self, key):
        self.redis.delete(key)
```

## Best Practices

### ✅ Do This

- ✅ Design schema based on query patterns (not entity relationships)
- ✅ Embed data that is queried together
- ✅ Use TTL for cache expiration
- ✅ Implement cache invalidation strategy
- ✅ Use connection pooling
- ✅ Handle cache failures gracefully

### ❌ Avoid This

- ❌ Don't use NoSQL just because it's trendy
- ❌ Don't neglect indexing
- ❌ Don't store unbounded arrays in documents
- ❌ Don't cache everything (be selective)
- ❌ Don't forget about consistency requirements

## Common Pitfalls

**Problem:** Document too large (MongoDB 16MB limit)
**Solution:** Use bucket pattern or store references.

**Problem:** Cache stampede (many requests hit DB after cache expires)
**Solution:** Use cache locking or probabilistic early expiration.

**Problem:** Stale cache data
**Solution:** Implement cache-aside with proper invalidation.

## Related Skills

- `@senior-database-engineer-sql` - For relational databases
- `@senior-backend-developer` - For application integration
- `@senior-data-analyst` - For analytics queries