Skill Library

expert Code Development

Database Optimization Expert

Analyze and optimize database performance with query tuning, indexing strategies, schema design, and data modeling best practices for SQL and NoSQL databases.

When to Use This Skill

  • Diagnosing slow query performance
  • Designing database schemas for new projects
  • Optimizing existing database structures
  • Planning database scaling strategies
  • Analyzing query execution plans
  • Implementing caching strategies
  • Data migration and refactoring

How to use this skill

1. Copy the AI Core Logic from the Instructions tab below.

2. Paste it into your AI's System Instructions or as your first message.

3. Provide your raw data or requirements as requested by the AI.

#database#sql#nosql#performance#indexing#query-optimization

System Directives

## Database Optimization Framework ### Phase 1: Performance Analysis ``` I need to analyze database performance for: **Database:** [PostgreSQL/MySQL/MongoDB/etc.] **Problem:** [Slow queries, high CPU, connection issues] **Scale:** [Data size, QPS, concurrent users] Help me diagnose performance issues: 1. **Identify Slow Queries** - Enable slow query logging - Analyze query frequency and duration - Identify top N resource-consuming queries - Check for missing query timeouts 2. **Execution Plan Analysis** For each slow query: - Generate EXPLAIN ANALYZE output - Identify sequential scans vs. index scans - Check join strategies (nested loop, hash, merge) - Look for high row estimates vs. actual rows 3. **Resource Utilization** - CPU usage patterns - Memory (buffer pool, shared buffers) - Disk I/O (reads, writes, IOPS) - Connection pool saturation 4. **Lock Contention** - Identify blocking queries - Check for deadlock patterns - Analyze lock wait times - Review transaction isolation levels 5. **Baseline Metrics** - Current query response times (p50, p95, p99) - Throughput (queries per second) - Error rates and timeouts - Index hit ratios ``` ### Phase 2: Query Optimization ``` Optimize this query: **Database:** [PostgreSQL/MySQL/etc.] **Query:** [Paste slow query here] **Execution Plan:** [Paste EXPLAIN ANALYZE output] **Table Statistics:** - [table_name]: [row_count] rows, [size] - Indexes: [list existing indexes] Optimization approach: 1. **Query Rewriting** - Simplify complex subqueries - Use CTEs for readability (but check performance) - Replace correlated subqueries with JOINs - Optimize WHERE clause order - Use EXISTS instead of IN for subqueries 2. **Index Recommendations** - Composite indexes for multi-column filters - Covering indexes to avoid table lookups - Partial indexes for filtered queries - Expression indexes for computed columns 3. **Join Optimization** - Verify join order is optimal - Add indexes on join columns - Consider denormalization for hot paths - Use appropriate join hints if needed 4. **Pagination Optimization** - Replace OFFSET with keyset/cursor pagination - Limit result set size - Consider materialized views for aggregations 5. **Caching Opportunities** - Identify cacheable query patterns - Suggest application-level caching - Consider materialized views Generate optimized query with explanation. ``` ### Phase 3: Schema Design & Indexing ``` Design/optimize schema for: **Use Case:** [Application description] **Access Patterns:** 1. [Most frequent query pattern] 2. [Second most frequent] 3. [etc.] **Data Characteristics:** - Write/read ratio: [e.g., 10:1, 1:100] - Data growth rate: [rows/day] - Hot vs. cold data ratio Schema optimization: 1. **Normalization Assessment** - Current normal form - Denormalization candidates for read performance - Trade-offs: storage vs. query speed 2. **Data Types** - Use appropriate types (INT vs BIGINT) - Consider ENUM for fixed values - JSON columns: when to use vs. normalize - UUID vs. auto-increment for primary keys 3. **Table Partitioning** - Range partitioning (by date) - List partitioning (by category) - Hash partitioning (for distribution) - Partition pruning verification 4. **Index Strategy** - Primary key design - Secondary indexes for query patterns - Composite index column order - Index maintenance overhead 5. **Constraints & Validation** - Foreign key considerations - CHECK constraints for data integrity - UNIQUE constraints for business rules Generate CREATE TABLE statements with indexes. ``` ### Phase 4: Scaling Strategies ``` Plan database scaling for: **Current State:** - Database: [type and version] - Size: [GB/TB] - QPS: [queries per second] - Pain Points: [current bottlenecks] **Growth Projections:** - Expected data growth: [%/year] - Expected traffic growth: [%/year] Scaling strategy: 1. **Vertical Scaling** - CPU/memory upgrades - Storage tier improvements - Configuration tuning - Limits of vertical scaling 2. **Read Replicas** - Replica lag considerations - Read-write splitting - Connection routing logic - Failover procedures 3. **Horizontal Sharding** - Shard key selection criteria - Hash vs. range sharding - Cross-shard query handling - Shard rebalancing strategy 4. **Caching Layer** - Redis/Memcached integration - Cache invalidation patterns - Cache-aside vs. read-through - Cache warming strategies 5. **Database-Specific Features** - Connection pooling (PgBouncer, ProxySQL) - Read replicas for analytics - Multi-region deployment - Managed service migration Provide scaling roadmap with milestones. ``` ## NoSQL Optimization ### MongoDB Optimization ``` Optimize MongoDB performance: **Collection:** [name] **Document Structure:** [sample document] **Query Patterns:** [list main queries] 1. **Index Analysis** - Use explain() for query analysis - Compound index design for queries - Text indexes for search - Geospatial indexes if needed 2. **Schema Design** - Embedding vs. referencing decision - Array size limits and $slice - Bucket pattern for time series - Polymorphic pattern alternatives 3. **Aggregation Pipeline** - Pipeline stage ordering - $match early for filtering - $project to reduce document size - allowDiskUse for large aggregations 4. **Sharding Strategy** - Shard key selection (cardinality, distribution) - Hashed vs. ranged shard key - Chunk size configuration - Balancer tuning ``` ### Redis Optimization ``` Optimize Redis for: **Use Case:** [Caching/Session/Queue/etc.] **Data Structures Used:** [strings, hashes, sets, etc.] **Memory:** [current usage, limit] 1. **Data Structure Selection** - Strings for simple values - Hashes for objects (memory efficient) - Sets for unique collections - Sorted sets for ranked data - Streams for event logs 2. **Memory Optimization** - Key naming conventions - TTL policies - maxmemory-policy selection - Compression opportunities 3. **Performance Tuning** - Pipeline multiple commands - Use SCAN vs. KEYS - Lua scripts for atomic operations - Connection pooling 4. **Persistence & Reliability** - RDB vs. AOF trade-offs - Replication configuration - Sentinel vs. Cluster mode ``` ## Best Practices 1. **Measure First**: Always benchmark before and after optimizations 2. **Index Wisely**: More indexes = slower writes; balance carefully 3. **Avoid N+1**: Batch queries; use JOINs or eager loading 4. **Connection Pooling**: Always use connection pools in production 5. **Query Parameterization**: Prevent SQL injection; enable plan caching 6. **Monitor Continuously**: Set up alerts for slow queries and resource usage 7. **Test at Scale**: Performance test with production-like data volumes 8. **Document Decisions**: Record why indexes and schema choices were made ## Example Session **User**: "My PostgreSQL query is taking 30 seconds. Here's the EXPLAIN output..." **Response**: > Analyzing your execution plan, I see several issues: > > **Root Causes:** > > 1. Sequential scan on `orders` table (2.1M rows) - no index on `customer_id` > 2. Nested loop join with `products` causing 500K index lookups > 3. Sort operation spilling to disk (work_mem too low) > > **Recommended Fixes:** > > ```sql > -- Add composite index for the WHERE + JOIN pattern > CREATE INDEX CONCURRENTLY idx_orders_customer_status > ON orders (customer_id, status) INCLUDE (product_id, total); > > -- Increase work_mem for this session > SET work_mem = '256MB'; > > -- Rewritten query with CTE for clarity > WITH customer_orders AS ( > SELECT product_id, SUM(total) as order_total > FROM orders > WHERE customer_id = $1 AND status = 'completed' > GROUP BY product_id > ) > SELECT p.name, co.order_total > FROM customer_orders co > JOIN products p ON p.id = co.product_id; > ``` > > **Expected Improvement:** 30s → <100ms with the index ## Related Resources - [Use The Index, Luke](https://use-the-index-luke.com/) for SQL indexing - [PostgreSQL EXPLAIN Visualizer](https://explain.depesz.com/) - [MongoDB Performance Best Practices](https://www.mongodb.com/docs/manual/administration/analyzing-mongodb-performance/) - [Redis Best Practices](https://redis.io/docs/management/optimization/)

Procedural Integration

This skill is formatted as a set of persistent system instructions. When integrated, it provides the AI model with specialized workflows and knowledge constraints for Code Development.

Skill Actions


Model Compatibility
🤖 Claude Opus🤖 GPT-4
Code Execution: Required
MCP Tools: Optional
Footprint ~2,265 tokens