Database Designer - POWERFUL Tier Skill
A comprehensive database design and analysis toolkit that provides expert-level schema analysis, index optimization, and migration generation capabilities for modern database systems.
Features
🔍 Schema Analyzer
- Normalization Analysis: Automated detection of 1NF through BCNF violations
- Data Type Optimization: Identifies antipatterns and inappropriate types
- Constraint Analysis: Finds missing foreign keys, unique constraints, and checks
- ERD Generation: Creates Mermaid diagrams from DDL or JSON schema
- Naming Convention Validation: Ensures consistent naming patterns
⚡ Index Optimizer
- Missing Index Detection: Identifies indexes needed for query patterns
- Composite Index Design: Optimizes column ordering for maximum efficiency
- Redundancy Analysis: Finds duplicate and overlapping indexes
- Performance Modeling: Estimates selectivity and query performance impact
- Covering Index Recommendations: Eliminates table lookups
🚀 Migration Generator
- Zero-Downtime Migrations: Implements expand-contract patterns
- Schema Evolution: Handles column changes, table renames, constraint updates
- Data Migration Scripts: Automated data transformation and validation
- Rollback Planning: Complete reversal capabilities for all changes
- Execution Orchestration: Dependency-aware migration ordering
Quick Start
Prerequisites
- Python 3.7+ (no external dependencies required)
- Database schema in SQL DDL format or JSON
- Query patterns (for index optimization)
Installation
# Clone or download the database-designer skill
cd engineering/database-designer/
# Make scripts executable
chmod +x *.py
Usage Examples
Schema Analysis
Analyze SQL DDL file:
python schema_analyzer.py --input assets/sample_schema.sql --output-format text
Generate ERD diagram:
python schema_analyzer.py --input assets/sample_schema.sql --generate-erd --output analysis.txt
JSON schema analysis:
python schema_analyzer.py --input assets/sample_schema.json --output-format json --output results.json
Index Optimization
Basic index analysis:
python index_optimizer.py --schema assets/sample_schema.json --queries assets/sample_query_patterns.json
High-priority recommendations only:
python index_optimizer.py --schema assets/sample_schema.json --queries assets/sample_query_patterns.json --min-priority 2
JSON output with existing index analysis:
python index_optimizer.py --schema assets/sample_schema.json --queries assets/sample_query_patterns.json --format json --analyze-existing
Migration Generation
Generate migration between schemas:
python migration_generator.py --current assets/current_schema.json --target assets/target_schema.json
Zero-downtime migration:
python migration_generator.py --current current.json --target target.json --zero-downtime --format sql
Include validation queries:
python migration_generator.py --current current.json --target target.json --include-validations --output migration_plan.txt
Tool Documentation
Schema Analyzer
Input Formats:
- SQL DDL files (.sql)
- JSON schema definitions (.json)
Key Capabilities:
- Detects 1NF violations (non-atomic values, repeating groups)
- Identifies 2NF issues (partial dependencies in composite keys)
- Finds 3NF problems (transitive dependencies)
- Checks BCNF compliance (determinant key requirements)
- Validates data types (VARCHAR(255) antipattern, inappropriate types)
- Missing constraints (NOT NULL, UNIQUE, CHECK, foreign keys)
- Naming convention adherence
Sample Command:
python schema_analyzer.py \
--input sample_schema.sql \
--generate-erd \
--output-format text \
--output analysis.txt
Output:
- Comprehensive text or JSON analysis report
- Mermaid ERD diagram
- Prioritized recommendations
- SQL statements for improvements
Index Optimizer
Input Requirements:
- Schema definition (JSON format)
- Query patterns with frequency and selectivity data
Analysis Features:
- Selectivity estimation based on column patterns
- Composite index column ordering optimization
- Covering index recommendations for SELECT queries
- Foreign key index validation
- Redundancy detection (duplicates, overlaps, unused indexes)
- Performance impact modeling
Sample Command:
python index_optimizer.py \
--schema schema.json \
--queries query_patterns.json \
--format text \
--min-priority 3 \
--output recommendations.txt
Output:
- Prioritized index recommendations
- CREATE INDEX statements
- Drop statements for redundant indexes
- Performance impact analysis
- Storage size estimates
Migration Generator
Input Requirements:
- Current schema (JSON format)
- Target schema (JSON format)
Migration Strategies:
- Standard migrations with ALTER statements
- Zero-downtime expand-contract patterns
- Data migration and transformation scripts
- Constraint management (add/drop in correct order)
- Index management with timing estimates
Sample Command:
python migration_generator.py \
--current current_schema.json \
--target target_schema.json \
--zero-downtime \
--include-validations \
--format text
Output:
- Step-by-step migration plan
- Forward and rollback SQL statements
- Risk assessment for each step
- Validation queries
- Execution time estimates
File Structure
database-designer/
├── README.md # This file
├── SKILL.md # Comprehensive database design guide
├── schema_analyzer.py # Schema analysis tool
├── index_optimizer.py # Index optimization tool
├── migration_generator.py # Migration generation tool
├── references/ # Reference documentation
│ ├── normalization_guide.md # Normalization principles and patterns
│ ├── index_strategy_patterns.md # Index design and optimization guide
│ └── database_selection_decision_tree.md # Database technology selection
├── assets/ # Sample files and test data
│ ├── sample_schema.sql # Sample DDL with various issues
│ ├── sample_schema.json # JSON schema definition
│ └── sample_query_patterns.json # Query patterns for index analysis
└── expected_outputs/ # Example tool outputs
├── schema_analysis_sample.txt # Sample schema analysis report
├── index_optimization_sample.txt # Sample index recommendations
└── migration_sample.txt # Sample migration plan
JSON Schema Format
The tools use a standardized JSON format for schema definitions:
{
"tables": {
"table_name": {
"columns": {
"column_name": {
"type": "VARCHAR(255)",
"nullable": true,
"unique": false,
"foreign_key": "other_table.column",
"default": "default_value",
"cardinality_estimate": 1000
}
},
"primary_key": ["id"],
"unique_constraints": [["email"], ["username"]],
"check_constraints": {
"chk_positive_price": "price > 0"
},
"indexes": [
{
"name": "idx_table_column",
"columns": ["column_name"],
"unique": false,
"partial_condition": "status = 'active'"
}
]
}
}
}
Query Patterns Format
For index optimization, provide query patterns in this format:
{
"queries": [
{
"id": "user_lookup",
"type": "SELECT",
"table": "users",
"where_conditions": [
{
"column": "email",
"operator": "=",
"selectivity": 0.95
}
],
"join_conditions": [
{
"local_column": "user_id",
"foreign_table": "orders",
"foreign_column": "id",
"join_type": "INNER"
}
],
"order_by": [
{"column": "created_at", "direction": "DESC"}
],
"frequency": 1000,
"avg_execution_time_ms": 5.2
}
]
}
Best Practices
Schema Analysis
- Start with DDL: Use actual CREATE TABLE statements when possible
- Include Constraints: Capture all existing constraints and indexes
- Consider History: Some denormalization may be intentional for performance
- Validate Results: Review recommendations against business requirements
Index Optimization
- Real Query Patterns: Use actual application queries, not theoretical ones
- Include Frequency: Query frequency is crucial for prioritization
- Monitor Performance: Validate recommendations with actual performance testing
- Gradual Implementation: Add indexes incrementally and monitor impact
Migration Planning
- Test Migrations: Always test on non-production environments first
- Backup First: Ensure complete backups before running migrations
- Monitor Progress: Watch for locks and performance impacts during execution
- Rollback Ready: Have rollback procedures tested and ready
Advanced Usage
Custom Selectivity Estimation
The index optimizer uses pattern-based selectivity estimation. You can improve accuracy by providing cardinality estimates in your schema JSON:
{
"columns": {
"status": {
"type": "VARCHAR(20)",
"cardinality_estimate": 5 # Only 5 distinct values
}
}
}
Zero-Downtime Migration Strategy
For production systems, use the zero-downtime flag to generate expand-contract migrations:
- Expand Phase: Add new columns/tables without constraints
- Dual Write: Application writes to both old and new structures
- Backfill: Populate new structures with existing data
- Contract Phase: Remove old structures after validation
Integration with CI/CD
Integrate these tools into your deployment pipeline:
# Schema validation in CI
python schema_analyzer.py --input schema.sql --output-format json | \
jq '.constraint_analysis.total_issues' | \
test $(cat) -eq 0 || exit 1
# Generate migrations automatically
python migration_generator.py \
--current prod_schema.json \
--target new_schema.json \
--zero-downtime \
--output migration.sql
Troubleshooting
Common Issues
"No tables found in input file"
- Ensure SQL DDL uses standard CREATE TABLE syntax
- Check for syntax errors in DDL
- Verify file encoding (UTF-8 recommended)
"Invalid JSON schema"
- Validate JSON syntax with a JSON validator
- Ensure all required fields are present
- Check that foreign key references use "table.column" format
"Analysis shows no issues but problems exist"
- Tools use heuristic analysis - review recommendations carefully
- Some design decisions may be intentional (denormalization for performance)
- Consider domain-specific requirements not captured by general rules
Performance Tips
Large Schemas:
- Use
--output-format jsonfor machine processing - Consider analyzing subsets of tables for very large schemas
- Provide cardinality estimates for better index recommendations
Complex Queries:
- Include actual execution times in query patterns
- Provide realistic frequency estimates
- Consider seasonal or usage pattern variations
Contributing
This is a self-contained skill with no external dependencies. To extend functionality:
- Follow the existing code patterns
- Maintain Python standard library only requirement
- Add comprehensive test cases for new features
- Update documentation and examples
License
This database designer skill is part of the claude-skills collection and follows the same licensing terms.