Files
CleanArchitecture-template/.brain/.agent/skills/engineering-advanced-skills/database-designer
2026-03-12 15:17:52 +07:00
..
2026-03-12 15:17:52 +07:00
2026-03-12 15:17:52 +07:00
2026-03-12 15:17:52 +07:00
2026-03-12 15:17:52 +07:00
2026-03-12 15:17:52 +07:00
2026-03-12 15:17:52 +07:00
2026-03-12 15:17:52 +07:00
2026-03-12 15:17:52 +07:00

Database Designer - POWERFUL Tier Skill

A comprehensive database design and analysis toolkit that provides expert-level schema analysis, index optimization, and migration generation capabilities for modern database systems.

Features

🔍 Schema Analyzer

  • Normalization Analysis: Automated detection of 1NF through BCNF violations
  • Data Type Optimization: Identifies antipatterns and inappropriate types
  • Constraint Analysis: Finds missing foreign keys, unique constraints, and checks
  • ERD Generation: Creates Mermaid diagrams from DDL or JSON schema
  • Naming Convention Validation: Ensures consistent naming patterns

Index Optimizer

  • Missing Index Detection: Identifies indexes needed for query patterns
  • Composite Index Design: Optimizes column ordering for maximum efficiency
  • Redundancy Analysis: Finds duplicate and overlapping indexes
  • Performance Modeling: Estimates selectivity and query performance impact
  • Covering Index Recommendations: Eliminates table lookups

🚀 Migration Generator

  • Zero-Downtime Migrations: Implements expand-contract patterns
  • Schema Evolution: Handles column changes, table renames, constraint updates
  • Data Migration Scripts: Automated data transformation and validation
  • Rollback Planning: Complete reversal capabilities for all changes
  • Execution Orchestration: Dependency-aware migration ordering

Quick Start

Prerequisites

  • Python 3.7+ (no external dependencies required)
  • Database schema in SQL DDL format or JSON
  • Query patterns (for index optimization)

Installation

# Clone or download the database-designer skill
cd engineering/database-designer/

# Make scripts executable
chmod +x *.py

Usage Examples

Schema Analysis

Analyze SQL DDL file:

python schema_analyzer.py --input assets/sample_schema.sql --output-format text

Generate ERD diagram:

python schema_analyzer.py --input assets/sample_schema.sql --generate-erd --output analysis.txt

JSON schema analysis:

python schema_analyzer.py --input assets/sample_schema.json --output-format json --output results.json

Index Optimization

Basic index analysis:

python index_optimizer.py --schema assets/sample_schema.json --queries assets/sample_query_patterns.json

High-priority recommendations only:

python index_optimizer.py --schema assets/sample_schema.json --queries assets/sample_query_patterns.json --min-priority 2

JSON output with existing index analysis:

python index_optimizer.py --schema assets/sample_schema.json --queries assets/sample_query_patterns.json --format json --analyze-existing

Migration Generation

Generate migration between schemas:

python migration_generator.py --current assets/current_schema.json --target assets/target_schema.json

Zero-downtime migration:

python migration_generator.py --current current.json --target target.json --zero-downtime --format sql

Include validation queries:

python migration_generator.py --current current.json --target target.json --include-validations --output migration_plan.txt

Tool Documentation

Schema Analyzer

Input Formats:

  • SQL DDL files (.sql)
  • JSON schema definitions (.json)

Key Capabilities:

  • Detects 1NF violations (non-atomic values, repeating groups)
  • Identifies 2NF issues (partial dependencies in composite keys)
  • Finds 3NF problems (transitive dependencies)
  • Checks BCNF compliance (determinant key requirements)
  • Validates data types (VARCHAR(255) antipattern, inappropriate types)
  • Missing constraints (NOT NULL, UNIQUE, CHECK, foreign keys)
  • Naming convention adherence

Sample Command:

python schema_analyzer.py \
  --input sample_schema.sql \
  --generate-erd \
  --output-format text \
  --output analysis.txt

Output:

  • Comprehensive text or JSON analysis report
  • Mermaid ERD diagram
  • Prioritized recommendations
  • SQL statements for improvements

Index Optimizer

Input Requirements:

  • Schema definition (JSON format)
  • Query patterns with frequency and selectivity data

Analysis Features:

  • Selectivity estimation based on column patterns
  • Composite index column ordering optimization
  • Covering index recommendations for SELECT queries
  • Foreign key index validation
  • Redundancy detection (duplicates, overlaps, unused indexes)
  • Performance impact modeling

Sample Command:

python index_optimizer.py \
  --schema schema.json \
  --queries query_patterns.json \
  --format text \
  --min-priority 3 \
  --output recommendations.txt

Output:

  • Prioritized index recommendations
  • CREATE INDEX statements
  • Drop statements for redundant indexes
  • Performance impact analysis
  • Storage size estimates

Migration Generator

Input Requirements:

  • Current schema (JSON format)
  • Target schema (JSON format)

Migration Strategies:

  • Standard migrations with ALTER statements
  • Zero-downtime expand-contract patterns
  • Data migration and transformation scripts
  • Constraint management (add/drop in correct order)
  • Index management with timing estimates

Sample Command:

python migration_generator.py \
  --current current_schema.json \
  --target target_schema.json \
  --zero-downtime \
  --include-validations \
  --format text

Output:

  • Step-by-step migration plan
  • Forward and rollback SQL statements
  • Risk assessment for each step
  • Validation queries
  • Execution time estimates

File Structure

database-designer/
├── README.md                          # This file
├── SKILL.md                          # Comprehensive database design guide
├── schema_analyzer.py                # Schema analysis tool
├── index_optimizer.py                # Index optimization tool  
├── migration_generator.py            # Migration generation tool
├── references/                       # Reference documentation
│   ├── normalization_guide.md        # Normalization principles and patterns
│   ├── index_strategy_patterns.md    # Index design and optimization guide
│   └── database_selection_decision_tree.md # Database technology selection
├── assets/                           # Sample files and test data
│   ├── sample_schema.sql            # Sample DDL with various issues
│   ├── sample_schema.json           # JSON schema definition
│   └── sample_query_patterns.json   # Query patterns for index analysis
└── expected_outputs/                 # Example tool outputs
    ├── schema_analysis_sample.txt   # Sample schema analysis report
    ├── index_optimization_sample.txt # Sample index recommendations
    └── migration_sample.txt         # Sample migration plan

JSON Schema Format

The tools use a standardized JSON format for schema definitions:

{
  "tables": {
    "table_name": {
      "columns": {
        "column_name": {
          "type": "VARCHAR(255)",
          "nullable": true,
          "unique": false,
          "foreign_key": "other_table.column",
          "default": "default_value",
          "cardinality_estimate": 1000
        }
      },
      "primary_key": ["id"],
      "unique_constraints": [["email"], ["username"]],
      "check_constraints": {
        "chk_positive_price": "price > 0"
      },
      "indexes": [
        {
          "name": "idx_table_column",
          "columns": ["column_name"],
          "unique": false,
          "partial_condition": "status = 'active'"
        }
      ]
    }
  }
}

Query Patterns Format

For index optimization, provide query patterns in this format:

{
  "queries": [
    {
      "id": "user_lookup",
      "type": "SELECT",
      "table": "users",
      "where_conditions": [
        {
          "column": "email",
          "operator": "=",
          "selectivity": 0.95
        }
      ],
      "join_conditions": [
        {
          "local_column": "user_id",
          "foreign_table": "orders",
          "foreign_column": "id",
          "join_type": "INNER"
        }
      ],
      "order_by": [
        {"column": "created_at", "direction": "DESC"}
      ],
      "frequency": 1000,
      "avg_execution_time_ms": 5.2
    }
  ]
}

Best Practices

Schema Analysis

  1. Start with DDL: Use actual CREATE TABLE statements when possible
  2. Include Constraints: Capture all existing constraints and indexes
  3. Consider History: Some denormalization may be intentional for performance
  4. Validate Results: Review recommendations against business requirements

Index Optimization

  1. Real Query Patterns: Use actual application queries, not theoretical ones
  2. Include Frequency: Query frequency is crucial for prioritization
  3. Monitor Performance: Validate recommendations with actual performance testing
  4. Gradual Implementation: Add indexes incrementally and monitor impact

Migration Planning

  1. Test Migrations: Always test on non-production environments first
  2. Backup First: Ensure complete backups before running migrations
  3. Monitor Progress: Watch for locks and performance impacts during execution
  4. Rollback Ready: Have rollback procedures tested and ready

Advanced Usage

Custom Selectivity Estimation

The index optimizer uses pattern-based selectivity estimation. You can improve accuracy by providing cardinality estimates in your schema JSON:

{
  "columns": {
    "status": {
      "type": "VARCHAR(20)",
      "cardinality_estimate": 5  # Only 5 distinct values
    }
  }
}

Zero-Downtime Migration Strategy

For production systems, use the zero-downtime flag to generate expand-contract migrations:

  1. Expand Phase: Add new columns/tables without constraints
  2. Dual Write: Application writes to both old and new structures
  3. Backfill: Populate new structures with existing data
  4. Contract Phase: Remove old structures after validation

Integration with CI/CD

Integrate these tools into your deployment pipeline:

# Schema validation in CI
python schema_analyzer.py --input schema.sql --output-format json | \
  jq '.constraint_analysis.total_issues' | \
  test $(cat) -eq 0 || exit 1

# Generate migrations automatically
python migration_generator.py \
  --current prod_schema.json \
  --target new_schema.json \
  --zero-downtime \
  --output migration.sql

Troubleshooting

Common Issues

"No tables found in input file"

  • Ensure SQL DDL uses standard CREATE TABLE syntax
  • Check for syntax errors in DDL
  • Verify file encoding (UTF-8 recommended)

"Invalid JSON schema"

  • Validate JSON syntax with a JSON validator
  • Ensure all required fields are present
  • Check that foreign key references use "table.column" format

"Analysis shows no issues but problems exist"

  • Tools use heuristic analysis - review recommendations carefully
  • Some design decisions may be intentional (denormalization for performance)
  • Consider domain-specific requirements not captured by general rules

Performance Tips

Large Schemas:

  • Use --output-format json for machine processing
  • Consider analyzing subsets of tables for very large schemas
  • Provide cardinality estimates for better index recommendations

Complex Queries:

  • Include actual execution times in query patterns
  • Provide realistic frequency estimates
  • Consider seasonal or usage pattern variations

Contributing

This is a self-contained skill with no external dependencies. To extend functionality:

  1. Follow the existing code patterns
  2. Maintain Python standard library only requirement
  3. Add comprehensive test cases for new features
  4. Update documentation and examples

License

This database designer skill is part of the claude-skills collection and follows the same licensing terms.