Files
CleanArchitecture-template/.brain/.agent/skills/database-optimization/mysql/references/json-column-patterns.md
2026-03-12 15:17:52 +07:00

3.1 KiB

title, description, tags
title description tags
JSON Column Best Practices When and how to use JSON columns safely mysql, json, generated-columns, indexes, data-modeling

JSON Column Patterns

MySQL 5.7+ supports native JSON columns. Useful, but with important caveats.

When JSON Is Appropriate

  • Truly schema-less data (user preferences, metadata bags, webhook payloads).
  • Rarely filtered/joined — if you query a JSON path frequently, extract it to a real column.

Indexing JSON: Use Generated Columns

You cannot index a JSON column directly. Create a virtual generated column and index that:

ALTER TABLE events
  ADD COLUMN event_type VARCHAR(50) GENERATED ALWAYS AS (data->>'$.type') VIRTUAL,
  ADD INDEX idx_event_type (event_type);

Extraction Operators

Syntax Returns Use for
JSON_EXTRACT(col, '$.key') JSON type value (e.g., "foo" for strings) When you need JSON type semantics
col->'$.key' Same as JSON_EXTRACT(col, '$.key') Shorthand
col->>'$.key' Unquoted scalar (equivalent to JSON_UNQUOTE(JSON_EXTRACT(col, '$.key'))) WHERE comparisons, display

Always use ->> (unquote) in WHERE clauses, otherwise you compare against "foo" (with quotes).

Tip: the generated column example above can be written more concisely as:

ALTER TABLE events
  ADD COLUMN event_type VARCHAR(50) GENERATED ALWAYS AS (data->>'$.type') VIRTUAL,
  ADD INDEX idx_event_type (event_type);

Multi-Valued Indexes (MySQL 8.0.17+)

If you store arrays in JSON (e.g., tags: ["electronics","sale"]), MySQL 8.0.17+ supports multi-valued indexes to index array elements:

ALTER TABLE products
  ADD INDEX idx_tags ((CAST(tags AS CHAR(50) ARRAY)));

This can accelerate membership queries such as:

SELECT * FROM products WHERE 'electronics' MEMBER OF (tags);

Collation and Type Casting Pitfalls

  • JSON type comparisons: JSON_EXTRACT returns JSON type. Comparing directly to strings can be wrong for numbers/dates.
-- WRONG: lexicographic string comparison
WHERE data->>'$.price' <= '1200'

-- CORRECT: cast to numeric
WHERE CAST(data->>'$.price' AS UNSIGNED) <= 1200
  • Collation: values extracted with ->> behave like strings and use a collation. Use COLLATE when you need a specific comparison behavior.
WHERE data->>'$.status' COLLATE utf8mb4_0900_as_cs = 'Active'

Common Pitfalls

  • Heavy update cost: JSON_SET/JSON_REPLACE can touch large portions of a JSON document and generate significant redo/undo work on large blobs.
  • No partial indexes: You can only index extracted scalar paths via generated columns.
  • Large documents hurt: JSON stored inline in the row. Documents >8 KB spill to overflow pages, hurting read performance.
  • Type mismatches: JSON_EXTRACT returns a JSON type. Comparing with = 'foo' may not match — use ->> or JSON_UNQUOTE.
  • VIRTUAL vs STORED generated columns: VIRTUAL columns compute on read (less storage, more CPU). STORED columns materialize on write (more storage, faster reads if selected often). Both can be indexed; for indexed paths, the index stores the computed value either way.