add brain
This commit is contained in:
@@ -0,0 +1,77 @@
|
||||
---
|
||||
title: JSON Column Best Practices
|
||||
description: When and how to use JSON columns safely
|
||||
tags: mysql, json, generated-columns, indexes, data-modeling
|
||||
---
|
||||
|
||||
# JSON Column Patterns
|
||||
|
||||
MySQL 5.7+ supports native JSON columns. Useful, but with important caveats.
|
||||
|
||||
## When JSON Is Appropriate
|
||||
- Truly schema-less data (user preferences, metadata bags, webhook payloads).
|
||||
- Rarely filtered/joined — if you query a JSON path frequently, extract it to a real column.
|
||||
|
||||
## Indexing JSON: Use Generated Columns
|
||||
You **cannot** index a JSON column directly. Create a virtual generated column and index that:
|
||||
```sql
|
||||
ALTER TABLE events
|
||||
ADD COLUMN event_type VARCHAR(50) GENERATED ALWAYS AS (data->>'$.type') VIRTUAL,
|
||||
ADD INDEX idx_event_type (event_type);
|
||||
```
|
||||
|
||||
## Extraction Operators
|
||||
| Syntax | Returns | Use for |
|
||||
|---|---|---|
|
||||
| `JSON_EXTRACT(col, '$.key')` | JSON type value (e.g., `"foo"` for strings) | When you need JSON type semantics |
|
||||
| `col->'$.key'` | Same as `JSON_EXTRACT(col, '$.key')` | Shorthand |
|
||||
| `col->>'$.key'` | Unquoted scalar (equivalent to `JSON_UNQUOTE(JSON_EXTRACT(col, '$.key'))`) | WHERE comparisons, display |
|
||||
|
||||
Always use `->>` (unquote) in WHERE clauses, otherwise you compare against `"foo"` (with quotes).
|
||||
|
||||
Tip: the generated column example above can be written more concisely as:
|
||||
|
||||
```sql
|
||||
ALTER TABLE events
|
||||
ADD COLUMN event_type VARCHAR(50) GENERATED ALWAYS AS (data->>'$.type') VIRTUAL,
|
||||
ADD INDEX idx_event_type (event_type);
|
||||
```
|
||||
|
||||
## Multi-Valued Indexes (MySQL 8.0.17+)
|
||||
If you store arrays in JSON (e.g., `tags: ["electronics","sale"]`), MySQL 8.0.17+ supports multi-valued indexes to index array elements:
|
||||
|
||||
```sql
|
||||
ALTER TABLE products
|
||||
ADD INDEX idx_tags ((CAST(tags AS CHAR(50) ARRAY)));
|
||||
```
|
||||
|
||||
This can accelerate membership queries such as:
|
||||
|
||||
```sql
|
||||
SELECT * FROM products WHERE 'electronics' MEMBER OF (tags);
|
||||
```
|
||||
|
||||
## Collation and Type Casting Pitfalls
|
||||
- **JSON type comparisons**: `JSON_EXTRACT` returns JSON type. Comparing directly to strings can be wrong for numbers/dates.
|
||||
|
||||
```sql
|
||||
-- WRONG: lexicographic string comparison
|
||||
WHERE data->>'$.price' <= '1200'
|
||||
|
||||
-- CORRECT: cast to numeric
|
||||
WHERE CAST(data->>'$.price' AS UNSIGNED) <= 1200
|
||||
```
|
||||
|
||||
- **Collation**: values extracted with `->>` behave like strings and use a collation. Use `COLLATE` when you need a specific comparison behavior.
|
||||
|
||||
```sql
|
||||
WHERE data->>'$.status' COLLATE utf8mb4_0900_as_cs = 'Active'
|
||||
```
|
||||
|
||||
## Common Pitfalls
|
||||
- **Heavy update cost**: `JSON_SET`/`JSON_REPLACE` can touch large portions of a JSON document and generate significant redo/undo work on large blobs.
|
||||
- **No partial indexes**: You can only index extracted scalar paths via generated columns.
|
||||
- **Large documents hurt**: JSON stored inline in the row. Documents >8 KB spill to overflow pages, hurting read performance.
|
||||
- **Type mismatches**: `JSON_EXTRACT` returns a JSON type. Comparing with `= 'foo'` may not match — use `->>` or `JSON_UNQUOTE`.
|
||||
- **VIRTUAL vs STORED generated columns**: VIRTUAL columns compute on read (less storage, more CPU). STORED columns materialize on write (more storage, faster reads if selected often). Both can be indexed; for indexed paths, the index stores the computed value either way.
|
||||
|
||||
Reference in New Issue
Block a user