chore(skills): add Django migrations skills (#10260)

This commit is contained in:
Pepe Fagoaga
2026-03-12 17:37:43 +00:00
committed by GitHub
parent 80a814afce
commit b8c6f3ba67
4 changed files with 860 additions and 2 deletions

View File

@@ -46,6 +46,8 @@ Use these skills for detailed patterns on-demand:
| `prowler-commit` | Professional commits (conventional-commits) | [SKILL.md](skills/prowler-commit/SKILL.md) |
| `prowler-pr` | Pull request conventions | [SKILL.md](skills/prowler-pr/SKILL.md) |
| `prowler-docs` | Documentation style guide | [SKILL.md](skills/prowler-docs/SKILL.md) |
| `django-migration-psql` | Django migration best practices for PostgreSQL | [SKILL.md](skills/django-migration-psql/SKILL.md) |
| `postgresql-indexing` | PostgreSQL indexing, EXPLAIN, monitoring, maintenance | [SKILL.md](skills/postgresql-indexing/SKILL.md) |
| `prowler-attack-paths-query` | Create Attack Paths openCypher queries | [SKILL.md](skills/prowler-attack-paths-query/SKILL.md) |
| `gh-aw` | GitHub Agentic Workflows (gh-aw) | [SKILL.md](skills/gh-aw/SKILL.md) |
| `skill-creator` | Create new AI agent skills | [SKILL.md](skills/skill-creator/SKILL.md) |
@@ -85,15 +87,15 @@ When performing these actions, ALWAYS invoke the corresponding skill FIRST:
| Fixing bug | `tdd` |
| General Prowler development questions | `prowler` |
| Implementing JSON:API endpoints | `django-drf` |
| Importing Copilot Custom Agents into workflows | `gh-aw` |
| Implementing feature | `tdd` |
| Importing Copilot Custom Agents into workflows | `gh-aw` |
| Inspect PR CI checks and gates (.github/workflows/*) | `prowler-ci` |
| Inspect PR CI workflows (.github/workflows/*): conventional-commit, pr-check-changelog, pr-conflict-checker, labeler | `prowler-pr` |
| Mapping checks to compliance controls | `prowler-compliance` |
| Mocking AWS with moto in tests | `prowler-test-sdk` |
| Modifying API responses | `jsonapi` |
| Modifying gh-aw workflow frontmatter or safe-outputs | `gh-aw` |
| Modifying component | `tdd` |
| Modifying gh-aw workflow frontmatter or safe-outputs | `gh-aw` |
| Refactoring code | `tdd` |
| Regenerate AGENTS.md Auto-invoke tables (sync.sh) | `skill-sync` |
| Review PR requirements: template, title conventions, changelog gate | `prowler-pr` |

View File

@@ -4,6 +4,8 @@
> - [`prowler-api`](../skills/prowler-api/SKILL.md) - Models, Serializers, Views, RLS patterns
> - [`prowler-test-api`](../skills/prowler-test-api/SKILL.md) - Testing patterns (pytest-django)
> - [`prowler-attack-paths-query`](../skills/prowler-attack-paths-query/SKILL.md) - Attack Paths openCypher queries
> - [`django-migration-psql`](../skills/django-migration-psql/SKILL.md) - Migration best practices for PostgreSQL
> - [`postgresql-indexing`](../skills/postgresql-indexing/SKILL.md) - PostgreSQL indexing, EXPLAIN, monitoring, maintenance
> - [`django-drf`](../skills/django-drf/SKILL.md) - Generic DRF patterns
> - [`jsonapi`](../skills/jsonapi/SKILL.md) - Strict JSON:API v1.1 spec compliance
> - [`pytest`](../skills/pytest/SKILL.md) - Generic pytest patterns
@@ -16,14 +18,20 @@ When performing these actions, ALWAYS invoke the corresponding skill FIRST:
|--------|-------|
| Add changelog entry for a PR or feature | `prowler-changelog` |
| Adding DRF pagination or permissions | `django-drf` |
| Adding indexes or constraints to database tables | `django-migration-psql` |
| Adding privilege escalation detection queries | `prowler-attack-paths-query` |
| Analyzing query performance with EXPLAIN | `postgresql-indexing` |
| Committing changes | `prowler-commit` |
| Create PR that requires changelog entry | `prowler-changelog` |
| Creating API endpoints | `jsonapi` |
| Creating Attack Paths queries | `prowler-attack-paths-query` |
| Creating ViewSets, serializers, or filters in api/ | `django-drf` |
| Creating a git commit | `prowler-commit` |
| Creating or modifying PostgreSQL indexes | `postgresql-indexing` |
| Creating or reviewing Django migrations | `django-migration-psql` |
| Creating/modifying models, views, serializers | `prowler-api` |
| Debugging slow queries or missing indexes | `postgresql-indexing` |
| Dropping or reindexing PostgreSQL indexes | `postgresql-indexing` |
| Fixing bug | `tdd` |
| Implementing JSON:API endpoints | `django-drf` |
| Implementing feature | `tdd` |
@@ -32,12 +40,14 @@ When performing these actions, ALWAYS invoke the corresponding skill FIRST:
| Refactoring code | `tdd` |
| Review changelog format and conventions | `prowler-changelog` |
| Reviewing JSON:API compliance | `jsonapi` |
| Running makemigrations or pgmakemigrations | `django-migration-psql` |
| Testing RLS tenant isolation | `prowler-test-api` |
| Update CHANGELOG.md in any component | `prowler-changelog` |
| Updating existing Attack Paths queries | `prowler-attack-paths-query` |
| Working on task | `tdd` |
| Writing Prowler API tests | `prowler-test-api` |
| Writing Python tests with pytest | `pytest` |
| Writing data backfill or data migration | `django-migration-psql` |
---

View File

@@ -0,0 +1,454 @@
---
name: django-migration-psql
description: >
Reviews Django migration files for PostgreSQL best practices specific to Prowler.
Trigger: When creating migrations, running makemigrations/pgmakemigrations, reviewing migration PRs,
adding indexes or constraints to database tables, modifying existing migration files, or writing
data backfill migrations. Always use this skill when you see AddIndex, CreateModel, AddConstraint,
RunPython, bulk_create, bulk_update, or backfill operations in migration files.
license: Apache-2.0
metadata:
author: prowler-cloud
version: "1.0"
scope: [api, root]
auto_invoke:
- "Creating or reviewing Django migrations"
- "Adding indexes or constraints to database tables"
- "Running makemigrations or pgmakemigrations"
- "Writing data backfill or data migration"
allowed-tools: Read, Grep, Glob, Edit, Write, Bash
---
## When to use
- Creating a new Django migration
- Running `makemigrations` or `pgmakemigrations`
- Reviewing a PR that adds or modifies migrations
- Adding indexes, constraints, or models to the database
## Why this matters
A bad migration can lock a production table for minutes, block all reads/writes, or silently skip index creation on partitioned tables.
## Auto-generated migrations need splitting
`makemigrations` and `pgmakemigrations` bundle everything into one file: `CreateModel`, `AddIndex`, `AddConstraint`, sometimes across multiple tables. This is the default Django behavior and it violates every rule below.
After generating a migration, ALWAYS review it and split it:
1. Read the generated file and identify every operation
2. Group operations by concern:
- `CreateModel` + `AddConstraint` for each new table → one migration per table
- `AddIndex` per table → one migration per table
- `AddIndex` on partitioned tables → two migrations (partition + parent)
- `AlterField`, `AddField`, `RemoveField` for each table → one migration per table
3. Rewrite the generated file into separate migration files with correct dependencies
4. Delete the original auto-generated migration
When adding fields or indexes to an existing model, `makemigrations` may also bundle `AddIndex` for unrelated tables that had pending model changes. Always check for stowaways from other tables.
## Rule 1: separate indexes from model creation
`CreateModel` + `AddConstraint` = same migration (structural).
`AddIndex` = separate migration file (performance).
Django runs each migration inside a transaction (unless `atomic = False`). If an index operation fails, it rolls back everything, including the model creation. Splitting means a failed index doesn't prevent the table from existing. It also lets you `--fake` index migrations independently (see Rule 4).
### Bad
```python
# 0081_finding_group_daily_summary.py — DON'T DO THIS
class Migration(migrations.Migration):
operations = [
migrations.CreateModel(name="FindingGroupDailySummary", ...),
migrations.AddIndex(model_name="findinggroupdailysummary", ...), # separate this
migrations.AddIndex(model_name="findinggroupdailysummary", ...), # separate this
migrations.AddConstraint(model_name="findinggroupdailysummary", ...), # this is fine here
]
```
### Good
```python
# 0081_create_finding_group_daily_summary.py
class Migration(migrations.Migration):
operations = [
migrations.CreateModel(name="FindingGroupDailySummary", ...),
# Constraints belong with the model — they define its integrity rules
migrations.AddConstraint(model_name="findinggroupdailysummary", ...), # unique
migrations.AddConstraint(model_name="findinggroupdailysummary", ...), # RLS
]
# 0082_finding_group_daily_summary_indexes.py
class Migration(migrations.Migration):
dependencies = [("api", "0081_create_finding_group_daily_summary")]
operations = [
migrations.AddIndex(model_name="findinggroupdailysummary", ...),
migrations.AddIndex(model_name="findinggroupdailysummary", ...),
migrations.AddIndex(model_name="findinggroupdailysummary", ...),
]
```
Flag any migration with both `CreateModel` and `AddIndex` in `operations`.
## Rule 2: one table's indexes per migration
Each table's indexes must live in their own migration file. Never mix `AddIndex` for different `model_name` values in one migration.
If the index on table B fails, the rollback also drops the index on table A. The migration name gives no hint that it touches unrelated tables. You lose the ability to `--fake` one table's indexes without affecting the other.
### Bad
```python
# 0081_finding_group_daily_summary.py — DON'T DO THIS
class Migration(migrations.Migration):
operations = [
migrations.CreateModel(name="FindingGroupDailySummary", ...),
migrations.AddIndex(model_name="findinggroupdailysummary", ...), # table A
migrations.AddIndex(model_name="resource", ...), # table B!
migrations.AddIndex(model_name="resource", ...), # table B!
migrations.AddIndex(model_name="finding", ...), # table C!
]
```
### Good
```python
# 0081_create_finding_group_daily_summary.py — model + constraints
# 0082_finding_group_daily_summary_indexes.py — only FindingGroupDailySummary indexes
# 0083_resource_trigram_indexes.py — only Resource indexes
# 0084_finding_check_index_partitions.py — only Finding partition indexes (step 1)
# 0085_finding_check_index_parent.py — only Finding parent index (step 2)
```
Name each migration file after the table it affects. A reviewer should know which table a migration touches without opening the file.
Flag any migration where `AddIndex` operations reference more than one `model_name`.
## Rule 3: partitioned table indexes require the two-step pattern
Tables `findings` and `resource_finding_mappings` are range-partitioned. Plain `AddIndex` only creates the index definition on the parent table. Postgres does NOT propagate it to existing partitions. New partitions inherit it, but all current data stays unindexed.
Use the helpers in `api.db_utils`.
### Step 1: create indexes on actual partitions
```python
# 0084_finding_check_index_partitions.py
from functools import partial
from django.db import migrations
from api.db_utils import create_index_on_partitions, drop_index_on_partitions
class Migration(migrations.Migration):
atomic = False # REQUIRED — CREATE INDEX CONCURRENTLY can't run inside a transaction
dependencies = [("api", "0083_resource_trigram_indexes")]
operations = [
migrations.RunPython(
partial(
create_index_on_partitions,
parent_table="findings",
index_name="find_tenant_check_ins_idx",
columns="tenant_id, check_id, inserted_at",
),
reverse_code=partial(
drop_index_on_partitions,
parent_table="findings",
index_name="find_tenant_check_ins_idx",
),
)
]
```
Key details:
- `atomic = False` is mandatory. `CREATE INDEX CONCURRENTLY` cannot run inside a transaction.
- Always provide `reverse_code` using `drop_index_on_partitions` so rollbacks work.
- The default is `all_partitions=True`, which creates indexes on every partition CONCURRENTLY (no locks). This is the safe default.
- Do NOT use `all_partitions=False` unless you understand the consequence: Step 2's `AddIndex` on the parent will create indexes on the skipped partitions **with locks** (not CONCURRENTLY), because PostgreSQL fills in missing partition indexes inline during parent index creation.
### Step 2: register the index with Django
```python
# 0085_finding_check_index_parent.py
from django.db import migrations, models
class Migration(migrations.Migration):
dependencies = [("api", "0084_finding_check_index_partitions")]
operations = [
migrations.AddIndex(
model_name="finding",
index=models.Index(
fields=["tenant_id", "check_id", "inserted_at"],
name="find_tenant_check_ins_idx",
),
),
]
```
This second migration tells Django "this index exists" so it doesn't try to recreate it. New partitions created after this point inherit the index definition from the parent.
### Existing examples in the codebase
| Partition migration | Parent migration |
|---|---|
| `0020_findings_new_performance_indexes_partitions.py` | `0021_findings_new_performance_indexes_parent.py` |
| `0024_findings_uid_index_partitions.py` | `0025_findings_uid_index_parent.py` |
| `0028_findings_check_index_partitions.py` | `0029_findings_check_index_parent.py` |
| `0036_rfm_tenant_finding_index_partitions.py` | `0037_rfm_tenant_finding_index_parent.py` |
Flag any plain `AddIndex` on `finding` or `resourcefindingmapping` without a preceding partition migration.
## Rule 4: large table indexes — fake the migration, apply manually
For huge tables (findings has millions of rows), even `CREATE INDEX CONCURRENTLY` can take minutes and consume significant I/O. In production, you may want to decouple the migration from the actual index creation.
### Procedure
1. Write the migration normally following the two-step pattern above.
2. Fake the migration so Django marks it as applied without executing it:
```bash
python manage.py migrate api 0084_finding_check_index_partitions --fake
python manage.py migrate api 0085_finding_check_index_parent --fake
```
3. Create the index manually during a low-traffic window via `psql` or `python manage.py dbshell --database admin`:
```sql
-- For each partition you care about:
CREATE INDEX CONCURRENTLY IF NOT EXISTS findings_2026_jan_find_tenant_check_ins_idx
ON findings_2026_jan USING BTREE (tenant_id, check_id, inserted_at);
CREATE INDEX CONCURRENTLY IF NOT EXISTS findings_2026_feb_find_tenant_check_ins_idx
ON findings_2026_feb USING BTREE (tenant_id, check_id, inserted_at);
-- Then register on the parent (this is fast, no data scan):
CREATE INDEX IF NOT EXISTS find_tenant_check_ins_idx
ON findings USING BTREE (tenant_id, check_id, inserted_at);
```
4. Verify the index exists on the partitions you need:
```sql
SELECT indexrelid::regclass, indrelid::regclass
FROM pg_index
WHERE indexrelid::regclass::text LIKE '%find_tenant_check_ins%';
```
### When to use this approach
- The table will grow exponentially, e.g.: findings.
- You want to control exactly when the I/O hit happens (e.g., during a maintenance window).
This is optional. For smaller tables or non-production environments, letting the migration run normally is fine.
## Rule 5: data backfills — never inline, always batched
Data backfills (updating existing rows, populating new columns, generating summary data) are the most dangerous migrations. A naive `Model.objects.all().update(...)` on a multi-million row table will hold a transaction lock for minutes, blow out WAL, and potentially OOM the worker.
### Never backfill inline in the migration
The migration should only dispatch the work. The actual backfill runs asynchronously via Celery tasks, outside the migration transaction.
```python
# 0090_backfill_finding_group_summaries.py
from django.db import migrations
def trigger_backfill(apps, schema_editor):
from tasks.jobs.backfill import backfill_finding_group_summaries_task
Tenant = apps.get_model("api", "Tenant")
from api.db_router import MainRouter
tenant_ids = Tenant.objects.using(MainRouter.admin_db).values_list("id", flat=True)
for tenant_id in tenant_ids:
backfill_finding_group_summaries_task.delay(tenant_id=str(tenant_id))
class Migration(migrations.Migration):
dependencies = [("api", "0089_previous_migration")]
operations = [
migrations.RunPython(trigger_backfill, migrations.RunPython.noop),
]
```
The migration finishes in seconds. The backfill runs in the background per-tenant.
### Exception: trivial updates
Single-statement bulk updates on small result sets are OK inline:
```python
# Fine — single UPDATE, small result set, no iteration
def backfill_graph_data_ready(apps, schema_editor):
AttackPathsScan = apps.get_model("api", "AttackPathsScan")
AttackPathsScan.objects.using(MainRouter.admin_db).filter(
state="completed", graph_data_ready=False,
).update(graph_data_ready=True)
```
Use inline only when you're confident the affected row count is small (< ~10K rows).
### Batch processing in the Celery task
The actual backfill task must process data in batches. Use the helpers in `api.db_utils`:
```python
from api.db_utils import create_objects_in_batches, update_objects_in_batches, batch_delete
# Creating objects in batches (500 per transaction)
create_objects_in_batches(tenant_id, ScanCategorySummary, summaries, batch_size=500)
# Updating objects in batches
update_objects_in_batches(tenant_id, Finding, findings, fields=["status"], batch_size=500)
# Deleting in batches
batch_delete(tenant_id, queryset, batch_size=settings.DJANGO_DELETION_BATCH_SIZE)
```
Each batch runs in its own `rls_transaction()` so:
- A failure in batch N doesn't roll back batches 1 through N-1
- Lock duration is bounded to the batch size
- Memory stays constant regardless of total row count
### Rules for backfill tasks
1. **One RLS transaction per batch.** Never wrap the entire backfill in a single transaction. Each batch gets its own `rls_transaction(tenant_id)`.
2. **Use `bulk_create` / `bulk_update` with explicit `batch_size`.** Never `.save()` in a loop. The default batch_size is 500.
3. **Use `.iterator()` for reads.** When reading source data, use `queryset.iterator()` to avoid loading the entire result set into memory.
4. **Use `.only()` / `.values_list()` for reads.** Fetch only the columns you need, not full model instances.
5. **Catch and skip per-item failures.** Don't let one bad row kill the entire backfill. Log the error, count it, continue.
```python
scans_processed = 0
scans_skipped = 0
for scan_id in scan_ids:
try:
result = process_scan(tenant_id, scan_id)
scans_processed += 1
except Exception:
logger.warning("Failed to process scan %s", scan_id)
scans_skipped += 1
logger.info("Backfill done: %d processed, %d skipped", scans_processed, scans_skipped)
```
6. **Log totals at start and end, not per-batch.** Per-batch logging floods the logs. Log the total count at the start, and the processed/skipped counts at the end.
7. **Use `ignore_conflicts=True` for idempotent creates.** Makes the backfill safe to re-run if interrupted.
```python
Model.objects.bulk_create(objects, batch_size=500, ignore_conflicts=True)
```
8. **Iterate per-tenant.** Dispatch one Celery task per tenant. This gives you natural parallelism, bounded memory per task, and the ability to retry a single tenant without re-running everything.
### Existing examples
| Migration | Task |
|---|---|
| `0062_backfill_daily_severity_summaries.py` | `backfill_daily_severity_summaries_task` |
| `0080_backfill_attack_paths_graph_data_ready.py` | Inline (trivial update) |
| `0082_backfill_finding_group_summaries.py` | `backfill_finding_group_summaries_task` |
Task implementations: `tasks/jobs/backfill.py`
Batch utilities: `api/db_utils.py` (`batch_delete`, `create_objects_in_batches`, `update_objects_in_batches`)
## Decision tree
```
Auto-generated migration?
├── Yes → Split it following the rules below
└── No → Review it against the rules below
New model?
├── Yes → CreateModel + AddConstraint in one migration
│ AddIndex in separate migration(s), one per table
└── No, just indexes?
│ ├── Regular table → AddIndex in its own migration
│ └── Partitioned table (findings, resource_finding_mappings)?
│ ├── Step 1: RunPython + create_index_on_partitions (atomic=False)
│ └── Step 2: AddIndex on parent (separate migration)
│ └── Large table? → Consider --fake + manual apply
└── Data backfill?
├── Trivial update (< ~10K rows)? → Inline RunPython is OK
└── Large backfill? → Migration dispatches Celery task(s)
├── One task per tenant
├── Batch processing (bulk_create/bulk_update, batch_size=500)
├── One rls_transaction per batch
└── Catch + skip per-item failures, log totals
```
## Quick reference
| Scenario | Approach |
|---|---|
| Auto-generated migration | Split by concern and table before committing |
| New model + constraints/RLS | Same migration (constraints are structural) |
| Indexes on a regular table | Separate migration, one table per file |
| Indexes on a partitioned table | Two migrations: partitions first (`RunPython` + `atomic=False`), then parent (`AddIndex`) |
| Index on a huge partitioned table | Same two migrations, but fake + apply manually in production |
| Trivial data backfill (< ~10K rows) | Inline `RunPython` with single `.update()` call |
| Large data backfill | Migration dispatches Celery task per tenant, task batches with `rls_transaction` |
## Review output format
1. List each violation with rule number and one-line explanation
2. Show corrected migration file(s)
3. For partitioned tables, show both partition and parent migrations
If migration passes all checks, say so.
## Context7 lookups
**Prerequisite:** Install Context7 MCP server for up-to-date documentation lookup.
When implementing or debugging migration patterns, query these libraries via `mcp_context7_query-docs`:
| Library | Context7 ID | Use for |
|---------|-------------|---------|
| Django 5.1 | `/websites/djangoproject_en_5_1` | Migration operations, indexes, constraints, `SchemaEditor` |
| PostgreSQL | `/websites/postgresql_org_docs_current` | `CREATE INDEX CONCURRENTLY`, partitioned tables, `pg_inherits` |
| django-postgres-extra | `/SectorLabs/django-postgres-extra` | Partitioned models, `PostgresPartitionedModel`, partition management |
**Example queries:**
```
mcp_context7_query-docs(libraryId="/websites/djangoproject_en_5_1", query="migration operations AddIndex RunPython atomic")
mcp_context7_query-docs(libraryId="/websites/djangoproject_en_5_1", query="database indexes Meta class concurrently")
mcp_context7_query-docs(libraryId="/websites/postgresql_org_docs_current", query="CREATE INDEX CONCURRENTLY partitioned table")
mcp_context7_query-docs(libraryId="/SectorLabs/django-postgres-extra", query="partitioned model range partition index")
```
> **Note:** Use `mcp_context7_resolve-library-id` first if you need to find the correct library ID.
## Commands
```bash
# Generate migrations (ALWAYS review output before committing)
python manage.py makemigrations
python manage.py pgmakemigrations
# Apply migrations
python manage.py migrate
# Fake a migration (mark as applied without running)
python manage.py migrate api <migration_name> --fake
# Manage partitions
python manage.py pgpartition --using admin
```
## Resources
- **Partition helpers**: `api/src/backend/api/db_utils.py` (`create_index_on_partitions`, `drop_index_on_partitions`)
- **Partition config**: `api/src/backend/api/partitions.py`
- **RLS constraints**: `api/src/backend/api/rls.py`
- **Existing examples**: `0028` + `0029`, `0024` + `0025`, `0036` + `0037`

View File

@@ -0,0 +1,392 @@
---
name: postgresql-indexing
description: >
PostgreSQL indexing best practices for Prowler: index design, partial indexes, partitioned table
indexing, EXPLAIN ANALYZE validation, concurrent operations, monitoring, and maintenance.
Trigger: When creating or modifying PostgreSQL indexes, analyzing query performance with EXPLAIN,
debugging slow queries, reviewing index usage statistics, reindexing, dropping indexes, or working
with partitioned table indexes. Also trigger when discussing index strategies, partial indexes,
or index maintenance operations like VACUUM or ANALYZE.
license: Apache-2.0
metadata:
author: prowler-cloud
version: "1.0"
scope: [api]
auto_invoke:
- "Creating or modifying PostgreSQL indexes"
- "Analyzing query performance with EXPLAIN"
- "Debugging slow queries or missing indexes"
- "Dropping or reindexing PostgreSQL indexes"
allowed-tools: Read, Grep, Glob, Bash
---
## When to use
- Creating or modifying PostgreSQL indexes
- Analyzing query plans with `EXPLAIN`
- Debugging slow queries or missing index usage
- Dropping, reindexing, or validating indexes
- Working with indexes on partitioned tables (findings, resource_finding_mappings)
- Running VACUUM or ANALYZE after index changes
## Index design
### Partial indexes: constant columns go in WHERE, not in the key
When a column has a fixed value for the query (e.g., `state = 'completed'`), put it in the `WHERE` clause of the index, not in the indexed columns. Otherwise the planner cannot exploit the ordering of the other columns.
```sql
-- Bad: state in the key wastes space and breaks ordering
CREATE INDEX idx_scans_tenant_state ON scans (tenant_id, state, inserted_at DESC);
-- Good: state as a filter, planner uses tenant_id + inserted_at ordering
CREATE INDEX idx_scans_tenant_ins_completed ON scans (tenant_id, inserted_at DESC)
WHERE state = 'completed';
```
### Column order matters
Put high-selectivity columns first (columns that filter out the most rows). For composite indexes, the leftmost column must appear in the query's WHERE clause for the index to be used.
## Validating index effectiveness
### Always EXPLAIN (ANALYZE, BUFFERS) after adding indexes
Never assume an index is being used. Run `EXPLAIN (ANALYZE, BUFFERS)` to confirm.
```sql
EXPLAIN (ANALYZE, BUFFERS)
SELECT *
FROM users
WHERE email = 'user@example.com';
```
Use [Postgres EXPLAIN Visualizer (pev)](https://tatiyants.com/pev/) to visualize query plans and identify bottlenecks.
### Force index usage for testing
The planner may choose a sequential scan on small datasets. Toggle `enable_seqscan = off` to confirm the index path works, then re-enable it.
```sql
SET enable_seqscan = off;
EXPLAIN (ANALYZE, BUFFERS)
SELECT DISTINCT ON (provider_id) provider_id
FROM scans
WHERE tenant_id = '95383b24-da01-44b5-a713-0d9920d554db'
AND state = 'completed'
ORDER BY provider_id, inserted_at DESC;
SET enable_seqscan = on; -- always re-enable after testing
```
This is for validation only. Never leave `enable_seqscan = off` in production.
## Over-indexing
Every extra index has three costs that compound:
1. **Write overhead.** Every INSERT and UPDATE must maintain all indexes. Extra indexes also kill HOT (Heap-Only-Tuple) updates, which normally skip index maintenance when unindexed columns change.
2. **Planning time.** The planner evaluates more execution paths per index. On simple OLTP queries, planning time can exceed execution time by 4x when index count is high.
3. **Lock contention (fastpath limit).** PostgreSQL uses a fast path for the first 16 locks per backend. After 16 relations (table + its indexes), it falls back to slower LWLock mechanisms. At high QPS (100+), this causes `LockManager` wait events.
Rules:
- Drop unused and redundant indexes regularly
- Be especially careful with partitioned tables (each partition multiplies the index count)
- Use prepared statements to reduce planning overhead when index count is high
## Finding redundant indexes
Two indexes are redundant when:
- They have the same columns in the same order (duplicates)
- One is a prefix of the other: index `(a)` is redundant to `(a, b)`, but NOT to `(b, a)`
Column order matters. For partial indexes, the WHERE clause must also match.
```sql
-- Quick check: find indexes that share a leading column on the same table
SELECT
a.indrelid::regclass AS table_name,
a.indexrelid::regclass AS index_a,
b.indexrelid::regclass AS index_b,
pg_size_pretty(pg_relation_size(a.indexrelid)) AS size_a,
pg_size_pretty(pg_relation_size(b.indexrelid)) AS size_b
FROM pg_index a
JOIN pg_index b ON a.indrelid = b.indrelid
AND a.indexrelid != b.indexrelid
AND a.indkey::text = (
SELECT string_agg(x::text, ' ')
FROM unnest(b.indkey[:array_length(a.indkey, 1)]) AS x
)
WHERE NOT a.indisunique;
```
Before dropping: verify on all workload nodes (primary + replicas), use `DROP INDEX CONCURRENTLY`, and monitor for plan regressions.
## Monitoring index usage
### Identify unused indexes
Query `pg_stat_all_indexes` to find indexes that are never or rarely scanned:
```sql
SELECT
idxstat.schemaname AS schema_name,
idxstat.relname AS table_name,
idxstat.indexrelname AS index_name,
idxstat.idx_scan AS index_scans_count,
idxstat.last_idx_scan AS last_idx_scan_timestamp,
pg_size_pretty(pg_relation_size(idxstat.indexrelid)) AS index_size
FROM pg_stat_all_indexes AS idxstat
JOIN pg_index i ON idxstat.indexrelid = i.indexrelid
WHERE idxstat.schemaname NOT IN ('pg_catalog', 'information_schema', 'pg_toast')
AND NOT i.indisunique
ORDER BY idxstat.idx_scan ASC, idxstat.last_idx_scan ASC;
```
Indexes with `idx_scan = 0` and no recent `last_idx_scan` are candidates for removal.
Before dropping, verify:
- Stats haven't been reset recently (check `stats_reset` in `pg_stat_database`)
- Stats cover at least 1 month of production traffic
- All workload nodes (primary + replicas) have been checked
- The index isn't used by a periodic job that runs infrequently
```sql
-- Check when stats were last reset
SELECT stats_reset, age(now(), stats_reset)
FROM pg_stat_database
WHERE datname = current_database();
```
### Monitor index creation progress
Do not assume index creation succeeded. Use `pg_stat_progress_create_index` (Postgres 12+) to watch progress live:
```sql
SELECT * FROM pg_stat_progress_create_index;
```
In psql, use `\watch 5` to refresh every 5 seconds for a live dashboard view. `CREATE INDEX CONCURRENTLY` and `REINDEX CONCURRENTLY` have more phases than standard operations: monitor for blocking sessions and wait events.
### Validate index integrity
Check for invalid indexes regularly:
```sql
SELECT c.relname AS index_name, i.indisvalid
FROM pg_class c
JOIN pg_index i ON i.indexrelid = c.oid
WHERE i.indisvalid = false;
```
Invalid indexes are ignored by the planner. They waste space and cause inconsistent query performance, especially on partitioned tables where some partitions may have valid indexes and others do not.
## Concurrent operations
### Always use CONCURRENTLY in production
Never create or drop indexes without `CONCURRENTLY` on live tables. Without it, the operation holds a lock that blocks all writes.
```sql
-- Create
CREATE INDEX CONCURRENTLY IF NOT EXISTS index_name ON table_name (column_name);
-- Drop
DROP INDEX CONCURRENTLY IF EXISTS index_name;
```
`DROP INDEX CONCURRENTLY` cannot run inside a transaction block.
### Always use IF NOT EXISTS / IF EXISTS
Makes scripts idempotent. Safe to re-run without errors from duplicate or missing indexes.
### Concurrent indexing can fail silently
`CREATE INDEX CONCURRENTLY` can fail without raising an error. The result is an invalid index that the planner ignores. This is particularly dangerous on partitioned tables: some partitions get valid indexes, others don't, causing inconsistent query performance.
After any concurrent index creation, always validate:
```sql
SELECT c.relname, i.indisvalid
FROM pg_class c
JOIN pg_index i ON i.indexrelid = c.oid
WHERE c.relname LIKE '%your_index_name%';
```
## Reindexing invalid indexes
Rebuild invalid indexes without locking writes:
```sql
REINDEX INDEX CONCURRENTLY index_name;
```
### Understanding _ccnew and _ccold artifacts
When `CREATE INDEX CONCURRENTLY` or `REINDEX INDEX CONCURRENTLY` is interrupted, temporary indexes may remain:
| Suffix | Meaning | Action |
|--------|---------|--------|
| `_ccnew` | New index being built, incomplete | Drop it and retry `REINDEX CONCURRENTLY` |
| `_ccold` | Old index being replaced, rebuild succeeded | Safe to drop |
```sql
-- Example: both original and temp are invalid
-- users_emails_2019 btree (col) INVALID
-- users_emails_2019_ccnew btree (col) INVALID
-- Drop the failed new one, then retry
DROP INDEX CONCURRENTLY IF EXISTS users_emails_2019_ccnew;
REINDEX INDEX CONCURRENTLY users_emails_2019;
```
These leftovers clutter the schema, confuse developers, and waste disk space. Clean them up.
## Indexing partitioned tables
### Do NOT use ALTER INDEX ATTACH PARTITION
As stated in PostgreSQL documentation, `ALTER INDEX ... ATTACH PARTITION` prevents dropping malfunctioning or non-performant indexes from individual partitions. An attached index cannot be dropped by itself and is automatically dropped if its parent index is dropped.
This removes the ability to manage indexes per-partition, which we need for:
- Dropping broken indexes on specific partitions
- Skipping indexes on old partitions to save storage
- Rebuilding indexes on individual partitions without affecting others
### Correct approach: create on partitions, then on parent
1. Create the index on each child partition concurrently:
```sql
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_child_partition
ON child_partition (column_name);
```
2. Create the index on the parent table (metadata-only, fast):
```sql
CREATE INDEX IF NOT EXISTS idx_parent
ON parent_table (column_name);
```
PostgreSQL will automatically recognize partition-level indexes as part of the parent index definition when the index names and definitions match.
### Prioritize active partitions
For time-based partitions (findings uses monthly partitions):
- Create indexes on recent/current partitions where data is actively queried
- Skip older partitions that are rarely accessed
- The `all_partitions=False` default in `create_index_on_partitions` handles this automatically
## Index maintenance and bloat
Over time, B-tree indexes accumulate bloat from updates and deletes. VACUUM reclaims heap space but does NOT rebalance B-tree pages. Periodic reindexing is necessary for heavily updated tables.
### Detecting bloat
Indexes with estimated bloat above 50% are candidates for `REINDEX CONCURRENTLY`. Check bloat with tools like `pgstattuple` or bloat estimation queries.
### Reducing bloat buildup
Three things slow degradation:
1. **Upgrade to PostgreSQL 14+** for B-tree deduplication and bottom-up deletion
2. **Maximize HOT updates** by not indexing frequently-updated columns
3. **Tune autovacuum** to run more aggressively on high-churn tables
### Rebuilding many indexes without deadlocks
If you rebuild two indexes on the same table in parallel, PostgreSQL detects a deadlock and kills one session. To rebuild many indexes across multiple sessions safely, assign all indexes for a given table to the same session:
```sql
\set NUMBER_OF_SESSIONS 10
SELECT
format('%I.%I', n.nspname, c.relname) AS table_fqn,
format('%I.%I', n.nspname, i.relname) AS index_fqn,
mod(
hashtext(format('%I.%I', n.nspname, c.relname)) & 2147483647,
:NUMBER_OF_SESSIONS
) AS session_id
FROM pg_index idx
JOIN pg_class c ON idx.indrelid = c.oid
JOIN pg_class i ON idx.indexrelid = i.oid
JOIN pg_namespace n ON c.relnamespace = n.oid
WHERE n.nspname NOT IN ('pg_catalog', 'pg_toast', 'information_schema')
ORDER BY table_fqn, index_fqn;
```
Then run each session's indexes in a separate `REINDEX INDEX CONCURRENTLY` call. Set `NUMBER_OF_SESSIONS` based on `max_parallel_maintenance_workers` and available I/O.
## Dropping indexes
### Post-drop maintenance
After dropping an index, run VACUUM and ANALYZE to reclaim space and update planner statistics:
```sql
-- Full vacuum + analyze (can be heavy on large tables)
VACUUM (ANALYZE) your_table;
-- Lightweight alternative for huge tables: just update statistics
ANALYZE your_table;
```
## Commands
```sql
-- Validate query uses an index
EXPLAIN (ANALYZE, BUFFERS) SELECT ...;
-- Check index creation progress
SELECT * FROM pg_stat_progress_create_index;
-- Find invalid indexes
SELECT c.relname, i.indisvalid
FROM pg_class c JOIN pg_index i ON i.indexrelid = c.oid
WHERE i.indisvalid = false;
-- Find unused indexes
SELECT relname, indexrelname, idx_scan, pg_size_pretty(pg_relation_size(indexrelid))
FROM pg_stat_all_indexes
WHERE schemaname = 'public' AND idx_scan = 0;
-- Create index safely
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_name ON table (columns);
-- Drop index safely
DROP INDEX CONCURRENTLY IF EXISTS idx_name;
-- Rebuild invalid index
REINDEX INDEX CONCURRENTLY idx_name;
-- Post-drop maintenance
VACUUM (ANALYZE) table_name;
```
## Context7 lookups
**Prerequisite:** Install Context7 MCP server for up-to-date documentation lookup.
| Library | Context7 ID | Use for |
|---------|-------------|---------|
| PostgreSQL | `/websites/postgresql_org_docs_current` | Index types, EXPLAIN, partitioned table indexing, REINDEX |
**Example queries:**
```
mcp_context7_query-docs(libraryId="/websites/postgresql_org_docs_current", query="CREATE INDEX CONCURRENTLY partitioned table")
mcp_context7_query-docs(libraryId="/websites/postgresql_org_docs_current", query="EXPLAIN ANALYZE BUFFERS query plan")
mcp_context7_query-docs(libraryId="/websites/postgresql_org_docs_current", query="partial index WHERE clause")
mcp_context7_query-docs(libraryId="/websites/postgresql_org_docs_current", query="REINDEX CONCURRENTLY invalid index")
mcp_context7_query-docs(libraryId="/websites/postgresql_org_docs_current", query="pg_stat_all_indexes monitoring")
```
> **Note:** Use `mcp_context7_resolve-library-id` first if you need to find the correct library ID.
## Resources
- **EXPLAIN Visualizer**: [pev](https://tatiyants.com/pev/)