mirror of
https://github.com/prowler-cloud/prowler.git
synced 2026-05-14 16:25:13 +00:00
Compare commits
4 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| c1f0760e0f | |||
| 4b81bccade | |||
| 5b791be018 | |||
| 7c71038e1f |
+49
-491
@@ -1,8 +1,6 @@
|
||||
---
|
||||
name: django-drf
|
||||
description: >
|
||||
Django REST Framework patterns.
|
||||
Trigger: When implementing generic DRF APIs (ViewSets, serializers, routers, permissions, filtersets). For Prowler API specifics (RLS/RBAC/Providers), also use prowler-api.
|
||||
description: "Trigger: When implementing generic DRF APIs such as viewsets, serializers, routers, permissions, pagination, or filtersets, including JSON:API-capable endpoints. Applies the shared DRF execution patterns used in Prowler."
|
||||
license: Apache-2.0
|
||||
metadata:
|
||||
author: prowler-cloud
|
||||
@@ -15,491 +13,51 @@ metadata:
|
||||
allowed-tools: Read, Edit, Write, Glob, Grep, Bash, WebFetch, WebSearch, Task
|
||||
---
|
||||
|
||||
## Critical Patterns
|
||||
|
||||
- ALWAYS separate serializers by operation: Read / Create / Update / Include
|
||||
- ALWAYS use `filterset_class` for complex filtering (not `filterset_fields`)
|
||||
- ALWAYS validate unknown fields in write serializers (inherit `BaseWriteSerializer`)
|
||||
- ALWAYS use `select_related`/`prefetch_related` in `get_queryset()` to avoid N+1
|
||||
- ALWAYS handle `swagger_fake_view` in `get_queryset()` for schema generation
|
||||
- ALWAYS use `@extend_schema_field` for OpenAPI docs on `SerializerMethodField`
|
||||
- NEVER put business logic in serializers - use services/utils
|
||||
- NEVER use auto-increment PKs - use UUIDv4 or UUIDv7
|
||||
- NEVER use trailing slashes in URLs (`trailing_slash=False`)
|
||||
|
||||
> **Note:** `swagger_fake_view` is specific to **drf-spectacular** for OpenAPI schema generation.
|
||||
|
||||
---
|
||||
|
||||
## Implementation Checklist
|
||||
|
||||
When implementing a new endpoint, review these patterns in order:
|
||||
|
||||
| # | Pattern | Reference | Key Points |
|
||||
|---|---------|-----------|------------|
|
||||
| 1 | **Models** | `api/models.py` | UUID PK, `inserted_at`/`updated_at`, `JSONAPIMeta.resource_name` |
|
||||
| 2 | **ViewSets** | `api/base_views.py`, `api/v1/views.py` | Inherit `BaseRLSViewSet`, `get_queryset()` with N+1 prevention |
|
||||
| 3 | **Serializers** | `api/v1/serializers.py` | Separate Read/Create/Update/Include, inherit `BaseWriteSerializer` |
|
||||
| 4 | **Filters** | `api/filters.py` | Use `filterset_class`, inherit base filter classes |
|
||||
| 5 | **Permissions** | `api/base_views.py` | `required_permissions`, `set_required_permissions()` |
|
||||
| 6 | **Pagination** | `api/pagination.py` | Custom pagination class if needed |
|
||||
| 7 | **URL Routing** | `api/v1/urls.py` | `trailing_slash=False`, kebab-case paths |
|
||||
| 8 | **OpenAPI Schema** | `api/v1/views.py` | `@extend_schema_view` with drf-spectacular |
|
||||
| 9 | **Tests** | `api/tests/test_views.py` | JSON:API content type, fixture patterns |
|
||||
|
||||
> **Full file paths**: See [references/file-locations.md](references/file-locations.md)
|
||||
|
||||
---
|
||||
|
||||
## Decision Trees
|
||||
|
||||
### Which Serializer?
|
||||
```
|
||||
GET list/retrieve → <Model>Serializer
|
||||
POST create → <Model>CreateSerializer
|
||||
PATCH update → <Model>UpdateSerializer
|
||||
?include=... → <Model>IncludeSerializer
|
||||
```
|
||||
|
||||
### Which Base Serializer?
|
||||
```
|
||||
Read-only serializer → BaseModelSerializerV1
|
||||
Create with tenant_id → RLSSerializer + BaseWriteSerializer (auto-injects tenant_id on create)
|
||||
Update with validation → BaseWriteSerializer (tenant_id already exists on object)
|
||||
Non-model data → BaseSerializerV1
|
||||
```
|
||||
|
||||
### Which Filter Base?
|
||||
```
|
||||
Direct FK to Provider → BaseProviderFilter
|
||||
FK via Scan → BaseScanProviderFilter
|
||||
No provider relation → FilterSet
|
||||
```
|
||||
|
||||
### Which Base ViewSet?
|
||||
```
|
||||
RLS-protected model → BaseRLSViewSet (most common)
|
||||
Tenant operations → BaseTenantViewset
|
||||
User operations → BaseUserViewset
|
||||
No RLS required → BaseViewSet (rare)
|
||||
```
|
||||
|
||||
### Resource Name Format?
|
||||
```
|
||||
Single word model → plural lowercase (Provider → providers)
|
||||
Multi-word model → plural lowercase kebab (ProviderGroup → provider-groups)
|
||||
Through/join model → parent-child pattern (UserRoleRelationship → user-roles)
|
||||
Aggregation/overview → descriptive kebab plural (ComplianceOverview → compliance-overviews)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Serializer Patterns
|
||||
|
||||
### Base Class Hierarchy
|
||||
|
||||
```python
|
||||
# Read serializer (most common)
|
||||
class ProviderSerializer(RLSSerializer):
|
||||
class Meta:
|
||||
model = Provider
|
||||
fields = ["id", "provider", "uid", "alias", "connected", "inserted_at"]
|
||||
|
||||
# Write serializer (validates unknown fields)
|
||||
class ProviderCreateSerializer(RLSSerializer, BaseWriteSerializer):
|
||||
class Meta:
|
||||
model = Provider
|
||||
fields = ["provider", "uid", "alias"]
|
||||
|
||||
# Include serializer (sparse fields for ?include=)
|
||||
class ProviderIncludeSerializer(RLSSerializer):
|
||||
class Meta:
|
||||
model = Provider
|
||||
fields = ["id", "alias"] # Minimal fields
|
||||
```
|
||||
|
||||
### SerializerMethodField with OpenAPI
|
||||
|
||||
```python
|
||||
from drf_spectacular.utils import extend_schema_field
|
||||
|
||||
class ProviderSerializer(RLSSerializer):
|
||||
connection = serializers.SerializerMethodField(read_only=True)
|
||||
|
||||
@extend_schema_field({
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"connected": {"type": "boolean"},
|
||||
"last_checked_at": {"type": "string", "format": "date-time"},
|
||||
},
|
||||
})
|
||||
def get_connection(self, obj):
|
||||
return {
|
||||
"connected": obj.connected,
|
||||
"last_checked_at": obj.connection_last_checked_at,
|
||||
}
|
||||
```
|
||||
|
||||
### Included Serializers (JSON:API)
|
||||
|
||||
```python
|
||||
class ScanSerializer(RLSSerializer):
|
||||
included_serializers = {
|
||||
"provider": "api.v1.serializers.ProviderIncludeSerializer",
|
||||
}
|
||||
```
|
||||
|
||||
### Sensitive Data Masking
|
||||
|
||||
```python
|
||||
def to_representation(self, instance):
|
||||
data = super().to_representation(instance)
|
||||
# Mask by default, expose only on explicit request
|
||||
fields_param = self.context.get("request").query_params.get("fields[my-model]", "")
|
||||
if "api_key" in fields_param:
|
||||
data["api_key"] = instance.api_key_decoded
|
||||
else:
|
||||
data["api_key"] = "****" if instance.api_key else None
|
||||
return data
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## ViewSet Patterns
|
||||
|
||||
### get_queryset() with N+1 Prevention
|
||||
|
||||
**Always combine** `swagger_fake_view` check with `select_related`/`prefetch_related`:
|
||||
|
||||
```python
|
||||
def get_queryset(self):
|
||||
# REQUIRED: Return empty queryset for OpenAPI schema generation
|
||||
if getattr(self, "swagger_fake_view", False):
|
||||
return Provider.objects.none()
|
||||
|
||||
# N+1 prevention: eager load relationships
|
||||
return Provider.objects.select_related(
|
||||
"tenant",
|
||||
).prefetch_related(
|
||||
"provider_groups",
|
||||
Prefetch("tags", queryset=ProviderTag.objects.filter(tenant_id=self.request.tenant_id)),
|
||||
)
|
||||
```
|
||||
|
||||
> **Why swagger_fake_view?** drf-spectacular introspects ViewSets to generate OpenAPI schemas. Without this check, it executes real queries and can fail without request context.
|
||||
|
||||
### Action-Specific Serializers
|
||||
|
||||
```python
|
||||
def get_serializer_class(self):
|
||||
if self.action == "create":
|
||||
return ProviderCreateSerializer
|
||||
elif self.action == "partial_update":
|
||||
return ProviderUpdateSerializer
|
||||
elif self.action in ["connection", "destroy"]:
|
||||
return TaskSerializer
|
||||
return ProviderSerializer
|
||||
```
|
||||
|
||||
### Dynamic Permissions per Action
|
||||
|
||||
```python
|
||||
class ProviderViewSet(BaseRLSViewSet):
|
||||
required_permissions = [Permissions.MANAGE_PROVIDERS]
|
||||
|
||||
def set_required_permissions(self):
|
||||
if self.action in ["list", "retrieve"]:
|
||||
self.required_permissions = [] # Read-only = no permission
|
||||
else:
|
||||
self.required_permissions = [Permissions.MANAGE_PROVIDERS]
|
||||
```
|
||||
|
||||
### Cache Decorator
|
||||
|
||||
```python
|
||||
from django.utils.decorators import method_decorator
|
||||
from django.views.decorators.cache import cache_control
|
||||
|
||||
CACHE_DECORATOR = cache_control(
|
||||
max_age=django_settings.CACHE_MAX_AGE,
|
||||
stale_while_revalidate=django_settings.CACHE_STALE_WHILE_REVALIDATE,
|
||||
)
|
||||
|
||||
@method_decorator(CACHE_DECORATOR, name="list")
|
||||
@method_decorator(CACHE_DECORATOR, name="retrieve")
|
||||
class ProviderViewSet(BaseRLSViewSet):
|
||||
pass
|
||||
```
|
||||
|
||||
### Custom Actions
|
||||
|
||||
```python
|
||||
# Detail action (operates on single object)
|
||||
@action(detail=True, methods=["post"], url_name="connection")
|
||||
def connection(self, request, pk=None):
|
||||
instance = self.get_object()
|
||||
# Process instance...
|
||||
|
||||
# List action (operates on collection)
|
||||
@action(detail=False, methods=["get"], url_name="metadata")
|
||||
def metadata(self, request):
|
||||
queryset = self.filter_queryset(self.get_queryset())
|
||||
# Aggregate over queryset...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Filter Patterns
|
||||
|
||||
### Base Filter Classes
|
||||
|
||||
```python
|
||||
class BaseProviderFilter(FilterSet):
|
||||
"""For models with direct FK to Provider"""
|
||||
provider_id = UUIDFilter(field_name="provider__id", lookup_expr="exact")
|
||||
provider_id__in = UUIDInFilter(field_name="provider__id", lookup_expr="in")
|
||||
provider_type = ChoiceFilter(field_name="provider__provider", choices=Provider.ProviderChoices.choices)
|
||||
|
||||
class BaseScanProviderFilter(FilterSet):
|
||||
"""For models with FK to Scan (Scan has FK to Provider)"""
|
||||
provider_id = UUIDFilter(field_name="scan__provider__id", lookup_expr="exact")
|
||||
```
|
||||
|
||||
### Custom Multi-Value Filters
|
||||
|
||||
```python
|
||||
class UUIDInFilter(BaseInFilter, UUIDFilter):
|
||||
pass
|
||||
|
||||
class CharInFilter(BaseInFilter, CharFilter):
|
||||
pass
|
||||
|
||||
class ChoiceInFilter(BaseInFilter, ChoiceFilter):
|
||||
pass
|
||||
```
|
||||
|
||||
### ArrayField Filtering
|
||||
|
||||
```python
|
||||
# Single value contains
|
||||
region = CharFilter(method="filter_region")
|
||||
|
||||
def filter_region(self, queryset, name, value):
|
||||
return queryset.filter(resource_regions__contains=[value])
|
||||
|
||||
# Multi-value overlap
|
||||
region__in = CharInFilter(field_name="resource_regions", lookup_expr="overlap")
|
||||
```
|
||||
|
||||
### Date Range Validation
|
||||
|
||||
```python
|
||||
def filter_queryset(self, queryset):
|
||||
# Require date filter for performance
|
||||
if not (date_filters_provided):
|
||||
raise ValidationError([{
|
||||
"detail": "At least one date filter is required",
|
||||
"status": 400,
|
||||
"source": {"pointer": "/data/attributes/inserted_at"},
|
||||
"code": "required",
|
||||
}])
|
||||
|
||||
# Validate max range
|
||||
if date_range > settings.FINDINGS_MAX_DAYS_IN_RANGE:
|
||||
raise ValidationError(...)
|
||||
|
||||
return super().filter_queryset(queryset)
|
||||
```
|
||||
|
||||
### Dynamic FilterSet Selection
|
||||
|
||||
```python
|
||||
def get_filterset_class(self):
|
||||
if self.action in ["latest", "metadata_latest"]:
|
||||
return LatestFindingFilter
|
||||
return FindingFilter
|
||||
```
|
||||
|
||||
### Enum Field Override
|
||||
|
||||
```python
|
||||
class Meta:
|
||||
model = Finding
|
||||
filter_overrides = {
|
||||
FindingDeltaEnumField: {"filter_class": CharFilter},
|
||||
StatusEnumField: {"filter_class": CharFilter},
|
||||
SeverityEnumField: {"filter_class": CharFilter},
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Performance Patterns
|
||||
|
||||
### PaginateByPkMixin
|
||||
|
||||
For large querysets with expensive joins:
|
||||
|
||||
```python
|
||||
class PaginateByPkMixin:
|
||||
def paginate_by_pk(self, request, base_queryset, manager,
|
||||
select_related=None, prefetch_related=None):
|
||||
# 1. Get PKs only (cheap)
|
||||
pk_list = base_queryset.values_list("id", flat=True)
|
||||
page = self.paginate_queryset(pk_list)
|
||||
|
||||
# 2. Fetch full objects for just the page
|
||||
queryset = manager.filter(id__in=page)
|
||||
if select_related:
|
||||
queryset = queryset.select_related(*select_related)
|
||||
if prefetch_related:
|
||||
queryset = queryset.prefetch_related(*prefetch_related)
|
||||
|
||||
# 3. Re-sort to preserve DB ordering
|
||||
queryset = sorted(queryset, key=lambda obj: page.index(obj.id))
|
||||
return self.get_paginated_response(self.get_serializer(queryset, many=True).data)
|
||||
```
|
||||
|
||||
### Prefetch in Serializers
|
||||
|
||||
```python
|
||||
def get_tags(self, obj):
|
||||
# Use prefetched tags if available
|
||||
if hasattr(obj, "prefetched_tags"):
|
||||
return {tag.key: tag.value for tag in obj.prefetched_tags}
|
||||
# Fallback (causes N+1 if not prefetched)
|
||||
return obj.get_tags(self.context.get("tenant_id"))
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Naming Conventions
|
||||
|
||||
| Entity | Pattern | Example |
|
||||
|--------|---------|---------|
|
||||
| Serializer (read) | `<Model>Serializer` | `ProviderSerializer` |
|
||||
| Serializer (create) | `<Model>CreateSerializer` | `ProviderCreateSerializer` |
|
||||
| Serializer (update) | `<Model>UpdateSerializer` | `ProviderUpdateSerializer` |
|
||||
| Serializer (include) | `<Model>IncludeSerializer` | `ProviderIncludeSerializer` |
|
||||
| Filter | `<Model>Filter` | `ProviderFilter` |
|
||||
| ViewSet | `<Model>ViewSet` | `ProviderViewSet` |
|
||||
|
||||
---
|
||||
|
||||
## OpenAPI Documentation
|
||||
|
||||
```python
|
||||
from drf_spectacular.utils import extend_schema, extend_schema_view
|
||||
|
||||
@extend_schema_view(
|
||||
list=extend_schema(tags=["Provider"], summary="List all providers"),
|
||||
retrieve=extend_schema(tags=["Provider"], summary="Retrieve provider"),
|
||||
create=extend_schema(tags=["Provider"], summary="Create provider"),
|
||||
)
|
||||
@extend_schema(tags=["Provider"])
|
||||
class ProviderViewSet(BaseRLSViewSet):
|
||||
pass
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## API Security Patterns
|
||||
|
||||
> **Full examples**: See [assets/security_patterns.py](assets/security_patterns.py)
|
||||
|
||||
| Pattern | Key Points |
|
||||
|---------|------------|
|
||||
| **Input Validation** | Use `validate_<field>()` for sanitization, `validate()` for cross-field |
|
||||
| **Prevent Mass Assignment** | ALWAYS use explicit `fields` list, NEVER `__all__` or `exclude` |
|
||||
| **Object-Level Permissions** | Implement `has_object_permission()` for ownership checks |
|
||||
| **Rate Limiting** | Configure `DEFAULT_THROTTLE_RATES`, use per-view throttles for sensitive endpoints |
|
||||
| **Prevent Info Disclosure** | Generic error messages, return 404 not 403 for unauthorized (prevents enumeration) |
|
||||
| **SQL Injection** | ALWAYS use ORM parameterization, NEVER string interpolation in raw SQL |
|
||||
|
||||
### Quick Reference
|
||||
|
||||
```python
|
||||
# Input validation in serializer
|
||||
def validate_uid(self, value):
|
||||
value = value.strip().lower()
|
||||
if not re.match(r'^[a-z0-9-]+$', value):
|
||||
raise serializers.ValidationError("Invalid format")
|
||||
return value
|
||||
|
||||
# Explicit fields (prevent mass assignment)
|
||||
class Meta:
|
||||
fields = ["name", "email"] # GOOD: whitelist
|
||||
read_only_fields = ["id", "inserted_at"] # System fields
|
||||
|
||||
# Object permission
|
||||
class IsOwnerOrReadOnly(BasePermission):
|
||||
def has_object_permission(self, request, view, obj):
|
||||
if request.method in SAFE_METHODS:
|
||||
return True
|
||||
return obj.owner == request.user
|
||||
|
||||
# Throttling for sensitive endpoints
|
||||
class BurstRateThrottle(UserRateThrottle):
|
||||
rate = "10/minute"
|
||||
|
||||
# Safe error messages (prevent enumeration)
|
||||
def get_object(self):
|
||||
try:
|
||||
return super().get_object()
|
||||
except Http404:
|
||||
raise NotFound("Resource not found") # Generic, no internal IDs
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Commands
|
||||
|
||||
```bash
|
||||
# Development
|
||||
cd api && poetry run python src/backend/manage.py runserver
|
||||
cd api && poetry run python src/backend/manage.py shell
|
||||
|
||||
# Database
|
||||
cd api && poetry run python src/backend/manage.py makemigrations
|
||||
cd api && poetry run python src/backend/manage.py migrate
|
||||
|
||||
# Testing
|
||||
cd api && poetry run pytest -x --tb=short
|
||||
cd api && poetry run make lint
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Resources
|
||||
|
||||
### Local References
|
||||
- **File Locations**: See [references/file-locations.md](references/file-locations.md)
|
||||
- **JSON:API Conventions**: See [references/json-api-conventions.md](references/json-api-conventions.md)
|
||||
- **Security Patterns**: See [assets/security_patterns.py](assets/security_patterns.py)
|
||||
|
||||
### Context7 MCP (Recommended)
|
||||
|
||||
**Prerequisite:** Install Context7 MCP server for up-to-date documentation lookup.
|
||||
|
||||
When implementing or debugging, query these libraries via `mcp_context7_query-docs`:
|
||||
|
||||
| Library | Context7 ID | Use For |
|
||||
|---------|-------------|---------|
|
||||
| **Django** | `/websites/djangoproject_en_5_2` | Models, ORM, migrations |
|
||||
| **DRF** | `/websites/django-rest-framework` | ViewSets, serializers, permissions |
|
||||
| **drf-spectacular** | `/tfranzel/drf-spectacular` | OpenAPI schema, `@extend_schema` |
|
||||
|
||||
**Example queries:**
|
||||
```
|
||||
mcp_context7_query-docs(libraryId="/websites/django-rest-framework", query="ViewSet get_queryset best practices")
|
||||
mcp_context7_query-docs(libraryId="/tfranzel/drf-spectacular", query="extend_schema examples for custom actions")
|
||||
mcp_context7_query-docs(libraryId="/websites/djangoproject_en_5_2", query="model constraints and indexes")
|
||||
```
|
||||
|
||||
> **Note:** Use `mcp_context7_resolve-library-id` first if you need to find the correct library ID.
|
||||
|
||||
### External Docs
|
||||
- **DRF Docs**: https://www.django-rest-framework.org/
|
||||
- **DRF JSON:API**: https://django-rest-framework-json-api.readthedocs.io/
|
||||
- **drf-spectacular**: https://drf-spectacular.readthedocs.io/
|
||||
- **django-filter**: https://django-filter.readthedocs.io/
|
||||
## Activation Contract
|
||||
|
||||
Use this skill for generic DRF implementation structure: serializer layering, viewset composition, filtersets, routing, pagination, schema annotations, and query efficiency. Pair it with `jsonapi` for spec compliance and `prowler-api` when tenant isolation, RBAC, providers, or Celery-specific behavior enters the picture.
|
||||
|
||||
## Hard Rules
|
||||
|
||||
- Always separate serializer responsibilities by operation instead of one serializer doing everything.
|
||||
- Always use `filterset_class` for meaningful filtering logic.
|
||||
- Always validate unknown write fields through the repo’s write-serializer pattern.
|
||||
- Always protect `get_queryset()` with `swagger_fake_view` handling and N+1 prevention.
|
||||
- Always prefer UUID-based identifiers and kebab-case API paths.
|
||||
- Never hide business logic in serializers when it belongs in services, utilities, or domain code.
|
||||
|
||||
## Decision Gates
|
||||
|
||||
| Question | Action |
|
||||
|---|---|
|
||||
| Is this a generic DRF endpoint concern? | Use this skill as the primary implementation guide. |
|
||||
| Is the task about payload compliance rather than mechanics? | Load `jsonapi` too. |
|
||||
| Is the endpoint Prowler-specific because of RLS, RBAC, or providers? | Load `prowler-api` too. |
|
||||
| Do reads and writes have different responsibilities? | Split read, create, update, and include serializers. |
|
||||
| Could the queryset explode into N+1 queries or schema-generation failures? | Fix `get_queryset()` with eager loading and `swagger_fake_view` handling. |
|
||||
|
||||
## Execution Steps
|
||||
|
||||
1. Identify the endpoint surface: model, serializer set, filterset, router path, permission rule, or schema annotation.
|
||||
2. Choose the correct base classes for read, write, include, and viewset behavior.
|
||||
3. Design `get_queryset()` for correctness first, then add eager loading and schema-safety.
|
||||
4. Add filtersets, pagination, and action-specific serializers instead of overloading one class.
|
||||
5. Cross-check response shape with `jsonapi` and any tenant/provider behavior with `prowler-api`.
|
||||
6. Return the concrete DRF patterns that should be applied in code.
|
||||
|
||||
## Output Contract
|
||||
|
||||
- State which DRF layer is being guided: serializer, viewset, filterset, router, schema, or permission.
|
||||
- Mention the main pattern chosen, such as split serializers, `filterset_class`, or safe `get_queryset()`.
|
||||
- Name any companion skills required.
|
||||
- Flag the main correctness risk: N+1, schema-generation failure, weak validation, or over-coupled serializer logic.
|
||||
|
||||
## References
|
||||
|
||||
- [Repository agent rules](../../AGENTS.md)
|
||||
- [API component guidance](../../api/AGENTS.md)
|
||||
- [DRF file locations](references/file-locations.md)
|
||||
- [JSON:API conventions](references/json-api-conventions.md)
|
||||
- [Security patterns asset](assets/security_patterns.py)
|
||||
- [JSON:API skill](../jsonapi/SKILL.md)
|
||||
- [Prowler API skill](../prowler-api/SKILL.md)
|
||||
|
||||
@@ -1,11 +1,6 @@
|
||||
---
|
||||
name: django-migration-psql
|
||||
description: >
|
||||
Reviews Django migration files for PostgreSQL best practices specific to Prowler.
|
||||
Trigger: When creating migrations, running makemigrations/pgmakemigrations, reviewing migration PRs,
|
||||
adding indexes or constraints to database tables, modifying existing migration files, or writing
|
||||
data backfill migrations. Always use this skill when you see AddIndex, CreateModel, AddConstraint,
|
||||
RunPython, bulk_create, bulk_update, or backfill operations in migration files.
|
||||
description: "Trigger: When creating, reviewing, or splitting Django/PostgreSQL migrations with AddIndex, CreateModel, AddConstraint, RunPython, or backfill logic. Enforces Prowler-safe migration structure for indexes, partitioned tables, and large data moves."
|
||||
license: Apache-2.0
|
||||
metadata:
|
||||
author: prowler-cloud
|
||||
@@ -19,436 +14,55 @@ metadata:
|
||||
allowed-tools: Read, Grep, Glob, Edit, Write, Bash
|
||||
---
|
||||
|
||||
## When to use
|
||||
## Activation Contract
|
||||
|
||||
- Creating a new Django migration
|
||||
- Running `makemigrations` or `pgmakemigrations`
|
||||
- Reviewing a PR that adds or modifies migrations
|
||||
- Adding indexes, constraints, or models to the database
|
||||
Use this skill when a migration changes schema, indexes, partitioned tables, or existing data in the Prowler API database.
|
||||
|
||||
## Why this matters
|
||||
## Hard Rules
|
||||
|
||||
A bad migration can lock a production table for minutes, block all reads/writes, or silently skip index creation on partitioned tables.
|
||||
- Never trust auto-generated migrations as-is; review and split them by concern and table.
|
||||
- Keep `CreateModel` plus integrity constraints together, but move every `AddIndex` into separate migration files.
|
||||
- Never mix indexes for multiple tables in one migration.
|
||||
- `finding` and `resourcefindingmapping` partitioned indexes require the two-step pattern: partition creation first, parent `AddIndex` second.
|
||||
- Partition index creation with `RunPython(create_index_on_partitions, ...)` MUST use `atomic = False` and include `reverse_code`.
|
||||
- Large backfills must dispatch Celery work; do not iterate millions of rows inside the migration transaction.
|
||||
- Inline backfills are allowed only for trivial, single-statement updates on small result sets.
|
||||
- Backfill tasks must batch writes, use one `rls_transaction()` per batch, and avoid `.save()` loops.
|
||||
|
||||
## Auto-generated migrations need splitting
|
||||
## Decision Gates
|
||||
|
||||
`makemigrations` and `pgmakemigrations` bundle everything into one file: `CreateModel`, `AddIndex`, `AddConstraint`, sometimes across multiple tables. This is the default Django behavior and it violates every rule below.
|
||||
|
||||
After generating a migration, ALWAYS review it and split it:
|
||||
|
||||
1. Read the generated file and identify every operation
|
||||
2. Group operations by concern:
|
||||
- `CreateModel` + `AddConstraint` for each new table → one migration per table
|
||||
- `AddIndex` per table → one migration per table
|
||||
- `AddIndex` on partitioned tables → two migrations (partition + parent)
|
||||
- `AlterField`, `AddField`, `RemoveField` for each table → one migration per table
|
||||
3. Rewrite the generated file into separate migration files with correct dependencies
|
||||
4. Delete the original auto-generated migration
|
||||
|
||||
When adding fields or indexes to an existing model, `makemigrations` may also bundle `AddIndex` for unrelated tables that had pending model changes. Always check for stowaways from other tables.
|
||||
|
||||
## Rule 1: separate indexes from model creation
|
||||
|
||||
`CreateModel` + `AddConstraint` = same migration (structural).
|
||||
`AddIndex` = separate migration file (performance).
|
||||
|
||||
Django runs each migration inside a transaction (unless `atomic = False`). If an index operation fails, it rolls back everything, including the model creation. Splitting means a failed index doesn't prevent the table from existing. It also lets you `--fake` index migrations independently (see Rule 4).
|
||||
|
||||
### Bad
|
||||
|
||||
```python
|
||||
# 0081_finding_group_daily_summary.py — DON'T DO THIS
|
||||
class Migration(migrations.Migration):
|
||||
operations = [
|
||||
migrations.CreateModel(name="FindingGroupDailySummary", ...),
|
||||
migrations.AddIndex(model_name="findinggroupdailysummary", ...), # separate this
|
||||
migrations.AddIndex(model_name="findinggroupdailysummary", ...), # separate this
|
||||
migrations.AddConstraint(model_name="findinggroupdailysummary", ...), # this is fine here
|
||||
]
|
||||
```
|
||||
|
||||
### Good
|
||||
|
||||
```python
|
||||
# 0081_create_finding_group_daily_summary.py
|
||||
class Migration(migrations.Migration):
|
||||
operations = [
|
||||
migrations.CreateModel(name="FindingGroupDailySummary", ...),
|
||||
# Constraints belong with the model — they define its integrity rules
|
||||
migrations.AddConstraint(model_name="findinggroupdailysummary", ...), # unique
|
||||
migrations.AddConstraint(model_name="findinggroupdailysummary", ...), # RLS
|
||||
]
|
||||
|
||||
# 0082_finding_group_daily_summary_indexes.py
|
||||
class Migration(migrations.Migration):
|
||||
dependencies = [("api", "0081_create_finding_group_daily_summary")]
|
||||
operations = [
|
||||
migrations.AddIndex(model_name="findinggroupdailysummary", ...),
|
||||
migrations.AddIndex(model_name="findinggroupdailysummary", ...),
|
||||
migrations.AddIndex(model_name="findinggroupdailysummary", ...),
|
||||
]
|
||||
```
|
||||
|
||||
Flag any migration with both `CreateModel` and `AddIndex` in `operations`.
|
||||
|
||||
## Rule 2: one table's indexes per migration
|
||||
|
||||
Each table's indexes must live in their own migration file. Never mix `AddIndex` for different `model_name` values in one migration.
|
||||
|
||||
If the index on table B fails, the rollback also drops the index on table A. The migration name gives no hint that it touches unrelated tables. You lose the ability to `--fake` one table's indexes without affecting the other.
|
||||
|
||||
### Bad
|
||||
|
||||
```python
|
||||
# 0081_finding_group_daily_summary.py — DON'T DO THIS
|
||||
class Migration(migrations.Migration):
|
||||
operations = [
|
||||
migrations.CreateModel(name="FindingGroupDailySummary", ...),
|
||||
migrations.AddIndex(model_name="findinggroupdailysummary", ...), # table A
|
||||
migrations.AddIndex(model_name="resource", ...), # table B!
|
||||
migrations.AddIndex(model_name="resource", ...), # table B!
|
||||
migrations.AddIndex(model_name="finding", ...), # table C!
|
||||
]
|
||||
```
|
||||
|
||||
### Good
|
||||
|
||||
```python
|
||||
# 0081_create_finding_group_daily_summary.py — model + constraints
|
||||
# 0082_finding_group_daily_summary_indexes.py — only FindingGroupDailySummary indexes
|
||||
# 0083_resource_trigram_indexes.py — only Resource indexes
|
||||
# 0084_finding_check_index_partitions.py — only Finding partition indexes (step 1)
|
||||
# 0085_finding_check_index_parent.py — only Finding parent index (step 2)
|
||||
```
|
||||
|
||||
Name each migration file after the table it affects. A reviewer should know which table a migration touches without opening the file.
|
||||
|
||||
Flag any migration where `AddIndex` operations reference more than one `model_name`.
|
||||
|
||||
## Rule 3: partitioned table indexes require the two-step pattern
|
||||
|
||||
Tables `findings` and `resource_finding_mappings` are range-partitioned. Plain `AddIndex` only creates the index definition on the parent table. Postgres does NOT propagate it to existing partitions. New partitions inherit it, but all current data stays unindexed.
|
||||
|
||||
Use the helpers in `api.db_utils`.
|
||||
|
||||
### Step 1: create indexes on actual partitions
|
||||
|
||||
```python
|
||||
# 0084_finding_check_index_partitions.py
|
||||
from functools import partial
|
||||
from django.db import migrations
|
||||
from api.db_utils import create_index_on_partitions, drop_index_on_partitions
|
||||
|
||||
|
||||
class Migration(migrations.Migration):
|
||||
atomic = False # REQUIRED — CREATE INDEX CONCURRENTLY can't run inside a transaction
|
||||
|
||||
dependencies = [("api", "0083_resource_trigram_indexes")]
|
||||
|
||||
operations = [
|
||||
migrations.RunPython(
|
||||
partial(
|
||||
create_index_on_partitions,
|
||||
parent_table="findings",
|
||||
index_name="find_tenant_check_ins_idx",
|
||||
columns="tenant_id, check_id, inserted_at",
|
||||
),
|
||||
reverse_code=partial(
|
||||
drop_index_on_partitions,
|
||||
parent_table="findings",
|
||||
index_name="find_tenant_check_ins_idx",
|
||||
),
|
||||
)
|
||||
]
|
||||
```
|
||||
|
||||
Key details:
|
||||
- `atomic = False` is mandatory. `CREATE INDEX CONCURRENTLY` cannot run inside a transaction.
|
||||
- Always provide `reverse_code` using `drop_index_on_partitions` so rollbacks work.
|
||||
- The default is `all_partitions=True`, which creates indexes on every partition CONCURRENTLY (no locks). This is the safe default.
|
||||
- Do NOT use `all_partitions=False` unless you understand the consequence: Step 2's `AddIndex` on the parent will create indexes on the skipped partitions **with locks** (not CONCURRENTLY), because PostgreSQL fills in missing partition indexes inline during parent index creation.
|
||||
|
||||
### Step 2: register the index with Django
|
||||
|
||||
```python
|
||||
# 0085_finding_check_index_parent.py
|
||||
from django.db import migrations, models
|
||||
|
||||
|
||||
class Migration(migrations.Migration):
|
||||
dependencies = [("api", "0084_finding_check_index_partitions")]
|
||||
|
||||
operations = [
|
||||
migrations.AddIndex(
|
||||
model_name="finding",
|
||||
index=models.Index(
|
||||
fields=["tenant_id", "check_id", "inserted_at"],
|
||||
name="find_tenant_check_ins_idx",
|
||||
),
|
||||
),
|
||||
]
|
||||
```
|
||||
|
||||
This second migration tells Django "this index exists" so it doesn't try to recreate it. New partitions created after this point inherit the index definition from the parent.
|
||||
|
||||
### Existing examples in the codebase
|
||||
|
||||
| Partition migration | Parent migration |
|
||||
| Question | Action |
|
||||
|---|---|
|
||||
| `0020_findings_new_performance_indexes_partitions.py` | `0021_findings_new_performance_indexes_parent.py` |
|
||||
| `0024_findings_uid_index_partitions.py` | `0025_findings_uid_index_parent.py` |
|
||||
| `0028_findings_check_index_partitions.py` | `0029_findings_check_index_parent.py` |
|
||||
| `0036_rfm_tenant_finding_index_partitions.py` | `0037_rfm_tenant_finding_index_parent.py` |
|
||||
|
||||
Flag any plain `AddIndex` on `finding` or `resourcefindingmapping` without a preceding partition migration.
|
||||
|
||||
## Rule 4: large table indexes — fake the migration, apply manually
|
||||
|
||||
For huge tables (findings has millions of rows), even `CREATE INDEX CONCURRENTLY` can take minutes and consume significant I/O. In production, you may want to decouple the migration from the actual index creation.
|
||||
|
||||
### Procedure
|
||||
|
||||
1. Write the migration normally following the two-step pattern above.
|
||||
|
||||
2. Fake the migration so Django marks it as applied without executing it:
|
||||
|
||||
```bash
|
||||
python manage.py migrate api 0084_finding_check_index_partitions --fake
|
||||
python manage.py migrate api 0085_finding_check_index_parent --fake
|
||||
```
|
||||
|
||||
3. Create the index manually during a low-traffic window via `psql` or `python manage.py dbshell --database admin`:
|
||||
|
||||
```sql
|
||||
-- For each partition you care about:
|
||||
CREATE INDEX CONCURRENTLY IF NOT EXISTS findings_2026_jan_find_tenant_check_ins_idx
|
||||
ON findings_2026_jan USING BTREE (tenant_id, check_id, inserted_at);
|
||||
|
||||
CREATE INDEX CONCURRENTLY IF NOT EXISTS findings_2026_feb_find_tenant_check_ins_idx
|
||||
ON findings_2026_feb USING BTREE (tenant_id, check_id, inserted_at);
|
||||
|
||||
-- Then register on the parent (this is fast, no data scan):
|
||||
CREATE INDEX IF NOT EXISTS find_tenant_check_ins_idx
|
||||
ON findings USING BTREE (tenant_id, check_id, inserted_at);
|
||||
```
|
||||
|
||||
4. Verify the index exists on the partitions you need:
|
||||
|
||||
```sql
|
||||
SELECT indexrelid::regclass, indrelid::regclass
|
||||
FROM pg_index
|
||||
WHERE indexrelid::regclass::text LIKE '%find_tenant_check_ins%';
|
||||
```
|
||||
|
||||
### When to use this approach
|
||||
|
||||
- The table will grow exponentially, e.g.: findings.
|
||||
- You want to control exactly when the I/O hit happens (e.g., during a maintenance window).
|
||||
|
||||
This is optional. For smaller tables or non-production environments, letting the migration run normally is fine.
|
||||
|
||||
## Rule 5: data backfills — never inline, always batched
|
||||
|
||||
Data backfills (updating existing rows, populating new columns, generating summary data) are the most dangerous migrations. A naive `Model.objects.all().update(...)` on a multi-million row table will hold a transaction lock for minutes, blow out WAL, and potentially OOM the worker.
|
||||
|
||||
### Never backfill inline in the migration
|
||||
|
||||
The migration should only dispatch the work. The actual backfill runs asynchronously via Celery tasks, outside the migration transaction.
|
||||
|
||||
```python
|
||||
# 0090_backfill_finding_group_summaries.py
|
||||
from django.db import migrations
|
||||
|
||||
def trigger_backfill(apps, schema_editor):
|
||||
from tasks.jobs.backfill import backfill_finding_group_summaries_task
|
||||
Tenant = apps.get_model("api", "Tenant")
|
||||
from api.db_router import MainRouter
|
||||
|
||||
tenant_ids = Tenant.objects.using(MainRouter.admin_db).values_list("id", flat=True)
|
||||
for tenant_id in tenant_ids:
|
||||
backfill_finding_group_summaries_task.delay(tenant_id=str(tenant_id))
|
||||
|
||||
class Migration(migrations.Migration):
|
||||
dependencies = [("api", "0089_previous_migration")]
|
||||
operations = [
|
||||
migrations.RunPython(trigger_backfill, migrations.RunPython.noop),
|
||||
]
|
||||
```
|
||||
|
||||
The migration finishes in seconds. The backfill runs in the background per-tenant.
|
||||
|
||||
### Exception: trivial updates
|
||||
|
||||
Single-statement bulk updates on small result sets are OK inline:
|
||||
|
||||
```python
|
||||
# Fine — single UPDATE, small result set, no iteration
|
||||
def backfill_graph_data_ready(apps, schema_editor):
|
||||
AttackPathsScan = apps.get_model("api", "AttackPathsScan")
|
||||
AttackPathsScan.objects.using(MainRouter.admin_db).filter(
|
||||
state="completed", graph_data_ready=False,
|
||||
).update(graph_data_ready=True)
|
||||
```
|
||||
|
||||
Use inline only when you're confident the affected row count is small (< ~10K rows).
|
||||
|
||||
### Batch processing in the Celery task
|
||||
|
||||
The actual backfill task must process data in batches. Use the helpers in `api.db_utils`:
|
||||
|
||||
```python
|
||||
from api.db_utils import create_objects_in_batches, update_objects_in_batches, batch_delete
|
||||
|
||||
# Creating objects in batches (500 per transaction)
|
||||
create_objects_in_batches(tenant_id, ScanCategorySummary, summaries, batch_size=500)
|
||||
|
||||
# Updating objects in batches
|
||||
update_objects_in_batches(tenant_id, Finding, findings, fields=["status"], batch_size=500)
|
||||
|
||||
# Deleting in batches
|
||||
batch_delete(tenant_id, queryset, batch_size=settings.DJANGO_DELETION_BATCH_SIZE)
|
||||
```
|
||||
|
||||
Each batch runs in its own `rls_transaction()` so:
|
||||
- A failure in batch N doesn't roll back batches 1 through N-1
|
||||
- Lock duration is bounded to the batch size
|
||||
- Memory stays constant regardless of total row count
|
||||
|
||||
### Rules for backfill tasks
|
||||
|
||||
1. **One RLS transaction per batch.** Never wrap the entire backfill in a single transaction. Each batch gets its own `rls_transaction(tenant_id)`.
|
||||
|
||||
2. **Use `bulk_create` / `bulk_update` with explicit `batch_size`.** Never `.save()` in a loop. The default batch_size is 500.
|
||||
|
||||
3. **Use `.iterator()` for reads.** When reading source data, use `queryset.iterator()` to avoid loading the entire result set into memory.
|
||||
|
||||
4. **Use `.only()` / `.values_list()` for reads.** Fetch only the columns you need, not full model instances.
|
||||
|
||||
5. **Catch and skip per-item failures.** Don't let one bad row kill the entire backfill. Log the error, count it, continue.
|
||||
|
||||
```python
|
||||
scans_processed = 0
|
||||
scans_skipped = 0
|
||||
|
||||
for scan_id in scan_ids:
|
||||
try:
|
||||
result = process_scan(tenant_id, scan_id)
|
||||
scans_processed += 1
|
||||
except Exception:
|
||||
logger.warning("Failed to process scan %s", scan_id)
|
||||
scans_skipped += 1
|
||||
|
||||
logger.info("Backfill done: %d processed, %d skipped", scans_processed, scans_skipped)
|
||||
```
|
||||
|
||||
6. **Log totals at start and end, not per-batch.** Per-batch logging floods the logs. Log the total count at the start, and the processed/skipped counts at the end.
|
||||
|
||||
7. **Use `ignore_conflicts=True` for idempotent creates.** Makes the backfill safe to re-run if interrupted.
|
||||
|
||||
```python
|
||||
Model.objects.bulk_create(objects, batch_size=500, ignore_conflicts=True)
|
||||
```
|
||||
|
||||
8. **Iterate per-tenant.** Dispatch one Celery task per tenant. This gives you natural parallelism, bounded memory per task, and the ability to retry a single tenant without re-running everything.
|
||||
|
||||
### Existing examples
|
||||
|
||||
| Migration | Task |
|
||||
|---|---|
|
||||
| `0062_backfill_daily_severity_summaries.py` | `backfill_daily_severity_summaries_task` |
|
||||
| `0080_backfill_attack_paths_graph_data_ready.py` | Inline (trivial update) |
|
||||
| `0082_backfill_finding_group_summaries.py` | `backfill_finding_group_summaries_task` |
|
||||
|
||||
Task implementations: `tasks/jobs/backfill.py`
|
||||
Batch utilities: `api/db_utils.py` (`batch_delete`, `create_objects_in_batches`, `update_objects_in_batches`)
|
||||
|
||||
## Decision tree
|
||||
|
||||
```
|
||||
Auto-generated migration?
|
||||
├── Yes → Split it following the rules below
|
||||
└── No → Review it against the rules below
|
||||
|
||||
New model?
|
||||
├── Yes → CreateModel + AddConstraint in one migration
|
||||
│ AddIndex in separate migration(s), one per table
|
||||
└── No, just indexes?
|
||||
│ ├── Regular table → AddIndex in its own migration
|
||||
│ └── Partitioned table (findings, resource_finding_mappings)?
|
||||
│ ├── Step 1: RunPython + create_index_on_partitions (atomic=False)
|
||||
│ └── Step 2: AddIndex on parent (separate migration)
|
||||
│ └── Large table? → Consider --fake + manual apply
|
||||
└── Data backfill?
|
||||
├── Trivial update (< ~10K rows)? → Inline RunPython is OK
|
||||
└── Large backfill? → Migration dispatches Celery task(s)
|
||||
├── One task per tenant
|
||||
├── Batch processing (bulk_create/bulk_update, batch_size=500)
|
||||
├── One rls_transaction per batch
|
||||
└── Catch + skip per-item failures, log totals
|
||||
```
|
||||
|
||||
## Quick reference
|
||||
|
||||
| Scenario | Approach |
|
||||
|---|---|
|
||||
| Auto-generated migration | Split by concern and table before committing |
|
||||
| New model + constraints/RLS | Same migration (constraints are structural) |
|
||||
| Indexes on a regular table | Separate migration, one table per file |
|
||||
| Indexes on a partitioned table | Two migrations: partitions first (`RunPython` + `atomic=False`), then parent (`AddIndex`) |
|
||||
| Index on a huge partitioned table | Same two migrations, but fake + apply manually in production |
|
||||
| Trivial data backfill (< ~10K rows) | Inline `RunPython` with single `.update()` call |
|
||||
| Large data backfill | Migration dispatches Celery task per tenant, task batches with `rls_transaction` |
|
||||
|
||||
## Review output format
|
||||
|
||||
1. List each violation with rule number and one-line explanation
|
||||
2. Show corrected migration file(s)
|
||||
3. For partitioned tables, show both partition and parent migrations
|
||||
|
||||
If migration passes all checks, say so.
|
||||
|
||||
## Context7 lookups
|
||||
|
||||
**Prerequisite:** Install Context7 MCP server for up-to-date documentation lookup.
|
||||
|
||||
When implementing or debugging migration patterns, query these libraries via `mcp_context7_query-docs`:
|
||||
|
||||
| Library | Context7 ID | Use for |
|
||||
|---------|-------------|---------|
|
||||
| Django 5.1 | `/websites/djangoproject_en_5_1` | Migration operations, indexes, constraints, `SchemaEditor` |
|
||||
| PostgreSQL | `/websites/postgresql_org_docs_current` | `CREATE INDEX CONCURRENTLY`, partitioned tables, `pg_inherits` |
|
||||
| django-postgres-extra | `/SectorLabs/django-postgres-extra` | Partitioned models, `PostgresPartitionedModel`, partition management |
|
||||
|
||||
**Example queries:**
|
||||
```
|
||||
mcp_context7_query-docs(libraryId="/websites/djangoproject_en_5_1", query="migration operations AddIndex RunPython atomic")
|
||||
mcp_context7_query-docs(libraryId="/websites/djangoproject_en_5_1", query="database indexes Meta class concurrently")
|
||||
mcp_context7_query-docs(libraryId="/websites/postgresql_org_docs_current", query="CREATE INDEX CONCURRENTLY partitioned table")
|
||||
mcp_context7_query-docs(libraryId="/SectorLabs/django-postgres-extra", query="partitioned model range partition index")
|
||||
```
|
||||
|
||||
> **Note:** Use `mcp_context7_resolve-library-id` first if you need to find the correct library ID.
|
||||
|
||||
## Commands
|
||||
|
||||
```bash
|
||||
# Generate migrations (ALWAYS review output before committing)
|
||||
python manage.py makemigrations
|
||||
python manage.py pgmakemigrations
|
||||
|
||||
# Apply migrations
|
||||
python manage.py migrate
|
||||
|
||||
# Fake a migration (mark as applied without running)
|
||||
python manage.py migrate api <migration_name> --fake
|
||||
|
||||
# Manage partitions
|
||||
python manage.py pgpartition --using admin
|
||||
```
|
||||
|
||||
## Resources
|
||||
|
||||
- **Partition helpers**: `api/src/backend/api/db_utils.py` (`create_index_on_partitions`, `drop_index_on_partitions`)
|
||||
- **Partition config**: `api/src/backend/api/partitions.py`
|
||||
- **RLS constraints**: `api/src/backend/api/rls.py`
|
||||
- **Existing examples**: `0028` + `0029`, `0024` + `0025`, `0036` + `0037`
|
||||
| Did `makemigrations` bundle unrelated operations together? | Rewrite into focused files and delete the generated catch-all migration. |
|
||||
| New table plus indexes? | Put `CreateModel` and constraints together, then create separate index migration(s). |
|
||||
| Multiple `model_name` values in one `AddIndex` migration? | Split into one migration per table. |
|
||||
| Indexing `finding` or `resourcefindingmapping`? | Use partition helper migration first, then parent `AddIndex` migration. |
|
||||
| Very large partitioned table in production? | Consider fake-applying the migration and creating indexes manually during a maintenance window. |
|
||||
| Data migration larger than trivial update scope? | Dispatch one Celery task per tenant and batch inside the task. |
|
||||
|
||||
## Execution Steps
|
||||
|
||||
1. Read the generated or reviewed migration and list every operation by table and by concern.
|
||||
2. Split structural work (`CreateModel`, `AddConstraint`, field changes) from performance work (`AddIndex`).
|
||||
3. For regular-table indexes, create one migration per table with only that table's `AddIndex` operations.
|
||||
4. For partitioned tables, write migration A with `RunPython(create_index_on_partitions, reverse_code=drop_index_on_partitions)` and `atomic = False`, then migration B with the parent `AddIndex` so Django registers the index definition.
|
||||
5. If the table is huge, document whether the safe path is normal execution or `--fake` plus manual concurrent index creation.
|
||||
6. For data backfills, keep the migration as a dispatcher only unless the change is a tiny single `UPDATE`; move real work into Celery tasks that batch reads/writes, use `.iterator()`, fetch only needed columns, and tolerate per-item failures.
|
||||
7. Re-verify dependencies, migration names, and rollback behavior before finishing.
|
||||
|
||||
## Output Contract
|
||||
|
||||
- State whether the migration was accepted as-is or rewritten.
|
||||
- List each violation found with the governing rule: mixed concerns, mixed tables, missing partition step, unsafe backfill, or transaction misuse.
|
||||
- Show the final migration shape expected: structural file, per-table index file(s), and partition/parent pair when applicable.
|
||||
- If a backfill is involved, say whether it is inline-trivial or Celery-dispatched and why.
|
||||
- Mention the exact validation command or migration command used.
|
||||
|
||||
## References
|
||||
|
||||
- `api/src/backend/api/db_utils.py`
|
||||
- `api/src/backend/api/partitions.py`
|
||||
- `api/src/backend/api/rls.py`
|
||||
- `api/src/backend/tasks/jobs/backfill.py`
|
||||
- `api/src/backend/**/migrations/`
|
||||
- Existing partition examples: `0024` + `0025`, `0028` + `0029`, `0036` + `0037`
|
||||
|
||||
+35
-247
@@ -1,8 +1,6 @@
|
||||
---
|
||||
name: jsonapi
|
||||
description: >
|
||||
Strict JSON:API v1.1 specification compliance.
|
||||
Trigger: When creating or modifying API endpoints, reviewing API responses, or validating JSON:API compliance.
|
||||
description: "Trigger: When creating or modifying API endpoints, reviewing API responses, or validating JSON:API behavior in Prowler. Enforces JSON:API v1.1 response, relationship, and media-type compliance."
|
||||
license: Apache-2.0
|
||||
metadata:
|
||||
author: prowler-cloud
|
||||
@@ -14,258 +12,48 @@ metadata:
|
||||
- "Reviewing JSON:API compliance"
|
||||
---
|
||||
|
||||
## Use With django-drf
|
||||
## Activation Contract
|
||||
|
||||
This skill focuses on **spec compliance**. For **implementation patterns** (ViewSets, Serializers, Filters), use `django-drf` skill together with this one.
|
||||
Use this skill when the task is about what the JSON:API contract MUST look like: document shape, media types, relationship linkage, sparse fields, includes, errors, and status-code semantics. Pair it with `django-drf` for implementation mechanics and `prowler-api` for Prowler tenant or provider rules.
|
||||
|
||||
| Skill | Focus |
|
||||
|-------|-------|
|
||||
| `jsonapi` | What the spec requires (MUST/MUST NOT rules) |
|
||||
| `django-drf` | How to implement it in DRF (code patterns) |
|
||||
## Hard Rules
|
||||
|
||||
**When creating/modifying endpoints, invoke BOTH skills.**
|
||||
- Never return `data` and `errors` in the same document.
|
||||
- Always return JSON:API media types and document members consistent with the spec.
|
||||
- Always model resource identifiers with string `id` values and kebab-case `type` values.
|
||||
- Always represent relationships with JSON:API linkage objects, not raw foreign keys.
|
||||
- Always emit error objects as an array and keep `status` as a string.
|
||||
- Never hide spec violations behind framework defaults; verify the final payload shape.
|
||||
|
||||
---
|
||||
## Decision Gates
|
||||
|
||||
## Before Implementing/Reviewing
|
||||
| Question | Action |
|
||||
|---|---|
|
||||
| Are you designing endpoint structure or reviewing payload correctness? | Use this skill as the compliance authority. |
|
||||
| Are you implementing DRF serializers/viewsets/filters too? | Load `django-drf` as a companion skill. |
|
||||
| Does tenant visibility affect whether a resource should appear? | Load `prowler-api` too. |
|
||||
| Is the change about relationship payloads or compound documents? | Validate linkage, `include`, and deduplication rules explicitly. |
|
||||
| Is the response async or task-based? | Confirm status codes and response shape still satisfy JSON:API rules. |
|
||||
|
||||
**ALWAYS validate against the latest spec** before creating or modifying endpoints:
|
||||
## Execution Steps
|
||||
|
||||
### Option 1: Context7 MCP (Preferred)
|
||||
1. Identify the document type involved: success, error, relationship update, compound document, or sparse fieldset response.
|
||||
2. Check media type, top-level members, and status code semantics first.
|
||||
3. Validate resource object shape: `type`, string `id`, `attributes`, and `relationships`.
|
||||
4. Verify query parameter behavior for `include`, `fields`, `filter`, `sort`, and pagination.
|
||||
5. Review error payloads for array shape, string status, and pointers when field-specific.
|
||||
6. Hand implementation details back to `django-drf` once compliance constraints are clear.
|
||||
|
||||
If Context7 MCP is available, query the JSON:API spec directly:
|
||||
## Output Contract
|
||||
|
||||
```
|
||||
mcp_context7_resolve-library-id(query="jsonapi specification")
|
||||
mcp_context7_query-docs(libraryId="<resolved-id>", query="[specific topic: relationships, errors, etc.]")
|
||||
```
|
||||
- State the JSON:API rule or family of rules that governs the task.
|
||||
- Mention the endpoint or payload surface being validated.
|
||||
- Name companion skills needed for implementation or tenant-aware behavior.
|
||||
- Call out the concrete violation risk if the current shape is wrong.
|
||||
|
||||
### Option 2: WebFetch (Fallback)
|
||||
## References
|
||||
|
||||
If Context7 is not available, fetch from the official spec:
|
||||
|
||||
```
|
||||
WebFetch(url="https://jsonapi.org/format/", prompt="Extract rules for [specific topic]")
|
||||
```
|
||||
|
||||
This ensures compliance with the latest JSON:API version, even after spec updates.
|
||||
|
||||
---
|
||||
|
||||
## Critical Rules (NEVER Break)
|
||||
|
||||
### Document Structure
|
||||
- NEVER include both `data` and `errors` in the same response
|
||||
- ALWAYS include at least one of: `data`, `errors`, `meta`
|
||||
- ALWAYS use `type` and `id` (string) in resource objects
|
||||
- NEVER include `id` when creating resources (server generates it)
|
||||
|
||||
### Content-Type
|
||||
- ALWAYS use `Content-Type: application/vnd.api+json`
|
||||
- ALWAYS use `Accept: application/vnd.api+json`
|
||||
- NEVER add parameters to media type without `ext`/`profile`
|
||||
|
||||
### Resource Objects
|
||||
- ALWAYS use **string** for `id` (even if UUID)
|
||||
- ALWAYS use **lowercase kebab-case** for `type`
|
||||
- NEVER put `id` or `type` inside `attributes`
|
||||
- NEVER include foreign keys in `attributes` - use `relationships`
|
||||
|
||||
### Relationships
|
||||
- ALWAYS include at least one of: `links`, `data`, or `meta`
|
||||
- ALWAYS use resource linkage format: `{"type": "...", "id": "..."}`
|
||||
- NEVER use raw IDs in relationships - always use linkage objects
|
||||
|
||||
### Error Objects
|
||||
- ALWAYS return errors as array: `{"errors": [...]}`
|
||||
- ALWAYS include `status` as **string** (e.g., `"400"`, not `400`)
|
||||
- ALWAYS include `source.pointer` for field-specific errors
|
||||
|
||||
---
|
||||
|
||||
## HTTP Status Codes (Mandatory)
|
||||
|
||||
| Operation | Success | Async | Conflict | Not Found | Forbidden | Bad Request |
|
||||
|-----------|---------|-------|----------|-----------|-----------|-------------|
|
||||
| **GET** | `200` | - | - | `404` | `403` | `400` |
|
||||
| **POST** | `201` | `202` | `409` | `404` | `403` | `400` |
|
||||
| **PATCH** | `200` | `202` | `409` | `404` | `403` | `400` |
|
||||
| **DELETE** | `200`/`204` | `202` | - | `404` | `403` | - |
|
||||
|
||||
### When to Use Each
|
||||
|
||||
| Code | Use When |
|
||||
|------|----------|
|
||||
| `200 OK` | Successful GET, PATCH with response body, DELETE with response |
|
||||
| `201 Created` | POST created resource (MUST include `Location` header) |
|
||||
| `202 Accepted` | Async operation started (return task reference) |
|
||||
| `204 No Content` | Successful DELETE, PATCH with no response body |
|
||||
| `400 Bad Request` | Invalid query params, malformed request, unknown fields |
|
||||
| `403 Forbidden` | Authentication ok but no permission, client-generated ID rejected |
|
||||
| `404 Not Found` | Resource doesn't exist OR RLS hides it (never reveal which) |
|
||||
| `409 Conflict` | Duplicate ID, type mismatch, relationship conflict |
|
||||
| `415 Unsupported` | Wrong Content-Type header |
|
||||
|
||||
---
|
||||
|
||||
## Document Structure
|
||||
|
||||
### Success Response (Single)
|
||||
|
||||
```json
|
||||
{
|
||||
"data": {
|
||||
"type": "providers",
|
||||
"id": "550e8400-e29b-41d4-a716-446655440000",
|
||||
"attributes": {
|
||||
"alias": "Production",
|
||||
"connected": true
|
||||
},
|
||||
"relationships": {
|
||||
"tenant": {
|
||||
"data": {"type": "tenants", "id": "..."}
|
||||
}
|
||||
},
|
||||
"links": {
|
||||
"self": "/api/v1/providers/550e8400-..."
|
||||
}
|
||||
},
|
||||
"links": {
|
||||
"self": "/api/v1/providers/550e8400-..."
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Success Response (List)
|
||||
|
||||
```json
|
||||
{
|
||||
"data": [
|
||||
{"type": "providers", "id": "...", "attributes": {...}},
|
||||
{"type": "providers", "id": "...", "attributes": {...}}
|
||||
],
|
||||
"links": {
|
||||
"self": "/api/v1/providers?page[number]=1",
|
||||
"first": "/api/v1/providers?page[number]=1",
|
||||
"last": "/api/v1/providers?page[number]=5",
|
||||
"prev": null,
|
||||
"next": "/api/v1/providers?page[number]=2"
|
||||
},
|
||||
"meta": {
|
||||
"pagination": {"count": 100, "pages": 5}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Error Response
|
||||
|
||||
```json
|
||||
{
|
||||
"errors": [
|
||||
{
|
||||
"status": "400",
|
||||
"code": "invalid",
|
||||
"title": "Invalid attribute",
|
||||
"detail": "UID must be 12 digits for AWS accounts",
|
||||
"source": {"pointer": "/data/attributes/uid"}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Query Parameters
|
||||
|
||||
| Family | Format | Example |
|
||||
|--------|--------|---------|
|
||||
| `page` | `page[number]`, `page[size]` | `?page[number]=2&page[size]=25` |
|
||||
| `filter` | `filter[field]`, `filter[field__op]` | `?filter[status]=FAIL` |
|
||||
| `sort` | Comma-separated, `-` for desc | `?sort=-inserted_at,name` |
|
||||
| `fields` | `fields[type]` | `?fields[providers]=id,alias` |
|
||||
| `include` | Comma-separated paths | `?include=provider,scan.task` |
|
||||
|
||||
### Rules
|
||||
|
||||
- MUST return `400` for unsupported query parameters
|
||||
- MUST return `400` for unsupported `include` paths
|
||||
- MUST return `400` for unsupported `sort` fields
|
||||
- MUST NOT include extra fields when `fields[type]` is specified
|
||||
|
||||
---
|
||||
|
||||
## Common Violations (AVOID)
|
||||
|
||||
| Violation | Wrong | Correct |
|
||||
|-----------|-------|---------|
|
||||
| ID as integer | `"id": 123` | `"id": "123"` |
|
||||
| Type as camelCase | `"type": "providerGroup"` | `"type": "provider-groups"` |
|
||||
| FK in attributes | `"tenant_id": "..."` | `"relationships": {"tenant": {...}}` |
|
||||
| Errors not array | `{"error": "..."}` | `{"errors": [{"detail": "..."}]}` |
|
||||
| Status as number | `"status": 400` | `"status": "400"` |
|
||||
| Data + errors | `{"data": ..., "errors": ...}` | Only one or the other |
|
||||
| Missing pointer | `{"detail": "Invalid"}` | `{"detail": "...", "source": {"pointer": "..."}}` |
|
||||
|
||||
---
|
||||
|
||||
## Relationship Updates
|
||||
|
||||
### To-One Relationship
|
||||
|
||||
```http
|
||||
PATCH /api/v1/providers/123/relationships/tenant
|
||||
Content-Type: application/vnd.api+json
|
||||
|
||||
{"data": {"type": "tenants", "id": "456"}}
|
||||
```
|
||||
|
||||
To clear: `{"data": null}`
|
||||
|
||||
### To-Many Relationship
|
||||
|
||||
| Operation | Method | Body |
|
||||
|-----------|--------|------|
|
||||
| Replace all | PATCH | `{"data": [{...}, {...}]}` |
|
||||
| Add members | POST | `{"data": [{...}]}` |
|
||||
| Remove members | DELETE | `{"data": [{...}]}` |
|
||||
|
||||
---
|
||||
|
||||
## Compound Documents (`include`)
|
||||
|
||||
When using `?include=provider`:
|
||||
|
||||
```json
|
||||
{
|
||||
"data": {
|
||||
"type": "scans",
|
||||
"id": "...",
|
||||
"relationships": {
|
||||
"provider": {
|
||||
"data": {"type": "providers", "id": "prov-123"}
|
||||
}
|
||||
}
|
||||
},
|
||||
"included": [
|
||||
{
|
||||
"type": "providers",
|
||||
"id": "prov-123",
|
||||
"attributes": {"alias": "Production"}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### Rules
|
||||
|
||||
- Every included resource MUST be reachable via relationship chain from primary data
|
||||
- MUST NOT include orphan resources
|
||||
- MUST NOT duplicate resources (same type+id)
|
||||
|
||||
---
|
||||
|
||||
## Spec Reference
|
||||
|
||||
- **Full Specification**: https://jsonapi.org/format/
|
||||
- **Implementation**: Use `django-drf` skill for DRF-specific patterns
|
||||
- **Testing**: Use `prowler-test-api` skill for test patterns
|
||||
- [Repository agent rules](../../AGENTS.md)
|
||||
- [API component guidance](../../api/AGENTS.md)
|
||||
- [DRF implementation skill](../django-drf/SKILL.md)
|
||||
- [Prowler API skill](../prowler-api/SKILL.md)
|
||||
|
||||
@@ -1,12 +1,6 @@
|
||||
---
|
||||
name: postgresql-indexing
|
||||
description: >
|
||||
PostgreSQL indexing best practices for Prowler: index design, partial indexes, partitioned table
|
||||
indexing, EXPLAIN ANALYZE validation, concurrent operations, monitoring, and maintenance.
|
||||
Trigger: When creating or modifying PostgreSQL indexes, analyzing query performance with EXPLAIN,
|
||||
debugging slow queries, reviewing index usage statistics, reindexing, dropping indexes, or working
|
||||
with partitioned table indexes. Also trigger when discussing index strategies, partial indexes,
|
||||
or index maintenance operations like VACUUM or ANALYZE.
|
||||
description: "Trigger: When designing, validating, dropping, or repairing PostgreSQL indexes, including EXPLAIN analysis and partitioned-table indexing. Enforces Prowler-safe index design, concurrent operations, and partition maintenance rules."
|
||||
license: Apache-2.0
|
||||
metadata:
|
||||
author: prowler-cloud
|
||||
@@ -20,373 +14,53 @@ metadata:
|
||||
allowed-tools: Read, Grep, Glob, Bash
|
||||
---
|
||||
|
||||
## When to use
|
||||
|
||||
- Creating or modifying PostgreSQL indexes
|
||||
- Analyzing query plans with `EXPLAIN`
|
||||
- Debugging slow queries or missing index usage
|
||||
- Dropping, reindexing, or validating indexes
|
||||
- Working with indexes on partitioned tables (findings, resource_finding_mappings)
|
||||
- Running VACUUM or ANALYZE after index changes
|
||||
|
||||
## Index design
|
||||
|
||||
### Partial indexes: constant columns go in WHERE, not in the key
|
||||
|
||||
When a column has a fixed value for the query (e.g., `state = 'completed'`), put it in the `WHERE` clause of the index, not in the indexed columns. Otherwise the planner cannot exploit the ordering of the other columns.
|
||||
|
||||
```sql
|
||||
-- Bad: state in the key wastes space and breaks ordering
|
||||
CREATE INDEX idx_scans_tenant_state ON scans (tenant_id, state, inserted_at DESC);
|
||||
|
||||
-- Good: state as a filter, planner uses tenant_id + inserted_at ordering
|
||||
CREATE INDEX idx_scans_tenant_ins_completed ON scans (tenant_id, inserted_at DESC)
|
||||
WHERE state = 'completed';
|
||||
```
|
||||
|
||||
### Column order matters
|
||||
|
||||
Put high-selectivity columns first (columns that filter out the most rows). For composite indexes, the leftmost column must appear in the query's WHERE clause for the index to be used.
|
||||
|
||||
## Validating index effectiveness
|
||||
|
||||
### Always EXPLAIN (ANALYZE, BUFFERS) after adding indexes
|
||||
|
||||
Never assume an index is being used. Run `EXPLAIN (ANALYZE, BUFFERS)` to confirm.
|
||||
|
||||
```sql
|
||||
EXPLAIN (ANALYZE, BUFFERS)
|
||||
SELECT *
|
||||
FROM users
|
||||
WHERE email = 'user@example.com';
|
||||
```
|
||||
|
||||
Use [Postgres EXPLAIN Visualizer (pev)](https://tatiyants.com/pev/) to visualize query plans and identify bottlenecks.
|
||||
|
||||
### Force index usage for testing
|
||||
|
||||
The planner may choose a sequential scan on small datasets. Toggle `enable_seqscan = off` to confirm the index path works, then re-enable it.
|
||||
|
||||
```sql
|
||||
SET enable_seqscan = off;
|
||||
|
||||
EXPLAIN (ANALYZE, BUFFERS)
|
||||
SELECT DISTINCT ON (provider_id) provider_id
|
||||
FROM scans
|
||||
WHERE tenant_id = '95383b24-da01-44b5-a713-0d9920d554db'
|
||||
AND state = 'completed'
|
||||
ORDER BY provider_id, inserted_at DESC;
|
||||
|
||||
SET enable_seqscan = on; -- always re-enable after testing
|
||||
```
|
||||
|
||||
This is for validation only. Never leave `enable_seqscan = off` in production.
|
||||
|
||||
## Over-indexing
|
||||
|
||||
Every extra index has three costs that compound:
|
||||
|
||||
1. **Write overhead.** Every INSERT and UPDATE must maintain all indexes. Extra indexes also kill HOT (Heap-Only-Tuple) updates, which normally skip index maintenance when unindexed columns change.
|
||||
|
||||
2. **Planning time.** The planner evaluates more execution paths per index. On simple OLTP queries, planning time can exceed execution time by 4x when index count is high.
|
||||
|
||||
3. **Lock contention (fastpath limit).** PostgreSQL uses a fast path for the first 16 locks per backend. After 16 relations (table + its indexes), it falls back to slower LWLock mechanisms. At high QPS (100+), this causes `LockManager` wait events.
|
||||
|
||||
Rules:
|
||||
- Drop unused and redundant indexes regularly
|
||||
- Be especially careful with partitioned tables (each partition multiplies the index count)
|
||||
- Use prepared statements to reduce planning overhead when index count is high
|
||||
|
||||
## Finding redundant indexes
|
||||
|
||||
Two indexes are redundant when:
|
||||
- They have the same columns in the same order (duplicates)
|
||||
- One is a prefix of the other: index `(a)` is redundant to `(a, b)`, but NOT to `(b, a)`
|
||||
|
||||
Column order matters. For partial indexes, the WHERE clause must also match.
|
||||
|
||||
```sql
|
||||
-- Quick check: find indexes that share a leading column on the same table
|
||||
SELECT
|
||||
a.indrelid::regclass AS table_name,
|
||||
a.indexrelid::regclass AS index_a,
|
||||
b.indexrelid::regclass AS index_b,
|
||||
pg_size_pretty(pg_relation_size(a.indexrelid)) AS size_a,
|
||||
pg_size_pretty(pg_relation_size(b.indexrelid)) AS size_b
|
||||
FROM pg_index a
|
||||
JOIN pg_index b ON a.indrelid = b.indrelid
|
||||
AND a.indexrelid != b.indexrelid
|
||||
AND a.indkey::text = (
|
||||
SELECT string_agg(x::text, ' ')
|
||||
FROM unnest(b.indkey[:array_length(a.indkey, 1)]) AS x
|
||||
)
|
||||
WHERE NOT a.indisunique;
|
||||
```
|
||||
|
||||
Before dropping: verify on all workload nodes (primary + replicas), use `DROP INDEX CONCURRENTLY`, and monitor for plan regressions.
|
||||
|
||||
## Monitoring index usage
|
||||
|
||||
### Identify unused indexes
|
||||
|
||||
Query `pg_stat_all_indexes` to find indexes that are never or rarely scanned:
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
idxstat.schemaname AS schema_name,
|
||||
idxstat.relname AS table_name,
|
||||
idxstat.indexrelname AS index_name,
|
||||
idxstat.idx_scan AS index_scans_count,
|
||||
idxstat.last_idx_scan AS last_idx_scan_timestamp,
|
||||
pg_size_pretty(pg_relation_size(idxstat.indexrelid)) AS index_size
|
||||
FROM pg_stat_all_indexes AS idxstat
|
||||
JOIN pg_index i ON idxstat.indexrelid = i.indexrelid
|
||||
WHERE idxstat.schemaname NOT IN ('pg_catalog', 'information_schema', 'pg_toast')
|
||||
AND NOT i.indisunique
|
||||
ORDER BY idxstat.idx_scan ASC, idxstat.last_idx_scan ASC;
|
||||
```
|
||||
|
||||
Indexes with `idx_scan = 0` and no recent `last_idx_scan` are candidates for removal.
|
||||
|
||||
Before dropping, verify:
|
||||
- Stats haven't been reset recently (check `stats_reset` in `pg_stat_database`)
|
||||
- Stats cover at least 1 month of production traffic
|
||||
- All workload nodes (primary + replicas) have been checked
|
||||
- The index isn't used by a periodic job that runs infrequently
|
||||
|
||||
```sql
|
||||
-- Check when stats were last reset
|
||||
SELECT stats_reset, age(now(), stats_reset)
|
||||
FROM pg_stat_database
|
||||
WHERE datname = current_database();
|
||||
```
|
||||
|
||||
### Monitor index creation progress
|
||||
|
||||
Do not assume index creation succeeded. Use `pg_stat_progress_create_index` (Postgres 12+) to watch progress live:
|
||||
|
||||
```sql
|
||||
SELECT * FROM pg_stat_progress_create_index;
|
||||
```
|
||||
|
||||
In psql, use `\watch 5` to refresh every 5 seconds for a live dashboard view. `CREATE INDEX CONCURRENTLY` and `REINDEX CONCURRENTLY` have more phases than standard operations: monitor for blocking sessions and wait events.
|
||||
|
||||
### Validate index integrity
|
||||
|
||||
Check for invalid indexes regularly:
|
||||
|
||||
```sql
|
||||
SELECT c.relname AS index_name, i.indisvalid
|
||||
FROM pg_class c
|
||||
JOIN pg_index i ON i.indexrelid = c.oid
|
||||
WHERE i.indisvalid = false;
|
||||
```
|
||||
|
||||
Invalid indexes are ignored by the planner. They waste space and cause inconsistent query performance, especially on partitioned tables where some partitions may have valid indexes and others do not.
|
||||
|
||||
## Concurrent operations
|
||||
|
||||
### Always use CONCURRENTLY in production
|
||||
|
||||
Never create or drop indexes without `CONCURRENTLY` on live tables. Without it, the operation holds a lock that blocks all writes.
|
||||
|
||||
```sql
|
||||
-- Create
|
||||
CREATE INDEX CONCURRENTLY IF NOT EXISTS index_name ON table_name (column_name);
|
||||
|
||||
-- Drop
|
||||
DROP INDEX CONCURRENTLY IF EXISTS index_name;
|
||||
```
|
||||
|
||||
`DROP INDEX CONCURRENTLY` cannot run inside a transaction block.
|
||||
|
||||
### Always use IF NOT EXISTS / IF EXISTS
|
||||
|
||||
Makes scripts idempotent. Safe to re-run without errors from duplicate or missing indexes.
|
||||
|
||||
### Concurrent indexing can fail silently
|
||||
|
||||
`CREATE INDEX CONCURRENTLY` can fail without raising an error. The result is an invalid index that the planner ignores. This is particularly dangerous on partitioned tables: some partitions get valid indexes, others don't, causing inconsistent query performance.
|
||||
|
||||
After any concurrent index creation, always validate:
|
||||
|
||||
```sql
|
||||
SELECT c.relname, i.indisvalid
|
||||
FROM pg_class c
|
||||
JOIN pg_index i ON i.indexrelid = c.oid
|
||||
WHERE c.relname LIKE '%your_index_name%';
|
||||
```
|
||||
|
||||
## Reindexing invalid indexes
|
||||
|
||||
Rebuild invalid indexes without locking writes:
|
||||
|
||||
```sql
|
||||
REINDEX INDEX CONCURRENTLY index_name;
|
||||
```
|
||||
|
||||
### Understanding _ccnew and _ccold artifacts
|
||||
|
||||
When `CREATE INDEX CONCURRENTLY` or `REINDEX INDEX CONCURRENTLY` is interrupted, temporary indexes may remain:
|
||||
|
||||
| Suffix | Meaning | Action |
|
||||
|--------|---------|--------|
|
||||
| `_ccnew` | New index being built, incomplete | Drop it and retry `REINDEX CONCURRENTLY` |
|
||||
| `_ccold` | Old index being replaced, rebuild succeeded | Safe to drop |
|
||||
|
||||
```sql
|
||||
-- Example: both original and temp are invalid
|
||||
-- users_emails_2019 btree (col) INVALID
|
||||
-- users_emails_2019_ccnew btree (col) INVALID
|
||||
|
||||
-- Drop the failed new one, then retry
|
||||
DROP INDEX CONCURRENTLY IF EXISTS users_emails_2019_ccnew;
|
||||
REINDEX INDEX CONCURRENTLY users_emails_2019;
|
||||
```
|
||||
|
||||
These leftovers clutter the schema, confuse developers, and waste disk space. Clean them up.
|
||||
|
||||
## Indexing partitioned tables
|
||||
|
||||
### Do NOT use ALTER INDEX ATTACH PARTITION
|
||||
|
||||
As stated in PostgreSQL documentation, `ALTER INDEX ... ATTACH PARTITION` prevents dropping malfunctioning or non-performant indexes from individual partitions. An attached index cannot be dropped by itself and is automatically dropped if its parent index is dropped.
|
||||
|
||||
This removes the ability to manage indexes per-partition, which we need for:
|
||||
- Dropping broken indexes on specific partitions
|
||||
- Skipping indexes on old partitions to save storage
|
||||
- Rebuilding indexes on individual partitions without affecting others
|
||||
|
||||
### Correct approach: create on partitions, then on parent
|
||||
|
||||
1. Create the index on each child partition concurrently:
|
||||
|
||||
```sql
|
||||
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_child_partition
|
||||
ON child_partition (column_name);
|
||||
```
|
||||
|
||||
2. Create the index on the parent table (metadata-only, fast):
|
||||
|
||||
```sql
|
||||
CREATE INDEX IF NOT EXISTS idx_parent
|
||||
ON parent_table (column_name);
|
||||
```
|
||||
|
||||
PostgreSQL will automatically recognize partition-level indexes as part of the parent index definition when the index names and definitions match.
|
||||
|
||||
### Prioritize active partitions
|
||||
|
||||
For time-based partitions (findings uses monthly partitions):
|
||||
|
||||
- Create indexes on recent/current partitions where data is actively queried
|
||||
- Skip older partitions that are rarely accessed
|
||||
- The `all_partitions=False` default in `create_index_on_partitions` handles this automatically
|
||||
|
||||
## Index maintenance and bloat
|
||||
|
||||
Over time, B-tree indexes accumulate bloat from updates and deletes. VACUUM reclaims heap space but does NOT rebalance B-tree pages. Periodic reindexing is necessary for heavily updated tables.
|
||||
|
||||
### Detecting bloat
|
||||
|
||||
Indexes with estimated bloat above 50% are candidates for `REINDEX CONCURRENTLY`. Check bloat with tools like `pgstattuple` or bloat estimation queries.
|
||||
|
||||
### Reducing bloat buildup
|
||||
|
||||
Three things slow degradation:
|
||||
1. **Upgrade to PostgreSQL 14+** for B-tree deduplication and bottom-up deletion
|
||||
2. **Maximize HOT updates** by not indexing frequently-updated columns
|
||||
3. **Tune autovacuum** to run more aggressively on high-churn tables
|
||||
|
||||
### Rebuilding many indexes without deadlocks
|
||||
|
||||
If you rebuild two indexes on the same table in parallel, PostgreSQL detects a deadlock and kills one session. To rebuild many indexes across multiple sessions safely, assign all indexes for a given table to the same session:
|
||||
|
||||
```sql
|
||||
\set NUMBER_OF_SESSIONS 10
|
||||
|
||||
SELECT
|
||||
format('%I.%I', n.nspname, c.relname) AS table_fqn,
|
||||
format('%I.%I', n.nspname, i.relname) AS index_fqn,
|
||||
mod(
|
||||
hashtext(format('%I.%I', n.nspname, c.relname)) & 2147483647,
|
||||
:NUMBER_OF_SESSIONS
|
||||
) AS session_id
|
||||
FROM pg_index idx
|
||||
JOIN pg_class c ON idx.indrelid = c.oid
|
||||
JOIN pg_class i ON idx.indexrelid = i.oid
|
||||
JOIN pg_namespace n ON c.relnamespace = n.oid
|
||||
WHERE n.nspname NOT IN ('pg_catalog', 'pg_toast', 'information_schema')
|
||||
ORDER BY table_fqn, index_fqn;
|
||||
```
|
||||
|
||||
Then run each session's indexes in a separate `REINDEX INDEX CONCURRENTLY` call. Set `NUMBER_OF_SESSIONS` based on `max_parallel_maintenance_workers` and available I/O.
|
||||
|
||||
## Dropping indexes
|
||||
|
||||
### Post-drop maintenance
|
||||
|
||||
After dropping an index, run VACUUM and ANALYZE to reclaim space and update planner statistics:
|
||||
|
||||
```sql
|
||||
-- Full vacuum + analyze (can be heavy on large tables)
|
||||
VACUUM (ANALYZE) your_table;
|
||||
|
||||
-- Lightweight alternative for huge tables: just update statistics
|
||||
ANALYZE your_table;
|
||||
```
|
||||
|
||||
## Commands
|
||||
|
||||
```sql
|
||||
-- Validate query uses an index
|
||||
EXPLAIN (ANALYZE, BUFFERS) SELECT ...;
|
||||
|
||||
-- Check index creation progress
|
||||
SELECT * FROM pg_stat_progress_create_index;
|
||||
|
||||
-- Find invalid indexes
|
||||
SELECT c.relname, i.indisvalid
|
||||
FROM pg_class c JOIN pg_index i ON i.indexrelid = c.oid
|
||||
WHERE i.indisvalid = false;
|
||||
|
||||
-- Find unused indexes
|
||||
SELECT relname, indexrelname, idx_scan, pg_size_pretty(pg_relation_size(indexrelid))
|
||||
FROM pg_stat_all_indexes
|
||||
WHERE schemaname = 'public' AND idx_scan = 0;
|
||||
|
||||
-- Create index safely
|
||||
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_name ON table (columns);
|
||||
|
||||
-- Drop index safely
|
||||
DROP INDEX CONCURRENTLY IF EXISTS idx_name;
|
||||
|
||||
-- Rebuild invalid index
|
||||
REINDEX INDEX CONCURRENTLY idx_name;
|
||||
|
||||
-- Post-drop maintenance
|
||||
VACUUM (ANALYZE) table_name;
|
||||
```
|
||||
|
||||
## Context7 lookups
|
||||
|
||||
**Prerequisite:** Install Context7 MCP server for up-to-date documentation lookup.
|
||||
|
||||
| Library | Context7 ID | Use for |
|
||||
|---------|-------------|---------|
|
||||
| PostgreSQL | `/websites/postgresql_org_docs_current` | Index types, EXPLAIN, partitioned table indexing, REINDEX |
|
||||
|
||||
**Example queries:**
|
||||
```
|
||||
mcp_context7_query-docs(libraryId="/websites/postgresql_org_docs_current", query="CREATE INDEX CONCURRENTLY partitioned table")
|
||||
mcp_context7_query-docs(libraryId="/websites/postgresql_org_docs_current", query="EXPLAIN ANALYZE BUFFERS query plan")
|
||||
mcp_context7_query-docs(libraryId="/websites/postgresql_org_docs_current", query="partial index WHERE clause")
|
||||
mcp_context7_query-docs(libraryId="/websites/postgresql_org_docs_current", query="REINDEX CONCURRENTLY invalid index")
|
||||
mcp_context7_query-docs(libraryId="/websites/postgresql_org_docs_current", query="pg_stat_all_indexes monitoring")
|
||||
```
|
||||
|
||||
> **Note:** Use `mcp_context7_resolve-library-id` first if you need to find the correct library ID.
|
||||
|
||||
## Resources
|
||||
|
||||
- **EXPLAIN Visualizer**: [pev](https://tatiyants.com/pev/)
|
||||
## Activation Contract
|
||||
|
||||
Use this skill when query performance or schema work depends on PostgreSQL index design, validation, integrity, or partition-aware maintenance.
|
||||
|
||||
## Hard Rules
|
||||
|
||||
- Put constant filters in partial-index `WHERE` clauses, not in the indexed key.
|
||||
- Validate every proposed index with `EXPLAIN (ANALYZE, BUFFERS)`; never assume the planner will use it.
|
||||
- On live tables, create, drop, and rebuild indexes with `CONCURRENTLY` and idempotent guards where possible.
|
||||
- After `CREATE INDEX CONCURRENTLY`, always verify `indisvalid`; concurrent builds can leave unusable invalid indexes behind.
|
||||
- Do not use `ALTER INDEX ... ATTACH PARTITION`; Prowler needs independent partition index control.
|
||||
- For partitioned tables, build child-partition indexes first, then create the parent definition.
|
||||
- Watch over-indexing: extra indexes increase write cost, planning cost, and lock pressure.
|
||||
- After dropping indexes, refresh planner stats with `VACUUM (ANALYZE)` or at least `ANALYZE`.
|
||||
|
||||
## Decision Gates
|
||||
|
||||
| Question | Action |
|
||||
|---|---|
|
||||
| Query always filters on a fixed value like `state = 'completed'`? | Use a partial index with the constant in `WHERE`. |
|
||||
| Planner still chooses a seq scan on a tiny dataset? | Toggle `enable_seqscan = off` only for validation, then turn it back on. |
|
||||
| Creating or dropping on a live table? | Use `CONCURRENTLY`; avoid transaction wrappers that would invalidate the command. |
|
||||
| Working on `findings` or another partitioned table? | Create matching indexes on children first, then register the parent index. |
|
||||
| Index build succeeded syntactically but performance is still bad? | Check `pg_stat_progress_create_index`, `pg_index.indisvalid`, and redundant/unused index patterns. |
|
||||
| Need to remove an index? | Confirm workload coverage, drop concurrently, then run post-drop maintenance. |
|
||||
|
||||
## Execution Steps
|
||||
|
||||
1. Start from the query shape: filters, ordering, distinct/grouping, and whether a partial predicate can shrink the index.
|
||||
2. Choose column order by selectivity and leftmost-filter usage; avoid indexing frequently updated columns unless justified.
|
||||
3. Run `EXPLAIN (ANALYZE, BUFFERS)` before and after, and use `enable_seqscan = off` only as a temporary proof that the index path is valid.
|
||||
4. For production changes, use `CREATE INDEX CONCURRENTLY IF NOT EXISTS`, `DROP INDEX CONCURRENTLY IF EXISTS`, or `REINDEX INDEX CONCURRENTLY` as appropriate.
|
||||
5. Validate the result with `pg_index.indisvalid`; if `_ccnew` or `_ccold` artifacts appear, clean them up deliberately before retrying.
|
||||
6. On partitioned tables, create the same definition on child partitions first and only then add the parent metadata index; skip `ATTACH PARTITION`.
|
||||
7. Review redundant, unused, and bloated indexes, then run `VACUUM (ANALYZE)` or `ANALYZE` after drops or major churn.
|
||||
|
||||
## Output Contract
|
||||
|
||||
- Describe the target query pattern and the chosen index shape.
|
||||
- State whether the final design is full, partial, composite, or partitioned.
|
||||
- Report the validation evidence used: `EXPLAIN`, `indisvalid`, progress view, unused-index stats, or bloat checks.
|
||||
- If partitioned tables are involved, explicitly say child indexes were handled before the parent definition.
|
||||
- Mention any operational risk: over-indexing, invalid concurrent build, deadlock risk during parallel reindex, or stats refresh required.
|
||||
|
||||
## References
|
||||
|
||||
- `api/src/backend/api/db_utils.py`
|
||||
- `api/src/backend/api/partitions.py`
|
||||
- `skills/django-migration-psql/SKILL.md`
|
||||
- `api/src/backend/**/migrations/`
|
||||
|
||||
+50
-494
@@ -1,8 +1,6 @@
|
||||
---
|
||||
name: prowler-api
|
||||
description: >
|
||||
Prowler API patterns: RLS, RBAC, providers, Celery tasks.
|
||||
Trigger: When working in api/ on models/serializers/viewsets/filters/tasks involving tenant isolation (RLS), RBAC, or provider lifecycle.
|
||||
description: "Trigger: When working in `api/` on Prowler-specific models, serializers, viewsets, filters, Celery tasks, provider lifecycle, RBAC, or tenant isolation. Applies the repository’s RLS-first API contract."
|
||||
license: Apache-2.0
|
||||
metadata:
|
||||
author: prowler-cloud
|
||||
@@ -12,494 +10,52 @@ metadata:
|
||||
allowed-tools: Read, Edit, Write, Glob, Grep, Bash, WebFetch, WebSearch, Task
|
||||
---
|
||||
|
||||
## When to Use
|
||||
|
||||
Use this skill for **Prowler-specific** patterns:
|
||||
- Row-Level Security (RLS) / tenant isolation
|
||||
- RBAC permissions and role checks
|
||||
- Provider lifecycle and validation
|
||||
- Celery tasks with tenant context
|
||||
- Multi-database architecture (4-database setup)
|
||||
|
||||
For **generic DRF patterns** (ViewSets, Serializers, Filters, JSON:API), use `django-drf` skill.
|
||||
|
||||
---
|
||||
|
||||
## Critical Rules
|
||||
|
||||
- ALWAYS use `rls_transaction(tenant_id)` when querying outside ViewSet context
|
||||
- ALWAYS use `get_role()` before checking permissions (returns FIRST role only)
|
||||
- ALWAYS use `@set_tenant` then `@handle_provider_deletion` decorator order
|
||||
- ALWAYS use explicit through models for M2M relationships (required for RLS)
|
||||
- NEVER access `Provider.objects` without RLS context in Celery tasks
|
||||
- NEVER bypass RLS by using raw SQL or `connection.cursor()`
|
||||
- NEVER use Django's default M2M - RLS requires through models with `tenant_id`
|
||||
|
||||
> **Note**: `rls_transaction()` accepts both UUID objects and strings - it converts internally via `str(value)`.
|
||||
|
||||
---
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
### 4-Database Architecture
|
||||
|
||||
| Database | Alias | Purpose | RLS |
|
||||
|----------|-------|---------|-----|
|
||||
| `default` | `prowler_user` | Standard API queries | **Yes** |
|
||||
| `admin` | `admin` | Migrations, auth bypass | No |
|
||||
| `replica` | `prowler_user` | Read-only queries | **Yes** |
|
||||
| `admin_replica` | `admin` | Admin read replica | No |
|
||||
|
||||
```python
|
||||
# When to use admin (bypasses RLS)
|
||||
from api.db_router import MainRouter
|
||||
User.objects.using(MainRouter.admin_db).get(id=user_id) # Auth lookups
|
||||
|
||||
# Standard queries use default (RLS enforced)
|
||||
Provider.objects.filter(connected=True) # Requires rls_transaction context
|
||||
```
|
||||
|
||||
### RLS Transaction Flow
|
||||
|
||||
```
|
||||
Request → Authentication → BaseRLSViewSet.initial()
|
||||
│
|
||||
├─ Extract tenant_id from JWT
|
||||
├─ SET api.tenant_id = 'uuid' (PostgreSQL)
|
||||
└─ All queries now tenant-scoped
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Implementation Checklist
|
||||
|
||||
When implementing Prowler-specific API features:
|
||||
|
||||
| # | Pattern | Reference | Key Points |
|
||||
|---|---------|-----------|------------|
|
||||
| 1 | **RLS Models** | `api/rls.py` | Inherit `RowLevelSecurityProtectedModel`, add constraint |
|
||||
| 2 | **RLS Transactions** | `api/db_utils.py` | Use `rls_transaction(tenant_id)` context manager |
|
||||
| 3 | **RBAC Permissions** | `api/rbac/permissions.py` | `get_role()`, `get_providers()`, `Permissions` enum |
|
||||
| 4 | **Provider Validation** | `api/models.py` | `validate_<provider>_uid()` methods on `Provider` model |
|
||||
| 5 | **Celery Tasks** | `tasks/tasks.py`, `api/decorators.py`, `config/celery.py` | Task definitions, decorators (`@set_tenant`, `@handle_provider_deletion`), `RLSTask` base |
|
||||
| 6 | **RLS Serializers** | `api/v1/serializers.py` | Inherit `RLSSerializer` to auto-inject `tenant_id` |
|
||||
| 7 | **Through Models** | `api/models.py` | ALL M2M must use explicit through with `tenant_id` |
|
||||
|
||||
> **Full file paths**: See [references/file-locations.md](references/file-locations.md)
|
||||
|
||||
---
|
||||
|
||||
## Decision Trees
|
||||
|
||||
### Which Base Model?
|
||||
```
|
||||
Tenant-scoped data → RowLevelSecurityProtectedModel
|
||||
Global/shared data → models.Model + BaseSecurityConstraint (rare)
|
||||
Partitioned time-series → PostgresPartitionedModel + RowLevelSecurityProtectedModel
|
||||
Soft-deletable → Add is_deleted + ActiveProviderManager
|
||||
```
|
||||
|
||||
### Which Manager?
|
||||
```
|
||||
Normal queries → Model.objects (excludes deleted)
|
||||
Include deleted records → Model.all_objects
|
||||
Celery task context → Must use rls_transaction() first
|
||||
```
|
||||
|
||||
### Which Database?
|
||||
```
|
||||
Standard API queries → default (automatic via ViewSet)
|
||||
Read-only operations → replica (automatic for GET in BaseRLSViewSet)
|
||||
Auth/admin operations → MainRouter.admin_db
|
||||
Cross-tenant lookups → MainRouter.admin_db (use sparingly!)
|
||||
```
|
||||
|
||||
### Celery Task Decorator Order?
|
||||
```
|
||||
@shared_task(base=RLSTask, name="...", queue="...")
|
||||
@set_tenant # First: sets tenant context
|
||||
@handle_provider_deletion # Second: handles deleted providers
|
||||
def my_task(tenant_id, provider_id):
|
||||
pass
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## RLS Model Pattern
|
||||
|
||||
```python
|
||||
from api.rls import RowLevelSecurityProtectedModel, RowLevelSecurityConstraint
|
||||
|
||||
class MyModel(RowLevelSecurityProtectedModel):
|
||||
# tenant FK inherited from parent
|
||||
id = models.UUIDField(primary_key=True, default=uuid4, editable=False)
|
||||
name = models.CharField(max_length=255)
|
||||
inserted_at = models.DateTimeField(auto_now_add=True, editable=False)
|
||||
updated_at = models.DateTimeField(auto_now=True, editable=False)
|
||||
|
||||
class Meta(RowLevelSecurityProtectedModel.Meta):
|
||||
db_table = "my_models"
|
||||
constraints = [
|
||||
RowLevelSecurityConstraint(
|
||||
field="tenant_id",
|
||||
name="rls_on_%(class)s",
|
||||
statements=["SELECT", "INSERT", "UPDATE", "DELETE"],
|
||||
),
|
||||
]
|
||||
|
||||
class JSONAPIMeta:
|
||||
resource_name = "my-models"
|
||||
```
|
||||
|
||||
### M2M Relationships (MUST use through models)
|
||||
|
||||
```python
|
||||
class Resource(RowLevelSecurityProtectedModel):
|
||||
tags = models.ManyToManyField(
|
||||
ResourceTag,
|
||||
through="ResourceTagMapping", # REQUIRED for RLS
|
||||
)
|
||||
|
||||
class ResourceTagMapping(RowLevelSecurityProtectedModel):
|
||||
# Through model MUST have tenant_id for RLS
|
||||
resource = models.ForeignKey(Resource, on_delete=models.CASCADE)
|
||||
tag = models.ForeignKey(ResourceTag, on_delete=models.CASCADE)
|
||||
|
||||
class Meta:
|
||||
constraints = [
|
||||
RowLevelSecurityConstraint(
|
||||
field="tenant_id",
|
||||
name="rls_on_%(class)s",
|
||||
statements=["SELECT", "INSERT", "UPDATE", "DELETE"],
|
||||
),
|
||||
]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Async Task Response Pattern (202 Accepted)
|
||||
|
||||
For long-running operations, return 202 with task reference:
|
||||
|
||||
```python
|
||||
@action(detail=True, methods=["post"], url_name="connection")
|
||||
def connection(self, request, pk=None):
|
||||
with transaction.atomic():
|
||||
task = check_provider_connection_task.delay(
|
||||
provider_id=pk, tenant_id=self.request.tenant_id
|
||||
)
|
||||
prowler_task = Task.objects.get(id=task.id)
|
||||
serializer = TaskSerializer(prowler_task)
|
||||
return Response(
|
||||
data=serializer.data,
|
||||
status=status.HTTP_202_ACCEPTED,
|
||||
headers={"Content-Location": reverse("task-detail", kwargs={"pk": prowler_task.id})}
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Providers (11 Supported)
|
||||
|
||||
| Provider | UID Format | Example |
|
||||
|----------|-----------|---------|
|
||||
| AWS | 12 digits | `123456789012` |
|
||||
| Azure | UUID v4 | `a1b2c3d4-e5f6-...` |
|
||||
| GCP | 6-30 chars, lowercase, letter start | `my-gcp-project` |
|
||||
| M365 | Valid domain | `contoso.onmicrosoft.com` |
|
||||
| Kubernetes | 2-251 chars | `arn:aws:eks:...` |
|
||||
| GitHub | 1-39 chars | `my-org` |
|
||||
| IaC | Git URL | `https://github.com/user/repo.git` |
|
||||
| Oracle Cloud | OCID format | `ocid1.tenancy.oc1..` |
|
||||
| MongoDB Atlas | 24-char hex | `507f1f77bcf86cd799439011` |
|
||||
| Alibaba Cloud | 16 digits | `1234567890123456` |
|
||||
|
||||
**Adding new provider**: Add to `ProviderChoices` enum + create `validate_<provider>_uid()` staticmethod.
|
||||
|
||||
---
|
||||
|
||||
## RBAC Permissions
|
||||
|
||||
| Permission | Controls |
|
||||
|------------|----------|
|
||||
| `MANAGE_USERS` | User CRUD, role assignments |
|
||||
| `MANAGE_ACCOUNT` | Tenant settings |
|
||||
| `MANAGE_BILLING` | Billing/subscription |
|
||||
| `MANAGE_PROVIDERS` | Provider CRUD |
|
||||
| `MANAGE_INTEGRATIONS` | Integration config |
|
||||
| `MANAGE_SCANS` | Scan execution |
|
||||
| `UNLIMITED_VISIBILITY` | See all providers (bypasses provider_groups) |
|
||||
|
||||
### RBAC Visibility Pattern
|
||||
|
||||
```python
|
||||
def get_queryset(self):
|
||||
user_role = get_role(self.request.user)
|
||||
if user_role.unlimited_visibility:
|
||||
return Model.objects.filter(tenant_id=self.request.tenant_id)
|
||||
else:
|
||||
# Filter by provider_groups assigned to role
|
||||
return Model.objects.filter(provider__in=get_providers(user_role))
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Celery Queues
|
||||
|
||||
| Queue | Purpose |
|
||||
|-------|---------|
|
||||
| `scans` | Prowler scan execution |
|
||||
| `overview` | Dashboard aggregations (severity, attack surface) |
|
||||
| `compliance` | Compliance report generation |
|
||||
| `integrations` | External integrations (Jira, S3, Security Hub) |
|
||||
| `deletion` | Provider/tenant deletion (async) |
|
||||
| `backfill` | Historical data backfill operations |
|
||||
| `scan-reports` | Output generation (CSV, JSON, HTML, PDF) |
|
||||
|
||||
---
|
||||
|
||||
## Task Composition (Canvas)
|
||||
|
||||
Use Celery's Canvas primitives for complex workflows:
|
||||
|
||||
| Primitive | Use For |
|
||||
|-----------|---------|
|
||||
| `chain()` | Sequential execution: A → B → C |
|
||||
| `group()` | Parallel execution: A, B, C simultaneously |
|
||||
| Combined | Chain with nested groups for complex workflows |
|
||||
|
||||
> **Note:** Use `.si()` (signature immutable) to prevent result passing. Use `.s()` if you need to pass results.
|
||||
|
||||
> **Examples:** See [assets/celery_patterns.py](assets/celery_patterns.py) for chain, group, and combined patterns.
|
||||
|
||||
---
|
||||
|
||||
## Beat Scheduling (Periodic Tasks)
|
||||
|
||||
| Operation | Key Points |
|
||||
|-----------|------------|
|
||||
| **Create schedule** | `IntervalSchedule.objects.get_or_create(every=24, period=HOURS)` |
|
||||
| **Create periodic task** | Use task name (not function), `kwargs=json.dumps(...)` |
|
||||
| **Delete scheduled task** | `PeriodicTask.objects.filter(name=...).delete()` |
|
||||
| **Avoid race conditions** | Use `countdown=5` to wait for DB commit |
|
||||
|
||||
> **Examples:** See [assets/celery_patterns.py](assets/celery_patterns.py) for schedule_provider_scan pattern.
|
||||
|
||||
---
|
||||
|
||||
## Advanced Task Patterns
|
||||
|
||||
### `@set_tenant` Behavior
|
||||
|
||||
| Mode | `tenant_id` in kwargs | `tenant_id` passed to function |
|
||||
|------|----------------------|-------------------------------|
|
||||
| `@set_tenant` (default) | Popped (removed) | NO - function doesn't receive it |
|
||||
| `@set_tenant(keep_tenant=True)` | Read but kept | YES - function receives it |
|
||||
|
||||
### Key Patterns
|
||||
|
||||
| Pattern | Description |
|
||||
|---------|-------------|
|
||||
| `bind=True` | Access `self.request.id`, `self.request.retries` |
|
||||
| `get_task_logger(__name__)` | Proper logging in Celery tasks |
|
||||
| `SoftTimeLimitExceeded` | Catch to save progress before hard kill |
|
||||
| `countdown=30` | Defer execution by N seconds |
|
||||
| `eta=datetime(...)` | Execute at specific time |
|
||||
|
||||
> **Examples:** See [assets/celery_patterns.py](assets/celery_patterns.py) for all advanced patterns.
|
||||
|
||||
---
|
||||
|
||||
## Celery Configuration
|
||||
|
||||
| Setting | Value | Purpose |
|
||||
|---------|-------|---------|
|
||||
| `BROKER_VISIBILITY_TIMEOUT` | `86400` (24h) | Prevent re-queue for long tasks |
|
||||
| `CELERY_RESULT_BACKEND` | `django-db` | Store results in PostgreSQL |
|
||||
| `CELERY_TASK_TRACK_STARTED` | `True` | Track when tasks start |
|
||||
| `soft_time_limit` | Task-specific | Raises `SoftTimeLimitExceeded` |
|
||||
| `time_limit` | Task-specific | Hard kill (SIGKILL) |
|
||||
|
||||
> **Full config:** See [assets/celery_patterns.py](assets/celery_patterns.py) and actual files at `config/celery.py`, `config/settings/celery.py`.
|
||||
|
||||
---
|
||||
|
||||
## UUIDv7 for Partitioned Tables
|
||||
|
||||
`Finding` and `ResourceFindingMapping` use UUIDv7 for time-based partitioning:
|
||||
|
||||
```python
|
||||
from uuid6 import uuid7
|
||||
from api.uuid_utils import uuid7_start, uuid7_end, datetime_to_uuid7
|
||||
|
||||
# Partition-aware filtering
|
||||
start = uuid7_start(datetime_to_uuid7(date_from))
|
||||
end = uuid7_end(datetime_to_uuid7(date_to), settings.FINDINGS_TABLE_PARTITION_MONTHS)
|
||||
queryset.filter(id__gte=start, id__lt=end)
|
||||
```
|
||||
|
||||
**Why UUIDv7?** Time-ordered UUIDs enable PostgreSQL to prune partitions during range queries.
|
||||
|
||||
---
|
||||
|
||||
## Batch Operations with RLS
|
||||
|
||||
```python
|
||||
from api.db_utils import batch_delete, create_objects_in_batches, update_objects_in_batches
|
||||
|
||||
# Delete in batches (RLS-aware)
|
||||
batch_delete(tenant_id, queryset, batch_size=1000)
|
||||
|
||||
# Bulk create with RLS
|
||||
create_objects_in_batches(tenant_id, Finding, objects, batch_size=500)
|
||||
|
||||
# Bulk update with RLS
|
||||
update_objects_in_batches(tenant_id, Finding, objects, fields=["status"], batch_size=500)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Security Patterns
|
||||
|
||||
> **Full examples**: See [assets/security_patterns.py](assets/security_patterns.py)
|
||||
|
||||
### Tenant Isolation Summary
|
||||
|
||||
| Pattern | Rule |
|
||||
|---------|------|
|
||||
| **RLS in ViewSets** | Automatic via `BaseRLSViewSet` - tenant_id from JWT |
|
||||
| **RLS in Celery** | MUST use `@set_tenant` + `rls_transaction(tenant_id)` |
|
||||
| **Cross-tenant validation** | Defense-in-depth: verify `obj.tenant_id == request.tenant_id` |
|
||||
| **Never trust user input** | Use `request.tenant_id` from JWT, never `request.data.get("tenant_id")` |
|
||||
| **Admin DB bypass** | Only for cross-tenant admin ops - exposes ALL tenants' data |
|
||||
|
||||
### Celery Task Security Summary
|
||||
|
||||
| Pattern | Rule |
|
||||
|---------|------|
|
||||
| **Named tasks only** | NEVER use dynamic task names from user input |
|
||||
| **Validate arguments** | Check UUID format before database queries |
|
||||
| **Safe queuing** | Use `transaction.on_commit()` to enqueue AFTER commit |
|
||||
| **Modern retries** | Use `autoretry_for`, `retry_backoff`, `retry_jitter` |
|
||||
| **Time limits** | Set `soft_time_limit` and `time_limit` to prevent hung tasks |
|
||||
| **Idempotency** | Use `update_or_create` or idempotency keys |
|
||||
|
||||
### Quick Reference
|
||||
|
||||
```python
|
||||
# Safe task queuing - task only enqueued after transaction commits
|
||||
with transaction.atomic():
|
||||
provider = Provider.objects.create(**data)
|
||||
transaction.on_commit(
|
||||
lambda: verify_provider_connection.delay(
|
||||
tenant_id=str(request.tenant_id),
|
||||
provider_id=str(provider.id)
|
||||
)
|
||||
)
|
||||
|
||||
# Modern retry pattern
|
||||
@shared_task(
|
||||
base=RLSTask,
|
||||
bind=True,
|
||||
autoretry_for=(ConnectionError, TimeoutError, OperationalError),
|
||||
retry_backoff=True,
|
||||
retry_backoff_max=600,
|
||||
retry_jitter=True,
|
||||
max_retries=5,
|
||||
soft_time_limit=300,
|
||||
time_limit=360,
|
||||
)
|
||||
@set_tenant
|
||||
def sync_provider_data(self, tenant_id, provider_id):
|
||||
with rls_transaction(tenant_id):
|
||||
# ... task logic
|
||||
pass
|
||||
|
||||
# Idempotent task - safe to retry
|
||||
@shared_task(base=RLSTask, acks_late=True)
|
||||
@set_tenant
|
||||
def process_finding(tenant_id, finding_uid, data):
|
||||
with rls_transaction(tenant_id):
|
||||
Finding.objects.update_or_create(uid=finding_uid, defaults=data)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Production Deployment Checklist
|
||||
|
||||
> **Full settings**: See [references/production-settings.md](references/production-settings.md)
|
||||
|
||||
Run before every production deployment:
|
||||
|
||||
```bash
|
||||
cd api && poetry run python src/backend/manage.py check --deploy
|
||||
```
|
||||
|
||||
### Critical Settings
|
||||
|
||||
| Setting | Production Value | Risk if Wrong |
|
||||
|---------|-----------------|---------------|
|
||||
| `DEBUG` | `False` | Exposes stack traces, settings, SQL queries |
|
||||
| `SECRET_KEY` | Env var, rotated | Session hijacking, CSRF bypass |
|
||||
| `ALLOWED_HOSTS` | Explicit list | Host header attacks |
|
||||
| `SECURE_SSL_REDIRECT` | `True` | Credentials sent over HTTP |
|
||||
| `SESSION_COOKIE_SECURE` | `True` | Session cookies over HTTP |
|
||||
| `CSRF_COOKIE_SECURE` | `True` | CSRF tokens over HTTP |
|
||||
| `SECURE_HSTS_SECONDS` | `31536000` (1 year) | Downgrade attacks |
|
||||
| `CONN_MAX_AGE` | `60` or higher | Connection pool exhaustion |
|
||||
|
||||
---
|
||||
|
||||
## Commands
|
||||
|
||||
```bash
|
||||
# Development
|
||||
cd api && poetry run python src/backend/manage.py runserver
|
||||
cd api && poetry run python src/backend/manage.py shell
|
||||
|
||||
# Celery
|
||||
cd api && poetry run celery -A config.celery worker -l info -Q scans,overview
|
||||
cd api && poetry run celery -A config.celery beat -l info
|
||||
|
||||
# Testing
|
||||
cd api && poetry run pytest -x --tb=short
|
||||
|
||||
# Production checks
|
||||
cd api && poetry run python src/backend/manage.py check --deploy
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Resources
|
||||
|
||||
### Local References
|
||||
- **File Locations**: See [references/file-locations.md](references/file-locations.md)
|
||||
- **Modeling Decisions**: See [references/modeling-decisions.md](references/modeling-decisions.md)
|
||||
- **Configuration**: See [references/configuration.md](references/configuration.md)
|
||||
- **Production Settings**: See [references/production-settings.md](references/production-settings.md)
|
||||
- **Security Patterns**: See [assets/security_patterns.py](assets/security_patterns.py)
|
||||
|
||||
### Related Skills
|
||||
- **Generic DRF Patterns**: Use `django-drf` skill
|
||||
- **API Testing**: Use `prowler-test-api` skill
|
||||
|
||||
### Context7 MCP (Recommended)
|
||||
|
||||
**Prerequisite:** Install Context7 MCP server for up-to-date documentation lookup.
|
||||
|
||||
When implementing or debugging Prowler-specific patterns, query these libraries via `mcp_context7_query-docs`:
|
||||
|
||||
| Library | Context7 ID | Use For |
|
||||
|---------|-------------|---------|
|
||||
| **Celery** | `/websites/celeryq_dev_en_stable` | Task patterns, queues, error handling |
|
||||
| **django-celery-beat** | `/celery/django-celery-beat` | Periodic task scheduling |
|
||||
| **Django** | `/websites/djangoproject_en_5_2` | Models, ORM, constraints, indexes |
|
||||
|
||||
**Example queries:**
|
||||
```
|
||||
mcp_context7_query-docs(libraryId="/websites/celeryq_dev_en_stable", query="shared_task decorator retry patterns")
|
||||
mcp_context7_query-docs(libraryId="/celery/django-celery-beat", query="periodic task database scheduler")
|
||||
mcp_context7_query-docs(libraryId="/websites/djangoproject_en_5_2", query="model constraints CheckConstraint UniqueConstraint")
|
||||
```
|
||||
|
||||
> **Note:** Use `mcp_context7_resolve-library-id` first if you need to find the correct library ID.
|
||||
## Activation Contract
|
||||
|
||||
Use this skill for Prowler API behavior that depends on tenant isolation, RBAC visibility, provider orchestration, or Celery execution semantics. Pair it with `django-drf` for generic DRF patterns and `jsonapi` for response-shape compliance.
|
||||
|
||||
## Hard Rules
|
||||
|
||||
- Always preserve RLS boundaries; queries outside request-scoped viewsets must run inside `rls_transaction(tenant_id)`.
|
||||
- Always check permissions through the repo’s RBAC helpers before assuming provider visibility.
|
||||
- Always model tenant-scoped M2M relations with explicit through models carrying `tenant_id`.
|
||||
- Always keep Celery tenant setup and provider-deletion handling in the established decorator/base-task flow.
|
||||
- Never bypass RLS with raw SQL, unmanaged cursors, or admin connections unless the design explicitly requires cross-tenant access.
|
||||
- Never invent generic DRF patterns here when `django-drf` already owns them.
|
||||
|
||||
## Decision Gates
|
||||
|
||||
| Question | Action |
|
||||
|---|---|
|
||||
| Is the behavior tenant-scoped data access? | Use RLS-safe models, serializers, and `rls_transaction()` where request context is absent. |
|
||||
| Is the endpoint mostly generic DRF plumbing? | Load `django-drf` alongside this skill. |
|
||||
| Is the concern response/media-type compliance? | Load `jsonapi` alongside this skill. |
|
||||
| Is this async provider or scan orchestration? | Use Celery patterns with tenant-aware task setup. |
|
||||
| Does the query need admin or cross-tenant access? | Escalate the reason explicitly and use the admin path sparingly. |
|
||||
|
||||
## Execution Steps
|
||||
|
||||
1. Classify the change: RLS model, RBAC/viewset flow, provider lifecycle, serializer boundary, or Celery workflow.
|
||||
2. Identify where tenant context comes from and where it could be lost.
|
||||
3. Choose the correct base abstractions for models, serializers, viewsets, and tasks.
|
||||
4. Validate relationship modeling, provider visibility, and async handoff against existing Prowler patterns.
|
||||
5. Cross-check the implementation with `django-drf` and `jsonapi` when endpoint behavior is involved.
|
||||
6. Return only the repo-specific constraints that materially affect the change.
|
||||
|
||||
## Output Contract
|
||||
|
||||
- State the Prowler-specific API constraint that governs the task: RLS, RBAC, provider lifecycle, or Celery tenant handling.
|
||||
- Name any companion skills required, especially `django-drf` and `jsonapi`.
|
||||
- Call out the exact files or modules to inspect next.
|
||||
- Mention any high-risk boundary where tenant isolation or provider visibility could break.
|
||||
|
||||
## References
|
||||
|
||||
- [Repository agent rules](../../AGENTS.md)
|
||||
- [API component guidance](../../api/AGENTS.md)
|
||||
- [API file locations](references/file-locations.md)
|
||||
- [API modeling decisions](references/modeling-decisions.md)
|
||||
- [API configuration](references/configuration.md)
|
||||
- [Production settings notes](references/production-settings.md)
|
||||
- [Celery patterns asset](assets/celery_patterns.py)
|
||||
- [Security patterns asset](assets/security_patterns.py)
|
||||
|
||||
@@ -1,10 +1,6 @@
|
||||
---
|
||||
name: prowler-attack-paths-query
|
||||
description: >
|
||||
Creates Prowler Attack Paths openCypher queries using the Cartography schema as the source of truth
|
||||
for node labels, properties, and relationships. Also covers Prowler-specific additions (Internet node,
|
||||
ProwlerFinding, internal isolation labels) and $provider_uid scoping for predefined queries.
|
||||
Trigger: When creating or updating Attack Paths queries.
|
||||
description: "Trigger: When creating or updating Prowler Attack Paths openCypher queries. Governs provider scoping, Cartography-schema grounding, Prowler-specific labels, and openCypher-safe query patterns."
|
||||
license: Apache-2.0
|
||||
metadata:
|
||||
author: prowler-cloud
|
||||
@@ -17,471 +13,55 @@ metadata:
|
||||
allowed-tools: Read, Edit, Write, Glob, Grep, Bash, WebFetch, Task
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Attack Paths queries are openCypher queries that analyze cloud infrastructure graphs (ingested via Cartography) to detect security risks like privilege escalation paths, network exposure, and misconfigurations.
|
||||
|
||||
Queries are written in **openCypher Version 9** for compatibility with both Neo4j and Amazon Neptune.
|
||||
|
||||
---
|
||||
|
||||
## Two query audiences
|
||||
|
||||
This skill covers two types of queries with different isolation mechanisms:
|
||||
|
||||
| | Predefined queries | Custom queries |
|
||||
|---|---|---|
|
||||
| **Where they live** | `api/src/backend/api/attack_paths/queries/{provider}.py` | User/LLM-supplied via the custom query API endpoint |
|
||||
| **Provider isolation** | `AWSAccount {id: $provider_uid}` anchor + path connectivity | Automatic `_Provider_{uuid}` label injection via `cypher_sanitizer.py` |
|
||||
| **What to write** | Chain every MATCH from the `aws` variable | Plain Cypher, no isolation boilerplate needed |
|
||||
| **Internal labels** | Never use (`_ProviderResource`, `_Tenant_*`, `_Provider_*`) | Never use (injected automatically by the system) |
|
||||
|
||||
**For predefined queries**: every node must be reachable from the `AWSAccount` root via graph traversal. This is the isolation boundary.
|
||||
|
||||
**For custom queries**: write natural Cypher without isolation concerns. The query runner injects a `_Provider_{uuid}` label into every node pattern before execution, and a post-query filter catches edge cases.
|
||||
|
||||
---
|
||||
|
||||
## Input Sources
|
||||
|
||||
Queries can be created from:
|
||||
|
||||
1. **pathfinding.cloud ID** (e.g., `ECS-001`, `GLUE-001`)
|
||||
- Reference: https://github.com/DataDog/pathfinding.cloud
|
||||
- The aggregated `paths.json` is too large for WebFetch. Use Bash:
|
||||
|
||||
```bash
|
||||
# Fetch a single path by ID
|
||||
curl -s https://raw.githubusercontent.com/DataDog/pathfinding.cloud/main/docs/paths.json \
|
||||
| jq '.[] | select(.id == "ecs-002")'
|
||||
|
||||
# List all path IDs and names
|
||||
curl -s https://raw.githubusercontent.com/DataDog/pathfinding.cloud/main/docs/paths.json \
|
||||
| jq -r '.[] | "\(.id): \(.name)"'
|
||||
|
||||
# Filter by service prefix
|
||||
curl -s https://raw.githubusercontent.com/DataDog/pathfinding.cloud/main/docs/paths.json \
|
||||
| jq -r '.[] | select(.id | startswith("ecs")) | "\(.id): \(.name)"'
|
||||
```
|
||||
|
||||
If `jq` is not available, use `python3 -c "import json,sys; ..."` as a fallback.
|
||||
|
||||
2. **Natural language description** from the user
|
||||
|
||||
---
|
||||
|
||||
## Query Structure
|
||||
|
||||
### Provider scoping parameter
|
||||
|
||||
One parameter is injected automatically by the query runner:
|
||||
|
||||
| Parameter | Property it matches | Used on | Purpose |
|
||||
| --------------- | ------------------- | ------------ | -------------------------------- |
|
||||
| `$provider_uid` | `id` | `AWSAccount` | Scopes to a specific AWS account |
|
||||
|
||||
All other nodes are isolated by path connectivity from the `AWSAccount` anchor.
|
||||
|
||||
### Imports
|
||||
|
||||
All query files start with these imports:
|
||||
|
||||
```python
|
||||
from api.attack_paths.queries.types import (
|
||||
AttackPathsQueryAttribution,
|
||||
AttackPathsQueryDefinition,
|
||||
AttackPathsQueryParameterDefinition,
|
||||
)
|
||||
from tasks.jobs.attack_paths.config import PROWLER_FINDING_LABEL
|
||||
```
|
||||
|
||||
The `PROWLER_FINDING_LABEL` constant (value: `"ProwlerFinding"`) is used via f-string interpolation in all queries. Never hardcode the label string.
|
||||
|
||||
### Privilege escalation sub-patterns
|
||||
|
||||
There are four distinct privilege escalation patterns. Choose based on the attack type:
|
||||
|
||||
| Sub-pattern | Target | `path_target` shape | Example |
|
||||
|---|---|---|---|
|
||||
| Self-escalation | Principal's own policies | `(aws)--(target_policy:AWSPolicy)--(principal)` | IAM-001 |
|
||||
| Lateral to user | Other IAM users | `(aws)--(target_user:AWSUser)` | IAM-002 |
|
||||
| Assume-role lateral | Assumable roles | `(aws)--(target_role:AWSRole)<-[:STS_ASSUMEROLE_ALLOW]-(principal)` | IAM-014 |
|
||||
| PassRole + service | Service-trusting roles | `(aws)--(target_role:AWSRole)-[:TRUSTS_AWS_PRINCIPAL]->(...)` | EC2-001 |
|
||||
|
||||
#### Self-escalation (e.g., IAM-001)
|
||||
|
||||
The principal modifies resources attached to itself. `path_target` loops back to `principal`:
|
||||
|
||||
```python
|
||||
AWS_{QUERY_NAME} = AttackPathsQueryDefinition(
|
||||
id="aws-{kebab-case-name}",
|
||||
name="{Human-friendly label} ({REFERENCE_ID})",
|
||||
short_description="{Brief explanation, no technical permissions.}",
|
||||
description="{Detailed description of the attack vector and impact.}",
|
||||
attribution=AttackPathsQueryAttribution(
|
||||
text="pathfinding.cloud - {REFERENCE_ID} - {permission}",
|
||||
link="https://pathfinding.cloud/paths/{reference_id_lowercase}",
|
||||
),
|
||||
provider="aws",
|
||||
cypher=f"""
|
||||
// Find principals with {permission}
|
||||
MATCH path_principal = (aws:AWSAccount {{id: $provider_uid}})--(principal:AWSPrincipal)--(policy:AWSPolicy)--(stmt:AWSPolicyStatement)
|
||||
WHERE stmt.effect = 'Allow'
|
||||
AND any(action IN stmt.action WHERE
|
||||
toLower(action) = '{permission_lowercase}'
|
||||
OR toLower(action) = '{service}:*'
|
||||
OR action = '*'
|
||||
)
|
||||
|
||||
// Find target resources attached to the same principal
|
||||
MATCH path_target = (aws)--(target_policy:AWSPolicy)--(principal)
|
||||
WHERE target_policy.arn CONTAINS $provider_uid
|
||||
AND any(resource IN stmt.resource WHERE
|
||||
resource = '*'
|
||||
OR target_policy.arn CONTAINS resource
|
||||
)
|
||||
|
||||
WITH collect(path_principal) + collect(path_target) AS paths
|
||||
UNWIND paths AS p
|
||||
UNWIND nodes(p) AS n
|
||||
|
||||
WITH paths, collect(DISTINCT n) AS unique_nodes
|
||||
UNWIND unique_nodes AS n
|
||||
OPTIONAL MATCH (n)-[pfr]-(pf:{PROWLER_FINDING_LABEL} {{status: 'FAIL'}})
|
||||
|
||||
RETURN paths, collect(DISTINCT pf) as dpf, collect(DISTINCT pfr) as dpfr
|
||||
""",
|
||||
parameters=[],
|
||||
)
|
||||
```
|
||||
|
||||
#### Other sub-pattern `path_target` shapes
|
||||
|
||||
The other 3 sub-patterns share the same `path_principal`, deduplication tail, and RETURN as self-escalation. Only the `path_target` MATCH differs:
|
||||
|
||||
```cypher
|
||||
// Lateral to user (e.g., IAM-002) - targets other IAM users
|
||||
MATCH path_target = (aws)--(target_user:AWSUser)
|
||||
WHERE any(resource IN stmt.resource WHERE resource = '*' OR target_user.arn CONTAINS resource OR resource CONTAINS target_user.name)
|
||||
|
||||
// Assume-role lateral (e.g., IAM-014) - targets roles the principal can assume
|
||||
MATCH path_target = (aws)--(target_role:AWSRole)<-[:STS_ASSUMEROLE_ALLOW]-(principal)
|
||||
WHERE any(resource IN stmt.resource WHERE resource = '*' OR target_role.arn CONTAINS resource OR resource CONTAINS target_role.name)
|
||||
|
||||
// PassRole + service (e.g., EC2-001) - targets roles trusting a service
|
||||
MATCH path_target = (aws)--(target_role:AWSRole)-[:TRUSTS_AWS_PRINCIPAL]->(:AWSPrincipal {arn: '{service}.amazonaws.com'})
|
||||
WHERE any(resource IN stmt.resource WHERE resource = '*' OR target_role.arn CONTAINS resource OR resource CONTAINS target_role.name)
|
||||
```
|
||||
|
||||
**Multi-permission**: PassRole queries require a second permission. Add `MATCH (principal)--(policy2:AWSPolicy)--(stmt2:AWSPolicyStatement)` with its own WHERE before `path_target`, then check BOTH `stmt.resource` AND `stmt2.resource` against the target. See IAM-015 or EC2-001 in `aws.py` for examples.
|
||||
|
||||
### Network exposure pattern
|
||||
|
||||
The Internet node is reached via `CAN_ACCESS` through the already-scoped resource, not via a standalone lookup:
|
||||
|
||||
```python
|
||||
AWS_{QUERY_NAME} = AttackPathsQueryDefinition(
|
||||
id="aws-{kebab-case-name}",
|
||||
name="{Human-friendly label}",
|
||||
short_description="{Brief explanation.}",
|
||||
description="{Detailed description.}",
|
||||
provider="aws",
|
||||
cypher=f"""
|
||||
// Match exposed resources (MUST chain from `aws`)
|
||||
MATCH path = (aws:AWSAccount {{id: $provider_uid}})--(resource:EC2Instance)
|
||||
WHERE resource.exposed_internet = true
|
||||
|
||||
// Internet node reached via path connectivity through the resource
|
||||
OPTIONAL MATCH (internet:Internet)-[can_access:CAN_ACCESS]->(resource)
|
||||
|
||||
WITH collect(path) AS paths, head(collect(internet)) AS internet, collect(can_access) AS can_access
|
||||
UNWIND paths AS p
|
||||
UNWIND nodes(p) AS n
|
||||
|
||||
WITH paths, internet, can_access, collect(DISTINCT n) AS unique_nodes
|
||||
UNWIND unique_nodes AS n
|
||||
OPTIONAL MATCH (n)-[pfr]-(pf:{PROWLER_FINDING_LABEL} {{status: 'FAIL'}})
|
||||
|
||||
RETURN paths, collect(DISTINCT pf) as dpf, collect(DISTINCT pfr) as dpfr,
|
||||
internet, can_access
|
||||
""",
|
||||
parameters=[],
|
||||
)
|
||||
```
|
||||
|
||||
### Register in query list
|
||||
|
||||
Add to the `{PROVIDER}_QUERIES` list at the bottom of the file:
|
||||
|
||||
```python
|
||||
AWS_QUERIES: list[AttackPathsQueryDefinition] = [
|
||||
# ... existing queries ...
|
||||
AWS_{NEW_QUERY_NAME}, # Add here
|
||||
]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step-by-step creation process
|
||||
|
||||
### 1. Read the queries module
|
||||
|
||||
**FIRST**, read all files in the queries module to understand the structure, type definitions, registration, and existing style:
|
||||
|
||||
```
|
||||
api/src/backend/api/attack_paths/queries/
|
||||
├── __init__.py # Module exports
|
||||
├── types.py # AttackPathsQueryDefinition, AttackPathsQueryParameterDefinition
|
||||
├── registry.py # Query registry logic
|
||||
└── {provider}.py # Provider-specific queries (e.g., aws.py)
|
||||
```
|
||||
|
||||
**DO NOT** use generic templates. Match the exact style of existing queries in the file.
|
||||
|
||||
### 2. Fetch and consult the Cartography schema
|
||||
|
||||
**This is the most important step.** Every node label, property, and relationship in the query must exist in the Cartography schema for the pinned version. Do not guess or rely on memory.
|
||||
|
||||
Check `api/pyproject.toml` for the Cartography dependency, then fetch the schema:
|
||||
|
||||
```bash
|
||||
grep cartography api/pyproject.toml
|
||||
```
|
||||
|
||||
Build the schema URL (ALWAYS use the specific tag, not master/main):
|
||||
|
||||
```
|
||||
# Git dependency (prowler-cloud/cartography@0.126.1):
|
||||
https://raw.githubusercontent.com/prowler-cloud/cartography/refs/tags/0.126.1/docs/root/modules/{provider}/schema.md
|
||||
|
||||
# PyPI dependency (cartography = "^0.126.0"):
|
||||
https://raw.githubusercontent.com/cartography-cncf/cartography/refs/tags/0.126.0/docs/root/modules/{provider}/schema.md
|
||||
```
|
||||
|
||||
Read the schema to discover available node labels, properties, and relationships for the target resources. Internal labels (`_ProviderResource`, `_AWSResource`, `_Tenant_*`, `_Provider_*`) exist for isolation but should never appear in queries.
|
||||
|
||||
### 4. Create query definition
|
||||
|
||||
Use the appropriate pattern (privilege escalation or network exposure) with:
|
||||
|
||||
- **id**: `{provider}-{kebab-case-description}`
|
||||
- **name**: Short, human-friendly label. For sourced queries, append the reference ID: `"EC2 Instance Launch with Privileged Role (EC2-001)"`.
|
||||
- **short_description**: Brief explanation, no technical permissions.
|
||||
- **description**: Full technical explanation. Plain text only.
|
||||
- **provider**: Provider identifier (aws, azure, gcp, kubernetes, github)
|
||||
- **cypher**: The openCypher query with proper escaping
|
||||
- **parameters**: Optional list of user-provided parameters (`parameters=[]` if none)
|
||||
- **attribution**: Optional `AttackPathsQueryAttribution(text, link)` for sourced queries. The `text` includes source, reference ID, and permissions. The `link` uses a lowercase ID. Omit for non-sourced queries.
|
||||
|
||||
### 5. Add query to provider list
|
||||
|
||||
Add the constant to the `{PROVIDER}_QUERIES` list.
|
||||
|
||||
---
|
||||
|
||||
## Query naming conventions
|
||||
|
||||
### Query ID
|
||||
|
||||
```
|
||||
{provider}-{category}-{description}
|
||||
```
|
||||
|
||||
Examples: `aws-ec2-privesc-passrole-iam`, `aws-ec2-instances-internet-exposed`
|
||||
|
||||
### Query constant name
|
||||
|
||||
```
|
||||
{PROVIDER}_{CATEGORY}_{DESCRIPTION}
|
||||
```
|
||||
|
||||
Examples: `AWS_EC2_PRIVESC_PASSROLE_IAM`, `AWS_EC2_INSTANCES_INTERNET_EXPOSED`
|
||||
|
||||
---
|
||||
|
||||
## Query categories
|
||||
|
||||
| Category | Description | Example |
|
||||
| -------------------- | ------------------------------ | ------------------------- |
|
||||
| Basic Resource | List resources with properties | RDS instances, S3 buckets |
|
||||
| Network Exposure | Internet-exposed resources | EC2 with public IPs |
|
||||
| Privilege Escalation | IAM privilege escalation paths | PassRole + RunInstances |
|
||||
| Data Access | Access to sensitive data | EC2 with S3 access |
|
||||
|
||||
---
|
||||
|
||||
## Common openCypher patterns
|
||||
|
||||
### Match account and principal
|
||||
|
||||
```cypher
|
||||
MATCH path_principal = (aws:AWSAccount {id: $provider_uid})--(principal:AWSPrincipal)--(policy:AWSPolicy)--(stmt:AWSPolicyStatement)
|
||||
```
|
||||
|
||||
### Check IAM action permissions
|
||||
|
||||
```cypher
|
||||
WHERE stmt.effect = 'Allow'
|
||||
AND any(action IN stmt.action WHERE
|
||||
toLower(action) = 'iam:passrole'
|
||||
OR toLower(action) = 'iam:*'
|
||||
OR action = '*'
|
||||
)
|
||||
```
|
||||
|
||||
### Find roles trusting a service
|
||||
|
||||
```cypher
|
||||
MATCH path_target = (aws)--(target_role:AWSRole)-[:TRUSTS_AWS_PRINCIPAL]->(:AWSPrincipal {arn: 'ec2.amazonaws.com'})
|
||||
```
|
||||
|
||||
### Find roles the principal can assume
|
||||
|
||||
Note the arrow direction - `STS_ASSUMEROLE_ALLOW` points from the role to the principal:
|
||||
|
||||
```cypher
|
||||
MATCH path_target = (aws)--(target_role:AWSRole)<-[:STS_ASSUMEROLE_ALLOW]-(principal)
|
||||
```
|
||||
|
||||
### Check resource scope
|
||||
|
||||
```cypher
|
||||
WHERE any(resource IN stmt.resource WHERE
|
||||
resource = '*'
|
||||
OR target_role.arn CONTAINS resource
|
||||
OR resource CONTAINS target_role.name
|
||||
)
|
||||
```
|
||||
|
||||
### Internet node via path connectivity
|
||||
|
||||
The Internet node is reached through `CAN_ACCESS` relationships to already-scoped resources. No standalone lookup needed:
|
||||
|
||||
```cypher
|
||||
OPTIONAL MATCH (internet:Internet)-[can_access:CAN_ACCESS]->(resource)
|
||||
```
|
||||
|
||||
### Multi-label OR (match multiple resource types)
|
||||
|
||||
```cypher
|
||||
MATCH path = (aws:AWSAccount {id: $provider_uid})-[r]-(x)-[q]-(y)
|
||||
WHERE (x:EC2PrivateIp AND x.public_ip = $ip)
|
||||
OR (x:EC2Instance AND x.publicipaddress = $ip)
|
||||
OR (x:NetworkInterface AND x.public_ip = $ip)
|
||||
OR (x:ElasticIPAddress AND x.public_ip = $ip)
|
||||
```
|
||||
|
||||
### Include Prowler findings
|
||||
|
||||
Deduplicate nodes before the ProwlerFinding lookup to avoid redundant OPTIONAL MATCH calls on nodes that appear in multiple paths:
|
||||
|
||||
```cypher
|
||||
WITH collect(path_principal) + collect(path_target) AS paths
|
||||
UNWIND paths AS p
|
||||
UNWIND nodes(p) AS n
|
||||
|
||||
WITH paths, collect(DISTINCT n) AS unique_nodes
|
||||
UNWIND unique_nodes AS n
|
||||
OPTIONAL MATCH (n)-[pfr]-(pf:{PROWLER_FINDING_LABEL} {{status: 'FAIL'}})
|
||||
|
||||
RETURN paths, collect(DISTINCT pf) as dpf, collect(DISTINCT pfr) as dpfr
|
||||
```
|
||||
|
||||
For network exposure queries, aggregate the internet node and relationship alongside paths:
|
||||
|
||||
```cypher
|
||||
WITH collect(path) AS paths, head(collect(internet)) AS internet, collect(can_access) AS can_access
|
||||
UNWIND paths AS p
|
||||
UNWIND nodes(p) AS n
|
||||
|
||||
WITH paths, internet, can_access, collect(DISTINCT n) AS unique_nodes
|
||||
UNWIND unique_nodes AS n
|
||||
OPTIONAL MATCH (n)-[pfr]-(pf:{PROWLER_FINDING_LABEL} {{status: 'FAIL'}})
|
||||
|
||||
RETURN paths, collect(DISTINCT pf) as dpf, collect(DISTINCT pfr) as dpfr,
|
||||
internet, can_access
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Prowler-specific labels and relationships
|
||||
|
||||
These are added by the sync task, not part of the Cartography schema. For all other node labels, properties, and relationships, **always consult the Cartography schema** (see step 2 below).
|
||||
|
||||
| Label/Relationship | Description |
|
||||
| ---------------------- | -------------------------------------------------- |
|
||||
| `ProwlerFinding` | Finding node (`status`, `severity`, `check_id`) |
|
||||
| `Internet` | Internet sentinel node |
|
||||
| `CAN_ACCESS` | Internet-to-resource exposure (relationship) |
|
||||
| `HAS_FINDING` | Resource-to-finding link (relationship) |
|
||||
| `TRUSTS_AWS_PRINCIPAL` | Role trust relationship |
|
||||
| `STS_ASSUMEROLE_ALLOW` | Can assume role (direction: role -> principal) |
|
||||
|
||||
---
|
||||
|
||||
## Parameters
|
||||
|
||||
For queries requiring user input:
|
||||
|
||||
```python
|
||||
parameters=[
|
||||
AttackPathsQueryParameterDefinition(
|
||||
name="ip",
|
||||
label="IP address",
|
||||
# data_type defaults to "string", cast defaults to str.
|
||||
# For non-string params, set both: data_type="integer", cast=int
|
||||
description="Public IP address, e.g. 192.0.2.0.",
|
||||
placeholder="192.0.2.0",
|
||||
),
|
||||
],
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Best practices
|
||||
|
||||
1. **Chain all MATCHes from the root account node**: Every `MATCH` clause must connect to the `aws` variable (or another variable already bound to the account's subgraph). An unanchored `MATCH` would return nodes from all providers.
|
||||
|
||||
```cypher
|
||||
// WRONG: matches ALL AWSRoles across all providers
|
||||
MATCH (role:AWSRole) WHERE role.name = 'admin'
|
||||
|
||||
// CORRECT: scoped to the specific account's subgraph
|
||||
MATCH (aws)--(role:AWSRole) WHERE role.name = 'admin'
|
||||
```
|
||||
|
||||
**Exception**: A second-permission MATCH like `MATCH (principal)--(policy2:AWSPolicy)--(stmt2:AWSPolicyStatement)` is safe because `principal` is already bound to the account's subgraph by the first MATCH. It does not need to chain from `aws` again.
|
||||
|
||||
2. **Include Prowler findings**: Always add `OPTIONAL MATCH (n)-[pfr]-(pf:{PROWLER_FINDING_LABEL} {{status: 'FAIL'}})` with `collect(DISTINCT pf)`.
|
||||
|
||||
3. **Comment the query purpose**: Add inline comments explaining each MATCH clause.
|
||||
|
||||
4. **Never use internal labels in queries**: `_ProviderResource`, `_AWSResource`, `_Tenant_*`, `_Provider_*` are for system isolation. They should never appear in predefined or custom query text.
|
||||
|
||||
6. **Internet node uses path connectivity**: Reach it via `OPTIONAL MATCH (internet:Internet)-[can_access:CAN_ACCESS]->(resource)` where `resource` is already scoped by the account anchor. No standalone lookup.
|
||||
|
||||
---
|
||||
|
||||
## openCypher compatibility
|
||||
|
||||
Queries must be written in **openCypher Version 9** for compatibility with both Neo4j and Amazon Neptune.
|
||||
|
||||
### Avoid these (not in openCypher spec)
|
||||
|
||||
| Feature | Use instead |
|
||||
| -------------------------- | ------------------------------------------------------ |
|
||||
| APOC procedures (`apoc.*`) | Real nodes and relationships in the graph |
|
||||
| Neptune extensions | Standard openCypher |
|
||||
| `reduce()` function | `UNWIND` + `collect()` |
|
||||
| `FOREACH` clause | `WITH` + `UNWIND` + `SET` |
|
||||
| Regex operator (`=~`) | `toLower()` + exact match, or `CONTAINS`/`STARTS WITH`. One legacy query uses `=~` - do not add new usages |
|
||||
| `CALL () { UNION }` | Multi-label OR in WHERE (see patterns section) |
|
||||
|
||||
---
|
||||
|
||||
## Reference
|
||||
|
||||
- **pathfinding.cloud**: https://github.com/DataDog/pathfinding.cloud (use `curl | jq`, not WebFetch)
|
||||
- **Cartography schema**: `https://raw.githubusercontent.com/{org}/cartography/refs/tags/{version}/docs/root/modules/{provider}/schema.md`
|
||||
- **Neptune openCypher compliance**: https://docs.aws.amazon.com/neptune/latest/userguide/feature-opencypher-compliance.html
|
||||
- **openCypher spec**: https://github.com/opencypher/openCypher
|
||||
## Activation Contract
|
||||
|
||||
Use this skill when editing predefined Attack Paths queries in `api/src/backend/api/attack_paths/queries/` or when designing query logic that must remain compatible with Neptune and Neo4j.
|
||||
|
||||
## Hard Rules
|
||||
|
||||
- Write openCypher Version 9 only; no APOC, Neptune-only extensions, fresh regex usage, `reduce()`, or `CALL ... UNION` shortcuts.
|
||||
- For predefined queries, anchor on `MATCH (aws:AWSAccount {id: $provider_uid})...` and keep every additional `MATCH` chained from `aws` or a variable already proven to belong to that scoped subgraph.
|
||||
- For custom-query behavior, do not add manual isolation boilerplate; the sanitizer injects provider labels automatically.
|
||||
- Never use internal isolation labels in query text: `_ProviderResource`, `_AWSResource`, `_Tenant_*`, `_Provider_*`.
|
||||
- Read the pinned Cartography dependency from `api/pyproject.toml` and ground every non-Prowler label, property, and relationship in that schema before writing Cypher.
|
||||
- Use `PROWLER_FINDING_LABEL` via f-string interpolation; never hardcode `ProwlerFinding`.
|
||||
- Deduplicate path nodes before the `OPTIONAL MATCH` that loads Prowler findings.
|
||||
- Register every new predefined query in the provider query list.
|
||||
|
||||
## Decision Gates
|
||||
|
||||
| Question | Action |
|
||||
|---|---|
|
||||
| Is this a predefined repository query? | Manually anchor on `AWSAccount {id: $provider_uid}` and preserve path connectivity for every matched node. |
|
||||
| Is this a custom query endpoint scenario? | Write plain Cypher and let the sanitizer inject provider isolation. |
|
||||
| Is the attack path privilege escalation? | Pick the correct target pattern: self-escalation, lateral user, assumable role, or PassRole + service. |
|
||||
| Is it a network exposure path? | Scope the resource first, then reach `(:Internet)-[:CAN_ACCESS]->(resource)` from that already-scoped node. |
|
||||
| Need another permission check? | Add a second policy/statement match from the already-bound principal, not a new unscoped root match. |
|
||||
| Unsure whether a label or property exists? | Stop guessing and verify against the pinned Cartography schema before proceeding. |
|
||||
|
||||
## Execution Steps
|
||||
|
||||
1. Read `types.py`, `registry.py`, and the provider query module to match existing constant names, registration style, and return shape.
|
||||
2. Inspect `api/pyproject.toml` for the pinned Cartography source/version and verify every non-Prowler label, property, and relationship against that schema.
|
||||
3. Choose the correct audience and pattern: predefined scoped query, custom sanitized query, privilege-escalation sub-pattern, or network-exposure pattern.
|
||||
4. Build the query definition with stable naming: `{provider}-{category}-{description}` id, uppercase constant, plain-language descriptions, and attribution only when sourced.
|
||||
5. Add inline comments, keep all matches scoped, use `PROWLER_FINDING_LABEL`, deduplicate nodes before `OPTIONAL MATCH`, and include `internet, can_access` only for network-exposure returns.
|
||||
6. Add any parameter definitions with explicit typing when non-string input is required.
|
||||
7. Register the constant in `{PROVIDER}_QUERIES` and re-check openCypher compatibility before finishing.
|
||||
|
||||
## Output Contract
|
||||
|
||||
- State whether the query is predefined or custom-query-oriented.
|
||||
- Name the scoping anchor used and the attack pattern selected.
|
||||
- Report which schema sources were verified locally (`api/pyproject.toml`, provider query module, pinned schema target/version).
|
||||
- Mention whether the query includes Prowler finding enrichment, network exposure return values, or multi-permission logic.
|
||||
- Confirm registration location and any parameters added.
|
||||
|
||||
## References
|
||||
|
||||
- `api/src/backend/api/attack_paths/queries/types.py`
|
||||
- `api/src/backend/api/attack_paths/queries/registry.py`
|
||||
- `api/src/backend/api/attack_paths/queries/{provider}.py`
|
||||
- `api/src/backend/api/attack_paths/cypher_sanitizer.py`
|
||||
- `api/src/backend/tasks/jobs/attack_paths/config.py`
|
||||
- `api/pyproject.toml`
|
||||
|
||||
+62
-1033
File diff suppressed because it is too large
Load Diff
+49
-313
@@ -1,8 +1,6 @@
|
||||
---
|
||||
name: prowler-ui
|
||||
description: >
|
||||
Prowler UI-specific patterns. For generic patterns, see: typescript, react-19, nextjs-15, tailwind-4.
|
||||
Trigger: When working inside ui/ on Prowler-specific conventions (shadcn vs HeroUI legacy, folder placement, actions/adapters, shared types/hooks/lib).
|
||||
description: "Trigger: When working inside `ui/` on Prowler-specific app structure, folder placement, shared UI conventions, shadcn adoption, or display-layer patterns beyond generic React/Next.js guidance. Applies the repo’s UI architecture rules."
|
||||
license: Apache-2.0
|
||||
metadata:
|
||||
author: prowler-cloud
|
||||
@@ -14,313 +12,51 @@ metadata:
|
||||
allowed-tools: Read, Edit, Write, Glob, Grep, Bash, WebFetch, WebSearch, Task
|
||||
---
|
||||
|
||||
## Related Generic Skills
|
||||
|
||||
- `typescript` - Const types, flat interfaces
|
||||
- `react-19` - No useMemo/useCallback, compiler
|
||||
- `nextjs-15` - App Router, Server Actions
|
||||
- `tailwind-4` - cn() utility, styling rules
|
||||
- `zod-4` - Schema validation
|
||||
- `zustand-5` - State management
|
||||
- `ai-sdk-5` - Chat/AI features
|
||||
- `playwright` - E2E testing (see also `prowler-test-ui`)
|
||||
|
||||
## Tech Stack (Versions)
|
||||
|
||||
```
|
||||
Next.js 15.5.9 | React 19.2.2 | Tailwind 4.1.13 | shadcn/ui
|
||||
Zod 4.1.11 | React Hook Form 7.62.0 | Zustand 5.0.8
|
||||
NextAuth 5.0.0-beta.30 | Recharts 2.15.4
|
||||
HeroUI 2.8.4 (LEGACY - do not add new components)
|
||||
```
|
||||
|
||||
## CRITICAL: Component Library Rule
|
||||
|
||||
- **ALWAYS**: Use `shadcn/ui` + Tailwind (`components/shadcn/`)
|
||||
- **NEVER**: Add new HeroUI components (`components/ui/` is legacy only)
|
||||
|
||||
## DECISION TREES
|
||||
|
||||
### Component Placement
|
||||
|
||||
```
|
||||
New feature UI? → shadcn/ui + Tailwind
|
||||
Existing HeroUI feature? → Keep HeroUI (don't mix)
|
||||
Used 1 feature? → features/{feature}/components/
|
||||
Used 2+ features? → components/shared/
|
||||
Needs state/hooks? → "use client"
|
||||
Server component? → No directive needed
|
||||
```
|
||||
|
||||
### Code Location
|
||||
|
||||
```
|
||||
Server action → actions/{feature}/{feature}.ts
|
||||
Data transform → actions/{feature}/{feature}.adapter.ts
|
||||
Types (shared 2+) → types/{domain}.ts
|
||||
Types (local 1) → {feature}/types.ts
|
||||
Utils (shared 2+) → lib/
|
||||
Utils (local 1) → {feature}/utils/
|
||||
Hooks (shared 2+) → hooks/
|
||||
Hooks (local 1) → {feature}/hooks.ts
|
||||
shadcn components → components/shadcn/
|
||||
HeroUI components → components/ui/ (LEGACY)
|
||||
```
|
||||
|
||||
### Styling Decision
|
||||
|
||||
```
|
||||
Tailwind class exists? → className
|
||||
Dynamic value? → style prop
|
||||
Conditional styles? → cn()
|
||||
Static only? → className (no cn())
|
||||
Recharts/library? → CHART_COLORS constant + var()
|
||||
```
|
||||
|
||||
### Scope Rule (ABSOLUTE)
|
||||
|
||||
- Used 2+ places → `lib/` or `types/` or `hooks/` (components go in `components/{domain}/`)
|
||||
- Used 1 place → keep local in feature directory
|
||||
- **This determines ALL folder structure decisions**
|
||||
|
||||
## Project Structure
|
||||
|
||||
```
|
||||
ui/
|
||||
├── app/
|
||||
│ ├── (auth)/ # Auth pages (login, signup)
|
||||
│ └── (prowler)/ # Main app
|
||||
│ ├── compliance/
|
||||
│ ├── findings/
|
||||
│ ├── providers/
|
||||
│ ├── scans/
|
||||
│ ├── services/
|
||||
│ └── integrations/
|
||||
├── components/
|
||||
│ ├── shadcn/ # shadcn/ui (USE THIS)
|
||||
│ ├── ui/ # HeroUI (LEGACY)
|
||||
│ ├── {domain}/ # Domain-specific (compliance, findings, providers, etc.)
|
||||
│ ├── filters/ # Filter components
|
||||
│ ├── graphs/ # Chart components
|
||||
│ └── icons/ # Icon components
|
||||
├── actions/ # Server actions
|
||||
├── types/ # Shared types
|
||||
├── hooks/ # Shared hooks
|
||||
├── lib/ # Utilities
|
||||
├── store/ # Zustand state
|
||||
├── tests/ # Playwright E2E
|
||||
└── styles/ # Global CSS
|
||||
```
|
||||
|
||||
## Recharts (Special Case)
|
||||
|
||||
For Recharts props that don't accept className:
|
||||
|
||||
```typescript
|
||||
const CHART_COLORS = {
|
||||
primary: "var(--color-primary)",
|
||||
secondary: "var(--color-secondary)",
|
||||
text: "var(--color-text)",
|
||||
gridLine: "var(--color-border)",
|
||||
};
|
||||
|
||||
// Only use var() for library props, NEVER in className
|
||||
<XAxis tick={{ fill: CHART_COLORS.text }} />
|
||||
<CartesianGrid stroke={CHART_COLORS.gridLine} />
|
||||
```
|
||||
|
||||
## Form + Validation Pattern
|
||||
|
||||
```typescript
|
||||
"use client";
|
||||
import { useForm } from "react-hook-form";
|
||||
import { zodResolver } from "@hookform/resolvers/zod";
|
||||
import { z } from "zod";
|
||||
|
||||
const schema = z.object({
|
||||
email: z.email(), // Zod 4 syntax
|
||||
name: z.string().min(1),
|
||||
});
|
||||
|
||||
type FormData = z.infer<typeof schema>;
|
||||
|
||||
export function MyForm() {
|
||||
const { register, handleSubmit, formState: { errors } } = useForm<FormData>({
|
||||
resolver: zodResolver(schema),
|
||||
});
|
||||
|
||||
const onSubmit = async (data: FormData) => {
|
||||
await serverAction(data);
|
||||
};
|
||||
|
||||
return (
|
||||
<form onSubmit={handleSubmit(onSubmit)}>
|
||||
<input {...register("email")} />
|
||||
{errors.email && <span>{errors.email.message}</span>}
|
||||
<button type="submit">Submit</button>
|
||||
</form>
|
||||
);
|
||||
}
|
||||
```
|
||||
|
||||
## Commands
|
||||
|
||||
```bash
|
||||
# Development
|
||||
cd ui && pnpm install
|
||||
cd ui && pnpm run dev
|
||||
|
||||
# Code Quality
|
||||
cd ui && pnpm run typecheck
|
||||
cd ui && pnpm run lint:fix
|
||||
cd ui && pnpm run format:write
|
||||
cd ui && pnpm run healthcheck # typecheck + lint
|
||||
|
||||
# Testing
|
||||
cd ui && pnpm run test:e2e
|
||||
cd ui && pnpm run test:e2e:ui
|
||||
cd ui && pnpm run test:e2e:debug
|
||||
|
||||
# Build
|
||||
cd ui && pnpm run build
|
||||
cd ui && pnpm start
|
||||
```
|
||||
|
||||
## Batch vs Instant Component API (REQUIRED)
|
||||
|
||||
When a component supports both **batch** (deferred, submit-based) and **instant** (immediate callback) behavior, model the coupling with a discriminated union — never as independent optionals. Coupled props must be all-or-nothing.
|
||||
|
||||
```typescript
|
||||
// ❌ NEVER: Independent optionals — allows invalid half-states
|
||||
interface FilterProps {
|
||||
onBatchApply?: (values: string[]) => void;
|
||||
onInstantChange?: (value: string) => void;
|
||||
isBatchMode?: boolean;
|
||||
}
|
||||
|
||||
// ✅ ALWAYS: Discriminated union — one valid shape per mode
|
||||
type BatchProps = {
|
||||
mode: "batch";
|
||||
onApply: (values: string[]) => void;
|
||||
onCancel: () => void;
|
||||
};
|
||||
|
||||
type InstantProps = {
|
||||
mode: "instant";
|
||||
onChange: (value: string) => void;
|
||||
// onApply/onCancel are forbidden here via structural exclusion
|
||||
onApply?: never;
|
||||
onCancel?: never;
|
||||
};
|
||||
|
||||
type FilterProps = BatchProps | InstantProps;
|
||||
```
|
||||
|
||||
This makes invalid prop combinations a compile error, not a runtime surprise.
|
||||
|
||||
## Reuse Shared Display Utilities First (REQUIRED)
|
||||
|
||||
Before adding **local** display maps (labels, provider names, status strings, category formatters), search `ui/types/*` and `ui/lib/*` for existing helpers.
|
||||
|
||||
```typescript
|
||||
// ✅ CHECK THESE FIRST before creating a new map:
|
||||
// ui/lib/utils.ts → general formatters
|
||||
// ui/types/providers.ts → provider display names, icons
|
||||
// ui/types/findings.ts → severity/status display maps
|
||||
// ui/types/compliance.ts → category/group formatters
|
||||
|
||||
// ❌ NEVER add a local map that already exists:
|
||||
const SEVERITY_LABELS: Record<string, string> = {
|
||||
critical: "Critical",
|
||||
high: "High",
|
||||
// ...duplicating an existing shared map
|
||||
};
|
||||
|
||||
// ✅ Import and reuse instead:
|
||||
import { severityLabel } from "@/types/findings";
|
||||
```
|
||||
|
||||
If a helper doesn't exist and will be used in 2+ places, add it to `ui/lib/` or `ui/types/` and reuse it. Keep local only if used in exactly one place.
|
||||
|
||||
## Derived State Rule (REQUIRED)
|
||||
|
||||
Avoid `useState` + `useEffect` patterns that mirror props or searchParams — they create sync bugs and unnecessary re-renders. Derive values directly from the source of truth.
|
||||
|
||||
```typescript
|
||||
// ❌ NEVER: Mirror props into state via effect
|
||||
const [localFilter, setLocalFilter] = useState(filter);
|
||||
useEffect(() => { setLocalFilter(filter); }, [filter]);
|
||||
|
||||
// ✅ ALWAYS: Derive directly
|
||||
const localFilter = filter; // or compute inline
|
||||
```
|
||||
|
||||
If local state is genuinely needed (e.g., optimistic UI, pending edits before submit), add a short comment:
|
||||
|
||||
```typescript
|
||||
// Local state needed: user edits are buffered until "Apply" is clicked
|
||||
const [pending, setPending] = useState(initialValues);
|
||||
```
|
||||
|
||||
## Strict Key Typing for Label Maps (REQUIRED)
|
||||
|
||||
Avoid `Record<string, string>` when the key set is known. Use an explicit union type or a const-key object so typos are caught at compile time.
|
||||
|
||||
```typescript
|
||||
// ❌ Loose — typos compile silently
|
||||
const STATUS_LABELS: Record<string, string> = {
|
||||
actve: "Active", // typo, no error
|
||||
};
|
||||
|
||||
// ✅ Tight — union key
|
||||
type Status = "active" | "inactive" | "pending";
|
||||
const STATUS_LABELS: Record<Status, string> = {
|
||||
active: "Active",
|
||||
inactive: "Inactive",
|
||||
pending: "Pending",
|
||||
// actve: "Active" ← compile error
|
||||
};
|
||||
|
||||
// ✅ Also fine — const satisfies
|
||||
const STATUS_LABELS = {
|
||||
active: "Active",
|
||||
inactive: "Inactive",
|
||||
pending: "Pending",
|
||||
} as const satisfies Record<Status, string>;
|
||||
```
|
||||
|
||||
## QA Checklist Before Commit
|
||||
|
||||
- [ ] `pnpm run typecheck` passes
|
||||
- [ ] `pnpm run lint:fix` passes
|
||||
- [ ] `pnpm run format:write` passes
|
||||
- [ ] Relevant E2E tests pass
|
||||
- [ ] All UI states handled (loading, error, empty)
|
||||
- [ ] No secrets in code (use `.env.local`)
|
||||
- [ ] Error messages sanitized (no stack traces to users)
|
||||
- [ ] Server-side validation present (don't trust client)
|
||||
- [ ] Accessibility: keyboard navigation, ARIA labels
|
||||
- [ ] Mobile responsive (if applicable)
|
||||
|
||||
## Pre-Re-Review Checklist (Review Thread Hygiene)
|
||||
|
||||
Before requesting re-review from a reviewer:
|
||||
|
||||
- [ ] Every unresolved inline thread has been either fixed or explicitly answered with a rationale
|
||||
- [ ] If you agreed with a comment: the change is committed and the commit hash is mentioned in the reply
|
||||
- [ ] If you disagreed: the reply explains why with clear reasoning — do not leave threads silently open
|
||||
- [ ] Re-request review only after all threads are in a clean state
|
||||
|
||||
## Migrations Reference
|
||||
|
||||
| From | To | Key Changes |
|
||||
|------|-----|-------------|
|
||||
| React 18 | 19.1 | Async components, React Compiler (no useMemo/useCallback) |
|
||||
| Next.js 14 | 15.5 | Improved App Router, better streaming |
|
||||
| NextUI | HeroUI 2.8.4 | Package rename only, same API |
|
||||
| Zod 3 | 4 | `z.email()` not `z.string().email()`, `error` not `message` |
|
||||
| AI SDK 4 | 5 | `@ai-sdk/react`, `sendMessage` not `handleSubmit`, `parts` not `content` |
|
||||
|
||||
## Resources
|
||||
|
||||
- **Documentation**: See [references/](references/) for links to local developer guide
|
||||
## Activation Contract
|
||||
|
||||
Use this skill when the work depends on Prowler UI structure rather than generic framework syntax: component placement, action/adapter boundaries, shared-vs-local scope decisions, legacy HeroUI avoidance, or shared display utilities. Pair it with `react-19`, `nextjs-15`, `tailwind-4`, `typescript`, `zod-4`, or `zustand-5` when those implementation details matter.
|
||||
|
||||
## Hard Rules
|
||||
|
||||
- Always prefer `components/shadcn/` for new UI; do not introduce new HeroUI usage.
|
||||
- Always apply the scope rule first: code reused in 2+ places becomes shared, otherwise keep it local.
|
||||
- Always keep server actions, adapters, types, hooks, and utilities in their intended folders.
|
||||
- Always derive state directly when possible; do not mirror props or search params into effect-driven local state without a real buffering reason.
|
||||
- Always reuse shared label, formatter, and display helpers before adding local maps.
|
||||
- Never encode invalid prop combinations with unrelated optional fields when a discriminated union can model the API correctly.
|
||||
|
||||
## Decision Gates
|
||||
|
||||
| Question | Action |
|
||||
|---|---|
|
||||
| Is this a new component? | Build with shadcn + Tailwind conventions. |
|
||||
| Is logic reused across multiple features? | Promote it to `components/`, `types/`, `hooks/`, or `lib/` as appropriate. |
|
||||
| Is it only used in one feature? | Keep it inside that feature boundary. |
|
||||
| Is styling conditional or compositional? | Use `cn()`; use plain `className` for static classes. |
|
||||
| Does a third-party prop reject Tailwind classes? | Use a constant or `style` value, not `var()` inside `className`. |
|
||||
|
||||
## Execution Steps
|
||||
|
||||
1. Identify whether the change is component structure, state modeling, display formatting, or action/data flow.
|
||||
2. Apply the scope rule to decide local versus shared placement.
|
||||
3. Choose shadcn-first component patterns and keep legacy HeroUI isolated.
|
||||
4. Check shared helpers in `ui/types`, `ui/lib`, and `ui/hooks` before adding duplicates.
|
||||
5. Validate prop APIs, derived state, and styling decisions against the established UI rules.
|
||||
6. Pull in generic framework skills only for the parts they specifically own.
|
||||
|
||||
## Output Contract
|
||||
|
||||
- State where the code should live in `ui/` and why.
|
||||
- Call out the main UI rule applied: shadcn-first, scope rule, derived state, shared helper reuse, or discriminated unions.
|
||||
- Mention any companion generic skills required.
|
||||
- Flag any legacy HeroUI or state-sync risk that must be preserved or removed carefully.
|
||||
|
||||
## References
|
||||
|
||||
- [Repository agent rules](../../AGENTS.md)
|
||||
- [UI component guidance](../../ui/AGENTS.md)
|
||||
- [UI references](references/ui-docs.md)
|
||||
- [TypeScript skill](../typescript/SKILL.md)
|
||||
- [React 19 skill](../react-19/SKILL.md)
|
||||
- [Next.js 15 skill](../nextjs-15/SKILL.md)
|
||||
- [Tailwind 4 skill](../tailwind-4/SKILL.md)
|
||||
|
||||
+34
-43
@@ -1,8 +1,6 @@
|
||||
---
|
||||
name: prowler
|
||||
description: >
|
||||
Main entry point for Prowler development - quick reference for all components.
|
||||
Trigger: General Prowler development questions, project overview, component navigation (NOT PR CI gates or GitHub Actions workflows).
|
||||
description: "Trigger: When the task is general Prowler development, repository navigation, component selection, or project overview work outside PR CI workflow details. Routes the model to the right Prowler surface fast."
|
||||
license: Apache-2.0
|
||||
metadata:
|
||||
author: prowler-cloud
|
||||
@@ -12,54 +10,47 @@ metadata:
|
||||
allowed-tools: Read, Edit, Write, Glob, Grep, Bash, WebFetch, WebSearch, Task
|
||||
---
|
||||
|
||||
## Components
|
||||
## Activation Contract
|
||||
|
||||
| Component | Stack | Location |
|
||||
|-----------|-------|----------|
|
||||
| SDK | Python 3.10+, Poetry | `prowler/` |
|
||||
| API | Django 5.1, DRF, Celery | `api/` |
|
||||
| UI | Next.js 15, React 19, Tailwind 4 | `ui/` |
|
||||
| MCP | FastMCP 2.13.1 | `mcp_server/` |
|
||||
Use this skill first when the model needs to orient itself in the Prowler monorepo, choose the correct component, or point to the follow-up skill that should own the task.
|
||||
|
||||
## Quick Commands
|
||||
## Hard Rules
|
||||
|
||||
```bash
|
||||
# SDK
|
||||
poetry install --with dev
|
||||
poetry run python prowler-cli.py aws --check check_name
|
||||
poetry run pytest tests/
|
||||
- Treat this skill as a router, not the final authority for API, UI, SDK, MCP, CI, or testing implementation details.
|
||||
- Redirect specialized work to the matching Prowler skill before giving deep guidance.
|
||||
- Keep component guidance anchored to real repo paths and current stack names.
|
||||
- Do not use this skill for PR workflow gates or GitHub Actions analysis; those belong to `prowler-pr` or `prowler-ci`.
|
||||
- Prefer concise orientation over long cookbook explanations.
|
||||
|
||||
# API
|
||||
cd api && poetry run python src/backend/manage.py runserver
|
||||
cd api && poetry run pytest
|
||||
## Decision Gates
|
||||
|
||||
# UI
|
||||
cd ui && pnpm run dev
|
||||
cd ui && pnpm run healthcheck
|
||||
| Question | Action |
|
||||
|---|---|
|
||||
| Is the task about monorepo orientation or “where does this live”? | Use this skill and route to the right component. |
|
||||
| Is the task inside `api/` with RLS, RBAC, providers, or Celery? | Load `prowler-api`. |
|
||||
| Is the task inside `ui/` with app structure or component conventions? | Load `prowler-ui`. |
|
||||
| Is the task about checks, providers, compliance, docs, CI, or PR gates? | Hand off to the corresponding specialized Prowler skill. |
|
||||
| Is the task only about testing strategy? | Load `tdd` plus the matching test skill. |
|
||||
|
||||
# MCP
|
||||
cd mcp_server && uv run prowler-mcp
|
||||
## Execution Steps
|
||||
|
||||
# Full Stack
|
||||
docker-compose up -d
|
||||
```
|
||||
1. Identify the affected surface: `prowler/`, `api/`, `ui/`, `mcp_server/`, or cross-cutting docs/CI.
|
||||
2. Confirm the stack and runtime boundary for that surface.
|
||||
3. Route to the correct specialized skill before proposing implementation details.
|
||||
4. If multiple surfaces are involved, call out the primary owner and the supporting skills.
|
||||
5. Return repo paths, component names, and the next best skill to load.
|
||||
|
||||
## Providers
|
||||
## Output Contract
|
||||
|
||||
AWS, Azure, GCP, Kubernetes, GitHub, M365, OCI, AlibabaCloud, Cloudflare, MongoDB Atlas, NHN, LLM, IaC
|
||||
- State the target component or components.
|
||||
- Name the follow-up skill or skills that should own the work.
|
||||
- Mention the canonical repo path(s) to inspect next.
|
||||
- If the task is out of scope for this router skill, say so explicitly.
|
||||
|
||||
## Commit Style
|
||||
## References
|
||||
|
||||
`feat:`, `fix:`, `docs:`, `chore:`, `perf:`, `refactor:`, `test:`
|
||||
|
||||
## Related Skills
|
||||
|
||||
- `prowler-sdk-check` - Create security checks
|
||||
- `prowler-api` - Django/DRF patterns
|
||||
- `prowler-ui` - Next.js/React patterns
|
||||
- `prowler-mcp` - MCP server tools
|
||||
- `prowler-test` - Testing patterns
|
||||
|
||||
## Resources
|
||||
|
||||
- **Documentation**: See [references/](references/) for links to local developer guide
|
||||
- [Repository agent rules](../../AGENTS.md)
|
||||
- [Prowler skill references](references/prowler-docs.md)
|
||||
- [API component guidance](../../api/AGENTS.md)
|
||||
- [UI component guidance](../../ui/AGENTS.md)
|
||||
- [MCP component guidance](../../mcp_server/AGENTS.md)
|
||||
|
||||
+35
-171
@@ -1,8 +1,6 @@
|
||||
---
|
||||
name: pytest
|
||||
description: >
|
||||
Pytest testing patterns for Python.
|
||||
Trigger: When writing or refactoring pytest tests (fixtures, mocking, parametrize, markers). For Prowler-specific API/SDK testing conventions, also use prowler-test-api or prowler-test-sdk.
|
||||
description: "Trigger: When writing or refactoring pytest tests in Python, including fixtures, mocking, parametrization, async tests, and markers. Provides generic pytest structure before component-specific API or SDK rules."
|
||||
license: Apache-2.0
|
||||
metadata:
|
||||
author: prowler-cloud
|
||||
@@ -12,183 +10,49 @@ metadata:
|
||||
allowed-tools: Read, Edit, Write, Glob, Grep, Bash, WebFetch, WebSearch, Task
|
||||
---
|
||||
|
||||
## Basic Test Structure
|
||||
## Activation Contract
|
||||
|
||||
```python
|
||||
import pytest
|
||||
Use this skill for generic pytest structure and patterns; if the test touches Prowler API or SDK specifics, pair it with `prowler-test-api` or `prowler-test-sdk`.
|
||||
|
||||
class TestUserService:
|
||||
def test_create_user_success(self):
|
||||
user = create_user(name="John", email="john@test.com")
|
||||
assert user.name == "John"
|
||||
assert user.email == "john@test.com"
|
||||
## Hard Rules
|
||||
|
||||
def test_create_user_invalid_email_fails(self):
|
||||
with pytest.raises(ValueError, match="Invalid email"):
|
||||
create_user(name="John", email="invalid")
|
||||
```
|
||||
- Keep tests behavior-focused and name them after expected outcomes.
|
||||
- Extract reusable setup into fixtures instead of repeating inline construction.
|
||||
- Use `pytest.raises` for failure expectations and `@pytest.mark.parametrize` for matrix coverage.
|
||||
- Mock external boundaries, not the logic under test.
|
||||
- Register and use markers intentionally; do not invent silent marker names.
|
||||
- Prefer local references only; do not rely on external documentation links inside the skill.
|
||||
|
||||
## Fixtures
|
||||
## Decision Gates
|
||||
|
||||
```python
|
||||
import pytest
|
||||
| Question | Action |
|
||||
|---|---|
|
||||
| Shared setup across tests? | Move it into a fixture or `conftest.py`. |
|
||||
| Same assertion logic over many inputs? | Use `@pytest.mark.parametrize`. |
|
||||
| Need to verify an exception? | Use `pytest.raises(..., match=...)`. |
|
||||
| Testing async behavior? | Use `@pytest.mark.asyncio` or the repo's async test pattern. |
|
||||
| Working in `api/` or `prowler/`? | Load the component-specific testing skill too. |
|
||||
|
||||
@pytest.fixture
|
||||
def user():
|
||||
"""Create a test user."""
|
||||
return User(name="Test User", email="test@example.com")
|
||||
## Execution Steps
|
||||
|
||||
@pytest.fixture
|
||||
def authenticated_client(client, user):
|
||||
"""Client with authenticated user."""
|
||||
client.force_login(user)
|
||||
return client
|
||||
1. Identify whether the test is generic pytest, API-specific, or SDK-specific.
|
||||
2. Read neighboring tests and `conftest.py` before adding new fixtures.
|
||||
3. Write focused test functions or test classes with clear outcome-based names.
|
||||
4. Promote repeated setup into fixtures and shared helpers only when duplication appears twice or more.
|
||||
5. Use parametrization, markers, and mocks deliberately to keep coverage broad but readable.
|
||||
6. Run the narrowest relevant pytest target and inspect failures before widening scope.
|
||||
7. Report the exact command used and any fixture or marker introduced.
|
||||
|
||||
# Fixture with teardown
|
||||
@pytest.fixture
|
||||
def temp_file():
|
||||
path = Path("/tmp/test_file.txt")
|
||||
path.write_text("test content")
|
||||
yield path # Test runs here
|
||||
path.unlink() # Cleanup after test
|
||||
## Output Contract
|
||||
|
||||
# Fixture scopes
|
||||
@pytest.fixture(scope="module") # Once per module
|
||||
@pytest.fixture(scope="class") # Once per class
|
||||
@pytest.fixture(scope="session") # Once per test session
|
||||
```
|
||||
|
||||
## conftest.py
|
||||
|
||||
```python
|
||||
# tests/conftest.py - Shared fixtures
|
||||
import pytest
|
||||
|
||||
@pytest.fixture
|
||||
def db_session():
|
||||
session = create_session()
|
||||
yield session
|
||||
session.rollback()
|
||||
|
||||
@pytest.fixture
|
||||
def api_client():
|
||||
return TestClient(app)
|
||||
```
|
||||
|
||||
## Mocking
|
||||
|
||||
```python
|
||||
from unittest.mock import patch, MagicMock
|
||||
|
||||
class TestPaymentService:
|
||||
def test_process_payment_success(self):
|
||||
with patch("services.payment.stripe_client") as mock_stripe:
|
||||
mock_stripe.charge.return_value = {"id": "ch_123", "status": "succeeded"}
|
||||
|
||||
result = process_payment(amount=100)
|
||||
|
||||
assert result["status"] == "succeeded"
|
||||
mock_stripe.charge.assert_called_once_with(amount=100)
|
||||
|
||||
def test_process_payment_failure(self):
|
||||
with patch("services.payment.stripe_client") as mock_stripe:
|
||||
mock_stripe.charge.side_effect = PaymentError("Card declined")
|
||||
|
||||
with pytest.raises(PaymentError):
|
||||
process_payment(amount=100)
|
||||
|
||||
# MagicMock for complex objects
|
||||
def test_with_mock_object():
|
||||
mock_user = MagicMock()
|
||||
mock_user.id = "user-123"
|
||||
mock_user.name = "Test User"
|
||||
mock_user.is_active = True
|
||||
|
||||
result = get_user_info(mock_user)
|
||||
assert result["name"] == "Test User"
|
||||
```
|
||||
|
||||
## Parametrize
|
||||
|
||||
```python
|
||||
@pytest.mark.parametrize("input,expected", [
|
||||
("hello", "HELLO"),
|
||||
("world", "WORLD"),
|
||||
("pytest", "PYTEST"),
|
||||
])
|
||||
def test_uppercase(input, expected):
|
||||
assert input.upper() == expected
|
||||
|
||||
@pytest.mark.parametrize("email,is_valid", [
|
||||
("user@example.com", True),
|
||||
("invalid-email", False),
|
||||
("", False),
|
||||
("user@.com", False),
|
||||
])
|
||||
def test_email_validation(email, is_valid):
|
||||
assert validate_email(email) == is_valid
|
||||
```
|
||||
|
||||
## Markers
|
||||
|
||||
```python
|
||||
# pytest.ini or pyproject.toml
|
||||
[tool.pytest.ini_options]
|
||||
markers = [
|
||||
"slow: marks tests as slow",
|
||||
"integration: marks integration tests",
|
||||
]
|
||||
|
||||
# Usage
|
||||
@pytest.mark.slow
|
||||
def test_large_data_processing():
|
||||
...
|
||||
|
||||
@pytest.mark.integration
|
||||
def test_database_connection():
|
||||
...
|
||||
|
||||
@pytest.mark.skip(reason="Not implemented yet")
|
||||
def test_future_feature():
|
||||
...
|
||||
|
||||
@pytest.mark.skipif(sys.platform == "win32", reason="Unix only")
|
||||
def test_unix_specific():
|
||||
...
|
||||
|
||||
# Run specific markers
|
||||
# pytest -m "not slow"
|
||||
# pytest -m "integration"
|
||||
```
|
||||
|
||||
## Async Tests
|
||||
|
||||
```python
|
||||
import pytest
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_async_function():
|
||||
result = await async_fetch_data()
|
||||
assert result is not None
|
||||
```
|
||||
|
||||
## Commands
|
||||
|
||||
```bash
|
||||
pytest # Run all tests
|
||||
pytest -v # Verbose output
|
||||
pytest -x # Stop on first failure
|
||||
pytest -k "test_user" # Filter by name
|
||||
pytest -m "not slow" # Filter by marker
|
||||
pytest --cov=src # With coverage
|
||||
pytest -n auto # Parallel (pytest-xdist)
|
||||
pytest --tb=short # Short traceback
|
||||
```
|
||||
- State whether the change relied on fixtures, parametrization, mocking, markers, or async support.
|
||||
- Mention any component-specific skill paired with pytest.
|
||||
- Report the exact pytest command used for validation.
|
||||
- Call out any test isolation or fixture-scope decision that affects future contributors.
|
||||
|
||||
## References
|
||||
|
||||
For general pytest documentation, see:
|
||||
- **Official Docs**: https://docs.pytest.org/en/stable/
|
||||
|
||||
For Prowler SDK testing with provider-specific patterns (moto, MagicMock), see:
|
||||
- **Documentation**: [references/prowler-testing.md](references/prowler-testing.md)
|
||||
- [TDD skill](../tdd/SKILL.md)
|
||||
- [Prowler API testing skill](../prowler-test-api/SKILL.md)
|
||||
- [Prowler SDK testing skill](../prowler-test-sdk/SKILL.md)
|
||||
- [Repository agent rules](../../AGENTS.md)
|
||||
|
||||
+33
-102
@@ -1,8 +1,6 @@
|
||||
---
|
||||
name: react-19
|
||||
description: >
|
||||
React 19 patterns with React Compiler.
|
||||
Trigger: When writing React 19 components/hooks in .tsx (React Compiler rules, hook patterns, refs as props). If using Next.js App Router/Server Actions, also use nextjs-15.
|
||||
description: "Trigger: When writing React 19 components, hooks, or `.tsx` files, especially with React Compiler, `use()`, actions, or ref-as-prop patterns. Applies React 19 runtime and composition rules."
|
||||
license: Apache-2.0
|
||||
metadata:
|
||||
author: prowler-cloud
|
||||
@@ -12,113 +10,46 @@ metadata:
|
||||
allowed-tools: Read, Edit, Write, Glob, Grep, Bash, WebFetch, WebSearch, Task
|
||||
---
|
||||
|
||||
## No Manual Memoization (REQUIRED)
|
||||
## Activation Contract
|
||||
|
||||
```typescript
|
||||
// ✅ React Compiler handles optimization automatically
|
||||
function Component({ items }) {
|
||||
const filtered = items.filter(x => x.active);
|
||||
const sorted = filtered.sort((a, b) => a.name.localeCompare(b.name));
|
||||
Use this skill when the change is inside React 19 component code and the agent must choose between Server Components, Client Components, compiler-friendly patterns, or modern hook APIs.
|
||||
|
||||
const handleClick = (id) => {
|
||||
console.log(id);
|
||||
};
|
||||
## Hard Rules
|
||||
|
||||
return <List items={sorted} onClick={handleClick} />;
|
||||
}
|
||||
- Do not add `useMemo` or `useCallback` for routine render-path optimization; React Compiler handles the common case.
|
||||
- Prefer Server Components by default; add `"use client"` only for client-only behavior.
|
||||
- Import named React APIs; do not use default `React` imports.
|
||||
- Use `ref` as a prop in React 19 instead of introducing `forwardRef` by habit.
|
||||
- If the task also involves App Router or Server Actions integration details, load `nextjs-15` too.
|
||||
|
||||
// ❌ NEVER: Manual memoization
|
||||
const filtered = useMemo(() => items.filter(x => x.active), [items]);
|
||||
const handleClick = useCallback((id) => console.log(id), []);
|
||||
```
|
||||
## Decision Gates
|
||||
|
||||
## Imports (REQUIRED)
|
||||
| Question | Action |
|
||||
|---|---|
|
||||
| Does the component use state, effects, browser APIs, or event handlers? | Mark it as a Client Component with `"use client"`. |
|
||||
| Does the component only fetch or compose data for rendering? | Keep it as a Server Component. |
|
||||
| Are you reading a promise or conditional context? | Consider `use()` instead of older workarounds. |
|
||||
| Are you wiring form actions or pending state? | Prefer actions and `useActionState`. |
|
||||
| Are you about to add memoization for performance? | Stop and justify it; default to compiler-friendly plain code first. |
|
||||
|
||||
```typescript
|
||||
// ✅ ALWAYS: Named imports
|
||||
import { useState, useEffect, useRef } from "react";
|
||||
## Execution Steps
|
||||
|
||||
// ❌ NEVER
|
||||
import React from "react";
|
||||
import * as React from "react";
|
||||
```
|
||||
1. Identify whether the file should stay server-side or become client-side.
|
||||
2. Remove legacy React imports and manual memoization unless there is a proven exception.
|
||||
3. Keep render logic direct and compiler-friendly.
|
||||
4. Use `use()` for supported promise/context reads when it simplifies the flow.
|
||||
5. Use action-based form patterns for mutation flows when relevant.
|
||||
6. Pass refs as props in new React 19 component APIs.
|
||||
7. Validate that the final component model matches the feature's runtime needs.
|
||||
|
||||
## Server Components First
|
||||
## Output Contract
|
||||
|
||||
```typescript
|
||||
// ✅ Server Component (default) - no directive
|
||||
export default async function Page() {
|
||||
const data = await fetchData();
|
||||
return <ClientComponent data={data} />;
|
||||
}
|
||||
- State whether the component is server or client and why.
|
||||
- Call out any React 19 modernization applied, such as removing manual memoization, using `use()`, or replacing `forwardRef`.
|
||||
- Mention whether `nextjs-15` was also required.
|
||||
|
||||
// ✅ Client Component - only when needed
|
||||
"use client";
|
||||
export function Interactive() {
|
||||
const [state, setState] = useState(false);
|
||||
return <button onClick={() => setState(!state)}>Toggle</button>;
|
||||
}
|
||||
```
|
||||
## References
|
||||
|
||||
## When to use "use client"
|
||||
|
||||
- useState, useEffect, useRef, useContext
|
||||
- Event handlers (onClick, onChange)
|
||||
- Browser APIs (window, localStorage)
|
||||
|
||||
## use() Hook
|
||||
|
||||
```typescript
|
||||
import { use } from "react";
|
||||
|
||||
// Read promises (suspends until resolved)
|
||||
function Comments({ promise }) {
|
||||
const comments = use(promise);
|
||||
return comments.map(c => <div key={c.id}>{c.text}</div>);
|
||||
}
|
||||
|
||||
// Conditional context (not possible with useContext!)
|
||||
function Theme({ showTheme }) {
|
||||
if (showTheme) {
|
||||
const theme = use(ThemeContext);
|
||||
return <div style={{ color: theme.primary }}>Themed</div>;
|
||||
}
|
||||
return <div>Plain</div>;
|
||||
}
|
||||
```
|
||||
|
||||
## Actions & useActionState
|
||||
|
||||
```typescript
|
||||
"use server";
|
||||
async function submitForm(formData: FormData) {
|
||||
await saveToDatabase(formData);
|
||||
revalidatePath("/");
|
||||
}
|
||||
|
||||
// With pending state
|
||||
import { useActionState } from "react";
|
||||
|
||||
function Form() {
|
||||
const [state, action, isPending] = useActionState(submitForm, null);
|
||||
return (
|
||||
<form action={action}>
|
||||
<button disabled={isPending}>
|
||||
{isPending ? "Saving..." : "Save"}
|
||||
</button>
|
||||
</form>
|
||||
);
|
||||
}
|
||||
```
|
||||
|
||||
## ref as Prop (No forwardRef)
|
||||
|
||||
```typescript
|
||||
// ✅ React 19: ref is just a prop
|
||||
function Input({ ref, ...props }) {
|
||||
return <input ref={ref} {...props} />;
|
||||
}
|
||||
|
||||
// ❌ Old way (unnecessary now)
|
||||
const Input = forwardRef((props, ref) => <input ref={ref} {...props} />);
|
||||
```
|
||||
- [Next.js 15 skill](../nextjs-15/SKILL.md)
|
||||
- [TypeScript skill](../typescript/SKILL.md)
|
||||
- [Repository agent rules](../../AGENTS.md)
|
||||
|
||||
+36
-149
@@ -1,8 +1,6 @@
|
||||
---
|
||||
name: skill-creator
|
||||
description: >
|
||||
Creates new AI agent skills following the Agent Skills spec.
|
||||
Trigger: When user asks to create a new skill, add agent instructions, or document patterns for AI.
|
||||
description: "Trigger: When user asks to create a new skill, add agent instructions, or document patterns for AI. Creates new AI agent skills following the Agent Skills spec."
|
||||
license: Apache-2.0
|
||||
metadata:
|
||||
author: prowler-cloud
|
||||
@@ -12,160 +10,49 @@ metadata:
|
||||
allowed-tools: Read, Edit, Write, Glob, Grep, Bash, WebFetch, WebSearch, Task
|
||||
---
|
||||
|
||||
## When to Create a Skill
|
||||
## Activation Contract
|
||||
|
||||
Create a skill when:
|
||||
- A pattern is used repeatedly and AI needs guidance
|
||||
- Project-specific conventions differ from generic best practices
|
||||
- Complex workflows need step-by-step instructions
|
||||
- Decision trees help AI choose the right approach
|
||||
Use this skill when the task is to create a new skill or reshape rough agent guidance into a reusable skill package.
|
||||
|
||||
**Don't create a skill when:**
|
||||
- Documentation already exists (create a reference instead)
|
||||
- Pattern is trivial or self-explanatory
|
||||
- It's a one-off task
|
||||
## Hard Rules
|
||||
|
||||
---
|
||||
- Create a skill only for reusable, non-trivial patterns.
|
||||
- Keep `description` on one quoted physical line with `Trigger:` first.
|
||||
- Use local references only; never point `references/` at web URLs.
|
||||
- Prefer short rules, decision tables, and minimal examples over tutorials.
|
||||
- Add `metadata.scope` and `metadata.auto_invoke` when the skill should surface in `AGENTS.md` auto-invoke tables.
|
||||
- Do not duplicate long docs inside the skill; point to local references instead.
|
||||
|
||||
## Skill Structure
|
||||
## Decision Gates
|
||||
|
||||
```
|
||||
skills/{skill-name}/
|
||||
├── SKILL.md # Required - main skill file
|
||||
├── assets/ # Optional - templates, schemas, examples
|
||||
│ ├── template.py
|
||||
│ └── schema.json
|
||||
└── references/ # Optional - links to local docs
|
||||
└── docs.md # Points to docs/developer-guide/*.mdx
|
||||
```
|
||||
| Question | Action |
|
||||
|---|---|
|
||||
| Is the pattern already documented well enough? | Reuse or reference the existing doc instead of creating a new skill. |
|
||||
| Is the guidance specific to this repo or workflow? | Create a project-specific skill name such as `prowler-{component}` or `{action}-{target}`. |
|
||||
| Do you need templates, schemas, or example configs? | Put them in `assets/`. |
|
||||
| Do you need supporting documentation? | Link only local files from `references/`. |
|
||||
| Will the skill be auto-invoked from `AGENTS.md`? | Add or update `metadata.scope` and `metadata.auto_invoke`, then decide whether `skill-sync` must run. |
|
||||
|
||||
---
|
||||
## Execution Steps
|
||||
|
||||
## SKILL.md Template
|
||||
1. Confirm the skill does not already exist under `skills/`.
|
||||
2. Choose a reusable name that matches the repo naming conventions.
|
||||
3. Create `skills/{skill-name}/SKILL.md` and required support folders only if needed (`assets/`, `references/`).
|
||||
4. Write frontmatter with `name`, one-line quoted `description`, `license`, and metadata.
|
||||
5. Write the body in this order: Activation Contract, Hard Rules, Decision Gates, Execution Steps, Output Contract, References.
|
||||
6. Keep the body compact: operational instructions first, examples only when they unblock execution.
|
||||
7. If auto-invoke metadata changed, run the `skill-sync` workflow appropriate to the scope.
|
||||
8. Update any non-generated skill index entries the repository expects.
|
||||
|
||||
```markdown
|
||||
---
|
||||
name: {skill-name}
|
||||
description: >
|
||||
{One-line description of what this skill does}.
|
||||
Trigger: {When the AI should load this skill}.
|
||||
license: Apache-2.0
|
||||
metadata:
|
||||
author: prowler-cloud
|
||||
version: "1.0"
|
||||
---
|
||||
## Output Contract
|
||||
|
||||
## When to Use
|
||||
- Return the created or updated skill path(s).
|
||||
- State whether auto-invoke metadata changed and whether `skill-sync` was run, dry-run, or intentionally skipped.
|
||||
- Summarize the reusable pattern the skill captures in 1-3 bullets.
|
||||
- Call out any follow-up files the human should review, such as `AGENTS.md` or assets/templates.
|
||||
|
||||
{Bullet points of when to use this skill}
|
||||
## References
|
||||
|
||||
## Critical Patterns
|
||||
|
||||
{The most important rules - what AI MUST know}
|
||||
|
||||
## Code Examples
|
||||
|
||||
{Minimal, focused examples}
|
||||
|
||||
## Commands
|
||||
|
||||
```bash
|
||||
{Common commands}
|
||||
```
|
||||
|
||||
## Resources
|
||||
|
||||
- **Templates**: See [assets/](assets/) for {description}
|
||||
- **Documentation**: See [references/](references/) for local docs
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Naming Conventions
|
||||
|
||||
| Type | Pattern | Examples |
|
||||
|------|---------|----------|
|
||||
| Generic skill | `{technology}` | `pytest`, `playwright`, `typescript` |
|
||||
| Prowler-specific | `prowler-{component}` | `prowler-api`, `prowler-ui`, `prowler-sdk-check` |
|
||||
| Testing skill | `prowler-test-{component}` | `prowler-test-sdk`, `prowler-test-api` |
|
||||
| Workflow skill | `{action}-{target}` | `skill-creator`, `jira-task` |
|
||||
|
||||
---
|
||||
|
||||
## Decision: assets/ vs references/
|
||||
|
||||
```
|
||||
Need code templates? → assets/
|
||||
Need JSON schemas? → assets/
|
||||
Need example configs? → assets/
|
||||
Link to existing docs? → references/
|
||||
Link to external guides? → references/ (with local path)
|
||||
```
|
||||
|
||||
**Key Rule**: `references/` should point to LOCAL files (`docs/developer-guide/*.mdx`), not web URLs.
|
||||
|
||||
---
|
||||
|
||||
## Decision: Prowler-Specific vs Generic
|
||||
|
||||
```
|
||||
Patterns apply to ANY project? → Generic skill (e.g., pytest, typescript)
|
||||
Patterns are Prowler-specific? → prowler-{name} skill
|
||||
Generic skill needs Prowler info? → Add references/ pointing to Prowler docs
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Frontmatter Fields
|
||||
|
||||
| Field | Required | Description |
|
||||
|-------|----------|-------------|
|
||||
| `name` | Yes | Skill identifier (lowercase, hyphens) |
|
||||
| `description` | Yes | What + Trigger in one block |
|
||||
| `license` | Yes | Always `Apache-2.0` for Prowler |
|
||||
| `metadata.author` | Yes | `prowler-cloud` |
|
||||
| `metadata.version` | Yes | Semantic version as string |
|
||||
|
||||
---
|
||||
|
||||
## Content Guidelines
|
||||
|
||||
### DO
|
||||
- Start with the most critical patterns
|
||||
- Use tables for decision trees
|
||||
- Keep code examples minimal and focused
|
||||
- Include Commands section with copy-paste commands
|
||||
|
||||
### DON'T
|
||||
- Add Keywords section (agent searches frontmatter, not body)
|
||||
- Duplicate content from existing docs (reference instead)
|
||||
- Include lengthy explanations (link to docs)
|
||||
- Add troubleshooting sections (keep focused)
|
||||
- Use web URLs in references (use local paths)
|
||||
|
||||
---
|
||||
|
||||
## Registering the Skill
|
||||
|
||||
After creating the skill, add it to `AGENTS.md`:
|
||||
|
||||
```markdown
|
||||
| `{skill-name}` | {Description} | [SKILL.md](skills/{skill-name}/SKILL.md) |
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Checklist Before Creating
|
||||
|
||||
- [ ] Skill doesn't already exist (check `skills/`)
|
||||
- [ ] Pattern is reusable (not one-off)
|
||||
- [ ] Name follows conventions
|
||||
- [ ] Frontmatter is complete (description includes trigger keywords)
|
||||
- [ ] Critical patterns are clear
|
||||
- [ ] Code examples are minimal
|
||||
- [ ] Commands section exists
|
||||
- [ ] Added to AGENTS.md
|
||||
|
||||
## Resources
|
||||
|
||||
- **Templates**: See [assets/](assets/) for SKILL.md template
|
||||
- [Template](assets/SKILL-TEMPLATE.md)
|
||||
- [Skills overview](../README.md)
|
||||
- [Repository agent rules](../../AGENTS.md)
|
||||
|
||||
+33
-96
@@ -1,8 +1,6 @@
|
||||
---
|
||||
name: skill-sync
|
||||
description: >
|
||||
Syncs skill metadata to AGENTS.md Auto-invoke sections.
|
||||
Trigger: When updating skill metadata (metadata.scope/metadata.auto_invoke), regenerating Auto-invoke tables, or running ./skills/skill-sync/assets/sync.sh (including --dry-run/--scope).
|
||||
description: "Trigger: When updating skill metadata (metadata.scope/metadata.auto_invoke), regenerating Auto-invoke tables, or running ./skills/skill-sync/assets/sync.sh. Syncs skill metadata to AGENTS.md Auto-invoke sections."
|
||||
license: Apache-2.0
|
||||
metadata:
|
||||
author: prowler-cloud
|
||||
@@ -15,107 +13,46 @@ metadata:
|
||||
allowed-tools: Read, Edit, Write, Glob, Grep, Bash
|
||||
---
|
||||
|
||||
## Purpose
|
||||
## Activation Contract
|
||||
|
||||
Keeps AGENTS.md Auto-invoke sections in sync with skill metadata. When you create or modify a skill, run the sync script to automatically update all affected AGENTS.md files.
|
||||
Use this skill when a skill's `metadata.scope` or `metadata.auto_invoke` changes, when auto-invoke tables need regeneration, or when a skill is missing from `AGENTS.md` auto-invoke output.
|
||||
|
||||
## Required Skill Metadata
|
||||
## Hard Rules
|
||||
|
||||
Each skill that should appear in Auto-invoke sections needs these fields in `metadata`.
|
||||
- Treat `./skills/skill-sync/assets/sync.sh` as the source of truth for generated auto-invoke tables.
|
||||
- Do not hand-edit generated auto-invoke sections unless the workflow itself is being fixed.
|
||||
- Run `--dry-run` first when you only need verification or when metadata impact is uncertain.
|
||||
- Only `metadata.scope` and `metadata.auto_invoke` should drive sync decisions.
|
||||
- Keep scope values aligned to real targets: `root`, `ui`, `api`, `sdk`, `mcp_server`.
|
||||
|
||||
`auto_invoke` can be either a single string **or** a list of actions:
|
||||
## Decision Gates
|
||||
|
||||
```yaml
|
||||
metadata:
|
||||
author: prowler-cloud
|
||||
version: "1.0"
|
||||
scope: [ui] # Which AGENTS.md: ui, api, sdk, root
|
||||
| Question | Action |
|
||||
|---|---|
|
||||
| Did `metadata.scope` or `metadata.auto_invoke` change? | Run `sync.sh` for real, or `--scope` if the blast radius is intentionally narrow. |
|
||||
| Did only body text or examples change? | Skip sync and say why; generated tables are unaffected. |
|
||||
| Are you checking expected output without modifying files? | Run `sync.sh --dry-run`. |
|
||||
| Is one surface affected? | Use `sync.sh --scope <scope>`. |
|
||||
| Is a skill missing from auto-invoke output? | Inspect its frontmatter first, then run `--dry-run` to confirm what the script sees. |
|
||||
|
||||
# Option A: single action
|
||||
auto_invoke: "Creating/modifying components"
|
||||
## Execution Steps
|
||||
|
||||
# Option B: multiple actions
|
||||
# auto_invoke:
|
||||
# - "Creating/modifying components"
|
||||
# - "Refactoring component folder placement"
|
||||
```
|
||||
1. Read the changed skill frontmatter and confirm `metadata.scope` and `metadata.auto_invoke` are present and well-formed.
|
||||
2. Decide whether the task needs a real sync, a dry-run, or a documented no-op.
|
||||
3. If validating only, run `./skills/skill-sync/assets/sync.sh --dry-run`.
|
||||
4. If updating one target, run `./skills/skill-sync/assets/sync.sh --scope <scope>`.
|
||||
5. If updating all affected targets, run `./skills/skill-sync/assets/sync.sh`.
|
||||
6. Verify the expected `AGENTS.md` surfaces changed only where metadata demanded it.
|
||||
|
||||
### Scope Values
|
||||
## Output Contract
|
||||
|
||||
| Scope | Updates |
|
||||
|-------|---------|
|
||||
| `root` | `AGENTS.md` (repo root) |
|
||||
| `ui` | `ui/AGENTS.md` |
|
||||
| `api` | `api/AGENTS.md` |
|
||||
| `sdk` | `prowler/AGENTS.md` |
|
||||
| `mcp_server` | `mcp_server/AGENTS.md` |
|
||||
- State whether sync was executed, dry-run only, or skipped as a no-op.
|
||||
- List the scope(s) evaluated and the `AGENTS.md` file(s) affected or intentionally untouched.
|
||||
- If the issue was missing auto-invoke output, explain the root cause in the skill metadata or script behavior.
|
||||
- Return the exact command used for verification or update.
|
||||
|
||||
Skills can have multiple scopes: `scope: [ui, api]`
|
||||
## References
|
||||
|
||||
---
|
||||
|
||||
## Usage
|
||||
|
||||
### After Creating/Modifying a Skill
|
||||
|
||||
```bash
|
||||
./skills/skill-sync/assets/sync.sh
|
||||
```
|
||||
|
||||
### What It Does
|
||||
|
||||
1. Reads all `skills/*/SKILL.md` files
|
||||
2. Extracts `metadata.scope` and `metadata.auto_invoke`
|
||||
3. Generates Auto-invoke tables for each AGENTS.md
|
||||
4. Updates the `### Auto-invoke Skills` section in each file
|
||||
|
||||
---
|
||||
|
||||
## Example
|
||||
|
||||
Given this skill metadata:
|
||||
|
||||
```yaml
|
||||
# skills/prowler-ui/SKILL.md
|
||||
metadata:
|
||||
author: prowler-cloud
|
||||
version: "1.0"
|
||||
scope: [ui]
|
||||
auto_invoke: "Creating/modifying React components"
|
||||
```
|
||||
|
||||
The sync script generates in `ui/AGENTS.md`:
|
||||
|
||||
```markdown
|
||||
### Auto-invoke Skills
|
||||
|
||||
When performing these actions, ALWAYS invoke the corresponding skill FIRST:
|
||||
|
||||
| Action | Skill |
|
||||
|--------|-------|
|
||||
| Creating/modifying React components | `prowler-ui` |
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Commands
|
||||
|
||||
```bash
|
||||
# Sync all AGENTS.md files
|
||||
./skills/skill-sync/assets/sync.sh
|
||||
|
||||
# Dry run (show what would change)
|
||||
./skills/skill-sync/assets/sync.sh --dry-run
|
||||
|
||||
# Sync specific scope only
|
||||
./skills/skill-sync/assets/sync.sh --scope ui
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Checklist After Modifying Skills
|
||||
|
||||
- [ ] Added `metadata.scope` to new/modified skill
|
||||
- [ ] Added `metadata.auto_invoke` with action description
|
||||
- [ ] Ran `./skills/skill-sync/assets/sync.sh`
|
||||
- [ ] Verified AGENTS.md files updated correctly
|
||||
- [Sync script](assets/sync.sh)
|
||||
- [Sync script test helper](assets/sync_test.sh)
|
||||
- [Repository agent rules](../../AGENTS.md)
|
||||
|
||||
+32
-177
@@ -1,8 +1,6 @@
|
||||
---
|
||||
name: tailwind-4
|
||||
description: >
|
||||
Tailwind CSS 4 patterns and best practices.
|
||||
Trigger: When styling with Tailwind (className, variants, cn()), especially when dynamic styling or CSS variables are involved (no var() in className).
|
||||
description: "Trigger: When styling with Tailwind CSS 4, especially in `className`, variant composition, `cn()`, or dynamic-value decisions. Enforces Tailwind-first styling rules and escape hatches."
|
||||
license: Apache-2.0
|
||||
metadata:
|
||||
author: prowler-cloud
|
||||
@@ -12,188 +10,45 @@ metadata:
|
||||
allowed-tools: Read, Edit, Write, Glob, Grep, Bash, WebFetch, WebSearch, Task
|
||||
---
|
||||
|
||||
## Styling Decision Tree
|
||||
## Activation Contract
|
||||
|
||||
```
|
||||
Tailwind class exists? → className="..."
|
||||
Dynamic value? → style={{ width: `${x}%` }}
|
||||
Conditional styles? → cn("base", condition && "variant")
|
||||
Static only? → className="..." (no cn() needed)
|
||||
Library can't use class?→ style prop with var() constants
|
||||
```
|
||||
Use this skill when UI styling decisions involve Tailwind class composition, semantic theme usage, or choosing between `className`, `cn()`, and inline styles.
|
||||
|
||||
## Critical Rules
|
||||
## Hard Rules
|
||||
|
||||
### Never Use var() in className
|
||||
- Prefer Tailwind utility classes directly in `className` for static styling.
|
||||
- Do not put `var(...)` expressions inside `className`; use semantic Tailwind tokens or inline styles where needed.
|
||||
- Do not use hex colors in class strings; use theme or Tailwind palette classes.
|
||||
- Use `cn()` only when conditional or merge behavior is real.
|
||||
- Use inline `style` only for truly dynamic values or third-party APIs that cannot consume class names.
|
||||
|
||||
```typescript
|
||||
// ❌ NEVER: var() in className
|
||||
<div className="bg-[var(--color-primary)]" />
|
||||
<div className="text-[var(--text-color)]" />
|
||||
## Decision Gates
|
||||
|
||||
// ✅ ALWAYS: Use Tailwind semantic classes
|
||||
<div className="bg-primary" />
|
||||
<div className="text-slate-400" />
|
||||
```
|
||||
| Question | Action |
|
||||
|---|---|
|
||||
| Static styling only? | Use plain `className="..."`. |
|
||||
| Conditional or override-prone classes? | Use `cn(...)`. |
|
||||
| Dynamic numeric or percentage values? | Use the `style` prop. |
|
||||
| Third-party library prop cannot accept classes? | Pass CSS custom property values or inline style constants. |
|
||||
| Need a one-off dimension not in the design system? | Use an arbitrary value sparingly, but never for colors. |
|
||||
|
||||
### Never Use Hex Colors
|
||||
## Execution Steps
|
||||
|
||||
```typescript
|
||||
// ❌ NEVER: Hex colors in className
|
||||
<p className="text-[#ffffff]" />
|
||||
<div className="bg-[#1e293b]" />
|
||||
1. Classify the styling need as static, conditional, dynamic, or third-party-only.
|
||||
2. Prefer semantic Tailwind utilities and theme tokens first.
|
||||
3. Introduce `cn()` only if merge logic or conditions justify it.
|
||||
4. Move dynamic measurements or library-only values into `style` constants.
|
||||
5. Replace color escape hatches with palette or theme classes.
|
||||
6. Review the final markup and remove unnecessary wrappers or styling indirection.
|
||||
|
||||
// ✅ ALWAYS: Use Tailwind color classes
|
||||
<p className="text-white" />
|
||||
<div className="bg-slate-800" />
|
||||
```
|
||||
## Output Contract
|
||||
|
||||
## The cn() Utility
|
||||
- State which styling path was chosen: plain `className`, `cn()`, or inline `style`.
|
||||
- Call out any removed anti-pattern such as `var(...)` in `className` or hex colors.
|
||||
- Mention any remaining escape hatch and why it was necessary.
|
||||
|
||||
```typescript
|
||||
import { clsx } from "clsx";
|
||||
import { twMerge } from "tailwind-merge";
|
||||
## References
|
||||
|
||||
export function cn(...inputs: ClassValue[]) {
|
||||
return twMerge(clsx(inputs));
|
||||
}
|
||||
```
|
||||
|
||||
### When to Use cn()
|
||||
|
||||
```typescript
|
||||
// ✅ Conditional classes
|
||||
<div className={cn("base-class", isActive && "active-class")} />
|
||||
|
||||
// ✅ Merging with potential conflicts
|
||||
<button className={cn("px-4 py-2", className)} /> // className might override
|
||||
|
||||
// ✅ Multiple conditions
|
||||
<div className={cn(
|
||||
"rounded-lg border",
|
||||
variant === "primary" && "bg-blue-500 text-white",
|
||||
variant === "secondary" && "bg-gray-200 text-gray-800",
|
||||
disabled && "opacity-50 cursor-not-allowed"
|
||||
)} />
|
||||
```
|
||||
|
||||
### When NOT to Use cn()
|
||||
|
||||
```typescript
|
||||
// ❌ Static classes - unnecessary wrapper
|
||||
<div className={cn("flex items-center gap-2")} />
|
||||
|
||||
// ✅ Just use className directly
|
||||
<div className="flex items-center gap-2" />
|
||||
```
|
||||
|
||||
## Style Constants for Charts/Libraries
|
||||
|
||||
When libraries don't accept className (like Recharts):
|
||||
|
||||
```typescript
|
||||
// ✅ Constants with var() - ONLY for library props
|
||||
const CHART_COLORS = {
|
||||
primary: "var(--color-primary)",
|
||||
secondary: "var(--color-secondary)",
|
||||
text: "var(--color-text)",
|
||||
gridLine: "var(--color-border)",
|
||||
};
|
||||
|
||||
// Usage with Recharts (can't use className)
|
||||
<XAxis tick={{ fill: CHART_COLORS.text }} />
|
||||
<CartesianGrid stroke={CHART_COLORS.gridLine} />
|
||||
```
|
||||
|
||||
## Dynamic Values
|
||||
|
||||
```typescript
|
||||
// ✅ style prop for truly dynamic values
|
||||
<div style={{ width: `${percentage}%` }} />
|
||||
<div style={{ opacity: isVisible ? 1 : 0 }} />
|
||||
|
||||
// ✅ CSS custom properties for theming
|
||||
<div style={{ "--progress": `${value}%` } as React.CSSProperties} />
|
||||
```
|
||||
|
||||
## Common Patterns
|
||||
|
||||
### Flexbox
|
||||
|
||||
```typescript
|
||||
<div className="flex items-center justify-between gap-4" />
|
||||
<div className="flex flex-col gap-2" />
|
||||
<div className="inline-flex items-center" />
|
||||
```
|
||||
|
||||
### Grid
|
||||
|
||||
```typescript
|
||||
<div className="grid grid-cols-3 gap-4" />
|
||||
<div className="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-4 gap-6" />
|
||||
```
|
||||
|
||||
### Spacing
|
||||
|
||||
```typescript
|
||||
// Padding
|
||||
<div className="p-4" /> // All sides
|
||||
<div className="px-4 py-2" /> // Horizontal, vertical
|
||||
<div className="pt-4 pb-2" /> // Top, bottom
|
||||
|
||||
// Margin
|
||||
<div className="m-4" />
|
||||
<div className="mx-auto" /> // Center horizontally
|
||||
<div className="mt-8 mb-4" />
|
||||
```
|
||||
|
||||
### Typography
|
||||
|
||||
```typescript
|
||||
<h1 className="text-2xl font-bold text-white" />
|
||||
<p className="text-sm text-slate-400" />
|
||||
<span className="text-xs font-medium uppercase tracking-wide" />
|
||||
```
|
||||
|
||||
### Borders & Shadows
|
||||
|
||||
```typescript
|
||||
<div className="rounded-lg border border-slate-700" />
|
||||
<div className="rounded-full shadow-lg" />
|
||||
<div className="ring-2 ring-blue-500 ring-offset-2" />
|
||||
```
|
||||
|
||||
### States
|
||||
|
||||
```typescript
|
||||
<button className="hover:bg-blue-600 focus:ring-2 active:scale-95" />
|
||||
<input className="focus:border-blue-500 focus:outline-none" />
|
||||
<div className="group-hover:opacity-100" />
|
||||
```
|
||||
|
||||
### Responsive
|
||||
|
||||
```typescript
|
||||
<div className="w-full md:w-1/2 lg:w-1/3" />
|
||||
<div className="hidden md:block" />
|
||||
<div className="text-sm md:text-base lg:text-lg" />
|
||||
```
|
||||
|
||||
### Dark Mode
|
||||
|
||||
```typescript
|
||||
<div className="bg-white dark:bg-slate-900" />
|
||||
<p className="text-gray-900 dark:text-white" />
|
||||
```
|
||||
|
||||
## Arbitrary Values (Escape Hatch)
|
||||
|
||||
```typescript
|
||||
// ✅ OK for one-off values not in design system
|
||||
<div className="w-[327px]" />
|
||||
<div className="top-[117px]" />
|
||||
<div className="grid-cols-[1fr_2fr_1fr]" />
|
||||
|
||||
// ❌ Don't use for colors - use theme instead
|
||||
<div className="bg-[#1e293b]" /> // NO
|
||||
```
|
||||
- [Prowler UI skill](../prowler-ui/SKILL.md)
|
||||
- [React 19 skill](../react-19/SKILL.md)
|
||||
- [Repository agent rules](../../AGENTS.md)
|
||||
|
||||
+35
-344
@@ -1,9 +1,6 @@
|
||||
---
|
||||
name: tdd
|
||||
description: >
|
||||
Test-Driven Development workflow for ALL Prowler components (UI, SDK, API).
|
||||
Trigger: ALWAYS when implementing features, fixing bugs, or refactoring - regardless of component.
|
||||
This is a MANDATORY workflow, not optional.
|
||||
description: "Trigger: ALWAYS when implementing features, fixing bugs, refactoring, or modifying behavior in Prowler. Enforces the RED -> GREEN -> REFACTOR workflow across UI, API, and SDK work."
|
||||
license: Apache-2.0
|
||||
metadata:
|
||||
author: prowler-cloud
|
||||
@@ -18,354 +15,48 @@ metadata:
|
||||
allowed-tools: Read, Edit, Write, Glob, Grep, Bash, Task
|
||||
---
|
||||
|
||||
## TDD Cycle (MANDATORY)
|
||||
## Activation Contract
|
||||
|
||||
```
|
||||
+-----------------------------------------+
|
||||
| RED -> GREEN -> REFACTOR |
|
||||
| ^ | |
|
||||
| +------------------------+ |
|
||||
+-----------------------------------------+
|
||||
```
|
||||
Use this skill before changing production code whenever the task adds behavior, fixes a bug, or refactors existing logic.
|
||||
|
||||
**The question is NOT "should I write tests?" but "what tests do I need?"**
|
||||
## Hard Rules
|
||||
|
||||
---
|
||||
- Start with a failing test; no production change before RED is proven.
|
||||
- Run the smallest relevant test scope, not the whole suite, unless the refactor safety net requires broader coverage.
|
||||
- Add only enough code to pass the current failing test.
|
||||
- After GREEN, refactor with tests still passing.
|
||||
- Load the stack-specific testing skill when applicable: `vitest`, `prowler-test-ui`, `pytest`, `prowler-test-api`, or `prowler-test-sdk`.
|
||||
|
||||
## The Three Laws of TDD
|
||||
## Decision Gates
|
||||
|
||||
1. **No production code** until you have a failing test
|
||||
2. **No more test** than necessary to fail
|
||||
3. **No more code** than necessary to pass
|
||||
| Question | Action |
|
||||
|---|---|
|
||||
| Working in `ui/`? | Use Vitest conventions and co-located `*.test.{ts,tsx}` files. |
|
||||
| Working in `api/`? | Use pytest + Django patterns and the API testing skill. |
|
||||
| Working in `prowler/`? | Use pytest + provider-specific SDK testing patterns. |
|
||||
| Refactoring without new behavior? | Capture current behavior first by running the closest existing tests before editing. |
|
||||
| No relevant test exists? | Create the narrowest new test that demonstrates the target behavior or bug. |
|
||||
|
||||
---
|
||||
## Execution Steps
|
||||
|
||||
## Detect Your Stack
|
||||
1. Identify the component and matching test runner.
|
||||
2. Read nearby tests first to match naming, fixtures, and assertion style.
|
||||
3. Write or extend one test that fails for the intended behavior.
|
||||
4. Run that focused test and confirm RED.
|
||||
5. Implement the minimum change to reach GREEN.
|
||||
6. Add triangulation cases when one test could be satisfied by a fake or hardcoded implementation.
|
||||
7. Refactor only after the behavior is protected by passing tests.
|
||||
8. Re-run the focused suite and report the exact validation command used.
|
||||
|
||||
Before starting, identify which component you're working on:
|
||||
## Output Contract
|
||||
|
||||
| Working in | Stack | Runner | Test pattern | Details |
|
||||
|------------|-------|--------|-------------|---------|
|
||||
| `ui/` | TypeScript / React | Vitest + RTL | `*.test.{ts,tsx}` (co-located) | See `vitest` skill |
|
||||
| `prowler/` | Python | pytest + moto | `*_test.py` (suffix) in `tests/` | See `prowler-test-sdk` skill |
|
||||
| `api/` | Python / Django | pytest + django | `test_*.py` (prefix) in `api/src/backend/**/tests/` | See `prowler-test-api` skill |
|
||||
- State the RED evidence: which test failed and why.
|
||||
- State the GREEN evidence: which command passed after the change.
|
||||
- Name the stack and test skill used.
|
||||
- Call out any blocker if RED or GREEN could not be executed exactly as intended.
|
||||
|
||||
---
|
||||
## References
|
||||
|
||||
## Phase 0: Assessment (ALWAYS FIRST)
|
||||
|
||||
Before writing ANY code:
|
||||
|
||||
### UI (`ui/`)
|
||||
|
||||
```bash
|
||||
# 1. Find existing tests
|
||||
fd "*.test.tsx" ui/components/feature/
|
||||
|
||||
# 2. Check coverage
|
||||
pnpm test:coverage -- components/feature/
|
||||
|
||||
# 3. Read existing tests
|
||||
```
|
||||
|
||||
### SDK (`prowler/`)
|
||||
|
||||
```bash
|
||||
# 1. Find existing tests
|
||||
fd "*_test.py" tests/providers/aws/services/ec2/
|
||||
|
||||
# 2. Run specific test
|
||||
poetry run pytest tests/providers/aws/services/ec2/ec2_ami_public/ -v
|
||||
|
||||
# 3. Read existing tests
|
||||
```
|
||||
|
||||
### API (`api/`)
|
||||
|
||||
```bash
|
||||
# 1. Find existing tests
|
||||
fd "test_*.py" api/src/backend/api/tests/
|
||||
|
||||
# 2. Run specific test
|
||||
poetry run pytest api/src/backend/api/tests/test_models.py -v
|
||||
|
||||
# 3. Read existing tests
|
||||
```
|
||||
|
||||
### Decision Tree (All Stacks)
|
||||
|
||||
```
|
||||
+------------------------------------------+
|
||||
| Does test file exist for this code? |
|
||||
+----------+-----------------------+-------+
|
||||
| NO | YES
|
||||
v v
|
||||
+------------------+ +------------------+
|
||||
| CREATE test file | | Check coverage |
|
||||
| -> Phase 1: RED | | for your change |
|
||||
+------------------+ +--------+---------+
|
||||
|
|
||||
+--------+--------+
|
||||
| Missing cases? |
|
||||
+---+---------+---+
|
||||
| YES | NO
|
||||
v v
|
||||
+-----------+ +-----------+
|
||||
| ADD tests | | Proceed |
|
||||
| Phase 1 | | Phase 2 |
|
||||
+-----------+ +-----------+
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase 1: RED - Write Failing Tests
|
||||
|
||||
### For NEW Functionality
|
||||
|
||||
**UI (Vitest)**
|
||||
|
||||
```typescript
|
||||
describe("PriceCalculator", () => {
|
||||
it("should return 0 for quantities below threshold", () => {
|
||||
// Given
|
||||
const quantity = 3;
|
||||
|
||||
// When
|
||||
const result = calculateDiscount(quantity);
|
||||
|
||||
// Then
|
||||
expect(result).toBe(0);
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**SDK (pytest)**
|
||||
|
||||
```python
|
||||
class Test_ec2_ami_public:
|
||||
@mock_aws
|
||||
def test_no_public_amis(self):
|
||||
# Given - No AMIs exist
|
||||
aws_provider = set_mocked_aws_provider([AWS_REGION_US_EAST_1])
|
||||
|
||||
with mock.patch("prowler...ec2_service", new=EC2(aws_provider)):
|
||||
from prowler...ec2_ami_public import ec2_ami_public
|
||||
|
||||
# When
|
||||
check = ec2_ami_public()
|
||||
result = check.execute()
|
||||
|
||||
# Then
|
||||
assert len(result) == 0
|
||||
```
|
||||
|
||||
**API (pytest-django)**
|
||||
|
||||
```python
|
||||
@pytest.mark.django_db
|
||||
class TestResourceModel:
|
||||
def test_create_resource_with_tags(self, providers_fixture):
|
||||
# Given
|
||||
provider, *_ = providers_fixture
|
||||
tenant_id = provider.tenant_id
|
||||
|
||||
# When
|
||||
resource = Resource.objects.create(
|
||||
tenant_id=tenant_id, provider=provider,
|
||||
uid="arn:aws:ec2:us-east-1:123456789:instance/i-1234",
|
||||
name="test", region="us-east-1", service="ec2", type="instance",
|
||||
)
|
||||
|
||||
# Then
|
||||
assert resource.uid == "arn:aws:ec2:us-east-1:123456789:instance/i-1234"
|
||||
```
|
||||
|
||||
**Run -> MUST fail:** Test references code that doesn't exist yet.
|
||||
|
||||
### For BUG FIXES
|
||||
|
||||
Write a test that **reproduces the bug** first:
|
||||
|
||||
**UI:** `expect(() => render(<DatePicker value={null} />)).not.toThrow();`
|
||||
|
||||
**SDK:** `assert result[0].status == "FAIL" # Currently returns PASS incorrectly`
|
||||
|
||||
**API:** `assert response.status_code == 403 # Currently returns 200`
|
||||
|
||||
**Run -> Should FAIL (reproducing the bug)**
|
||||
|
||||
### For REFACTORING
|
||||
|
||||
Capture ALL current behavior BEFORE refactoring:
|
||||
|
||||
```
|
||||
# Any stack: run ALL existing tests, they should PASS
|
||||
# This is your safety net - if any fail after refactoring, you broke something
|
||||
```
|
||||
|
||||
**Run -> All should PASS (baseline)**
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: GREEN - Minimum Code
|
||||
|
||||
Write the MINIMUM code to make the test pass. Hardcoding is valid for the first test.
|
||||
|
||||
**UI:**
|
||||
|
||||
```typescript
|
||||
// Test expects calculateDiscount(100, 10) === 10
|
||||
function calculateDiscount() {
|
||||
return 10; // FAKE IT - hardcoded is valid for first test
|
||||
}
|
||||
```
|
||||
|
||||
**Python (SDK/API):**
|
||||
|
||||
```python
|
||||
# Test expects check.execute() returns 0 results
|
||||
def execute(self):
|
||||
return [] # FAKE IT - hardcoded is valid for first test
|
||||
```
|
||||
|
||||
**This passes. But we're not done...**
|
||||
|
||||
---
|
||||
|
||||
## Phase 3: Triangulation (CRITICAL)
|
||||
|
||||
**One test allows faking. Multiple tests FORCE real logic.**
|
||||
|
||||
Add tests with different inputs that break the hardcoded value:
|
||||
|
||||
| Scenario | Required? |
|
||||
|----------|-----------|
|
||||
| Happy path | YES |
|
||||
| Zero/empty values | YES |
|
||||
| Boundary values | YES |
|
||||
| Different valid inputs | YES (breaks fake) |
|
||||
| Error conditions | YES |
|
||||
|
||||
**UI:**
|
||||
|
||||
```typescript
|
||||
it("should calculate 10% discount", () => {
|
||||
expect(calculateDiscount(100, 10)).toBe(10);
|
||||
});
|
||||
|
||||
// ADD - breaks the fake:
|
||||
it("should calculate 15% on 200", () => {
|
||||
expect(calculateDiscount(200, 15)).toBe(30);
|
||||
});
|
||||
|
||||
it("should return 0 for 0% rate", () => {
|
||||
expect(calculateDiscount(100, 0)).toBe(0);
|
||||
});
|
||||
```
|
||||
|
||||
**Python:**
|
||||
|
||||
```python
|
||||
def test_single_public_ami(self):
|
||||
# Different input -> breaks hardcoded empty list
|
||||
assert len(result) == 1
|
||||
assert result[0].status == "FAIL"
|
||||
|
||||
def test_private_ami(self):
|
||||
assert result[0].status == "PASS"
|
||||
```
|
||||
|
||||
**Now fake BREAKS -> Real implementation required.**
|
||||
|
||||
---
|
||||
|
||||
## Phase 4: REFACTOR
|
||||
|
||||
Tests GREEN -> Improve code quality WITHOUT changing behavior.
|
||||
|
||||
- Extract functions/methods
|
||||
- Improve naming
|
||||
- Add types/validation
|
||||
- Reduce duplication
|
||||
|
||||
**Run tests after EACH change -> Must stay GREEN**
|
||||
|
||||
---
|
||||
|
||||
## Quick Reference
|
||||
|
||||
```
|
||||
+------------------------------------------------+
|
||||
| TDD WORKFLOW |
|
||||
+------------------------------------------------+
|
||||
| 0. ASSESS: What tests exist? What's missing? |
|
||||
| |
|
||||
| 1. RED: Write ONE failing test |
|
||||
| +-- Run -> Must fail with clear error |
|
||||
| |
|
||||
| 2. GREEN: Write MINIMUM code to pass |
|
||||
| +-- Fake It is valid for first test |
|
||||
| |
|
||||
| 3. TRIANGULATE: Add tests that break the fake |
|
||||
| +-- Different inputs, edge cases |
|
||||
| |
|
||||
| 4. REFACTOR: Improve with confidence |
|
||||
| +-- Tests stay green throughout |
|
||||
| |
|
||||
| 5. REPEAT: Next behavior/requirement |
|
||||
+------------------------------------------------+
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Anti-Patterns (NEVER DO)
|
||||
|
||||
```
|
||||
# ANY language:
|
||||
|
||||
# 1. Code first, tests after
|
||||
def new_feature(): ... # Then writing tests = USELESS
|
||||
|
||||
# 2. Skip triangulation
|
||||
# Single test allows faking forever
|
||||
|
||||
# 3. Test implementation details
|
||||
assert component.state.is_loading == True # BAD - test behavior, not internals
|
||||
assert mock_service.call_count == 3 # BAD - brittle coupling
|
||||
|
||||
# 4. All tests at once before any code
|
||||
# Write ONE test, make it pass, THEN write the next
|
||||
|
||||
# 5. Giant test methods
|
||||
# Each test should verify ONE behavior
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Commands by Stack
|
||||
|
||||
### UI (`ui/`)
|
||||
|
||||
```bash
|
||||
pnpm test # Watch mode
|
||||
pnpm test:run # Single run (CI)
|
||||
pnpm test:coverage # Coverage report
|
||||
pnpm test ComponentName # Filter by name
|
||||
```
|
||||
|
||||
### SDK (`prowler/`)
|
||||
|
||||
```bash
|
||||
poetry run pytest tests/path/ -v # Run specific tests
|
||||
poetry run pytest tests/path/ -v -k "test_name" # Filter by name
|
||||
poetry run pytest -n auto tests/ # Parallel run
|
||||
poetry run pytest --cov=./prowler tests/ # Coverage
|
||||
```
|
||||
|
||||
### API (`api/`)
|
||||
|
||||
```bash
|
||||
poetry run pytest -x --tb=short # Run all (stop on first fail)
|
||||
poetry run pytest api/src/backend/api/tests/test_file.py # Specific file
|
||||
poetry run pytest -k "test_name" -v # Filter by name
|
||||
```
|
||||
- [Vitest skill](../vitest/SKILL.md)
|
||||
- [Pytest skill](../pytest/SKILL.md)
|
||||
- [Repository agent rules](../../AGENTS.md)
|
||||
|
||||
+33
-120
@@ -1,8 +1,6 @@
|
||||
---
|
||||
name: typescript
|
||||
description: >
|
||||
TypeScript strict patterns and best practices.
|
||||
Trigger: When implementing or refactoring TypeScript in .ts/.tsx (types, interfaces, generics, const maps, type guards, removing any, tightening unknown).
|
||||
description: "Trigger: When implementing or refactoring TypeScript in `.ts` or `.tsx`, including types, interfaces, generics, type guards, const maps, and stricter unknown handling. Enforces strict TypeScript modeling patterns."
|
||||
license: Apache-2.0
|
||||
metadata:
|
||||
author: prowler-cloud
|
||||
@@ -12,131 +10,46 @@ metadata:
|
||||
allowed-tools: Read, Edit, Write, Glob, Grep, Bash, WebFetch, WebSearch, Task
|
||||
---
|
||||
|
||||
## Const Types Pattern (REQUIRED)
|
||||
## Activation Contract
|
||||
|
||||
```typescript
|
||||
// ✅ ALWAYS: Create const object first, then extract type
|
||||
const STATUS = {
|
||||
ACTIVE: "active",
|
||||
INACTIVE: "inactive",
|
||||
PENDING: "pending",
|
||||
} as const;
|
||||
Use this skill when the work changes TypeScript types or when runtime behavior depends on better compile-time modeling.
|
||||
|
||||
type Status = (typeof STATUS)[keyof typeof STATUS];
|
||||
## Hard Rules
|
||||
|
||||
// ❌ NEVER: Direct union types
|
||||
type Status = "active" | "inactive" | "pending";
|
||||
```
|
||||
- Prefer strict, expressive types over `any`; use `unknown`, generics, or narrow unions instead.
|
||||
- Model reusable literals from `as const` objects when values exist at runtime.
|
||||
- Keep interfaces flat; extract nested object shapes into named types.
|
||||
- Use discriminated unions when props or fields are only valid in coordinated sets.
|
||||
- Import types with `import type` when only the type is needed.
|
||||
|
||||
**Why?** Single source of truth, runtime values, autocomplete, easier refactoring.
|
||||
## Decision Gates
|
||||
|
||||
## Flat Interfaces (REQUIRED)
|
||||
| Question | Action |
|
||||
|---|---|
|
||||
| Need both runtime values and a type union? | Create a const object and derive the type from it. |
|
||||
| Is a value shape deeply nested inline? | Extract dedicated named interfaces or types. |
|
||||
| Are multiple optional props semantically coupled? | Replace them with discriminated union branches. |
|
||||
| Is the input truly unknown? | Accept `unknown` and narrow with a type guard. |
|
||||
| Are you duplicating a mapped or transformed shape manually? | Reach for utility types before inventing parallel interfaces. |
|
||||
|
||||
```typescript
|
||||
// ✅ ALWAYS: One level depth, nested objects → dedicated interface
|
||||
interface UserAddress {
|
||||
street: string;
|
||||
city: string;
|
||||
}
|
||||
## Execution Steps
|
||||
|
||||
interface User {
|
||||
id: string;
|
||||
name: string;
|
||||
address: UserAddress; // Reference, not inline
|
||||
}
|
||||
1. Identify the domain shape that needs stronger typing.
|
||||
2. Replace `any` or weak optionals with precise unions, generics, or guards.
|
||||
3. Convert literal unions to const-derived types when runtime values matter.
|
||||
4. Flatten nested inline objects into named interfaces.
|
||||
5. Use utility types for projections, partials, and derived shapes.
|
||||
6. Re-check imports and convert type-only imports to `import type` where appropriate.
|
||||
7. Validate that invalid states are now rejected by the type system.
|
||||
|
||||
interface Admin extends User {
|
||||
permissions: string[];
|
||||
}
|
||||
## Output Contract
|
||||
|
||||
// ❌ NEVER: Inline nested objects
|
||||
interface User {
|
||||
address: { street: string; city: string }; // NO!
|
||||
}
|
||||
```
|
||||
- Summarize the type-system improvement made.
|
||||
- Call out any invalid state now prevented at compile time.
|
||||
- Mention the main pattern used: const-derived type, discriminated union, utility type, or type guard.
|
||||
|
||||
## Never Use `any`
|
||||
## References
|
||||
|
||||
```typescript
|
||||
// ✅ Use unknown for truly unknown types
|
||||
function parse(input: unknown): User {
|
||||
if (isUser(input)) return input;
|
||||
throw new Error("Invalid input");
|
||||
}
|
||||
|
||||
// ✅ Use generics for flexible types
|
||||
function first<T>(arr: T[]): T | undefined {
|
||||
return arr[0];
|
||||
}
|
||||
|
||||
// ❌ NEVER
|
||||
function parse(input: any): any { }
|
||||
```
|
||||
|
||||
## Utility Types
|
||||
|
||||
```typescript
|
||||
Pick<User, "id" | "name"> // Select fields
|
||||
Omit<User, "id"> // Exclude fields
|
||||
Partial<User> // All optional
|
||||
Required<User> // All required
|
||||
Readonly<User> // All readonly
|
||||
Record<string, User> // Object type
|
||||
Extract<Union, "a" | "b"> // Extract from union
|
||||
Exclude<Union, "a"> // Exclude from union
|
||||
NonNullable<T | null> // Remove null/undefined
|
||||
ReturnType<typeof fn> // Function return type
|
||||
Parameters<typeof fn> // Function params tuple
|
||||
```
|
||||
|
||||
## Type Guards
|
||||
|
||||
```typescript
|
||||
function isUser(value: unknown): value is User {
|
||||
return (
|
||||
typeof value === "object" &&
|
||||
value !== null &&
|
||||
"id" in value &&
|
||||
"name" in value
|
||||
);
|
||||
}
|
||||
```
|
||||
|
||||
## Coupled Optional Props (REQUIRED)
|
||||
|
||||
Do not model semantically coupled props as independent optionals — this allows invalid half-states that compile but break at runtime. Use discriminated unions with `never` to make invalid combinations impossible.
|
||||
|
||||
```typescript
|
||||
// ❌ BEFORE: Independent optionals — half-states allowed
|
||||
interface PaginationProps {
|
||||
onPageChange?: (page: number) => void;
|
||||
pageSize?: number;
|
||||
currentPage?: number;
|
||||
}
|
||||
|
||||
// ✅ AFTER: Discriminated union — shape is all-or-nothing
|
||||
type ControlledPagination = {
|
||||
controlled: true;
|
||||
currentPage: number;
|
||||
pageSize: number;
|
||||
onPageChange: (page: number) => void;
|
||||
};
|
||||
|
||||
type UncontrolledPagination = {
|
||||
controlled: false;
|
||||
currentPage?: never;
|
||||
pageSize?: never;
|
||||
onPageChange?: never;
|
||||
};
|
||||
|
||||
type PaginationProps = ControlledPagination | UncontrolledPagination;
|
||||
```
|
||||
|
||||
**Key rule:** If two or more props are only meaningful together, they belong to the same discriminated union branch. Mixing them as independent optionals shifts correctness responsibility from the type system to runtime guards.
|
||||
|
||||
## Import Types
|
||||
|
||||
```typescript
|
||||
import type { User } from "./types";
|
||||
import { createUser, type Config } from "./utils";
|
||||
```
|
||||
- [React 19 skill](../react-19/SKILL.md)
|
||||
- [Zod 4 skill](../zod-4/SKILL.md)
|
||||
- [Repository agent rules](../../AGENTS.md)
|
||||
|
||||
+35
-175
@@ -1,8 +1,6 @@
|
||||
---
|
||||
name: vitest
|
||||
description: >
|
||||
Vitest unit testing patterns with React Testing Library.
|
||||
Trigger: When writing unit tests for React components, hooks, or utilities.
|
||||
description: "Trigger: When writing or refactoring Vitest tests for React components, hooks, or UI utilities. Defines unit and integration testing patterns with React Testing Library."
|
||||
license: Apache-2.0
|
||||
metadata:
|
||||
author: prowler-cloud
|
||||
@@ -16,186 +14,48 @@ metadata:
|
||||
allowed-tools: Read, Edit, Write, Glob, Grep, Bash, Task
|
||||
---
|
||||
|
||||
> **For E2E tests**: Use `prowler-test-ui` skill (Playwright).
|
||||
> This skill covers **unit/integration tests** with Vitest + React Testing Library.
|
||||
## Activation Contract
|
||||
|
||||
## Test Structure (REQUIRED)
|
||||
Use this skill for UI unit and integration tests built with Vitest and React Testing Library; for browser E2E flows, switch to `prowler-test-ui` instead.
|
||||
|
||||
Use **Given/When/Then** (AAA) pattern with comments:
|
||||
## Hard Rules
|
||||
|
||||
```typescript
|
||||
it("should update user name when form is submitted", async () => {
|
||||
// Given - Arrange
|
||||
const user = userEvent.setup();
|
||||
const onSubmit = vi.fn();
|
||||
render(<UserForm onSubmit={onSubmit} />);
|
||||
- Structure tests with Given/When/Then intent.
|
||||
- Prefer behavior-oriented `describe` blocks grouped by condition, not by implementation method.
|
||||
- Query the screen by accessibility priority first: role, label, placeholder, text, then test id.
|
||||
- Use `userEvent` for interactions unless a lower-level event is explicitly required.
|
||||
- Keep async assertions focused: one expectation per `waitFor` block.
|
||||
- Restore mocks between tests.
|
||||
|
||||
// When - Act
|
||||
await user.type(screen.getByLabelText(/name/i), "John");
|
||||
await user.click(screen.getByRole("button", { name: /submit/i }));
|
||||
## Decision Gates
|
||||
|
||||
// Then - Assert
|
||||
expect(onSubmit).toHaveBeenCalledWith({ name: "John" });
|
||||
});
|
||||
```
|
||||
| Question | Action |
|
||||
|---|---|
|
||||
| Testing a browser flow across pages? | Use `prowler-test-ui`, not Vitest. |
|
||||
| Need to interact like a user? | Use `userEvent.setup()` and await the interaction. |
|
||||
| Element appears later? | Use `findBy*` or `waitFor` appropriately. |
|
||||
| Need a selector? | Prefer accessible queries before `getByTestId`. |
|
||||
| Thinking about testing internals? | Stop and assert user-visible behavior instead. |
|
||||
|
||||
---
|
||||
## Execution Steps
|
||||
|
||||
## Describe Block Organization
|
||||
1. Confirm the test belongs in unit/integration scope, not Playwright.
|
||||
2. Read nearby tests to match file placement and helper patterns.
|
||||
3. Write or update the spec using AAA comments when clarity helps.
|
||||
4. Render through public component APIs and interact through accessible queries.
|
||||
5. Use `userEvent` for user actions and async queries for delayed UI.
|
||||
6. Isolate mocks and restore them after each test.
|
||||
7. Run only the relevant Vitest target and verify the expected behavior.
|
||||
|
||||
```typescript
|
||||
describe("ComponentName", () => {
|
||||
describe("when [condition]", () => {
|
||||
it("should [expected behavior]", () => {});
|
||||
});
|
||||
});
|
||||
```
|
||||
## Output Contract
|
||||
|
||||
**Group by behavior, NOT by method.**
|
||||
- State whether the test covers a component, hook, or utility.
|
||||
- Report the main query and interaction patterns used.
|
||||
- Mention the exact Vitest command or filter used for validation.
|
||||
- Call out if E2E coverage was intentionally out of scope.
|
||||
|
||||
---
|
||||
## References
|
||||
|
||||
## Query Priority (REQUIRED)
|
||||
|
||||
| Priority | Query | Use Case |
|
||||
|----------|-------|----------|
|
||||
| 1 | `getByRole` | Buttons, inputs, headings |
|
||||
| 2 | `getByLabelText` | Form fields |
|
||||
| 3 | `getByPlaceholderText` | Inputs without label |
|
||||
| 4 | `getByText` | Static text |
|
||||
| 5 | `getByTestId` | Last resort only |
|
||||
|
||||
```typescript
|
||||
// ✅ GOOD
|
||||
screen.getByRole("button", { name: /submit/i });
|
||||
screen.getByLabelText(/email/i);
|
||||
|
||||
// ❌ BAD
|
||||
container.querySelector(".btn-primary");
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## userEvent over fireEvent (REQUIRED)
|
||||
|
||||
```typescript
|
||||
// ✅ ALWAYS use userEvent
|
||||
const user = userEvent.setup();
|
||||
await user.click(button);
|
||||
await user.type(input, "hello");
|
||||
|
||||
// ❌ NEVER use fireEvent for interactions
|
||||
fireEvent.click(button);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Async Testing Patterns
|
||||
|
||||
```typescript
|
||||
// ✅ findBy for elements that appear async
|
||||
const element = await screen.findByText(/loaded/i);
|
||||
|
||||
// ✅ waitFor for assertions
|
||||
await waitFor(() => {
|
||||
expect(screen.getByText(/success/i)).toBeInTheDocument();
|
||||
});
|
||||
|
||||
// ✅ ONE assertion per waitFor
|
||||
await waitFor(() => expect(mockFn).toHaveBeenCalled());
|
||||
await waitFor(() => expect(screen.getByText(/done/i)).toBeVisible());
|
||||
|
||||
// ❌ NEVER multiple assertions in waitFor
|
||||
await waitFor(() => {
|
||||
expect(mockFn).toHaveBeenCalled();
|
||||
expect(screen.getByText(/done/i)).toBeVisible(); // Slower failures
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Mocking
|
||||
|
||||
```typescript
|
||||
// Basic mock
|
||||
const handleClick = vi.fn();
|
||||
|
||||
// Mock with return value
|
||||
const fetchUser = vi.fn().mockResolvedValue({ name: "John" });
|
||||
|
||||
// Always clean up
|
||||
afterEach(() => {
|
||||
vi.restoreAllMocks();
|
||||
});
|
||||
```
|
||||
|
||||
### vi.spyOn vs vi.mock
|
||||
|
||||
| Method | When to Use |
|
||||
|--------|-------------|
|
||||
| `vi.spyOn` | Observe without replacing (PREFERRED) |
|
||||
| `vi.mock` | Replace entire module (use sparingly) |
|
||||
|
||||
---
|
||||
|
||||
## Common Matchers
|
||||
|
||||
```typescript
|
||||
// Presence
|
||||
expect(element).toBeInTheDocument();
|
||||
expect(element).toBeVisible();
|
||||
|
||||
// State
|
||||
expect(button).toBeDisabled();
|
||||
expect(input).toHaveValue("text");
|
||||
expect(checkbox).toBeChecked();
|
||||
|
||||
// Content
|
||||
expect(element).toHaveTextContent(/hello/i);
|
||||
expect(element).toHaveAttribute("href", "/home");
|
||||
|
||||
// Functions
|
||||
expect(fn).toHaveBeenCalledWith(arg1, arg2);
|
||||
expect(fn).toHaveBeenCalledTimes(2);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## What NOT to Test
|
||||
|
||||
```typescript
|
||||
// ❌ Internal state
|
||||
expect(component.state.isLoading).toBe(true);
|
||||
|
||||
// ❌ Third-party libraries
|
||||
expect(axios.get).toHaveBeenCalled();
|
||||
|
||||
// ❌ Static content (unless conditional)
|
||||
expect(screen.getByText("Welcome")).toBeInTheDocument();
|
||||
|
||||
// ✅ User-visible behavior
|
||||
expect(screen.getByRole("button")).toBeDisabled();
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## File Organization
|
||||
|
||||
```
|
||||
components/
|
||||
├── Button/
|
||||
│ ├── Button.tsx
|
||||
│ ├── Button.test.tsx # Co-located
|
||||
│ └── index.ts
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Commands
|
||||
|
||||
```bash
|
||||
pnpm test # Watch mode
|
||||
pnpm test:run # Single run
|
||||
pnpm test:coverage # With coverage
|
||||
pnpm test Button # Filter by name
|
||||
```
|
||||
- [TDD skill](../tdd/SKILL.md)
|
||||
- [Prowler UI E2E skill](../prowler-test-ui/SKILL.md)
|
||||
- [Repository agent rules](../../AGENTS.md)
|
||||
|
||||
Reference in New Issue
Block a user