From dce05295ef417fd4a8c1d84c3718d8c820fabd26 Mon Sep 17 00:00:00 2001 From: Pepe Fagoaga Date: Thu, 22 Jan 2026 13:54:06 +0100 Subject: [PATCH] chore(skills): Improve Django and DRF skills (#9831) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Adrián Jesús Peña Rodríguez --- AGENTS.md | 11 +- api/AGENTS.md | 10 +- mcp_server/AGENTS.md | 2 + skills/django-drf/SKILL.md | 599 ++++++++++++++---- skills/django-drf/assets/security_patterns.py | 159 +++++ .../django-drf/references/file-locations.md | 154 +++++ .../references/json-api-conventions.md | 116 ++++ skills/jsonapi/SKILL.md | 271 ++++++++ skills/prowler-api/SKILL.md | 506 +++++++++++++-- skills/prowler-api/assets/celery_patterns.py | 319 ++++++++++ .../prowler-api/assets/security_patterns.py | 207 ++++++ skills/prowler-api/references/api-docs.md | 21 - .../prowler-api/references/configuration.md | 282 +++++++++ .../prowler-api/references/file-locations.md | 128 ++++ .../references/modeling-decisions.md | 274 ++++++++ .../references/production-settings.md | 180 ++++++ skills/prowler-commit/SKILL.md | 180 ++++++ skills/prowler-test-api/SKILL.md | 211 +++--- skills/prowler-test-api/assets/api_test.py | 371 +++++++++++ .../references/test-api-docs.md | 222 ++++++- skills/skill-sync/assets/sync.sh | 2 + ui/AGENTS.md | 2 + 22 files changed, 3887 insertions(+), 340 deletions(-) create mode 100644 skills/django-drf/assets/security_patterns.py create mode 100644 skills/django-drf/references/file-locations.md create mode 100644 skills/django-drf/references/json-api-conventions.md create mode 100644 skills/jsonapi/SKILL.md create mode 100644 skills/prowler-api/assets/celery_patterns.py create mode 100644 skills/prowler-api/assets/security_patterns.py delete mode 100644 skills/prowler-api/references/api-docs.md create mode 100644 skills/prowler-api/references/configuration.md create mode 100644 skills/prowler-api/references/file-locations.md create mode 100644 skills/prowler-api/references/modeling-decisions.md create mode 100644 skills/prowler-api/references/production-settings.md create mode 100644 skills/prowler-commit/SKILL.md create mode 100644 skills/prowler-test-api/assets/api_test.py diff --git a/AGENTS.md b/AGENTS.md index d7bc86b5b1..e203deb845 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -20,6 +20,7 @@ Use these skills for detailed patterns on-demand: | `playwright` | Page Object Model, MCP workflow, selectors | [SKILL.md](skills/playwright/SKILL.md) | | `pytest` | Fixtures, mocking, markers, parametrize | [SKILL.md](skills/pytest/SKILL.md) | | `django-drf` | ViewSets, Serializers, Filters | [SKILL.md](skills/django-drf/SKILL.md) | +| `jsonapi` | Strict JSON:API v1.1 spec compliance | [SKILL.md](skills/jsonapi/SKILL.md) | | `zod-4` | New API (z.email(), z.uuid()) | [SKILL.md](skills/zod-4/SKILL.md) | | `zustand-5` | Persist, selectors, slices | [SKILL.md](skills/zustand-5/SKILL.md) | | `ai-sdk-5` | UIMessage, streaming, LangChain | [SKILL.md](skills/ai-sdk-5/SKILL.md) | @@ -40,6 +41,7 @@ Use these skills for detailed patterns on-demand: | `prowler-provider` | Add new cloud providers | [SKILL.md](skills/prowler-provider/SKILL.md) | | `prowler-changelog` | Changelog entries (keepachangelog.com) | [SKILL.md](skills/prowler-changelog/SKILL.md) | | `prowler-ci` | CI checks and PR gates (GitHub Actions) | [SKILL.md](skills/prowler-ci/SKILL.md) | +| `prowler-commit` | Professional commits (conventional-commits) | [SKILL.md](skills/prowler-commit/SKILL.md) | | `prowler-pr` | Pull request conventions | [SKILL.md](skills/prowler-pr/SKILL.md) | | `prowler-docs` | Documentation style guide | [SKILL.md](skills/prowler-docs/SKILL.md) | | `skill-creator` | Create new AI agent skills | [SKILL.md](skills/skill-creator/SKILL.md) | @@ -51,14 +53,19 @@ When performing these actions, ALWAYS invoke the corresponding skill FIRST: | Action | Skill | |--------|-------| | Add changelog entry for a PR or feature | `prowler-changelog` | +| Adding DRF pagination or permissions | `django-drf` | | Adding new providers | `prowler-provider` | | Adding services to existing providers | `prowler-provider` | | After creating/modifying a skill | `skill-sync` | | App Router / Server Actions | `nextjs-15` | | Building AI chat features | `ai-sdk-5` | +| Committing changes | `prowler-commit` | | Create PR that requires changelog entry | `prowler-changelog` | | Create a PR with gh pr create | `prowler-pr` | +| Creating API endpoints | `jsonapi` | +| Creating ViewSets, serializers, or filters in api/ | `django-drf` | | Creating Zod schemas | `zod-4` | +| Creating a git commit | `prowler-commit` | | Creating new checks | `prowler-sdk-check` | | Creating new skills | `skill-creator` | | Creating/modifying Prowler UI components | `prowler-ui` | @@ -67,14 +74,16 @@ When performing these actions, ALWAYS invoke the corresponding skill FIRST: | Debug why a GitHub Actions job is failing | `prowler-ci` | | Fill .github/pull_request_template.md (Context/Description/Steps to review/Checklist) | `prowler-pr` | | General Prowler development questions | `prowler` | -| Generic DRF patterns | `django-drf` | +| Implementing JSON:API endpoints | `django-drf` | | Inspect PR CI checks and gates (.github/workflows/*) | `prowler-ci` | | Inspect PR CI workflows (.github/workflows/*): conventional-commit, pr-check-changelog, pr-conflict-checker, labeler | `prowler-pr` | | Mapping checks to compliance controls | `prowler-compliance` | | Mocking AWS with moto in tests | `prowler-test-sdk` | +| Modifying API responses | `jsonapi` | | Regenerate AGENTS.md Auto-invoke tables (sync.sh) | `skill-sync` | | Review PR requirements: template, title conventions, changelog gate | `prowler-pr` | | Review changelog format and conventions | `prowler-changelog` | +| Reviewing JSON:API compliance | `jsonapi` | | Reviewing compliance framework PRs | `prowler-compliance-review` | | Testing RLS tenant isolation | `prowler-test-api` | | Troubleshoot why a skill is missing from AGENTS.md auto-invoke | `skill-sync` | diff --git a/api/AGENTS.md b/api/AGENTS.md index b4a488a12a..399f12b691 100644 --- a/api/AGENTS.md +++ b/api/AGENTS.md @@ -4,6 +4,7 @@ > - [`prowler-api`](../skills/prowler-api/SKILL.md) - Models, Serializers, Views, RLS patterns > - [`prowler-test-api`](../skills/prowler-test-api/SKILL.md) - Testing patterns (pytest-django) > - [`django-drf`](../skills/django-drf/SKILL.md) - Generic DRF patterns +> - [`jsonapi`](../skills/jsonapi/SKILL.md) - Strict JSON:API v1.1 spec compliance > - [`pytest`](../skills/pytest/SKILL.md) - Generic pytest patterns ### Auto-invoke Skills @@ -13,10 +14,17 @@ When performing these actions, ALWAYS invoke the corresponding skill FIRST: | Action | Skill | |--------|-------| | Add changelog entry for a PR or feature | `prowler-changelog` | +| Adding DRF pagination or permissions | `django-drf` | +| Committing changes | `prowler-commit` | | Create PR that requires changelog entry | `prowler-changelog` | +| Creating API endpoints | `jsonapi` | +| Creating ViewSets, serializers, or filters in api/ | `django-drf` | +| Creating a git commit | `prowler-commit` | | Creating/modifying models, views, serializers | `prowler-api` | -| Generic DRF patterns | `django-drf` | +| Implementing JSON:API endpoints | `django-drf` | +| Modifying API responses | `jsonapi` | | Review changelog format and conventions | `prowler-changelog` | +| Reviewing JSON:API compliance | `jsonapi` | | Testing RLS tenant isolation | `prowler-test-api` | | Update CHANGELOG.md in any component | `prowler-changelog` | | Writing Prowler API tests | `prowler-test-api` | diff --git a/mcp_server/AGENTS.md b/mcp_server/AGENTS.md index 24621c2755..c8f77bd4b1 100644 --- a/mcp_server/AGENTS.md +++ b/mcp_server/AGENTS.md @@ -9,7 +9,9 @@ When performing these actions, ALWAYS invoke the corresponding skill FIRST: | Action | Skill | |--------|-------| | Add changelog entry for a PR or feature | `prowler-changelog` | +| Committing changes | `prowler-commit` | | Create PR that requires changelog entry | `prowler-changelog` | +| Creating a git commit | `prowler-commit` | | Review changelog format and conventions | `prowler-changelog` | | Update CHANGELOG.md in any component | `prowler-changelog` | | Working on MCP server tools | `prowler-mcp` | diff --git a/skills/django-drf/SKILL.md b/skills/django-drf/SKILL.md index df740a3e4d..93ea6219f1 100644 --- a/skills/django-drf/SKILL.md +++ b/skills/django-drf/SKILL.md @@ -2,185 +2,504 @@ name: django-drf description: > Django REST Framework patterns. - Trigger: When implementing generic DRF APIs (ViewSets, serializers, routers, permissions, filtersets). For Prowler API specifics (RLS/JSON:API), also use prowler-api. + Trigger: When implementing generic DRF APIs (ViewSets, serializers, routers, permissions, filtersets). For Prowler API specifics (RLS/RBAC/Providers), also use prowler-api. license: Apache-2.0 metadata: author: prowler-cloud - version: "1.0" + version: "1.2.0" scope: [root, api] - auto_invoke: "Generic DRF patterns" + auto_invoke: + - "Creating ViewSets, serializers, or filters in api/" + - "Implementing JSON:API endpoints" + - "Adding DRF pagination or permissions" allowed-tools: Read, Edit, Write, Glob, Grep, Bash, WebFetch, WebSearch, Task --- -## ViewSet Pattern +## Critical Patterns -```python -from rest_framework import viewsets, status -from rest_framework.response import Response -from rest_framework.decorators import action +- ALWAYS separate serializers by operation: Read / Create / Update / Include +- ALWAYS use `filterset_class` for complex filtering (not `filterset_fields`) +- ALWAYS validate unknown fields in write serializers (inherit `BaseWriteSerializer`) +- ALWAYS use `select_related`/`prefetch_related` in `get_queryset()` to avoid N+1 +- ALWAYS handle `swagger_fake_view` in `get_queryset()` for schema generation +- ALWAYS use `@extend_schema_field` for OpenAPI docs on `SerializerMethodField` +- NEVER put business logic in serializers - use services/utils +- NEVER use auto-increment PKs - use UUIDv4 or UUIDv7 +- NEVER use trailing slashes in URLs (`trailing_slash=False`) -class UserViewSet(viewsets.ModelViewSet): - queryset = User.objects.all() - serializer_class = UserSerializer - filterset_class = UserFilter - permission_classes = [IsAuthenticated] +> **Note:** `swagger_fake_view` is specific to **drf-spectacular** for OpenAPI schema generation. - def get_serializer_class(self): - if self.action == "create": - return UserCreateSerializer - if self.action in ["update", "partial_update"]: - return UserUpdateSerializer - return UserSerializer +--- - @action(detail=True, methods=["post"]) - def activate(self, request, pk=None): - user = self.get_object() - user.is_active = True - user.save() - return Response({"status": "activated"}) +## Implementation Checklist + +When implementing a new endpoint, review these patterns in order: + +| # | Pattern | Reference | Key Points | +|---|---------|-----------|------------| +| 1 | **Models** | `api/models.py` | UUID PK, `inserted_at`/`updated_at`, `JSONAPIMeta.resource_name` | +| 2 | **ViewSets** | `api/base_views.py`, `api/v1/views.py` | Inherit `BaseRLSViewSet`, `get_queryset()` with N+1 prevention | +| 3 | **Serializers** | `api/v1/serializers.py` | Separate Read/Create/Update/Include, inherit `BaseWriteSerializer` | +| 4 | **Filters** | `api/filters.py` | Use `filterset_class`, inherit base filter classes | +| 5 | **Permissions** | `api/base_views.py` | `required_permissions`, `set_required_permissions()` | +| 6 | **Pagination** | `api/pagination.py` | Custom pagination class if needed | +| 7 | **URL Routing** | `api/v1/urls.py` | `trailing_slash=False`, kebab-case paths | +| 8 | **OpenAPI Schema** | `api/v1/views.py` | `@extend_schema_view` with drf-spectacular | +| 9 | **Tests** | `api/tests/test_views.py` | JSON:API content type, fixture patterns | + +> **Full file paths**: See [references/file-locations.md](references/file-locations.md) + +--- + +## Decision Trees + +### Which Serializer? ``` +GET list/retrieve → Serializer +POST create → CreateSerializer +PATCH update → UpdateSerializer +?include=... → IncludeSerializer +``` + +### Which Base Serializer? +``` +Read-only serializer → BaseModelSerializerV1 +Create with tenant_id → RLSSerializer + BaseWriteSerializer (auto-injects tenant_id on create) +Update with validation → BaseWriteSerializer (tenant_id already exists on object) +Non-model data → BaseSerializerV1 +``` + +### Which Filter Base? +``` +Direct FK to Provider → BaseProviderFilter +FK via Scan → BaseScanProviderFilter +No provider relation → FilterSet +``` + +### Which Base ViewSet? +``` +RLS-protected model → BaseRLSViewSet (most common) +Tenant operations → BaseTenantViewset +User operations → BaseUserViewset +No RLS required → BaseViewSet (rare) +``` + +### Resource Name Format? +``` +Single word model → plural lowercase (Provider → providers) +Multi-word model → plural lowercase kebab (ProviderGroup → provider-groups) +Through/join model → parent-child pattern (UserRoleRelationship → user-roles) +Aggregation/overview → descriptive kebab plural (ComplianceOverview → compliance-overviews) +``` + +--- ## Serializer Patterns +### Base Class Hierarchy + ```python -from rest_framework import serializers - -# Read Serializer -class UserSerializer(serializers.ModelSerializer): - full_name = serializers.SerializerMethodField() - +# Read serializer (most common) +class ProviderSerializer(RLSSerializer): class Meta: - model = User - fields = ["id", "email", "full_name", "created_at"] - read_only_fields = ["id", "created_at"] - - def get_full_name(self, obj): - return f"{obj.first_name} {obj.last_name}" - -# Create Serializer -class UserCreateSerializer(serializers.ModelSerializer): - password = serializers.CharField(write_only=True) + model = Provider + fields = ["id", "provider", "uid", "alias", "connected", "inserted_at"] +# Write serializer (validates unknown fields) +class ProviderCreateSerializer(RLSSerializer, BaseWriteSerializer): class Meta: - model = User - fields = ["email", "password", "first_name", "last_name"] + model = Provider + fields = ["provider", "uid", "alias"] - def create(self, validated_data): - password = validated_data.pop("password") - user = User(**validated_data) - user.set_password(password) - user.save() - return user - -# Update Serializer -class UserUpdateSerializer(serializers.ModelSerializer): +# Include serializer (sparse fields for ?include=) +class ProviderIncludeSerializer(RLSSerializer): class Meta: - model = User - fields = ["first_name", "last_name"] + model = Provider + fields = ["id", "alias"] # Minimal fields ``` -## Filters +### SerializerMethodField with OpenAPI ```python -from django_filters import rest_framework as filters +from drf_spectacular.utils import extend_schema_field -class UserFilter(filters.FilterSet): - email = filters.CharFilter(lookup_expr="icontains") - is_active = filters.BooleanFilter() - created_after = filters.DateTimeFilter( - field_name="created_at", - lookup_expr="gte" - ) - created_before = filters.DateTimeFilter( - field_name="created_at", - lookup_expr="lte" - ) +class ProviderSerializer(RLSSerializer): + connection = serializers.SerializerMethodField(read_only=True) - class Meta: - model = User - fields = ["email", "is_active"] + @extend_schema_field({ + "type": "object", + "properties": { + "connected": {"type": "boolean"}, + "last_checked_at": {"type": "string", "format": "date-time"}, + }, + }) + def get_connection(self, obj): + return { + "connected": obj.connected, + "last_checked_at": obj.connection_last_checked_at, + } ``` -## Permissions +### Included Serializers (JSON:API) ```python -from rest_framework.permissions import BasePermission +class ScanSerializer(RLSSerializer): + included_serializers = { + "provider": "api.v1.serializers.ProviderIncludeSerializer", + } +``` -class IsOwner(BasePermission): +### Sensitive Data Masking + +```python +def to_representation(self, instance): + data = super().to_representation(instance) + # Mask by default, expose only on explicit request + fields_param = self.context.get("request").query_params.get("fields[my-model]", "") + if "api_key" in fields_param: + data["api_key"] = instance.api_key_decoded + else: + data["api_key"] = "****" if instance.api_key else None + return data +``` + +--- + +## ViewSet Patterns + +### get_queryset() with N+1 Prevention + +**Always combine** `swagger_fake_view` check with `select_related`/`prefetch_related`: + +```python +def get_queryset(self): + # REQUIRED: Return empty queryset for OpenAPI schema generation + if getattr(self, "swagger_fake_view", False): + return Provider.objects.none() + + # N+1 prevention: eager load relationships + return Provider.objects.select_related( + "tenant", + ).prefetch_related( + "provider_groups", + Prefetch("tags", queryset=ProviderTag.objects.filter(tenant_id=self.request.tenant_id)), + ) +``` + +> **Why swagger_fake_view?** drf-spectacular introspects ViewSets to generate OpenAPI schemas. Without this check, it executes real queries and can fail without request context. + +### Action-Specific Serializers + +```python +def get_serializer_class(self): + if self.action == "create": + return ProviderCreateSerializer + elif self.action == "partial_update": + return ProviderUpdateSerializer + elif self.action in ["connection", "destroy"]: + return TaskSerializer + return ProviderSerializer +``` + +### Dynamic Permissions per Action + +```python +class ProviderViewSet(BaseRLSViewSet): + required_permissions = [Permissions.MANAGE_PROVIDERS] + + def set_required_permissions(self): + if self.action in ["list", "retrieve"]: + self.required_permissions = [] # Read-only = no permission + else: + self.required_permissions = [Permissions.MANAGE_PROVIDERS] +``` + +### Cache Decorator + +```python +from django.utils.decorators import method_decorator +from django.views.decorators.cache import cache_control + +CACHE_DECORATOR = cache_control( + max_age=django_settings.CACHE_MAX_AGE, + stale_while_revalidate=django_settings.CACHE_STALE_WHILE_REVALIDATE, +) + +@method_decorator(CACHE_DECORATOR, name="list") +@method_decorator(CACHE_DECORATOR, name="retrieve") +class ProviderViewSet(BaseRLSViewSet): + pass +``` + +### Custom Actions + +```python +# Detail action (operates on single object) +@action(detail=True, methods=["post"], url_name="connection") +def connection(self, request, pk=None): + instance = self.get_object() + # Process instance... + +# List action (operates on collection) +@action(detail=False, methods=["get"], url_name="metadata") +def metadata(self, request): + queryset = self.filter_queryset(self.get_queryset()) + # Aggregate over queryset... +``` + +--- + +## Filter Patterns + +### Base Filter Classes + +```python +class BaseProviderFilter(FilterSet): + """For models with direct FK to Provider""" + provider_id = UUIDFilter(field_name="provider__id", lookup_expr="exact") + provider_id__in = UUIDInFilter(field_name="provider__id", lookup_expr="in") + provider_type = ChoiceFilter(field_name="provider__provider", choices=Provider.ProviderChoices.choices) + +class BaseScanProviderFilter(FilterSet): + """For models with FK to Scan (Scan has FK to Provider)""" + provider_id = UUIDFilter(field_name="scan__provider__id", lookup_expr="exact") +``` + +### Custom Multi-Value Filters + +```python +class UUIDInFilter(BaseInFilter, UUIDFilter): + pass + +class CharInFilter(BaseInFilter, CharFilter): + pass + +class ChoiceInFilter(BaseInFilter, ChoiceFilter): + pass +``` + +### ArrayField Filtering + +```python +# Single value contains +region = CharFilter(method="filter_region") + +def filter_region(self, queryset, name, value): + return queryset.filter(resource_regions__contains=[value]) + +# Multi-value overlap +region__in = CharInFilter(field_name="resource_regions", lookup_expr="overlap") +``` + +### Date Range Validation + +```python +def filter_queryset(self, queryset): + # Require date filter for performance + if not (date_filters_provided): + raise ValidationError([{ + "detail": "At least one date filter is required", + "status": 400, + "source": {"pointer": "/data/attributes/inserted_at"}, + "code": "required", + }]) + + # Validate max range + if date_range > settings.FINDINGS_MAX_DAYS_IN_RANGE: + raise ValidationError(...) + + return super().filter_queryset(queryset) +``` + +### Dynamic FilterSet Selection + +```python +def get_filterset_class(self): + if self.action in ["latest", "metadata_latest"]: + return LatestFindingFilter + return FindingFilter +``` + +### Enum Field Override + +```python +class Meta: + model = Finding + filter_overrides = { + FindingDeltaEnumField: {"filter_class": CharFilter}, + StatusEnumField: {"filter_class": CharFilter}, + SeverityEnumField: {"filter_class": CharFilter}, + } +``` + +--- + +## Performance Patterns + +### PaginateByPkMixin + +For large querysets with expensive joins: + +```python +class PaginateByPkMixin: + def paginate_by_pk(self, request, base_queryset, manager, + select_related=None, prefetch_related=None): + # 1. Get PKs only (cheap) + pk_list = base_queryset.values_list("id", flat=True) + page = self.paginate_queryset(pk_list) + + # 2. Fetch full objects for just the page + queryset = manager.filter(id__in=page) + if select_related: + queryset = queryset.select_related(*select_related) + if prefetch_related: + queryset = queryset.prefetch_related(*prefetch_related) + + # 3. Re-sort to preserve DB ordering + queryset = sorted(queryset, key=lambda obj: page.index(obj.id)) + return self.get_paginated_response(self.get_serializer(queryset, many=True).data) +``` + +### Prefetch in Serializers + +```python +def get_tags(self, obj): + # Use prefetched tags if available + if hasattr(obj, "prefetched_tags"): + return {tag.key: tag.value for tag in obj.prefetched_tags} + # Fallback (causes N+1 if not prefetched) + return obj.get_tags(self.context.get("tenant_id")) +``` + +--- + +## Naming Conventions + +| Entity | Pattern | Example | +|--------|---------|---------| +| Serializer (read) | `Serializer` | `ProviderSerializer` | +| Serializer (create) | `CreateSerializer` | `ProviderCreateSerializer` | +| Serializer (update) | `UpdateSerializer` | `ProviderUpdateSerializer` | +| Serializer (include) | `IncludeSerializer` | `ProviderIncludeSerializer` | +| Filter | `Filter` | `ProviderFilter` | +| ViewSet | `ViewSet` | `ProviderViewSet` | + +--- + +## OpenAPI Documentation + +```python +from drf_spectacular.utils import extend_schema, extend_schema_view + +@extend_schema_view( + list=extend_schema(tags=["Provider"], summary="List all providers"), + retrieve=extend_schema(tags=["Provider"], summary="Retrieve provider"), + create=extend_schema(tags=["Provider"], summary="Create provider"), +) +@extend_schema(tags=["Provider"]) +class ProviderViewSet(BaseRLSViewSet): + pass +``` + +--- + +## API Security Patterns + +> **Full examples**: See [assets/security_patterns.py](assets/security_patterns.py) + +| Pattern | Key Points | +|---------|------------| +| **Input Validation** | Use `validate_()` for sanitization, `validate()` for cross-field | +| **Prevent Mass Assignment** | ALWAYS use explicit `fields` list, NEVER `__all__` or `exclude` | +| **Object-Level Permissions** | Implement `has_object_permission()` for ownership checks | +| **Rate Limiting** | Configure `DEFAULT_THROTTLE_RATES`, use per-view throttles for sensitive endpoints | +| **Prevent Info Disclosure** | Generic error messages, return 404 not 403 for unauthorized (prevents enumeration) | +| **SQL Injection** | ALWAYS use ORM parameterization, NEVER string interpolation in raw SQL | + +### Quick Reference + +```python +# Input validation in serializer +def validate_uid(self, value): + value = value.strip().lower() + if not re.match(r'^[a-z0-9-]+$', value): + raise serializers.ValidationError("Invalid format") + return value + +# Explicit fields (prevent mass assignment) +class Meta: + fields = ["name", "email"] # GOOD: whitelist + read_only_fields = ["id", "inserted_at"] # System fields + +# Object permission +class IsOwnerOrReadOnly(BasePermission): def has_object_permission(self, request, view, obj): + if request.method in SAFE_METHODS: + return True return obj.owner == request.user -class IsAdminOrReadOnly(BasePermission): - def has_permission(self, request, view): - if request.method in ["GET", "HEAD", "OPTIONS"]: - return True - return request.user.is_staff +# Throttling for sensitive endpoints +class BurstRateThrottle(UserRateThrottle): + rate = "10/minute" + +# Safe error messages (prevent enumeration) +def get_object(self): + try: + return super().get_object() + except Http404: + raise NotFound("Resource not found") # Generic, no internal IDs ``` -## Pagination - -```python -from rest_framework.pagination import PageNumberPagination - -class StandardPagination(PageNumberPagination): - page_size = 20 - page_size_query_param = "page_size" - max_page_size = 100 - -# settings.py -REST_FRAMEWORK = { - "DEFAULT_PAGINATION_CLASS": "api.pagination.StandardPagination", -} -``` - -## URL Routing - -```python -from rest_framework.routers import DefaultRouter - -router = DefaultRouter() -router.register(r"users", UserViewSet, basename="user") -router.register(r"posts", PostViewSet, basename="post") - -urlpatterns = [ - path("api/v1/", include(router.urls)), -] -``` - -## Testing - -```python -import pytest -from rest_framework import status -from rest_framework.test import APIClient - -@pytest.fixture -def api_client(): - return APIClient() - -@pytest.fixture -def authenticated_client(api_client, user): - api_client.force_authenticate(user=user) - return api_client - -@pytest.mark.django_db -class TestUserViewSet: - def test_list_users(self, authenticated_client): - response = authenticated_client.get("/api/v1/users/") - assert response.status_code == status.HTTP_200_OK - - def test_create_user(self, authenticated_client): - data = {"email": "new@test.com", "password": "pass123"} - response = authenticated_client.post("/api/v1/users/", data) - assert response.status_code == status.HTTP_201_CREATED -``` +--- ## Commands ```bash -python manage.py runserver -python manage.py makemigrations -python manage.py migrate -python manage.py createsuperuser -python manage.py shell +# Development +cd api && poetry run python src/backend/manage.py runserver +cd api && poetry run python src/backend/manage.py shell + +# Database +cd api && poetry run python src/backend/manage.py makemigrations +cd api && poetry run python src/backend/manage.py migrate + +# Testing +cd api && poetry run pytest -x --tb=short +cd api && poetry run make lint ``` + +--- + +## Resources + +### Local References +- **File Locations**: See [references/file-locations.md](references/file-locations.md) +- **JSON:API Conventions**: See [references/json-api-conventions.md](references/json-api-conventions.md) +- **Security Patterns**: See [assets/security_patterns.py](assets/security_patterns.py) + +### Context7 MCP (Recommended) + +**Prerequisite:** Install Context7 MCP server for up-to-date documentation lookup. + +When implementing or debugging, query these libraries via `mcp_context7_query-docs`: + +| Library | Context7 ID | Use For | +|---------|-------------|---------| +| **Django** | `/websites/djangoproject_en_5_2` | Models, ORM, migrations | +| **DRF** | `/websites/django-rest-framework` | ViewSets, serializers, permissions | +| **drf-spectacular** | `/tfranzel/drf-spectacular` | OpenAPI schema, `@extend_schema` | + +**Example queries:** +``` +mcp_context7_query-docs(libraryId="/websites/django-rest-framework", query="ViewSet get_queryset best practices") +mcp_context7_query-docs(libraryId="/tfranzel/drf-spectacular", query="extend_schema examples for custom actions") +mcp_context7_query-docs(libraryId="/websites/djangoproject_en_5_2", query="model constraints and indexes") +``` + +> **Note:** Use `mcp_context7_resolve-library-id` first if you need to find the correct library ID. + +### External Docs +- **DRF Docs**: https://www.django-rest-framework.org/ +- **DRF JSON:API**: https://django-rest-framework-json-api.readthedocs.io/ +- **drf-spectacular**: https://drf-spectacular.readthedocs.io/ +- **django-filter**: https://django-filter.readthedocs.io/ diff --git a/skills/django-drf/assets/security_patterns.py b/skills/django-drf/assets/security_patterns.py new file mode 100644 index 0000000000..17f453091a --- /dev/null +++ b/skills/django-drf/assets/security_patterns.py @@ -0,0 +1,159 @@ +# Example: DRF API Security Patterns +# Reference for django-drf skill + +import re + +from rest_framework import serializers, status, viewsets +from rest_framework.exceptions import NotFound +from rest_framework.permissions import SAFE_METHODS, BasePermission, IsAuthenticated +from rest_framework.throttling import UserRateThrottle + + +# ============================================================================= +# INPUT VALIDATION +# ============================================================================= + + +class ProviderCreateSerializer(serializers.Serializer): + """Example: Input validation in serializers.""" + + uid = serializers.CharField(max_length=255) + provider = serializers.CharField() + + def validate_uid(self, value): + """Field-level validation with sanitization.""" + # Sanitize: strip whitespace, normalize + value = value.strip().lower() + # Validate format + if not re.match(r"^[a-z0-9-]+$", value): + raise serializers.ValidationError( + "UID must be alphanumeric with hyphens only" + ) + return value + + def validate(self, attrs): + """Cross-field validation.""" + if attrs.get("provider") == "aws" and len(attrs.get("uid", "")) != 12: + raise serializers.ValidationError( + {"uid": "AWS account ID must be 12 digits"} + ) + return attrs + + +# ============================================================================= +# PREVENT MASS ASSIGNMENT +# ============================================================================= + + +class UserUpdateSerializer(serializers.ModelSerializer): + """Example: Explicit field whitelist prevents mass assignment.""" + + class Meta: + # GOOD: Explicit whitelist + fields = ["name", "email"] + # BAD: fields = "__all__" # Exposes is_staff, is_superuser + # BAD: exclude = ["password"] # New fields auto-exposed + + +class ProviderSerializer(serializers.ModelSerializer): + """Example: Read-only fields for computed/system values.""" + + class Meta: + fields = ["id", "uid", "alias", "connected", "inserted_at"] + # Cannot be set via API - only read + read_only_fields = ["id", "connected", "inserted_at"] + + +# ============================================================================= +# OBJECT-LEVEL PERMISSIONS +# ============================================================================= + + +class IsOwnerOrReadOnly(BasePermission): + """Example: Object-level permission check.""" + + def has_object_permission(self, request, view, obj): + # Read permissions for any authenticated request + if request.method in SAFE_METHODS: + return True + # Write permissions only for owner + return obj.owner == request.user + + +class DocumentViewSet(viewsets.ModelViewSet): + """Example: ViewSet with object-level permissions.""" + + permission_classes = [IsAuthenticated, IsOwnerOrReadOnly] + + +# ============================================================================= +# RATE LIMITING (THROTTLING) +# ============================================================================= + +# In settings.py: +# REST_FRAMEWORK = { +# "DEFAULT_THROTTLE_CLASSES": [ +# "rest_framework.throttling.AnonRateThrottle", +# "rest_framework.throttling.UserRateThrottle", +# ], +# "DEFAULT_THROTTLE_RATES": { +# "anon": "100/hour", +# "user": "1000/hour", +# }, +# } + + +class BurstRateThrottle(UserRateThrottle): + """Example: Custom throttle for sensitive endpoints.""" + + rate = "10/minute" + + +class PasswordResetViewSet(viewsets.ViewSet): + """Example: Per-view throttling for sensitive endpoints.""" + + throttle_classes = [BurstRateThrottle] + + +# ============================================================================= +# PREVENT INFORMATION DISCLOSURE +# ============================================================================= + + +class SecureViewSet(viewsets.ModelViewSet): + """Example: Prevent information disclosure patterns.""" + + def get_object(self): + try: + return super().get_object() + except Exception: + # GOOD: Generic message - doesn't leak internal IDs or tenant info + raise NotFound("Resource not found") + # BAD: raise NotFound(f"Provider {pk} not found in tenant {tenant_id}") + + def get_queryset(self): + # Use 404 not 403 for unauthorized access (prevents enumeration) + # Filter by tenant - unauthorized users get 404, not 403 + return self.queryset.filter(tenant_id=self.request.tenant_id) + + +# ============================================================================= +# SQL INJECTION PREVENTION +# ============================================================================= + + +def safe_query_examples(user_input): + """Example: SQL injection prevention patterns.""" + from django.db import connection + + # GOOD: Parameterized via ORM + # Provider.objects.filter(uid=user_input) + # Provider.objects.extra(where=["uid = %s"], params=[user_input]) + + # GOOD: If raw SQL unavoidable, use parameterized queries + with connection.cursor() as cursor: + cursor.execute("SELECT * FROM providers WHERE uid = %s", [user_input]) + + # BAD: String interpolation = SQL injection vulnerability + # Provider.objects.raw(f"SELECT * FROM providers WHERE uid = '{user_input}'") + # cursor.execute(f"SELECT * FROM providers WHERE uid = '{user_input}'") diff --git a/skills/django-drf/references/file-locations.md b/skills/django-drf/references/file-locations.md new file mode 100644 index 0000000000..30dab71550 --- /dev/null +++ b/skills/django-drf/references/file-locations.md @@ -0,0 +1,154 @@ +# Django-DRF File Locations + +## Core API Files + +| Pattern | File Path | Key Classes | +|---------|-----------|-------------| +| **Models** | `api/src/backend/api/models.py` | `Provider`, `Scan`, `Finding`, `Resource`, `StateChoices`, `StatusChoices` | +| **ViewSets** | `api/src/backend/api/v1/views.py` | `BaseViewSet`, `BaseRLSViewSet`, `BaseTenantViewset`, `BaseUserViewset` | +| **Serializers** | `api/src/backend/api/v1/serializers.py` | `BaseModelSerializerV1`, `BaseWriteSerializer`, `RLSSerializer` | +| **Filters** | `api/src/backend/api/filters.py` | `BaseProviderFilter`, `BaseScanProviderFilter`, `CommonFindingFilters` | +| **URL Routing** | `api/src/backend/api/v1/urls.py` | Router setup, nested routes | +| **Pagination** | `api/src/backend/api/pagination.py` | `LimitedJsonApiPageNumberPagination` | +| **Permissions** | `api/src/backend/api/decorators.py` | `HasPermissions`, `@check_permissions` | +| **RBAC** | `api/src/backend/api/rbac/permissions.py` | `Permissions` enum, `get_role()`, `get_providers()` | +| **Settings** | `api/src/backend/config/settings.py` | `REST_FRAMEWORK` config | + +## ViewSet Hierarchy + +``` +BaseViewSet (minimal - no RLS/auth) + │ + ├── BaseRLSViewSet (+ tenant filtering, RLS-protected models) + │ └── Most ViewSets inherit this + │ + ├── BaseTenantViewset (+ Tenant-specific logic) + │ └── TenantViewSet + │ + └── BaseUserViewset (+ User-specific logic) + └── UserViewSet +``` + +## Serializer Hierarchy + +``` +BaseModelSerializerV1 (JSON:API defaults, read_only_fields) + │ + ├── RLSSerializer (auto-injects tenant_id from request) + │ └── Most model serializers inherit this + │ + └── BaseWriteSerializer (rejects unknown fields) + └── Create/Update serializers + ++ Mixins: + - IncludedResourcesValidationMixin (validates ?include= param) + - JSONAPIRelatedLinksSerializerMixin (adds related links) +``` + +## Filter Hierarchy + +``` +FilterSet (django-filter) + │ + ├── CommonFindingFilters (mixin for date ranges, delta, status) + │ + ├── BaseProviderFilter (provider_type, provider_uid, provider_alias) + │ │ + │ └── BaseScanProviderFilter (+ scan_id, scan filters) + │ + └── Resource-specific filters (ProviderFilter, ScanFilter, etc.) + +Custom Filter Types: + - UUIDInFilter: Comma-separated UUIDs + - CharInFilter: Comma-separated strings + - DateFilter: ISO date parsing + - DateTimeFilter: ISO datetime parsing +``` + +## Testing Files + +| Pattern | File Path | Key Classes | +|---------|-----------|-------------| +| **ViewSet Tests** | `api/src/backend/api/tests/test_views.py` | Test patterns, fixtures | +| **RBAC Tests** | `api/src/backend/api/tests/test_rbac.py` | Permission tests | +| **Serializer Tests** | `api/src/backend/api/tests/test_serializers.py` | Validation tests | +| **Conftest** | `api/src/backend/conftest.py` | Shared fixtures | + +## Key Patterns + +### Filter Usage + +```python +# In filters.py +class ProviderFilter(BaseProviderFilter): + class Meta: + model = Provider + fields = { + "provider": ["exact", "in"], + "connected": ["exact"], + } + +# Custom filter method +def filter_severity(self, queryset, name, value): + if not value: + return queryset + return queryset.filter(severity__in=value) +``` + +### Serializer Usage + +```python +# Read serializer +class ProviderSerializer(RLSSerializer): + class Meta: + model = Provider + fields = ["id", "provider", "uid", "alias", "connected"] + +# Write serializer +class ProviderCreateSerializer(BaseWriteSerializer, RLSSerializer): + class Meta: + model = Provider + fields = ["provider", "uid", "alias"] +``` + +### ViewSet Action Pattern + +```python +@action(detail=True, methods=["post"], url_path="scan") +def trigger_scan(self, request, pk=None): + provider = self.get_object() + task = perform_scan_task.delay(...) + return Response(status=status.HTTP_202_ACCEPTED) +``` + +## REST_FRAMEWORK Settings + +Located in `api/src/backend/config/settings.py`: + +```python +REST_FRAMEWORK = { + "PAGE_SIZE": 10, + "DEFAULT_PAGINATION_CLASS": "api.pagination.LimitedJsonApiPageNumberPagination", + "DEFAULT_PARSER_CLASSES": [ + "rest_framework_json_api.parsers.JSONParser", + "rest_framework.parsers.JSONParser", + ], + "DEFAULT_FILTER_BACKENDS": [ + "rest_framework_json_api.filters.QueryParameterValidationFilter", + "rest_framework_json_api.filters.OrderingFilter", + "rest_framework_json_api.django_filters.DjangoFilterBackend", + "rest_framework.filters.SearchFilter", + ], + "EXCEPTION_HANDLER": "rest_framework_json_api.exceptions.exception_handler", + # ... more settings +} +``` + +## JSON:API Resource Names + +Find all `JSONAPIMeta` declarations: +```bash +rg "resource_name" api/src/backend/api/models.py +``` + +Convention: kebab-case, plural (e.g., `provider-groups`, `mute-rules`) diff --git a/skills/django-drf/references/json-api-conventions.md b/skills/django-drf/references/json-api-conventions.md new file mode 100644 index 0000000000..c51326b0b2 --- /dev/null +++ b/skills/django-drf/references/json-api-conventions.md @@ -0,0 +1,116 @@ +# JSON:API Conventions + +## Content Type + +``` +Content-Type: application/vnd.api+json +Accept: application/vnd.api+json +``` + +## Query Parameters + +| Feature | Format | Example | +|---------|--------|---------| +| **Pagination** | `page[number]`, `page[size]` | `?page[number]=2&page[size]=20` | +| **Filtering** | `filter[field]`, `filter[field__lookup]` | `?filter[status]=FAIL&filter[inserted_at__gte]=2024-01-01` | +| **Sorting** | `sort` (prefix `-` for desc) | `?sort=-inserted_at,name` | +| **Sparse fields** | `fields[type]` | `?fields[providers]=id,alias,uid` | +| **Includes** | `include` | `?include=provider,scan` | +| **Search** | `filter[search]` | `?filter[search]=production` | + +## Filter Naming + +| Lookup | Django Filter | JSON:API Query | +|--------|--------------|----------------| +| Exact | `field` | `filter[field]=value` | +| Contains | `field__icontains` | `filter[field__icontains]=val` | +| In list | `field__in` | `filter[field__in]=a,b,c` | +| Greater/equal | `field__gte` | `filter[field__gte]=2024-01-01` | +| Less/equal | `field__lte` | `filter[field__lte]=2024-12-31` | +| Related field | `relation__field` | `filter[provider_id]=uuid` | + +## Request Format + +```json +{ + "data": { + "type": "providers", + "attributes": { + "provider": "aws", + "uid": "123456789012", + "alias": "Production" + } + } +} +``` + +## Response Format + +```json +{ + "data": { + "type": "providers", + "id": "550e8400-e29b-41d4-a716-446655440000", + "attributes": { + "provider": "aws", + "uid": "123456789012", + "alias": "Production", + "inserted_at": "2024-01-15T10:30:00Z" + }, + "relationships": { + "provider_groups": { + "data": [{"type": "provider-groups", "id": "..."}] + } + }, + "links": { + "self": "/api/v1/providers/550e8400-e29b-41d4-a716-446655440000" + } + }, + "meta": { + "version": "v1" + } +} +``` + +## Error Response Format + +```json +{ + "errors": [ + { + "detail": "Error message here", + "status": "400", + "source": {"pointer": "/data/attributes/field_name"}, + "code": "error_code" + } + ] +} +``` + +## Resource Naming Rules + +- Use **lowercase kebab-case** (hyphens, not underscores) +- Use **plural nouns** for collections +- Resource name in `JSONAPIMeta` MUST match URL path segment + +| Model | resource_name | URL Path | +|-------|---------------|----------| +| `Provider` | `providers` | `/api/v1/providers` | +| `ProviderGroup` | `provider-groups` | `/api/v1/provider-groups` | +| `ProviderSecret` | `provider-secrets` | `/api/v1/providers/secrets` | +| `ComplianceOverview` | `compliance-overviews` | `/api/v1/compliance-overviews` | +| `AttackPathsScan` | `attack-paths-scans` | `/api/v1/attack-paths-scans` | +| `TenantAPIKey` | `api-keys` | `/api/v1/api-keys` | +| `MuteRule` | `mute-rules` | `/api/v1/mute-rules` | + +## URL Endpoints + +| Operation | Method | URL Pattern | +|-----------|--------|-------------| +| List | GET | `/{resources}` | +| Create | POST | `/{resources}` | +| Retrieve | GET | `/{resources}/{id}` | +| Update | PATCH | `/{resources}/{id}` | +| Delete | DELETE | `/{resources}/{id}` | +| Relationship | * | `/{resources}/{id}/relationships/{relation}` | +| Nested list | GET | `/{parent}/{parent_id}/{resources}` | diff --git a/skills/jsonapi/SKILL.md b/skills/jsonapi/SKILL.md new file mode 100644 index 0000000000..a8959a2199 --- /dev/null +++ b/skills/jsonapi/SKILL.md @@ -0,0 +1,271 @@ +--- +name: jsonapi +description: > + Strict JSON:API v1.1 specification compliance. + Trigger: When creating or modifying API endpoints, reviewing API responses, or validating JSON:API compliance. +license: Apache-2.0 +metadata: + author: prowler-cloud + version: "1.0.0" + scope: [root, api] + auto_invoke: + - "Creating API endpoints" + - "Modifying API responses" + - "Reviewing JSON:API compliance" +--- + +## Use With django-drf + +This skill focuses on **spec compliance**. For **implementation patterns** (ViewSets, Serializers, Filters), use `django-drf` skill together with this one. + +| Skill | Focus | +|-------|-------| +| `jsonapi` | What the spec requires (MUST/MUST NOT rules) | +| `django-drf` | How to implement it in DRF (code patterns) | + +**When creating/modifying endpoints, invoke BOTH skills.** + +--- + +## Before Implementing/Reviewing + +**ALWAYS validate against the latest spec** before creating or modifying endpoints: + +### Option 1: Context7 MCP (Preferred) + +If Context7 MCP is available, query the JSON:API spec directly: + +``` +mcp_context7_resolve-library-id(query="jsonapi specification") +mcp_context7_query-docs(libraryId="", query="[specific topic: relationships, errors, etc.]") +``` + +### Option 2: WebFetch (Fallback) + +If Context7 is not available, fetch from the official spec: + +``` +WebFetch(url="https://jsonapi.org/format/", prompt="Extract rules for [specific topic]") +``` + +This ensures compliance with the latest JSON:API version, even after spec updates. + +--- + +## Critical Rules (NEVER Break) + +### Document Structure +- NEVER include both `data` and `errors` in the same response +- ALWAYS include at least one of: `data`, `errors`, `meta` +- ALWAYS use `type` and `id` (string) in resource objects +- NEVER include `id` when creating resources (server generates it) + +### Content-Type +- ALWAYS use `Content-Type: application/vnd.api+json` +- ALWAYS use `Accept: application/vnd.api+json` +- NEVER add parameters to media type without `ext`/`profile` + +### Resource Objects +- ALWAYS use **string** for `id` (even if UUID) +- ALWAYS use **lowercase kebab-case** for `type` +- NEVER put `id` or `type` inside `attributes` +- NEVER include foreign keys in `attributes` - use `relationships` + +### Relationships +- ALWAYS include at least one of: `links`, `data`, or `meta` +- ALWAYS use resource linkage format: `{"type": "...", "id": "..."}` +- NEVER use raw IDs in relationships - always use linkage objects + +### Error Objects +- ALWAYS return errors as array: `{"errors": [...]}` +- ALWAYS include `status` as **string** (e.g., `"400"`, not `400`) +- ALWAYS include `source.pointer` for field-specific errors + +--- + +## HTTP Status Codes (Mandatory) + +| Operation | Success | Async | Conflict | Not Found | Forbidden | Bad Request | +|-----------|---------|-------|----------|-----------|-----------|-------------| +| **GET** | `200` | - | - | `404` | `403` | `400` | +| **POST** | `201` | `202` | `409` | `404` | `403` | `400` | +| **PATCH** | `200` | `202` | `409` | `404` | `403` | `400` | +| **DELETE** | `200`/`204` | `202` | - | `404` | `403` | - | + +### When to Use Each + +| Code | Use When | +|------|----------| +| `200 OK` | Successful GET, PATCH with response body, DELETE with response | +| `201 Created` | POST created resource (MUST include `Location` header) | +| `202 Accepted` | Async operation started (return task reference) | +| `204 No Content` | Successful DELETE, PATCH with no response body | +| `400 Bad Request` | Invalid query params, malformed request, unknown fields | +| `403 Forbidden` | Authentication ok but no permission, client-generated ID rejected | +| `404 Not Found` | Resource doesn't exist OR RLS hides it (never reveal which) | +| `409 Conflict` | Duplicate ID, type mismatch, relationship conflict | +| `415 Unsupported` | Wrong Content-Type header | + +--- + +## Document Structure + +### Success Response (Single) + +```json +{ + "data": { + "type": "providers", + "id": "550e8400-e29b-41d4-a716-446655440000", + "attributes": { + "alias": "Production", + "connected": true + }, + "relationships": { + "tenant": { + "data": {"type": "tenants", "id": "..."} + } + }, + "links": { + "self": "/api/v1/providers/550e8400-..." + } + }, + "links": { + "self": "/api/v1/providers/550e8400-..." + } +} +``` + +### Success Response (List) + +```json +{ + "data": [ + {"type": "providers", "id": "...", "attributes": {...}}, + {"type": "providers", "id": "...", "attributes": {...}} + ], + "links": { + "self": "/api/v1/providers?page[number]=1", + "first": "/api/v1/providers?page[number]=1", + "last": "/api/v1/providers?page[number]=5", + "prev": null, + "next": "/api/v1/providers?page[number]=2" + }, + "meta": { + "pagination": {"count": 100, "pages": 5} + } +} +``` + +### Error Response + +```json +{ + "errors": [ + { + "status": "400", + "code": "invalid", + "title": "Invalid attribute", + "detail": "UID must be 12 digits for AWS accounts", + "source": {"pointer": "/data/attributes/uid"} + } + ] +} +``` + +--- + +## Query Parameters + +| Family | Format | Example | +|--------|--------|---------| +| `page` | `page[number]`, `page[size]` | `?page[number]=2&page[size]=25` | +| `filter` | `filter[field]`, `filter[field__op]` | `?filter[status]=FAIL` | +| `sort` | Comma-separated, `-` for desc | `?sort=-inserted_at,name` | +| `fields` | `fields[type]` | `?fields[providers]=id,alias` | +| `include` | Comma-separated paths | `?include=provider,scan.task` | + +### Rules + +- MUST return `400` for unsupported query parameters +- MUST return `400` for unsupported `include` paths +- MUST return `400` for unsupported `sort` fields +- MUST NOT include extra fields when `fields[type]` is specified + +--- + +## Common Violations (AVOID) + +| Violation | Wrong | Correct | +|-----------|-------|---------| +| ID as integer | `"id": 123` | `"id": "123"` | +| Type as camelCase | `"type": "providerGroup"` | `"type": "provider-groups"` | +| FK in attributes | `"tenant_id": "..."` | `"relationships": {"tenant": {...}}` | +| Errors not array | `{"error": "..."}` | `{"errors": [{"detail": "..."}]}` | +| Status as number | `"status": 400` | `"status": "400"` | +| Data + errors | `{"data": ..., "errors": ...}` | Only one or the other | +| Missing pointer | `{"detail": "Invalid"}` | `{"detail": "...", "source": {"pointer": "..."}}` | + +--- + +## Relationship Updates + +### To-One Relationship + +```http +PATCH /api/v1/providers/123/relationships/tenant +Content-Type: application/vnd.api+json + +{"data": {"type": "tenants", "id": "456"}} +``` + +To clear: `{"data": null}` + +### To-Many Relationship + +| Operation | Method | Body | +|-----------|--------|------| +| Replace all | PATCH | `{"data": [{...}, {...}]}` | +| Add members | POST | `{"data": [{...}]}` | +| Remove members | DELETE | `{"data": [{...}]}` | + +--- + +## Compound Documents (`include`) + +When using `?include=provider`: + +```json +{ + "data": { + "type": "scans", + "id": "...", + "relationships": { + "provider": { + "data": {"type": "providers", "id": "prov-123"} + } + } + }, + "included": [ + { + "type": "providers", + "id": "prov-123", + "attributes": {"alias": "Production"} + } + ] +} +``` + +### Rules + +- Every included resource MUST be reachable via relationship chain from primary data +- MUST NOT include orphan resources +- MUST NOT duplicate resources (same type+id) + +--- + +## Spec Reference + +- **Full Specification**: https://jsonapi.org/format/ +- **Implementation**: Use `django-drf` skill for DRF-specific patterns +- **Testing**: Use `prowler-test-api` skill for test patterns diff --git a/skills/prowler-api/SKILL.md b/skills/prowler-api/SKILL.md index a0202bd783..1e654a3e03 100644 --- a/skills/prowler-api/SKILL.md +++ b/skills/prowler-api/SKILL.md @@ -1,29 +1,205 @@ --- name: prowler-api description: > - Prowler API patterns: JSON:API, RLS, RBAC, providers, Celery tasks. - Trigger: When working in api/ on models/serializers/viewsets/filters/tasks involving tenant isolation (RLS), RBAC, JSON:API, or provider lifecycle. + Prowler API patterns: RLS, RBAC, providers, Celery tasks. + Trigger: When working in api/ on models/serializers/viewsets/filters/tasks involving tenant isolation (RLS), RBAC, or provider lifecycle. license: Apache-2.0 metadata: author: prowler-cloud - version: "1.0" + version: "1.2.0" scope: [root, api] auto_invoke: "Creating/modifying models, views, serializers" allowed-tools: Read, Edit, Write, Glob, Grep, Bash, WebFetch, WebSearch, Task --- +## When to Use + +Use this skill for **Prowler-specific** patterns: +- Row-Level Security (RLS) / tenant isolation +- RBAC permissions and role checks +- Provider lifecycle and validation +- Celery tasks with tenant context +- Multi-database architecture (4-database setup) + +For **generic DRF patterns** (ViewSets, Serializers, Filters, JSON:API), use `django-drf` skill. + +--- + ## Critical Rules - ALWAYS use `rls_transaction(tenant_id)` when querying outside ViewSet context - ALWAYS use `get_role()` before checking permissions (returns FIRST role only) -- NEVER access `Provider.objects` without RLS context in Celery tasks - ALWAYS use `@set_tenant` then `@handle_provider_deletion` decorator order +- ALWAYS use explicit through models for M2M relationships (required for RLS) +- NEVER access `Provider.objects` without RLS context in Celery tasks +- NEVER bypass RLS by using raw SQL or `connection.cursor()` +- NEVER use Django's default M2M - RLS requires through models with `tenant_id` + +> **Note**: `rls_transaction()` accepts both UUID objects and strings - it converts internally via `str(value)`. --- -## 1. Providers (10 Supported) +## Architecture Overview -UID validation is dynamic: `getattr(self, f"validate_{self.provider}_uid")(self.uid)` +### 4-Database Architecture + +| Database | Alias | Purpose | RLS | +|----------|-------|---------|-----| +| `default` | `prowler_user` | Standard API queries | **Yes** | +| `admin` | `admin` | Migrations, auth bypass | No | +| `replica` | `prowler_user` | Read-only queries | **Yes** | +| `admin_replica` | `admin` | Admin read replica | No | + +```python +# When to use admin (bypasses RLS) +from api.db_router import MainRouter +User.objects.using(MainRouter.admin_db).get(id=user_id) # Auth lookups + +# Standard queries use default (RLS enforced) +Provider.objects.filter(connected=True) # Requires rls_transaction context +``` + +### RLS Transaction Flow + +``` +Request → Authentication → BaseRLSViewSet.initial() + │ + ├─ Extract tenant_id from JWT + ├─ SET api.tenant_id = 'uuid' (PostgreSQL) + └─ All queries now tenant-scoped +``` + +--- + +## Implementation Checklist + +When implementing Prowler-specific API features: + +| # | Pattern | Reference | Key Points | +|---|---------|-----------|------------| +| 1 | **RLS Models** | `api/rls.py` | Inherit `RowLevelSecurityProtectedModel`, add constraint | +| 2 | **RLS Transactions** | `api/db_utils.py` | Use `rls_transaction(tenant_id)` context manager | +| 3 | **RBAC Permissions** | `api/rbac/permissions.py` | `get_role()`, `get_providers()`, `Permissions` enum | +| 4 | **Provider Validation** | `api/models.py` | `validate__uid()` methods on `Provider` model | +| 5 | **Celery Tasks** | `tasks/tasks.py`, `api/decorators.py`, `config/celery.py` | Task definitions, decorators (`@set_tenant`, `@handle_provider_deletion`), `RLSTask` base | +| 6 | **RLS Serializers** | `api/v1/serializers.py` | Inherit `RLSSerializer` to auto-inject `tenant_id` | +| 7 | **Through Models** | `api/models.py` | ALL M2M must use explicit through with `tenant_id` | + +> **Full file paths**: See [references/file-locations.md](references/file-locations.md) + +--- + +## Decision Trees + +### Which Base Model? +``` +Tenant-scoped data → RowLevelSecurityProtectedModel +Global/shared data → models.Model + BaseSecurityConstraint (rare) +Partitioned time-series → PostgresPartitionedModel + RowLevelSecurityProtectedModel +Soft-deletable → Add is_deleted + ActiveProviderManager +``` + +### Which Manager? +``` +Normal queries → Model.objects (excludes deleted) +Include deleted records → Model.all_objects +Celery task context → Must use rls_transaction() first +``` + +### Which Database? +``` +Standard API queries → default (automatic via ViewSet) +Read-only operations → replica (automatic for GET in BaseRLSViewSet) +Auth/admin operations → MainRouter.admin_db +Cross-tenant lookups → MainRouter.admin_db (use sparingly!) +``` + +### Celery Task Decorator Order? +``` +@shared_task(base=RLSTask, name="...", queue="...") +@set_tenant # First: sets tenant context +@handle_provider_deletion # Second: handles deleted providers +def my_task(tenant_id, provider_id): + pass +``` + +--- + +## RLS Model Pattern + +```python +from api.rls import RowLevelSecurityProtectedModel, RowLevelSecurityConstraint + +class MyModel(RowLevelSecurityProtectedModel): + # tenant FK inherited from parent + id = models.UUIDField(primary_key=True, default=uuid4, editable=False) + name = models.CharField(max_length=255) + inserted_at = models.DateTimeField(auto_now_add=True, editable=False) + updated_at = models.DateTimeField(auto_now=True, editable=False) + + class Meta(RowLevelSecurityProtectedModel.Meta): + db_table = "my_models" + constraints = [ + RowLevelSecurityConstraint( + field="tenant_id", + name="rls_on_%(class)s", + statements=["SELECT", "INSERT", "UPDATE", "DELETE"], + ), + ] + + class JSONAPIMeta: + resource_name = "my-models" +``` + +### M2M Relationships (MUST use through models) + +```python +class Resource(RowLevelSecurityProtectedModel): + tags = models.ManyToManyField( + ResourceTag, + through="ResourceTagMapping", # REQUIRED for RLS + ) + +class ResourceTagMapping(RowLevelSecurityProtectedModel): + # Through model MUST have tenant_id for RLS + resource = models.ForeignKey(Resource, on_delete=models.CASCADE) + tag = models.ForeignKey(ResourceTag, on_delete=models.CASCADE) + + class Meta: + constraints = [ + RowLevelSecurityConstraint( + field="tenant_id", + name="rls_on_%(class)s", + statements=["SELECT", "INSERT", "UPDATE", "DELETE"], + ), + ] +``` + +--- + +## Async Task Response Pattern (202 Accepted) + +For long-running operations, return 202 with task reference: + +```python +@action(detail=True, methods=["post"], url_name="connection") +def connection(self, request, pk=None): + with transaction.atomic(): + task = check_provider_connection_task.delay( + provider_id=pk, tenant_id=self.request.tenant_id + ) + prowler_task = Task.objects.get(id=task.id) + serializer = TaskSerializer(prowler_task) + return Response( + data=serializer.data, + status=status.HTTP_202_ACCEPTED, + headers={"Content-Location": reverse("task-detail", kwargs={"pk": prowler_task.id})} + ) +``` + +--- + +## Providers (11 Supported) | Provider | UID Format | Example | |----------|-----------|---------| @@ -42,98 +218,288 @@ UID validation is dynamic: `getattr(self, f"validate_{self.provider}_uid")(self. --- -## 2. Row-Level Security (RLS) +## RBAC Permissions + +| Permission | Controls | +|------------|----------| +| `MANAGE_USERS` | User CRUD, role assignments | +| `MANAGE_ACCOUNT` | Tenant settings | +| `MANAGE_BILLING` | Billing/subscription | +| `MANAGE_PROVIDERS` | Provider CRUD | +| `MANAGE_INTEGRATIONS` | Integration config | +| `MANAGE_SCANS` | Scan execution | +| `UNLIMITED_VISIBILITY` | See all providers (bypasses provider_groups) | + +### RBAC Visibility Pattern ```python -from api.db_utils import rls_transaction - -with rls_transaction(tenant_id): - providers = Provider.objects.filter(connected=True) - # PostgreSQL enforces tenant_id automatically -``` - -Models inherit from `RowLevelSecurityProtectedModel` with `RowLevelSecurityConstraint`. - ---- - -## 3. Managers - -```python -Provider.objects.all() # Only is_deleted=False -Provider.all_objects.all() # All including deleted -Finding.objects.all() # Only from active providers +def get_queryset(self): + user_role = get_role(self.request.user) + if user_role.unlimited_visibility: + return Model.objects.filter(tenant_id=self.request.tenant_id) + else: + # Filter by provider_groups assigned to role + return Model.objects.filter(provider__in=get_providers(user_role)) ``` --- -## 4. RBAC +## Celery Queues -```python -from api.rbac.permissions import get_role, get_providers, Permissions - -user_role = get_role(self.request.user) # Returns FIRST role only - -if user_role.unlimited_visibility: - queryset = Provider.objects.filter(tenant_id=tenant_id) -else: - queryset = get_providers(user_role) # Filtered by provider_groups -``` - -**Permissions**: `MANAGE_USERS`, `MANAGE_ACCOUNT`, `MANAGE_BILLING`, `MANAGE_PROVIDERS`, `MANAGE_INTEGRATIONS`, `MANAGE_SCANS`, `UNLIMITED_VISIBILITY` +| Queue | Purpose | +|-------|---------| +| `scans` | Prowler scan execution | +| `overview` | Dashboard aggregations (severity, attack surface) | +| `compliance` | Compliance report generation | +| `integrations` | External integrations (Jira, S3, Security Hub) | +| `deletion` | Provider/tenant deletion (async) | +| `backfill` | Historical data backfill operations | +| `scan-reports` | Output generation (CSV, JSON, HTML, PDF) | --- -## 5. Celery Tasks +## Task Composition (Canvas) + +Use Celery's Canvas primitives for complex workflows: + +| Primitive | Use For | +|-----------|---------| +| `chain()` | Sequential execution: A → B → C | +| `group()` | Parallel execution: A, B, C simultaneously | +| Combined | Chain with nested groups for complex workflows | + +> **Note:** Use `.si()` (signature immutable) to prevent result passing. Use `.s()` if you need to pass results. + +> **Examples:** See [assets/celery_patterns.py](assets/celery_patterns.py) for chain, group, and combined patterns. + +--- + +## Beat Scheduling (Periodic Tasks) + +| Operation | Key Points | +|-----------|------------| +| **Create schedule** | `IntervalSchedule.objects.get_or_create(every=24, period=HOURS)` | +| **Create periodic task** | Use task name (not function), `kwargs=json.dumps(...)` | +| **Delete scheduled task** | `PeriodicTask.objects.filter(name=...).delete()` | +| **Avoid race conditions** | Use `countdown=5` to wait for DB commit | + +> **Examples:** See [assets/celery_patterns.py](assets/celery_patterns.py) for schedule_provider_scan pattern. + +--- + +## Advanced Task Patterns + +### `@set_tenant` Behavior + +| Mode | `tenant_id` in kwargs | `tenant_id` passed to function | +|------|----------------------|-------------------------------| +| `@set_tenant` (default) | Popped (removed) | NO - function doesn't receive it | +| `@set_tenant(keep_tenant=True)` | Read but kept | YES - function receives it | + +### Key Patterns + +| Pattern | Description | +|---------|-------------| +| `bind=True` | Access `self.request.id`, `self.request.retries` | +| `get_task_logger(__name__)` | Proper logging in Celery tasks | +| `SoftTimeLimitExceeded` | Catch to save progress before hard kill | +| `countdown=30` | Defer execution by N seconds | +| `eta=datetime(...)` | Execute at specific time | + +> **Examples:** See [assets/celery_patterns.py](assets/celery_patterns.py) for all advanced patterns. + +--- + +## Celery Configuration + +| Setting | Value | Purpose | +|---------|-------|---------| +| `BROKER_VISIBILITY_TIMEOUT` | `86400` (24h) | Prevent re-queue for long tasks | +| `CELERY_RESULT_BACKEND` | `django-db` | Store results in PostgreSQL | +| `CELERY_TASK_TRACK_STARTED` | `True` | Track when tasks start | +| `soft_time_limit` | Task-specific | Raises `SoftTimeLimitExceeded` | +| `time_limit` | Task-specific | Hard kill (SIGKILL) | + +> **Full config:** See [assets/celery_patterns.py](assets/celery_patterns.py) and actual files at `config/celery.py`, `config/settings/celery.py`. + +--- + +## UUIDv7 for Partitioned Tables + +`Finding` and `ResourceFindingMapping` use UUIDv7 for time-based partitioning: ```python -@shared_task(base=RLSTask, name="task-name", queue="scans") +from uuid6 import uuid7 +from api.uuid_utils import uuid7_start, uuid7_end, datetime_to_uuid7 + +# Partition-aware filtering +start = uuid7_start(datetime_to_uuid7(date_from)) +end = uuid7_end(datetime_to_uuid7(date_to), settings.FINDINGS_TABLE_PARTITION_MONTHS) +queryset.filter(id__gte=start, id__lt=end) +``` + +**Why UUIDv7?** Time-ordered UUIDs enable PostgreSQL to prune partitions during range queries. + +--- + +## Batch Operations with RLS + +```python +from api.db_utils import batch_delete, create_objects_in_batches, update_objects_in_batches + +# Delete in batches (RLS-aware) +batch_delete(tenant_id, queryset, batch_size=1000) + +# Bulk create with RLS +create_objects_in_batches(tenant_id, Finding, objects, batch_size=500) + +# Bulk update with RLS +update_objects_in_batches(tenant_id, Finding, objects, fields=["status"], batch_size=500) +``` + +--- + +## Security Patterns + +> **Full examples**: See [assets/security_patterns.py](assets/security_patterns.py) + +### Tenant Isolation Summary + +| Pattern | Rule | +|---------|------| +| **RLS in ViewSets** | Automatic via `BaseRLSViewSet` - tenant_id from JWT | +| **RLS in Celery** | MUST use `@set_tenant` + `rls_transaction(tenant_id)` | +| **Cross-tenant validation** | Defense-in-depth: verify `obj.tenant_id == request.tenant_id` | +| **Never trust user input** | Use `request.tenant_id` from JWT, never `request.data.get("tenant_id")` | +| **Admin DB bypass** | Only for cross-tenant admin ops - exposes ALL tenants' data | + +### Celery Task Security Summary + +| Pattern | Rule | +|---------|------| +| **Named tasks only** | NEVER use dynamic task names from user input | +| **Validate arguments** | Check UUID format before database queries | +| **Safe queuing** | Use `transaction.on_commit()` to enqueue AFTER commit | +| **Modern retries** | Use `autoretry_for`, `retry_backoff`, `retry_jitter` | +| **Time limits** | Set `soft_time_limit` and `time_limit` to prevent hung tasks | +| **Idempotency** | Use `update_or_create` or idempotency keys | + +### Quick Reference + +```python +# Safe task queuing - task only enqueued after transaction commits +with transaction.atomic(): + provider = Provider.objects.create(**data) + transaction.on_commit( + lambda: verify_provider_connection.delay( + tenant_id=str(request.tenant_id), + provider_id=str(provider.id) + ) + ) + +# Modern retry pattern +@shared_task( + base=RLSTask, + bind=True, + autoretry_for=(ConnectionError, TimeoutError, OperationalError), + retry_backoff=True, + retry_backoff_max=600, + retry_jitter=True, + max_retries=5, + soft_time_limit=300, + time_limit=360, +) @set_tenant -@handle_provider_deletion -def my_task(tenant_id: str, provider_id: str): - pass -``` +def sync_provider_data(self, tenant_id, provider_id): + with rls_transaction(tenant_id): + # ... task logic + pass -**Queues**: Check `tasks/tasks.py`. Common: `scans`, `overview`, `compliance`, `integrations`. - -**Orchestration**: Use `chain()` for sequential, `group()` for parallel. - ---- - -## 6. JSON:API Format - -```python -content_type = "application/vnd.api+json" - -# Request -{"data": {"type": "providers", "attributes": {"provider": "aws", "uid": "123456789012"}}} - -# Response access -response.json()["data"]["attributes"]["alias"] +# Idempotent task - safe to retry +@shared_task(base=RLSTask, acks_late=True) +@set_tenant +def process_finding(tenant_id, finding_uid, data): + with rls_transaction(tenant_id): + Finding.objects.update_or_create(uid=finding_uid, defaults=data) ``` --- -## 7. Serializers +## Production Deployment Checklist -| Pattern | Usage | -|---------|-------| -| `ProviderSerializer` | Read (list/retrieve) | -| `ProviderCreateSerializer` | POST | -| `ProviderUpdateSerializer` | PATCH | -| `RLSSerializer` | Auto-injects tenant_id | +> **Full settings**: See [references/production-settings.md](references/production-settings.md) + +Run before every production deployment: + +```bash +cd api && poetry run python src/backend/manage.py check --deploy +``` + +### Critical Settings + +| Setting | Production Value | Risk if Wrong | +|---------|-----------------|---------------| +| `DEBUG` | `False` | Exposes stack traces, settings, SQL queries | +| `SECRET_KEY` | Env var, rotated | Session hijacking, CSRF bypass | +| `ALLOWED_HOSTS` | Explicit list | Host header attacks | +| `SECURE_SSL_REDIRECT` | `True` | Credentials sent over HTTP | +| `SESSION_COOKIE_SECURE` | `True` | Session cookies over HTTP | +| `CSRF_COOKIE_SECURE` | `True` | CSRF tokens over HTTP | +| `SECURE_HSTS_SECONDS` | `31536000` (1 year) | Downgrade attacks | +| `CONN_MAX_AGE` | `60` or higher | Connection pool exhaustion | --- ## Commands ```bash -cd api && poetry run python manage.py migrate # Run migrations -cd api && poetry run python manage.py shell # Django shell -cd api && poetry run celery -A config.celery worker -l info # Start worker +# Development +cd api && poetry run python src/backend/manage.py runserver +cd api && poetry run python src/backend/manage.py shell + +# Celery +cd api && poetry run celery -A config.celery worker -l info -Q scans,overview +cd api && poetry run celery -A config.celery beat -l info + +# Testing +cd api && poetry run pytest -x --tb=short + +# Production checks +cd api && poetry run python src/backend/manage.py check --deploy ``` --- ## Resources -- **Documentation**: See [references/api-docs.md](references/api-docs.md) for local file paths and documentation +### Local References +- **File Locations**: See [references/file-locations.md](references/file-locations.md) +- **Modeling Decisions**: See [references/modeling-decisions.md](references/modeling-decisions.md) +- **Configuration**: See [references/configuration.md](references/configuration.md) +- **Production Settings**: See [references/production-settings.md](references/production-settings.md) +- **Security Patterns**: See [assets/security_patterns.py](assets/security_patterns.py) + +### Related Skills +- **Generic DRF Patterns**: Use `django-drf` skill +- **API Testing**: Use `prowler-test-api` skill + +### Context7 MCP (Recommended) + +**Prerequisite:** Install Context7 MCP server for up-to-date documentation lookup. + +When implementing or debugging Prowler-specific patterns, query these libraries via `mcp_context7_query-docs`: + +| Library | Context7 ID | Use For | +|---------|-------------|---------| +| **Celery** | `/websites/celeryq_dev_en_stable` | Task patterns, queues, error handling | +| **django-celery-beat** | `/celery/django-celery-beat` | Periodic task scheduling | +| **Django** | `/websites/djangoproject_en_5_2` | Models, ORM, constraints, indexes | + +**Example queries:** +``` +mcp_context7_query-docs(libraryId="/websites/celeryq_dev_en_stable", query="shared_task decorator retry patterns") +mcp_context7_query-docs(libraryId="/celery/django-celery-beat", query="periodic task database scheduler") +mcp_context7_query-docs(libraryId="/websites/djangoproject_en_5_2", query="model constraints CheckConstraint UniqueConstraint") +``` + +> **Note:** Use `mcp_context7_resolve-library-id` first if you need to find the correct library ID. diff --git a/skills/prowler-api/assets/celery_patterns.py b/skills/prowler-api/assets/celery_patterns.py new file mode 100644 index 0000000000..700e63891b --- /dev/null +++ b/skills/prowler-api/assets/celery_patterns.py @@ -0,0 +1,319 @@ +# Prowler API - Celery Patterns Reference +# Reference for prowler-api skill + +from datetime import datetime, timedelta, timezone +import json + +from celery import chain, group, shared_task +from celery.exceptions import SoftTimeLimitExceeded +from celery.utils.log import get_task_logger +from django.db import OperationalError, transaction +from django_celery_beat.models import IntervalSchedule, PeriodicTask + +from api.db_utils import rls_transaction +from api.decorators import handle_provider_deletion, set_tenant +from api.models import Provider, Scan +from config.celery import RLSTask + +logger = get_task_logger(__name__) + + +# ============================================================================= +# DECORATOR ORDER - CRITICAL +# ============================================================================= +# @shared_task() must be first +# @set_tenant must be second (sets RLS context) +# @handle_provider_deletion must be third (handles deleted providers) + + +# ============================================================================= +# @set_tenant BEHAVIOR +# ============================================================================= + + +# Example: @set_tenant (default) - tenant_id NOT in function signature +# The decorator pops tenant_id from kwargs after setting RLS context +@shared_task(base=RLSTask, name="provider-connection-check") +@set_tenant +def check_provider_connection_task(provider_id: str): + """Task receives NO tenant_id param - decorator pops it from kwargs.""" + # RLS context already set by decorator + with rls_transaction(): # Context already established + provider = Provider.objects.get(pk=provider_id) + return {"connected": provider.connected} + + +# Example: @set_tenant(keep_tenant=True) - tenant_id IN function signature +@shared_task(base=RLSTask, name="scan-report", queue="scan-reports") +@set_tenant(keep_tenant=True) +def generate_outputs_task(scan_id: str, provider_id: str, tenant_id: str): + """Task receives tenant_id param - use when function needs it.""" + # Can use tenant_id in function body + with rls_transaction(tenant_id): + scan = Scan.objects.get(pk=scan_id) + # ... generate outputs + return {"scan_id": scan_id, "tenant_id": tenant_id} + + +# ============================================================================= +# TASK COMPOSITION (CANVAS) +# ============================================================================= + + +# Chain: Sequential execution - A → B → C +def example_chain(tenant_id: str): + """Tasks run one after another.""" + chain( + task_a.si(tenant_id=tenant_id), + task_b.si(tenant_id=tenant_id), + task_c.si(tenant_id=tenant_id), + ).apply_async() + + +# Group: Parallel execution - A, B, C simultaneously +def example_group(tenant_id: str): + """Tasks run at the same time.""" + group( + task_a.si(tenant_id=tenant_id), + task_b.si(tenant_id=tenant_id), + task_c.si(tenant_id=tenant_id), + ).apply_async() + + +# Combined: Real pattern from Prowler (post-scan workflow) +def post_scan_workflow(tenant_id: str, scan_id: str, provider_id: str): + """Chain with nested groups for complex workflows.""" + chain( + # First: Summary + perform_scan_summary_task.si(tenant_id=tenant_id, scan_id=scan_id), + # Then: Parallel aggregation + outputs + group( + aggregate_daily_severity_task.si(tenant_id=tenant_id, scan_id=scan_id), + generate_outputs_task.si( + scan_id=scan_id, provider_id=provider_id, tenant_id=tenant_id + ), + ), + # Finally: Parallel compliance + integrations + group( + generate_compliance_reports_task.si( + tenant_id=tenant_id, scan_id=scan_id, provider_id=provider_id + ), + check_integrations_task.si( + tenant_id=tenant_id, provider_id=provider_id, scan_id=scan_id + ), + ), + ).apply_async() + + +# Note: Use .si() (signature immutable) to prevent result passing. +# Use .s() if you need to pass results between tasks. + + +# ============================================================================= +# BEAT SCHEDULING (PERIODIC TASKS) +# ============================================================================= + + +def schedule_provider_scan(provider_id: str, tenant_id: str): + """Create a periodic task that runs every 24 hours.""" + # 1. Create or get the schedule + schedule, _ = IntervalSchedule.objects.get_or_create( + every=24, + period=IntervalSchedule.HOURS, + ) + + # 2. Create the periodic task + PeriodicTask.objects.create( + interval=schedule, + name=f"scan-perform-scheduled-{provider_id}", # Unique name + task="scan-perform-scheduled", # Task name (not function name) + kwargs=json.dumps( + { + "tenant_id": str(tenant_id), + "provider_id": str(provider_id), + } + ), + one_off=False, + start_time=datetime.now(timezone.utc) + timedelta(hours=24), + ) + + +def delete_scheduled_scan(provider_id: str): + """Remove a periodic task.""" + PeriodicTask.objects.filter(name=f"scan-perform-scheduled-{provider_id}").delete() + + +# Avoiding race conditions with countdown +def schedule_with_countdown(provider_id: str, tenant_id: str): + """Use countdown to ensure DB transaction commits before task runs.""" + perform_scheduled_scan_task.apply_async( + kwargs={"tenant_id": tenant_id, "provider_id": provider_id}, + countdown=5, # Wait 5 seconds + ) + + +# ============================================================================= +# ADVANCED TASK PATTERNS +# ============================================================================= + + +# bind=True - Access task metadata +@shared_task(base=RLSTask, bind=True, name="scan-perform-scheduled", queue="scans") +@set_tenant(keep_tenant=True) +def perform_scheduled_scan_task(self, tenant_id: str, provider_id: str): + """bind=True provides access to self.request for task metadata.""" + task_id = self.request.id # Current task ID + retries = self.request.retries # Number of retries so far + + with rls_transaction(tenant_id): + scan = Scan.objects.create( + provider_id=provider_id, + task_id=task_id, # Track which task started this scan + ) + return {"scan_id": str(scan.id), "task_id": task_id} + + +# get_task_logger - Proper logging in Celery tasks +@shared_task(base=RLSTask, name="my-task") +@set_tenant +def my_task_with_logging(provider_id: str): + """Always use get_task_logger for Celery task logging.""" + logger.info(f"Processing provider {provider_id}") + logger.warning("Potential issue detected") + logger.error("Failed to process") + + # Called with tenant_id in kwargs (decorator handles it) + # my_task_with_logging.delay(provider_id="...", tenant_id="...") + + +# SoftTimeLimitExceeded - Graceful timeout handling +@shared_task( + base=RLSTask, + soft_time_limit=300, # 5 minutes - raises SoftTimeLimitExceeded + time_limit=360, # 6 minutes - hard kill (SIGKILL) +) +@set_tenant(keep_tenant=True) +def long_running_task(tenant_id: str, scan_id: str): + """Handle soft time limits gracefully to save progress.""" + try: + with rls_transaction(tenant_id): + for batch in get_large_dataset(): + process_batch(batch) + except SoftTimeLimitExceeded: + logger.warning(f"Task soft limit exceeded for scan {scan_id}, saving progress...") + save_partial_progress(scan_id) + raise # Re-raise to mark task as failed + + +# Deferred execution - countdown and eta +def deferred_examples(): + """Execute tasks at specific times.""" + # Execute after 30 seconds + my_task.apply_async(kwargs={"provider_id": "..."}, countdown=30) + + # Execute at specific time + my_task.apply_async( + kwargs={"provider_id": "..."}, + eta=datetime(2024, 1, 15, 10, 0, tzinfo=timezone.utc), + ) + + +# ============================================================================= +# CELERY CONFIGURATION (config/celery.py) +# ============================================================================= + +# Example configuration - see actual file for full config +""" +from celery import Celery + +celery_app = Celery("tasks") +celery_app.config_from_object("django.conf:settings", namespace="CELERY") + +# Visibility timeout - CRITICAL for long-running tasks +# If task takes longer than this, broker assumes worker died and re-queues +BROKER_VISIBILITY_TIMEOUT = 86400 # 24 hours for scan tasks + +celery_app.conf.broker_transport_options = { + "visibility_timeout": BROKER_VISIBILITY_TIMEOUT +} +celery_app.conf.result_backend_transport_options = { + "visibility_timeout": BROKER_VISIBILITY_TIMEOUT +} + +# Result settings +celery_app.conf.update( + result_extended=True, # Store additional task metadata + result_expires=None, # Never expire results (we manage cleanup) +) +""" + +# Django settings (config/settings/celery.py) +""" +CELERY_BROKER_URL = f"redis://{VALKEY_HOST}:{VALKEY_PORT}/{VALKEY_DB}" +CELERY_RESULT_BACKEND = "django-db" # Store results in PostgreSQL +CELERY_TASK_TRACK_STARTED = True # Track when tasks start +CELERY_BROKER_CONNECTION_RETRY_ON_STARTUP = True + +# Global time limits (optional) +CELERY_TASK_SOFT_TIME_LIMIT = 3600 # 1 hour soft limit +CELERY_TASK_TIME_LIMIT = 3660 # 1 hour + 1 minute hard limit +""" + + +# ============================================================================= +# ASYNC TASK RESPONSE PATTERN (202 Accepted) +# ============================================================================= + + +class ProviderViewSetExample: + """Example: Return 202 for long-running operations.""" + + def connection(self, request, pk=None): + """Trigger async connection check, return 202 with task location.""" + from django.urls import reverse + from rest_framework import status + from rest_framework.response import Response + + from api.models import Task + from api.v1.serializers import TaskSerializer + + with transaction.atomic(): + task = check_provider_connection_task.delay( + provider_id=pk, tenant_id=self.request.tenant_id + ) + prowler_task = Task.objects.get(id=task.id) + serializer = TaskSerializer(prowler_task) + return Response( + data=serializer.data, + status=status.HTTP_202_ACCEPTED, + headers={ + "Content-Location": reverse("task-detail", kwargs={"pk": prowler_task.id}) + }, + ) + + +# ============================================================================= +# PLACEHOLDERS (would exist in real codebase) +# ============================================================================= + +task_a = None +task_b = None +task_c = None +perform_scan_summary_task = None +aggregate_daily_severity_task = None +generate_compliance_reports_task = None +check_integrations_task = None +perform_scheduled_scan_task = None +my_task = None + + +def get_large_dataset(): + return [] + + +def process_batch(batch): + pass + + +def save_partial_progress(scan_id): + pass diff --git a/skills/prowler-api/assets/security_patterns.py b/skills/prowler-api/assets/security_patterns.py new file mode 100644 index 0000000000..981d78afbd --- /dev/null +++ b/skills/prowler-api/assets/security_patterns.py @@ -0,0 +1,207 @@ +# Example: Prowler API Security Patterns +# Reference for prowler-api skill + +import uuid + +from celery import shared_task +from celery.exceptions import SoftTimeLimitExceeded +from django.db import OperationalError, transaction +from rest_framework.exceptions import PermissionDenied + +from api.db_utils import rls_transaction +from api.decorators import handle_provider_deletion, set_tenant +from api.models import Finding, Provider +from api.rls import Tenant +from tasks.base import RLSTask + +# ============================================================================= +# TENANT ISOLATION (RLS) +# ============================================================================= + + +class ProviderViewSet: + """Example: RLS context set automatically by BaseRLSViewSet.""" + + def get_queryset(self): + # RLS already filters by tenant_id from JWT + # All queries are automatically tenant-scoped + return Provider.objects.all() + + +@shared_task(base=RLSTask) +@set_tenant +def process_scan_good(tenant_id, scan_id): + """GOOD: Explicit RLS context in Celery tasks.""" + with rls_transaction(tenant_id): + # RLS enforced - only sees tenant's data + scan = Scan.objects.get(id=scan_id) + return scan + + +def dangerous_function(provider_id): + """BAD: Bypassing RLS with admin database - exposes ALL tenants' data!""" + # NEVER do this unless absolutely necessary for cross-tenant admin ops + provider = Provider.objects.using("admin").get(id=provider_id) + return provider + + +# ============================================================================= +# CROSS-TENANT DATA LEAKAGE PREVENTION +# ============================================================================= + + +class SecureViewSet: + """Example: Defense-in-depth tenant validation.""" + + def get_object(self): + obj = super().get_object() + # Defense-in-depth: verify tenant even though RLS should filter + if obj.tenant_id != self.request.tenant_id: + raise PermissionDenied("Access denied") + return obj + + def create_good(self, request): + """GOOD: Use tenant from authenticated JWT.""" + serializer = self.get_serializer(data=request.data) + serializer.is_valid(raise_exception=True) + serializer.save(tenant_id=request.tenant_id) + + def create_bad(self, request): + """BAD: Trust user input for tenant_id.""" + serializer = self.get_serializer(data=request.data) + serializer.is_valid(raise_exception=True) + # NEVER trust user-provided tenant_id! + serializer.save(tenant_id=request.data.get("tenant_id")) + + +# ============================================================================= +# CELERY TASK SECURITY +# ============================================================================= + + +@shared_task(base=RLSTask) +@set_tenant +def process_provider(tenant_id, provider_id): + """Example: Validate task arguments before processing.""" + # Validate UUID format before database query + try: + uuid.UUID(provider_id) + except ValueError: + # Log and return - don't expose error details + return {"error": "Invalid provider_id format"} + + with rls_transaction(tenant_id): + # Now safe to query + provider = Provider.objects.get(id=provider_id) + return {"provider": str(provider.id)} + + +def send_task_bad(user_provided_task_name, args): + """BAD: Dynamic task names from user input = arbitrary code execution.""" + from celery import current_app + + # NEVER do this! + current_app.send_task(user_provided_task_name, args=args) + + +# ============================================================================= +# SAFE TASK QUEUING WITH TRANSACTIONS +# ============================================================================= + + +def create_provider_good(request, data): + """GOOD: Task only enqueued AFTER transaction commits.""" + with transaction.atomic(): + provider = Provider.objects.create(**data) + # Task enqueued only if transaction succeeds + transaction.on_commit( + lambda: verify_provider_connection.delay( + tenant_id=str(request.tenant_id), provider_id=str(provider.id) + ) + ) + return provider + + +def create_provider_bad(request, data): + """BAD: Task enqueued before transaction commits - race condition!""" + with transaction.atomic(): + provider = Provider.objects.create(**data) + # Task might run before transaction commits! + # If transaction rolls back, task processes non-existent data + verify_provider_connection.delay(provider_id=str(provider.id)) + return provider + + +# ============================================================================= +# MODERN CELERY RETRY PATTERNS +# ============================================================================= + + +@shared_task( + base=RLSTask, + bind=True, + # Automatic retry for transient errors + autoretry_for=(ConnectionError, TimeoutError, OperationalError), + retry_backoff=True, # Exponential: 1s, 2s, 4s, 8s... + retry_backoff_max=600, # Cap at 10 minutes + retry_jitter=True, # Randomize to prevent thundering herd + max_retries=5, + # Time limits prevent hung tasks + soft_time_limit=300, # 5 min: raises SoftTimeLimitExceeded + time_limit=360, # 6 min: hard kill +) +@set_tenant +def sync_provider_data(self, tenant_id, provider_id): + """Example: Modern retry pattern with time limits.""" + try: + with rls_transaction(tenant_id): + provider = Provider.objects.get(id=provider_id) + # ... sync logic + return {"status": "synced", "provider": str(provider.id)} + except SoftTimeLimitExceeded: + # Cleanup and exit gracefully + return {"status": "timeout", "provider": provider_id} + + +# ============================================================================= +# IDEMPOTENT TASK DESIGN +# ============================================================================= + + +@shared_task(base=RLSTask, acks_late=True) +@set_tenant +def process_finding_good(tenant_id, finding_uid, data): + """GOOD: Idempotent - safe to retry, uses upsert pattern.""" + with rls_transaction(tenant_id): + # update_or_create is idempotent - retry won't create duplicates + Finding.objects.update_or_create(uid=finding_uid, defaults=data) + + +@shared_task(base=RLSTask) +@set_tenant +def create_notification_bad(tenant_id, message): + """BAD: Non-idempotent - retry creates duplicates.""" + with rls_transaction(tenant_id): + # No dedup key - every retry creates a new notification! + Notification.objects.create(message=message) + + +@shared_task(base=RLSTask, acks_late=True) +@set_tenant +def send_notification_good(tenant_id, idempotency_key, message): + """GOOD: Idempotency key for non-upsertable operations.""" + with rls_transaction(tenant_id): + # Check if already processed + if ProcessedTask.objects.filter(key=idempotency_key).exists(): + return {"status": "already_processed"} + + Notification.objects.create(message=message) + ProcessedTask.objects.create(key=idempotency_key) + return {"status": "sent"} + + +# Placeholder for imports that would exist in real codebase +verify_provider_connection = None +Scan = None +Notification = None +ProcessedTask = None diff --git a/skills/prowler-api/references/api-docs.md b/skills/prowler-api/references/api-docs.md deleted file mode 100644 index 72bc7990fd..0000000000 --- a/skills/prowler-api/references/api-docs.md +++ /dev/null @@ -1,21 +0,0 @@ -# API Documentation - -## Local Documentation - -For API-related patterns, see: - -- `api/src/backend/api/models.py` - Models, Providers, UID validation -- `api/src/backend/api/v1/views.py` - ViewSets, RBAC patterns -- `api/src/backend/api/v1/serializers.py` - Serializers -- `api/src/backend/api/rbac/permissions.py` - RBAC functions -- `api/src/backend/tasks/tasks.py` - Celery tasks -- `api/src/backend/api/db_utils.py` - rls_transaction - -## Contents - -The documentation covers: -- Row-Level Security (RLS) implementation -- RBAC permission system -- Provider validation patterns -- Celery task orchestration -- JSON:API serialization format diff --git a/skills/prowler-api/references/configuration.md b/skills/prowler-api/references/configuration.md new file mode 100644 index 0000000000..dac5c735bf --- /dev/null +++ b/skills/prowler-api/references/configuration.md @@ -0,0 +1,282 @@ +# Prowler API Configuration Reference + +## Settings File Structure + +``` +api/src/backend/config/ +├── django/ +│ ├── base.py # Base settings (all environments) +│ ├── devel.py # Development overrides +│ ├── production.py # Production settings +│ └── testing.py # Test settings +├── settings/ +│ ├── celery.py # Celery broker/backend config +│ ├── partitions.py # Table partitioning settings +│ ├── sentry.py # Error tracking + exception filtering +│ └── social_login.py # OAuth/SAML providers +├── celery.py # Celery app instance + RLSTask +├── custom_logging.py # NDJSON/Human-readable formatters +├── env.py # django-environ setup +└── urls.py # Root URL config +``` + +--- + +## REST Framework Configuration + +### Complete `REST_FRAMEWORK` Settings + +```python +REST_FRAMEWORK = { + # Schema Generation (JSON:API compatible) + "DEFAULT_SCHEMA_CLASS": "drf_spectacular_jsonapi.schemas.openapi.JsonApiAutoSchema", + + # Authentication (JWT + API Key) + "DEFAULT_AUTHENTICATION_CLASSES": ( + "api.authentication.CombinedJWTOrAPIKeyAuthentication", + ), + + # Pagination + "PAGE_SIZE": 10, + "DEFAULT_PAGINATION_CLASS": "drf_spectacular_jsonapi.schemas.pagination.JsonApiPageNumberPagination", + + # Custom exception handler (JSON:API format) + "EXCEPTION_HANDLER": "api.exceptions.custom_exception_handler", + + # Parsers (JSON:API compatible) + "DEFAULT_PARSER_CLASSES": ( + "rest_framework_json_api.parsers.JSONParser", + "rest_framework.parsers.FormParser", + "rest_framework.parsers.MultiPartParser", + ), + + # Custom renderer with RLS context support + "DEFAULT_RENDERER_CLASSES": ("api.renderers.APIJSONRenderer",), + + # Metadata + "DEFAULT_METADATA_CLASS": "rest_framework_json_api.metadata.JSONAPIMetadata", + + # Filter Backends + "DEFAULT_FILTER_BACKENDS": ( + "rest_framework_json_api.filters.QueryParameterValidationFilter", + "rest_framework_json_api.filters.OrderingFilter", + "rest_framework_json_api.django_filters.backends.DjangoFilterBackend", + "rest_framework.filters.SearchFilter", + ), + + # JSON:API search parameter + "SEARCH_PARAM": "filter[search]", + + # Test settings + "TEST_REQUEST_RENDERER_CLASSES": ("rest_framework_json_api.renderers.JSONRenderer",), + "TEST_REQUEST_DEFAULT_FORMAT": "vnd.api+json", + + # Uniform exception format + "JSON_API_UNIFORM_EXCEPTIONS": True, + + # Throttling + "DEFAULT_THROTTLE_CLASSES": ["rest_framework.throttling.ScopedRateThrottle"], + "DEFAULT_THROTTLE_RATES": { + "token-obtain": env("DJANGO_THROTTLE_TOKEN_OBTAIN", default=None), + "dj_rest_auth": None, + }, +} +``` + +### Throttling Configuration + +| Scope | Environment Variable | Default | Format | +|-------|---------------------|---------|--------| +| `token-obtain` | `DJANGO_THROTTLE_TOKEN_OBTAIN` | `None` (disabled) | `"X/minute"`, `"X/hour"`, `"X/day"` | +| `dj_rest_auth` | N/A | `None` (disabled) | Same | + +**To enable throttling:** +```bash +DJANGO_THROTTLE_TOKEN_OBTAIN="10/minute" # Limit token endpoint to 10 requests/minute +``` + +--- + +## JWT Configuration (SIMPLE_JWT) + +```python +SIMPLE_JWT = { + # Token Lifetimes + "ACCESS_TOKEN_LIFETIME": timedelta(minutes=30), # DJANGO_ACCESS_TOKEN_LIFETIME + "REFRESH_TOKEN_LIFETIME": timedelta(minutes=1440), # DJANGO_REFRESH_TOKEN_LIFETIME (24h) + + # Token Rotation + "ROTATE_REFRESH_TOKENS": True, + "BLACKLIST_AFTER_ROTATION": True, + + # Cryptographic Settings + "ALGORITHM": "RS256", # Asymmetric (requires key pair) + "SIGNING_KEY": env.str("DJANGO_TOKEN_SIGNING_KEY", ""), + "VERIFYING_KEY": env.str("DJANGO_TOKEN_VERIFYING_KEY", ""), + + # JWT Claims + "TOKEN_TYPE_CLAIM": "typ", + "JTI_CLAIM": "jti", + "USER_ID_FIELD": "id", + "USER_ID_CLAIM": "sub", + + # Issuer/Audience + "AUDIENCE": env.str("DJANGO_JWT_AUDIENCE", "https://api.prowler.com"), + "ISSUER": env.str("DJANGO_JWT_ISSUER", "https://api.prowler.com"), + + # Custom Serializers + "TOKEN_OBTAIN_SERIALIZER": "api.serializers.TokenSerializer", + "TOKEN_REFRESH_SERIALIZER": "api.serializers.TokenRefreshSerializer", +} +``` + +--- + +## Database Configuration + +### 4-Database Architecture + +```python +DATABASES = { + "default": {...}, # Alias to prowler_user (RLS enabled) + "prowler_user": {...}, # RLS-enabled connection + "admin": {...}, # Admin connection (bypasses RLS) + "replica": {...}, # Read replica (RLS enabled) + "admin_replica": {...}, # Admin on replica + "neo4j": {...}, # Graph database (attack paths) +} +``` + +### Environment Variables + +| Variable | Default | Description | +|----------|---------|-------------| +| `POSTGRES_DB` | `prowler_db` | Database name | +| `POSTGRES_USER` | `prowler_user` | API user (RLS-constrained) | +| `POSTGRES_PASSWORD` | - | API user password | +| `POSTGRES_HOST` | `postgres-db` | Database host | +| `POSTGRES_PORT` | `5432` | Database port | +| `POSTGRES_ADMIN_USER` | `prowler` | Admin user (migrations) | +| `POSTGRES_ADMIN_PASSWORD` | - | Admin password | +| `POSTGRES_REPLICA_HOST` | - | Replica host (optional) | +| `POSTGRES_REPLICA_MAX_ATTEMPTS` | `3` | Retry attempts before fallback | +| `POSTGRES_REPLICA_RETRY_BASE_DELAY` | `0.5` | Base delay for exponential backoff | + +--- + +## Celery Configuration + +### Broker/Backend + +```python +VALKEY_HOST = env("VALKEY_HOST", default="valkey") +VALKEY_PORT = env("VALKEY_PORT", default="6379") +VALKEY_DB = env("VALKEY_DB", default="0") + +CELERY_BROKER_URL = f"redis://{VALKEY_HOST}:{VALKEY_PORT}/{VALKEY_DB}" +CELERY_RESULT_BACKEND = "django-db" # Store results in PostgreSQL +CELERY_TASK_TRACK_STARTED = True +CELERY_BROKER_CONNECTION_RETRY_ON_STARTUP = True +``` + +### Task Visibility + +| Variable | Default | Description | +|----------|---------|-------------| +| `DJANGO_BROKER_VISIBILITY_TIMEOUT` | `86400` (24h) | Task visibility timeout | +| `DJANGO_CELERY_DEADLOCK_ATTEMPTS` | `5` | Deadlock retry attempts | + +--- + +## Partitioning Configuration + +```python +PSQLEXTRA_PARTITIONING_MANAGER = "api.partitions.manager" +FINDINGS_TABLE_PARTITION_MONTHS = env.int("FINDINGS_TABLE_PARTITION_MONTHS", 1) +FINDINGS_TABLE_PARTITION_COUNT = env.int("FINDINGS_TABLE_PARTITION_COUNT", 7) +FINDINGS_TABLE_PARTITION_MAX_AGE_MONTHS = env.int("...", None) # Optional cleanup +``` + +--- + +## Application Settings + +| Variable | Default | Description | +|----------|---------|-------------| +| `DJANGO_DEBUG` | `False` | Debug mode | +| `DJANGO_ALLOWED_HOSTS` | `["localhost"]` | Allowed hosts | +| `DJANGO_CACHE_MAX_AGE` | `3600` | HTTP cache max-age | +| `DJANGO_STALE_WHILE_REVALIDATE` | `60` | Stale-while-revalidate time | +| `DJANGO_FINDINGS_MAX_DAYS_IN_RANGE` | `7` | Max days for findings date filter | +| `DJANGO_TMP_OUTPUT_DIRECTORY` | `/tmp/prowler_api_output` | Temp output directory | +| `DJANGO_FINDINGS_BATCH_SIZE` | `1000` | Batch size for findings export | +| `DJANGO_DELETION_BATCH_SIZE` | `5000` | Batch size for deletions | +| `DJANGO_LOGGING_LEVEL` | `INFO` | Log level | +| `DJANGO_LOGGING_FORMATTER` | `ndjson` | Log format (`ndjson` or `human_readable`) | + +--- + +## Social Login (OAuth/SAML) + +| Variable | Description | +|----------|-------------| +| `SOCIAL_GOOGLE_OAUTH_CLIENT_ID` | Google OAuth client ID | +| `SOCIAL_GOOGLE_OAUTH_CLIENT_SECRET` | Google OAuth secret | +| `SOCIAL_GITHUB_OAUTH_CLIENT_ID` | GitHub OAuth client ID | +| `SOCIAL_GITHUB_OAUTH_CLIENT_SECRET` | GitHub OAuth secret | + +--- + +## Monitoring + +| Variable | Description | +|----------|-------------| +| `DJANGO_SENTRY_DSN` | Sentry DSN for error tracking | + +--- + +## Middleware Stack (Order Matters) + +```python +MIDDLEWARE = [ + "django_guid.middleware.guid_middleware", # 1. Transaction ID + "django.middleware.security.SecurityMiddleware", # 2. Security headers + "django.contrib.sessions.middleware.SessionMiddleware", + "corsheaders.middleware.CorsMiddleware", # 4. CORS (before Common) + "django.middleware.common.CommonMiddleware", + "django.middleware.csrf.CsrfViewMiddleware", + "django.contrib.auth.middleware.AuthenticationMiddleware", + "django.contrib.messages.middleware.MessageMiddleware", + "django.middleware.clickjacking.XFrameOptionsMiddleware", + "api.middleware.APILoggingMiddleware", # 10. Custom API logging + "allauth.account.middleware.AccountMiddleware", +] +``` + +--- + +## Security Headers + +| Setting | Value | Description | +|---------|-------|-------------| +| `SECURE_PROXY_SSL_HEADER` | `("HTTP_X_FORWARDED_PROTO", "https")` | Trust X-Forwarded-Proto | +| `SECURE_CONTENT_TYPE_NOSNIFF` | `True` | X-Content-Type-Options: nosniff | +| `X_FRAME_OPTIONS` | `"DENY"` | Prevent framing | +| `CSRF_COOKIE_SECURE` | `True` | HTTPS-only CSRF cookie | +| `SESSION_COOKIE_SECURE` | `True` | HTTPS-only session cookie | + +--- + +## Password Validators + +| Validator | Options | +|-----------|---------| +| `UserAttributeSimilarityValidator` | Default | +| `MinimumLengthValidator` | `min_length=12` | +| `MaximumLengthValidator` | `max_length=72` (bcrypt limit) | +| `CommonPasswordValidator` | Default | +| `NumericPasswordValidator` | Default | +| `SpecialCharactersValidator` | `min_special_characters=1` | +| `UppercaseValidator` | `min_uppercase=1` | +| `LowercaseValidator` | `min_lowercase=1` | +| `NumericValidator` | `min_numeric=1` | diff --git a/skills/prowler-api/references/file-locations.md b/skills/prowler-api/references/file-locations.md new file mode 100644 index 0000000000..8b1eff65e3 --- /dev/null +++ b/skills/prowler-api/references/file-locations.md @@ -0,0 +1,128 @@ +# Prowler API File Locations + +## Configuration + +| Purpose | File Path | Key Items | +|---------|-----------|-----------| +| **Django Settings** | `api/src/backend/config/settings.py` | REST_FRAMEWORK, SIMPLE_JWT, DATABASES | +| **Celery Config** | `api/src/backend/config/celery.py` | Celery app, queues, task routing | +| **URL Routing** | `api/src/backend/config/urls.py` | Main URL patterns | +| **Database Router** | `api/src/backend/api/db_router.py` | `MainRouter` (4-database architecture) | + +## RLS (Row-Level Security) + +| Pattern | File Path | Key Classes/Functions | +|---------|-----------|----------------------| +| **RLS Base Model** | `api/src/backend/api/rls.py` | `RowLevelSecurityProtectedModel`, `RowLevelSecurityConstraint` | +| **RLS Transaction** | `api/src/backend/api/db_utils.py` | `rls_transaction()` context manager | +| **RLS Serializer** | `api/src/backend/api/v1/serializers.py` | `RLSSerializer` - auto-injects tenant_id | +| **Tenant Model** | `api/src/backend/api/rls.py` | `Tenant` model | +| **Partitioning** | `api/src/backend/api/partitions.py` | `PartitionManager`, UUIDv7 partitioning | + +## RBAC (Role-Based Access Control) + +| Pattern | File Path | Key Classes/Functions | +|---------|-----------|----------------------| +| **Permissions** | `api/src/backend/api/rbac/permissions.py` | `Permissions` enum, `get_role()`, `get_providers()` | +| **Role Model** | `api/src/backend/api/models.py` | `Role`, `UserRoleRelationship`, `RoleProviderGroupRelationship` | +| **Permission Decorator** | `api/src/backend/api/decorators.py` | `@check_permissions`, `HasPermissions` | +| **Visibility Filter** | `api/src/backend/api/rbac/` | Provider group visibility filtering | + +## Providers + +| Pattern | File Path | Key Classes/Functions | +|---------|-----------|----------------------| +| **Provider Model** | `api/src/backend/api/models.py` | `Provider`, `ProviderChoices` | +| **UID Validation** | `api/src/backend/api/models.py` | `validate__uid()` staticmethods | +| **Provider Secret** | `api/src/backend/api/models.py` | `ProviderSecret` model | +| **Provider Groups** | `api/src/backend/api/models.py` | `ProviderGroup`, `ProviderGroupMembership` | + +## Serializers + +| Pattern | File Path | Key Classes/Functions | +|---------|-----------|----------------------| +| **Base Serializers** | `api/src/backend/api/v1/serializers.py` | `BaseModelSerializerV1`, `RLSSerializer`, `BaseWriteSerializer` | +| **ViewSet Helpers** | `api/src/backend/api/v1/serializers.py` | `get_serializer_class_for_view()` | + +## ViewSets + +| Pattern | File Path | Key Classes/Functions | +|---------|-----------|----------------------| +| **Base ViewSets** | `api/src/backend/api/v1/views.py` | `BaseViewSet`, `BaseRLSViewSet`, `BaseTenantViewset`, `BaseUserViewset` | +| **Custom Actions** | `api/src/backend/api/v1/views.py` | `@action(detail=True)` patterns | +| **Filters** | `api/src/backend/api/filters.py` | `BaseProviderFilter`, `BaseScanProviderFilter`, `CommonFindingFilters` | + +## Celery Tasks + +| Pattern | File Path | Key Classes/Functions | +|---------|-----------|----------------------| +| **Task Definitions** | `api/src/backend/tasks/tasks.py` | All `@shared_task` definitions | +| **RLS Task Base** | `api/src/backend/config/celery.py` | `RLSTask` base class (creates APITask on dispatch) | +| **Task Decorators** | `api/src/backend/api/decorators.py` | `@set_tenant`, `@handle_provider_deletion` | +| **Celery Config** | `api/src/backend/config/celery.py` | Celery app, broker settings, visibility timeout | +| **Django Settings** | `api/src/backend/config/settings/celery.py` | `CELERY_BROKER_URL`, `CELERY_RESULT_BACKEND` | +| **Beat Schedule** | `api/src/backend/tasks/beat.py` | `schedule_provider_scan()`, `PeriodicTask` creation | +| **Task Utilities** | `api/src/backend/tasks/utils.py` | `batched()`, `get_next_execution_datetime()` | + +### Task Jobs (Business Logic) + +| Job File | Purpose | +|----------|---------| +| `tasks/jobs/scan.py` | `perform_prowler_scan()`, `aggregate_findings()`, `aggregate_attack_surface()` | +| `tasks/jobs/deletion.py` | `delete_provider()`, `delete_tenant()` | +| `tasks/jobs/backfill.py` | Historical data backfill operations | +| `tasks/jobs/export.py` | Output file generation (CSV, JSON, HTML) | +| `tasks/jobs/report.py` | PDF report generation (ThreatScore, ENS, NIS2) | +| `tasks/jobs/connection.py` | Provider/integration connection checks | +| `tasks/jobs/integrations.py` | S3, Security Hub, Jira uploads | +| `tasks/jobs/muting.py` | Historical findings muting | +| `tasks/jobs/attack_paths/` | Attack paths scan (Neo4j/Cartography) | + +## Key Line References + +### RLS Transaction (api/src/backend/api/db_utils.py) +```python +# Usage pattern +from api.db_utils import rls_transaction + +with rls_transaction(tenant_id): + # All queries here are tenant-scoped + providers = Provider.objects.filter(connected=True) +``` + +### RBAC Check (api/src/backend/api/rbac/permissions.py) +```python +# Usage pattern +from api.rbac.permissions import get_role, get_providers, Permissions + +user_role = get_role(request.user) # Returns FIRST role only +if user_role.unlimited_visibility: + queryset = Provider.objects.all() +else: + queryset = get_providers(user_role) +``` + +### Celery Task (api/src/backend/tasks/tasks.py) +```python +# Usage pattern +@shared_task(base=RLSTask, name="task-name", queue="scans") +@set_tenant +@handle_provider_deletion +def my_task(tenant_id: str, provider_id: str): + with rls_transaction(tenant_id): + provider = Provider.objects.get(pk=provider_id) +``` + +## Tests + +| Type | Path | +|------|------| +| **Central Fixtures** | `api/src/backend/conftest.py` | +| **API Tests** | `api/src/backend/api/tests/` | +| **Integration Tests** | `api/src/backend/api/tests/integration/` | +| **Task Tests** | `api/src/backend/tasks/tests/` | + +## Related Skills + +- **Generic DRF patterns**: Use `django-drf` skill for ViewSets, Serializers, Filters, JSON:API +- **API Testing**: Use `prowler-test-api` skill for testing patterns diff --git a/skills/prowler-api/references/modeling-decisions.md b/skills/prowler-api/references/modeling-decisions.md new file mode 100644 index 0000000000..68923c4ef4 --- /dev/null +++ b/skills/prowler-api/references/modeling-decisions.md @@ -0,0 +1,274 @@ +# Django Model Design Decisions + +## When to Use What + +### Primary Keys + +| Pattern | When to Use | Example | +|---------|-------------|---------| +| `uuid4` | Default for most models | `id = models.UUIDField(primary_key=True, default=uuid4)` | +| `uuid7` | Time-ordered data (findings, scans) | `id = models.UUIDField(primary_key=True, default=uuid7)` | + +**Why uuid7 for time-series?** UUIDv7 includes timestamp, enabling efficient range queries and partitioning. + +### Timestamps + +| Field | Pattern | Purpose | +|-------|---------|---------| +| `inserted_at` | `auto_now_add=True, editable=False` | Creation time, never changes | +| `updated_at` | `auto_now=True, editable=False` | Last modification time | + +### Soft Delete + +```python +# Model +is_deleted = models.BooleanField(default=False) + +# Custom manager (excludes deleted by default) +class ActiveProviderManager(models.Manager): + def get_queryset(self): + return super().get_queryset().filter(is_deleted=False) + +# Usage +objects = ActiveProviderManager() # Normal queries +all_objects = models.Manager() # Include deleted +``` + +### TextChoices Enums + +```python +class StateChoices(models.TextChoices): + AVAILABLE = "available", _("Available") + SCHEDULED = "scheduled", _("Scheduled") + EXECUTING = "executing", _("Executing") + COMPLETED = "completed", _("Completed") + FAILED = "failed", _("Failed") +``` + +### Constraints + +| Constraint | When to Use | +|------------|-------------| +| `UniqueConstraint` | Prevent duplicates within tenant scope | +| `UniqueConstraint + condition` | Unique only for non-deleted records | +| `RowLevelSecurityConstraint` | ALL RLS-protected models (mandatory) | + +```python +constraints = [ + # Unique provider UID per tenant (only for active providers) + models.UniqueConstraint( + fields=("tenant_id", "provider", "uid"), + condition=Q(is_deleted=False), + name="unique_provider_uids", + ), + # RLS constraint (REQUIRED for all tenant-scoped models) + RowLevelSecurityConstraint( + field="tenant_id", + name="rls_on_%(class)s", + statements=["SELECT", "INSERT", "UPDATE", "DELETE"], + ), +] +``` + +### Indexes + +| Index Type | When to Use | Example | +|------------|-------------|---------| +| `models.Index` | Frequent queries | `fields=["tenant_id", "provider_id"]` | +| `GinIndex` | Full-text search, ArrayField | `fields=["text_search"]` | +| Conditional Index | Specific query patterns | `condition=Q(state="completed")` | +| Covering Index | Avoid table lookups | `include=["id", "name"]` | + +```python +indexes = [ + # Common query pattern + models.Index( + fields=["tenant_id", "provider_id", "-inserted_at"], + name="scans_prov_ins_desc_idx", + ), + # Conditional: only completed scans + models.Index( + fields=["tenant_id", "provider_id", "-inserted_at"], + condition=Q(state=StateChoices.COMPLETED), + name="scans_completed_idx", + ), + # Covering: include extra columns to avoid table lookup + models.Index( + fields=["tenant_id", "provider_id"], + include=["id", "graph_database"], + name="aps_active_graph_idx", + ), + # Full-text search + GinIndex(fields=["text_search"], name="gin_resources_search_idx"), +] +``` + +### Full-Text Search + +```python +from django.contrib.postgres.search import SearchVector, SearchVectorField + +text_search = models.GeneratedField( + expression=SearchVector("uid", weight="A", config="simple") + + SearchVector("name", weight="B", config="simple"), + output_field=SearchVectorField(), + db_persist=True, + null=True, + editable=False, +) +``` + +### ArrayField + +```python +from django.contrib.postgres.fields import ArrayField + +groups = ArrayField( + models.CharField(max_length=100), + blank=True, + null=True, + help_text="Groups for categorization", +) +``` + +### JSONField + +```python +# Structured data with defaults +metadata = models.JSONField(default=dict, blank=True) +scanner_args = models.JSONField(default=dict, blank=True) +``` + +### Encrypted Fields + +```python +# Binary field for encrypted data +_secret = models.BinaryField(db_column="secret") + +@property +def secret(self): + # Decrypt on read + decrypted_data = fernet.decrypt(self._secret) + return json.loads(decrypted_data.decode()) + +@secret.setter +def secret(self, value): + # Encrypt on write + self._secret = fernet.encrypt(json.dumps(value).encode()) +``` + +### Foreign Keys + +| on_delete | When to Use | +|-----------|-------------| +| `CASCADE` | Child cannot exist without parent (Finding → Scan) | +| `SET_NULL` | Optional relationship, keep child (Task → PeriodicTask) | +| `PROTECT` | Prevent deletion if children exist | + +```python +# Required relationship +provider = models.ForeignKey( + Provider, + on_delete=models.CASCADE, + related_name="scans", + related_query_name="scan", +) + +# Optional relationship +scheduler_task = models.ForeignKey( + PeriodicTask, + on_delete=models.SET_NULL, + null=True, + blank=True, +) +``` + +### Many-to-Many with Through Table + +```python +# On the model +tags = models.ManyToManyField( + ResourceTag, + through="ResourceTagMapping", + related_name="resources", +) + +# Through table (for RLS + extra fields) +class ResourceTagMapping(RowLevelSecurityProtectedModel): + id = models.UUIDField(primary_key=True, default=uuid4) + resource = models.ForeignKey(Resource, on_delete=models.CASCADE) + tag = models.ForeignKey(ResourceTag, on_delete=models.CASCADE) + + class Meta: + constraints = [ + models.UniqueConstraint( + fields=("tenant_id", "resource_id", "tag_id"), + name="unique_resource_tag_mappings", + ), + RowLevelSecurityConstraint(...), + ] +``` + +### Partitioned Tables + +```python +from psqlextra.models import PostgresPartitionedModel +from psqlextra.types import PostgresPartitioningMethod + +class Finding(PostgresPartitionedModel, RowLevelSecurityProtectedModel): + class PartitioningMeta: + method = PostgresPartitioningMethod.RANGE + key = ["id"] # UUIDv7 for time-based partitioning +``` + +**Use for:** High-volume, time-series data (findings, resource mappings) + +### Model Validation + +```python +def clean(self): + super().clean() + # Dynamic validation based on field value + getattr(self, f"validate_{self.provider}_uid")(self.uid) + +def save(self, *args, **kwargs): + self.full_clean() # Always validate before save + super().save(*args, **kwargs) +``` + +### JSONAPIMeta + +```python +class JSONAPIMeta: + resource_name = "provider-groups" # kebab-case, plural +``` + +--- + +## Decision Tree: New Model + +``` +Is it tenant-scoped data? +├── Yes → Inherit RowLevelSecurityProtectedModel +│ Add RowLevelSecurityConstraint +│ Consider: soft-delete? partitioning? +└── No → Regular models.Model (rare in Prowler) + +Does it need time-ordering for queries? +├── Yes → Use uuid7 for primary key +└── No → Use uuid4 (default) + +Is it high-volume time-series data? +├── Yes → Use PostgresPartitionedModel +│ Partition by id (uuid7) +└── No → Regular model + +Does it reference Provider? +├── Yes → Add ActiveProviderManager +│ Use CASCADE or filter is_deleted +└── No → Standard manager + +Needs full-text search? +├── Yes → Add SearchVectorField + GinIndex +└── No → Skip +``` diff --git a/skills/prowler-api/references/production-settings.md b/skills/prowler-api/references/production-settings.md new file mode 100644 index 0000000000..ea9b59c328 --- /dev/null +++ b/skills/prowler-api/references/production-settings.md @@ -0,0 +1,180 @@ +# Production Settings Reference + +## Django Deployment Checklist Command + +```bash +cd api && poetry run python src/backend/manage.py check --deploy +``` + +This command checks for common deployment issues and missing security settings. + +--- + +## Critical Settings Table + +| Setting | Production Value | Risk if Wrong | +|---------|-----------------|---------------| +| `DEBUG` | `False` | Exposes stack traces, settings, SQL queries | +| `SECRET_KEY` | Env var, rotated | Session hijacking, CSRF bypass | +| `ALLOWED_HOSTS` | Explicit list | Host header attacks | +| `SECURE_SSL_REDIRECT` | `True` | Credentials sent over HTTP | +| `SESSION_COOKIE_SECURE` | `True` | Session cookies over HTTP | +| `CSRF_COOKIE_SECURE` | `True` | CSRF tokens over HTTP | +| `SECURE_HSTS_SECONDS` | `31536000` (1 year) | Downgrade attacks | +| `CONN_MAX_AGE` | `60` or higher | Connection pool exhaustion | + +--- + +## Full Production Settings Example + +```python +# settings/production.py +import environ + +env = environ.Env() + +# ============================================================================= +# CORE SECURITY +# ============================================================================= + +DEBUG = False # NEVER True in production + +# Load from environment - NEVER hardcode +SECRET_KEY = env("SECRET_KEY") + +# Explicit list - no wildcards +ALLOWED_HOSTS = env.list("ALLOWED_HOSTS") +# Example: ALLOWED_HOSTS=api.prowler.com,prowler.com + +# ============================================================================= +# HTTPS ENFORCEMENT +# ============================================================================= + +# Redirect all HTTP to HTTPS +SECURE_SSL_REDIRECT = True + +# Trust X-Forwarded-Proto header from reverse proxy (nginx, ALB, etc.) +SECURE_PROXY_SSL_HEADER = ("HTTP_X_FORWARDED_PROTO", "https") + +# ============================================================================= +# SECURE COOKIES +# ============================================================================= + +# Only send session cookie over HTTPS +SESSION_COOKIE_SECURE = True + +# Only send CSRF cookie over HTTPS +CSRF_COOKIE_SECURE = True + +# Prevent JavaScript access to session cookie (XSS protection) +SESSION_COOKIE_HTTPONLY = True + +# SameSite attribute for CSRF protection +CSRF_COOKIE_SAMESITE = "Strict" +SESSION_COOKIE_SAMESITE = "Strict" + +# ============================================================================= +# HTTP STRICT TRANSPORT SECURITY (HSTS) +# ============================================================================= + +# Tell browsers to always use HTTPS for this domain +SECURE_HSTS_SECONDS = 31536000 # 1 year + +# Apply HSTS to all subdomains +SECURE_HSTS_INCLUDE_SUBDOMAINS = True + +# Allow browser preload lists (requires domain submission) +SECURE_HSTS_PRELOAD = True + +# ============================================================================= +# CONTENT SECURITY +# ============================================================================= + +# Prevent clickjacking - deny all framing +X_FRAME_OPTIONS = "DENY" + +# Prevent MIME type sniffing +SECURE_CONTENT_TYPE_NOSNIFF = True + +# Enable XSS filter in older browsers +SECURE_BROWSER_XSS_FILTER = True + +# ============================================================================= +# DATABASE +# ============================================================================= + +# Connection pooling - reuse connections for 60 seconds +# Reduces connection overhead for frequent requests +CONN_MAX_AGE = 60 + +# For high-traffic: consider connection pooler like PgBouncer +# CONN_MAX_AGE = None # Let PgBouncer manage connections + +# ============================================================================= +# LOGGING +# ============================================================================= + +LOGGING = { + "version": 1, + "disable_existing_loggers": False, + "formatters": { + "verbose": { + "format": "{levelname} {asctime} {module} {process:d} {thread:d} {message}", + "style": "{", + }, + }, + "handlers": { + "console": { + "class": "logging.StreamHandler", + "formatter": "verbose", + }, + }, + "root": { + "handlers": ["console"], + "level": "INFO", # WARNING in production to reduce noise + }, + "loggers": { + "django.security": { + "handlers": ["console"], + "level": "WARNING", + "propagate": False, + }, + }, +} +``` + +--- + +## Environment Variables Checklist + +Required environment variables for production: + +```bash +# Core +SECRET_KEY= +ALLOWED_HOSTS=api.example.com,example.com +DEBUG=False + +# Database +DATABASE_URL= +# Or individual vars: +POSTGRES_HOST=... +POSTGRES_PORT=5432 +POSTGRES_DB=... +POSTGRES_USER=... +POSTGRES_PASSWORD=... + +# Redis (for Celery) +REDIS_URL=redis://host:6379/0 + +# Optional +SENTRY_DSN=https://...@sentry.io/... +``` + +--- + +## References + +- [Django Deployment Checklist](https://docs.djangoproject.com/en/5.2/howto/deployment/checklist/) +- [Django Security Settings](https://docs.djangoproject.com/en/5.2/topics/security/) +- [OWASP Secure Headers](https://owasp.org/www-project-secure-headers/) diff --git a/skills/prowler-commit/SKILL.md b/skills/prowler-commit/SKILL.md new file mode 100644 index 0000000000..0f67cbe882 --- /dev/null +++ b/skills/prowler-commit/SKILL.md @@ -0,0 +1,180 @@ +--- +name: prowler-commit +description: > + Creates professional git commits following conventional-commits format. + Trigger: When creating commits, after completing code changes, when user asks to commit. +license: Apache-2.0 +metadata: + author: prowler-cloud + version: "1.1.0" + scope: [root, api, ui, prowler, mcp_server] + auto_invoke: + - "Creating a git commit" + - "Committing changes" +--- + +## Critical Rules + +- ALWAYS use conventional-commits format: `type(scope): description` +- ALWAYS keep the first line under 72 characters +- ALWAYS ask for user confirmation before committing +- NEVER be overly specific (avoid counts like "6 subsections", "3 files") +- NEVER include implementation details in the title +- NEVER use `-n` flag unless user explicitly requests it +- NEVER use `git push --force` or `git push -f` (destructive, rewrites history) +- NEVER proactively offer to commit - wait for user to explicitly request it + +--- + +## Commit Format + +``` +type(scope): concise description + +- Key change 1 +- Key change 2 +- Key change 3 +``` + +### Types + +| Type | Use When | +|------|----------| +| `feat` | New feature or functionality | +| `fix` | Bug fix | +| `docs` | Documentation only | +| `chore` | Maintenance, dependencies, configs | +| `refactor` | Code change without feature/fix | +| `test` | Adding or updating tests | +| `perf` | Performance improvement | +| `style` | Formatting, no code change | + +### Scopes + +| Scope | When | +|-------|------| +| `api` | Changes in `api/` | +| `ui` | Changes in `ui/` | +| `sdk` | Changes in `prowler/` | +| `mcp` | Changes in `mcp_server/` | +| `skills` | Changes in `skills/` | +| `ci` | Changes in `.github/` | +| `docs` | Changes in `docs/` | +| *omit* | Multiple scopes or root-level | + +--- + +## Good vs Bad Examples + +### Title Line + +``` +# GOOD - Concise and clear +feat(api): add provider connection retry logic +fix(ui): resolve dashboard loading state +chore(skills): add Celery documentation +docs: update installation guide + +# BAD - Too specific or verbose +feat(api): add provider connection retry logic with exponential backoff and jitter (3 retries max) +chore(skills): add comprehensive Celery documentation covering 8 topics +fix(ui): fix the bug in dashboard component on line 45 +``` + +### Body (Bullet Points) + +``` +# GOOD - High-level changes +- Add retry mechanism for failed connections +- Document task composition patterns +- Expand configuration reference + +# BAD - Too detailed +- Add retry with max_retries=3, backoff=True, jitter=True +- Add 6 subsections covering chain, group, chord +- Update lines 45-67 in dashboard.tsx +``` + +--- + +## Workflow + +1. **Analyze changes** + ```bash + git status + git diff --stat HEAD + git log -3 --oneline # Check recent commit style + ``` + +2. **Draft commit message** + - Choose appropriate type and scope + - Write concise title (< 72 chars) + - Add 2-5 bullet points for significant changes + +3. **Present to user for confirmation** + - Show files to be committed + - Show proposed message + - Wait for explicit confirmation + +4. **Execute commit** + ```bash + git add + git commit -m "$(cat <<'EOF' + type(scope): description + + - Change 1 + - Change 2 + EOF + )" + ``` + +--- + +## Decision Tree + +``` +Single file changed? +├─ Yes → May omit body, title only +└─ No → Include body with key changes + +Multiple scopes affected? +├─ Yes → Omit scope: `feat: description` +└─ No → Include scope: `feat(api): description` + +Fixing a bug? +├─ User-facing → fix(scope): description +└─ Internal/dev → chore(scope): fix description + +Adding documentation? +├─ Code docs (docstrings) → Part of feat/fix +└─ Standalone docs → docs: or docs(scope): +``` + +--- + +## Commands + +```bash +# Check current state +git status +git diff --stat HEAD + +# Standard commit +git add +git commit -m "type(scope): description" + +# Multi-line commit +git commit -m "$(cat <<'EOF' +type(scope): description + +- Change 1 +- Change 2 +EOF +)" + +# Amend last commit (same message) +git commit --amend --no-edit + +# Amend with new message +git commit --amend -m "new message" +``` diff --git a/skills/prowler-test-api/SKILL.md b/skills/prowler-test-api/SKILL.md index f0cf7bade8..fc00ed31cc 100644 --- a/skills/prowler-test-api/SKILL.md +++ b/skills/prowler-test-api/SKILL.md @@ -6,7 +6,7 @@ description: > license: Apache-2.0 metadata: author: prowler-cloud - version: "1.0" + version: "1.1.0" scope: [root, api] auto_invoke: - "Writing Prowler API tests" @@ -17,115 +17,136 @@ allowed-tools: Read, Edit, Write, Glob, Grep, Bash, WebFetch, WebSearch, Task ## Critical Rules - ALWAYS use `response.json()["data"]` not `response.data` -- ALWAYS use `content_type = "application/vnd.api+json"` in requests -- ALWAYS test cross-tenant isolation with `other_tenant_provider` fixture +- ALWAYS use `content_type = "application/vnd.api+json"` for PATCH/PUT requests +- ALWAYS use `format="vnd.api+json"` for POST requests +- ALWAYS test cross-tenant isolation - RLS returns 404, NOT 403 - NEVER skip RLS isolation tests when adding new endpoints - NEVER use realistic-looking API keys in tests (TruffleHog will flag them) +- ALWAYS mock BOTH `.delay()` AND `Task.objects.get` for async task tests --- -## 1. JSON:API Format (Critical) +## 1. Fixture Dependency Chain -```python -content_type = "application/vnd.api+json" - -payload = { - "data": { - "type": "providers", # Plural, kebab-case - "id": str(resource.id), # Required for PATCH - "attributes": {"alias": "updated"}, - } -} - -response.json()["data"]["attributes"]["alias"] +``` +create_test_user (session) ─► tenants_fixture (function) ─► authenticated_client + │ + └─► providers_fixture ─► scans_fixture ─► findings_fixture ``` ---- - -## 2. RLS Isolation Tests - -```python -def test_cross_tenant_access_denied(self, authenticated_client, other_tenant_provider): - """User cannot see resources from other tenants.""" - response = authenticated_client.get( - reverse("provider-detail", args=[other_tenant_provider.id]) - ) - assert response.status_code == status.HTTP_404_NOT_FOUND -``` - ---- - -## 3. RBAC Tests - -```python -def test_unlimited_visibility_sees_all(self, authenticated_client_admin, providers_fixture): - response = authenticated_client_admin.get(reverse("provider-list")) - assert len(response.json()["data"]) == len(providers_fixture) - -def test_limited_visibility_sees_only_assigned(self, authenticated_client_limited): - # User with unlimited_visibility=False sees only providers in their provider_groups - pass - -def test_permission_required(self, authenticated_client_readonly): - response = authenticated_client_readonly.post(reverse("provider-list"), ...) - assert response.status_code == status.HTTP_403_FORBIDDEN -``` - ---- - -## 4. Managers (objects vs all_objects) - -```python -def test_objects_excludes_deleted(self): - deleted_provider = Provider.objects.create(..., is_deleted=True) - assert deleted_provider not in Provider.objects.all() - assert deleted_provider in Provider.all_objects.all() -``` - ---- - -## 5. Celery Task Tests - -```python -@patch("tasks.tasks.perform_prowler_scan") -def test_task_success(self, mock_scan): - mock_scan.return_value = {"findings_count": 100} - result = perform_scan_task(tenant_id="...", scan_id="...", provider_id="...") - assert result["findings_count"] == 100 -``` - ---- - -## 6. Key Fixtures +### Key Fixtures | Fixture | Description | |---------|-------------| -| `create_test_user` | Session user (dev@prowler.com) | -| `tenants_fixture` | 3 tenants (2 with membership, 1 isolated) | -| `providers_fixture` | Providers in tenant 1 | -| `other_tenant_provider` | Provider in isolated tenant (RLS tests) | -| `authenticated_client` | Client with JWT for tenant 1 | +| `create_test_user` | Session user (`dev@prowler.com`) | +| `tenants_fixture` | 3 tenants: [0],[1] have membership, [2] isolated | +| `authenticated_client` | JWT client for tenant[0] | +| `providers_fixture` | 9 providers in tenant[0] | +| `tasks_fixture` | 2 Celery tasks with TaskResult | + +### RBAC Fixtures + +| Fixture | Permissions | +|---------|-------------| +| `authenticated_client_rbac` | All permissions (admin) | +| `authenticated_client_rbac_noroles` | Membership but NO roles | +| `authenticated_client_no_permissions_rbac` | All permissions = False | --- -## 7. Fake Secrets in Tests (TruffleHog) - -CI runs TruffleHog to detect leaked secrets. Use obviously fake values: +## 2. JSON:API Requests +### POST (Create) ```python -# BAD - TruffleHog will flag these patterns: -api_key = "sk-test1234567890T3BlbkFJtest1234567890" # OpenAI pattern -api_key = "AKIA..." # AWS pattern - -# GOOD - clearly fake values: -api_key = "sk-fake-test-key-for-unit-testing-only" -api_key = "fake-aws-key-for-testing" +response = client.post( + reverse("provider-list"), + data={"data": {"type": "providers", "attributes": {...}}}, + format="vnd.api+json", # NOT content_type! +) ``` -**Patterns to avoid:** -- `sk-*T3BlbkFJ*` (OpenAI) -- `AKIA[A-Z0-9]{16}` (AWS Access Key) -- `ghp_*` or `gho_*` (GitHub tokens) +### PATCH (Update) +```python +response = client.patch( + reverse("provider-detail", kwargs={"pk": provider.id}), + data={"data": {"type": "providers", "id": str(provider.id), "attributes": {...}}}, + content_type="application/vnd.api+json", # NOT format! +) +``` + +### Reading Responses +```python +data = response.json()["data"] +attrs = data["attributes"] +errors = response.json()["errors"] # For 400 responses +``` + +--- + +## 3. RLS Isolation (Cross-Tenant) + +**RLS returns 404, NOT 403** - the resource is invisible, not forbidden. + +```python +def test_cross_tenant_access_denied(self, authenticated_client, tenants_fixture): + other_tenant = tenants_fixture[2] # Isolated tenant + foreign_provider = Provider.objects.create(tenant_id=other_tenant.id, ...) + + response = authenticated_client.get(reverse("provider-detail", args=[foreign_provider.id])) + assert response.status_code == status.HTTP_404_NOT_FOUND # NOT 403! +``` + +--- + +## 4. Celery Task Testing + +### Testing Strategies + +| Strategy | Use For | +|----------|---------| +| Mock `.delay()` + `Task.objects.get` | Testing views that trigger tasks | +| `task.apply()` | Synchronous task logic testing | +| Mock `chain`/`group` | Testing Canvas orchestration | +| Mock `connection` | Testing `@set_tenant` decorator | +| Mock `apply_async` | Testing Beat scheduled tasks | + +### Why NOT `task_always_eager` + +| Problem | Impact | +|---------|--------| +| No task serialization | Misses argument type errors | +| No broker interaction | Hides connection issues | +| Different execution context | `self.request` behaves differently | + +**Instead, use:** `task.apply()` for sync execution, mocking for isolation. + +> **Full examples:** See [assets/api_test.py](assets/api_test.py) for `TestCeleryTaskLogic`, `TestCeleryCanvas`, `TestSetTenantDecorator`, `TestBeatScheduling`. + +--- + +## 5. Fake Secrets (TruffleHog) + +```python +# BAD - TruffleHog flags these: +api_key = "sk-test1234567890T3BlbkFJtest1234567890" + +# GOOD - obviously fake: +api_key = "sk-fake-test-key-for-unit-testing-only" +``` + +--- + +## 6. Response Status Codes + +| Scenario | Code | +|----------|------| +| Successful GET | 200 | +| Successful POST | 201 | +| Async operation (DELETE/scan trigger) | 202 | +| Sync DELETE | 204 | +| Validation error | 400 | +| Missing permission (RBAC) | 403 | +| RLS isolation / not found | 404 | --- @@ -134,11 +155,13 @@ api_key = "fake-aws-key-for-testing" ```bash cd api && poetry run pytest -x --tb=short cd api && poetry run pytest -k "test_provider" -cd api && poetry run pytest -k "TestRBAC" +cd api && poetry run pytest api/src/backend/api/tests/test_rbac.py ``` --- ## Resources -- **Documentation**: See [references/test-api-docs.md](references/test-api-docs.md) for local file paths and documentation +- **Full Examples**: See [assets/api_test.py](assets/api_test.py) for complete test patterns +- **Fixture Reference**: See [references/test-api-docs.md](references/test-api-docs.md) +- **Fixture Source**: `api/src/backend/conftest.py` diff --git a/skills/prowler-test-api/assets/api_test.py b/skills/prowler-test-api/assets/api_test.py new file mode 100644 index 0000000000..0b70cd599d --- /dev/null +++ b/skills/prowler-test-api/assets/api_test.py @@ -0,0 +1,371 @@ +# Example: Prowler API Test Patterns +# Source: api/src/backend/api/tests/test_views.py + +from unittest.mock import Mock, patch + +import pytest +from conftest import ( + API_JSON_CONTENT_TYPE, + TEST_PASSWORD, + TEST_USER, + get_api_tokens, + get_authorization_header, +) +from django.urls import reverse +from rest_framework import status + +from api.models import Provider, Scan, StateChoices +from api.rls import Tenant + + +@pytest.mark.django_db +class TestProviderViewSet: + """Example API tests for Provider endpoints.""" + + def test_list_providers(self, authenticated_client, providers_fixture): + """GET list returns all providers for authenticated tenant.""" + response = authenticated_client.get(reverse("provider-list")) + + assert response.status_code == status.HTTP_200_OK + assert len(response.json()["data"]) == len(providers_fixture) + + def test_create_provider(self, authenticated_client): + """POST with JSON:API format creates provider.""" + response = authenticated_client.post( + reverse("provider-list"), + data={ + "data": { + "type": "providers", + "attributes": { + "provider": "aws", + "uid": "123456789012", + "alias": "my-aws-account", + }, + } + }, + format="vnd.api+json", # Use format= for POST + ) + + assert response.status_code == status.HTTP_201_CREATED + assert response.json()["data"]["attributes"]["uid"] == "123456789012" + + def test_update_provider(self, authenticated_client, providers_fixture): + """PATCH with JSON:API format updates provider.""" + provider = providers_fixture[0] + + payload = { + "data": { + "type": "providers", + "id": str(provider.id), # ID required for PATCH + "attributes": {"alias": "updated-alias"}, + } + } + + response = authenticated_client.patch( + reverse("provider-detail", kwargs={"pk": provider.id}), + data=payload, + content_type="application/vnd.api+json", # Use content_type= for PATCH + ) + + assert response.status_code == status.HTTP_200_OK + assert response.json()["data"]["attributes"]["alias"] == "updated-alias" + + +@pytest.mark.django_db +class TestRLSIsolation: + """Example RLS cross-tenant isolation tests.""" + + def test_cross_tenant_access_returns_404( + self, authenticated_client, tenants_fixture + ): + """User cannot see resources from other tenants - returns 404 NOT 403.""" + # Create resource in tenant user has NO access to (tenant[2] is isolated) + other_tenant = tenants_fixture[2] + foreign_provider = Provider.objects.create( + provider="aws", + uid="999888777666", + alias="foreign_provider", + tenant_id=other_tenant.id, + ) + + # Try to access - should get 404 (not 403!) + response = authenticated_client.get( + reverse("provider-detail", args=[foreign_provider.id]) + ) + assert response.status_code == status.HTTP_404_NOT_FOUND + + def test_list_excludes_other_tenants( + self, authenticated_client, providers_fixture, tenants_fixture + ): + """List endpoints only return resources from user's tenants.""" + # Create provider in isolated tenant + other_tenant = tenants_fixture[2] + Provider.objects.create( + provider="aws", + uid="foreign123", + tenant_id=other_tenant.id, + ) + + response = authenticated_client.get(reverse("provider-list")) + assert response.status_code == status.HTTP_200_OK + + # Should only see providers_fixture (9 providers in tenant[0]) + assert len(response.json()["data"]) == len(providers_fixture) + + +@pytest.mark.django_db +class TestRBACPermissions: + """Example RBAC permission tests.""" + + def test_requires_permission(self, authenticated_client_no_permissions_rbac): + """Users without manage_providers cannot create providers.""" + response = authenticated_client_no_permissions_rbac.post( + reverse("provider-list"), + data={ + "data": { + "type": "providers", + "attributes": {"provider": "aws", "uid": "123456789012"}, + } + }, + format="vnd.api+json", + ) + assert response.status_code == status.HTTP_403_FORBIDDEN + + def test_user_with_no_roles_denied(self, authenticated_client_rbac_noroles): + """User with membership but no roles gets 403.""" + response = authenticated_client_rbac_noroles.get(reverse("user-list")) + assert response.status_code == status.HTTP_403_FORBIDDEN + + def test_admin_sees_all(self, authenticated_client_rbac, providers_fixture): + """Admin with unlimited_visibility=True sees all providers.""" + response = authenticated_client_rbac.get(reverse("provider-list")) + assert response.status_code == status.HTTP_200_OK + + +@pytest.mark.django_db +class TestAsyncOperations: + """Example async task tests - mock BOTH .delay() AND Task.objects.get.""" + + @patch("api.v1.views.Task.objects.get") + @patch("api.v1.views.delete_provider_task.delay") + def test_delete_provider_returns_202( + self, + mock_delete_task, + mock_task_get, + authenticated_client, + providers_fixture, + tasks_fixture, + ): + """DELETE returns 202 Accepted with Content-Location header.""" + provider = providers_fixture[0] + prowler_task = tasks_fixture[0] + + # Mock the Celery task + task_mock = Mock() + task_mock.id = prowler_task.id + mock_delete_task.return_value = task_mock + mock_task_get.return_value = prowler_task + + response = authenticated_client.delete( + reverse("provider-detail", kwargs={"pk": provider.id}) + ) + + assert response.status_code == status.HTTP_202_ACCEPTED + assert "Content-Location" in response.headers + assert f"/api/v1/tasks/{prowler_task.id}" in response.headers["Content-Location"] + + # Verify task was called + mock_delete_task.assert_called_once() + + @patch("api.v1.views.Task.objects.get") + @patch("api.v1.views.perform_scan_task.delay") + def test_trigger_scan_returns_202( + self, + mock_scan_task, + mock_task_get, + authenticated_client, + providers_fixture, + tasks_fixture, + ): + """POST to scan trigger returns 202 with task location.""" + provider = providers_fixture[0] + prowler_task = tasks_fixture[0] + + task_mock = Mock() + task_mock.id = prowler_task.id + mock_scan_task.return_value = task_mock + mock_task_get.return_value = prowler_task + + response = authenticated_client.post( + reverse("provider-scan", kwargs={"pk": provider.id}), + format="vnd.api+json", + ) + + assert response.status_code == status.HTTP_202_ACCEPTED + + +@pytest.mark.django_db +class TestJSONAPIResponses: + """Example JSON:API response handling.""" + + def test_read_single_resource(self, authenticated_client, providers_fixture): + """Read data from single resource response.""" + provider = providers_fixture[0] + response = authenticated_client.get( + reverse("provider-detail", kwargs={"pk": provider.id}) + ) + + data = response.json()["data"] + attrs = data["attributes"] + resource_id = data["id"] + + assert resource_id == str(provider.id) + assert attrs["provider"] == provider.provider + + def test_read_list_response(self, authenticated_client, providers_fixture): + """Read data from list response.""" + response = authenticated_client.get(reverse("provider-list")) + + items = response.json()["data"] + assert len(items) == len(providers_fixture) + + def test_read_relationships(self, authenticated_client, scans_fixture): + """Read relationship data.""" + scan = scans_fixture[0] + response = authenticated_client.get( + reverse("scan-detail", kwargs={"pk": scan.id}) + ) + + data = response.json()["data"] + relationships = data["relationships"] + provider_rel = relationships["provider"]["data"] + + assert provider_rel["type"] == "providers" + assert provider_rel["id"] == str(scan.provider_id) + + def test_error_response(self, authenticated_client): + """Read error response structure.""" + response = authenticated_client.post( + reverse("user-list"), + data={"email": "invalid"}, # Missing required fields + format="json", + ) + + assert response.status_code == status.HTTP_400_BAD_REQUEST + errors = response.json()["errors"] + # Error has source.pointer and detail + assert "source" in errors[0] + assert "detail" in errors[0] + + +@pytest.mark.django_db +class TestSoftDelete: + """Example soft-delete manager tests.""" + + def test_objects_excludes_soft_deleted(self, providers_fixture): + """Default manager excludes soft-deleted records.""" + provider = providers_fixture[0] + provider.is_deleted = True + provider.save() + + # objects manager excludes deleted + assert provider not in Provider.objects.all() + + # all_objects includes deleted + assert provider in Provider.all_objects.all() + + +# ============================================================================= +# CELERY TASK TESTING +# ============================================================================= + + +@pytest.mark.django_db +class TestCeleryTaskLogic: + """Example: Testing Celery task logic directly with apply().""" + + def test_task_logic_directly(self, tenants_fixture, providers_fixture): + """Use apply() for synchronous execution without Celery worker.""" + from tasks.tasks import check_provider_connection_task + + tenant = tenants_fixture[0] + provider = providers_fixture[0] + + # Execute task synchronously (no broker needed) + result = check_provider_connection_task.apply( + kwargs={"tenant_id": str(tenant.id), "provider_id": str(provider.id)} + ) + + assert result.successful() + assert result.result["connected"] is True + + +@pytest.mark.django_db +class TestCeleryCanvas: + """Example: Testing Canvas (chain/group) task orchestration.""" + + @patch("tasks.tasks.chain") + @patch("tasks.tasks.group") + def test_post_scan_workflow(self, mock_group, mock_chain, tenants_fixture): + """Mock chain/group to verify task orchestration.""" + from tasks.tasks import _perform_scan_complete_tasks + + tenant = tenants_fixture[0] + + # Mock chain.apply_async + mock_chain_instance = Mock() + mock_chain.return_value = mock_chain_instance + + _perform_scan_complete_tasks(str(tenant.id), "scan-123", "provider-456") + + # Verify chain was called + assert mock_chain.called + mock_chain_instance.apply_async.assert_called() + + +@pytest.mark.django_db +class TestSetTenantDecorator: + """Example: Testing @set_tenant decorator behavior.""" + + @patch("api.decorators.connection") + def test_sets_rls_context(self, mock_conn, tenants_fixture, providers_fixture): + """Verify @set_tenant sets RLS context via SET_CONFIG_QUERY.""" + from tasks.tasks import check_provider_connection_task + + tenant = tenants_fixture[0] + provider = providers_fixture[0] + + # Call task with tenant_id - decorator sets RLS and pops it + check_provider_connection_task.apply( + kwargs={"tenant_id": str(tenant.id), "provider_id": str(provider.id)} + ) + + # Verify SET_CONFIG_QUERY was executed + mock_conn.cursor.return_value.__enter__.return_value.execute.assert_called() + + +@pytest.mark.django_db +class TestBeatScheduling: + """Example: Testing Beat scheduled task creation.""" + + @patch("tasks.beat.perform_scheduled_scan_task.apply_async") + def test_schedule_provider_scan(self, mock_apply, providers_fixture): + """Verify periodic task is created with correct settings.""" + from django_celery_beat.models import PeriodicTask + + from tasks.beat import schedule_provider_scan + + provider = providers_fixture[0] + mock_apply.return_value = Mock(id="task-123") + + schedule_provider_scan(provider) + + # Verify periodic task created + assert PeriodicTask.objects.filter( + name=f"scan-perform-scheduled-{provider.id}" + ).exists() + + # Verify immediate execution with countdown + mock_apply.assert_called_once() + call_kwargs = mock_apply.call_args + assert call_kwargs.kwargs.get("countdown") == 5 diff --git a/skills/prowler-test-api/references/test-api-docs.md b/skills/prowler-test-api/references/test-api-docs.md index 2aa3e0134f..0e7d02821a 100644 --- a/skills/prowler-test-api/references/test-api-docs.md +++ b/skills/prowler-test-api/references/test-api-docs.md @@ -1,18 +1,214 @@ -# API Test Documentation +# API Test Documentation Reference -## Local Documentation +## File Locations -For API testing patterns, see: +| Type | Path | +|------|------| +| Central fixtures | `api/src/backend/conftest.py` | +| API unit tests | `api/src/backend/api/tests/` | +| Integration tests | `api/src/backend/api/tests/integration/` | +| Task tests | `api/src/backend/tasks/tests/` | +| Dev fixtures (JSON) | `api/src/backend/api/fixtures/dev/` | -- `api/src/backend/conftest.py` - All fixtures -- `api/src/backend/api/tests/` - API tests -- `api/src/backend/tasks/tests/` - Task tests +--- -## Contents +## Fixture Dependency Graph -The documentation covers: -- JSON:API format for requests/responses -- RLS isolation test patterns -- RBAC permission tests -- Celery task mocking -- Test fixtures and their usage +``` +create_test_user (session) + │ + └─► tenants_fixture (function) + │ + ├─► set_user_admin_roles_fixture + │ │ + │ └─► authenticated_client + │ └─► (most API tests use this) + │ + ├─► providers_fixture + │ └─► scans_fixture + │ └─► findings_fixture + │ + └─► RBAC fixtures (create their own tenants/users): + ├─► create_test_user_rbac + │ └─► authenticated_client_rbac + │ + ├─► create_test_user_rbac_no_roles + │ └─► authenticated_client_rbac_noroles + │ + ├─► create_test_user_rbac_limited + │ └─► authenticated_client_no_permissions_rbac + │ + ├─► create_test_user_rbac_manage_account + │ └─► authenticated_client_rbac_manage_account + │ + └─► create_test_user_rbac_manage_users_only + └─► authenticated_client_rbac_manage_users_only +``` + +--- + +## Test File Contents + +### `api/src/backend/api/tests/test_views.py` + +Main ViewSet tests covering: +- `TestUserViewSet` - User CRUD, password validation, deletion cascades +- `TestTenantViewSet` - Tenant operations +- `TestProviderViewSet` - Provider CRUD, async deletion, connection testing +- `TestScanViewSet` - Scan trigger, list, filter +- `TestFindingViewSet` - Finding queries, filters +- `TestResourceViewSet` - Resource listing with tags +- `TestTaskViewSet` - Celery task status +- `TestIntegrationViewSet` - S3/Security Hub integrations +- `TestComplianceOverviewViewSet` - Compliance data +- And many more... + +### `api/src/backend/api/tests/test_rbac.py` + +RBAC permission tests covering: +- Permission checks for each ViewSet +- Role-based access patterns +- `unlimited_visibility` behavior +- Provider group visibility filtering +- Self-access patterns (`/me` endpoint) + +### `api/src/backend/api/tests/integration/test_rls_transaction.py` + +RLS enforcement tests: +- `rls_transaction` context manager +- Invalid UUID validation +- Custom parameter names + +### `api/src/backend/api/tests/integration/test_providers.py` + +Provider integration tests: +- Delete + recreate flow with async tasks +- End-to-end provider lifecycle + +### `api/src/backend/api/tests/integration/test_authentication.py` + +Authentication tests: +- JWT token flow +- API key authentication +- Social login (SAML, OAuth) +- Cross-tenant token isolation + +--- + +## Key Test Classes and Their Fixtures + +### Standard API Tests + +```python +@pytest.mark.django_db +class TestProviderViewSet: + def test_list(self, authenticated_client, providers_fixture): + # authenticated_client has JWT for tenant[0] + # providers_fixture has 9 providers in tenant[0] + ... +``` + +### RBAC Tests + +```python +@pytest.mark.django_db +class TestProviderRBAC: + def test_with_permission(self, authenticated_client_rbac, ...): + # Has all permissions + ... + + def test_without_permission(self, authenticated_client_no_permissions_rbac, ...): + # Has no permissions (all False) + ... +``` + +### Cross-Tenant Tests + +```python +@pytest.mark.django_db +class TestCrossTenantIsolation: + def test_cannot_access_other_tenant(self, authenticated_client, tenants_fixture): + other_tenant = tenants_fixture[2] # Isolated tenant + # Create resource in other_tenant + # Try to access with authenticated_client + # Expect 404 +``` + +### Async Task Tests + +```python +@pytest.mark.django_db +class TestAsyncOperations: + @patch("api.v1.views.Task.objects.get") + @patch("api.v1.views.some_task.delay") + def test_async_operation(self, mock_task, mock_task_get, tasks_fixture, ...): + prowler_task = tasks_fixture[0] + mock_task.return_value = Mock(id=prowler_task.id) + mock_task_get.return_value = prowler_task + # Execute and verify 202 response +``` + +--- + +## Constants Available from conftest + +```python +from conftest import ( + API_JSON_CONTENT_TYPE, # "application/vnd.api+json" + NO_TENANT_HTTP_STATUS, # status.HTTP_401_UNAUTHORIZED + TEST_USER, # "dev@prowler.com" + TEST_PASSWORD, # "testing_psswd" + TODAY, # str(datetime.today().date()) + today_after_n_days, # Function: (n: int) -> str + get_api_tokens, # Function: (client, email, password, tenant_id?) -> (access, refresh) + get_authorization_header, # Function: (token) -> {"Authorization": f"Bearer {token}"} +) +``` + +--- + +## Running Tests + +```bash +# Full test suite +cd api && poetry run pytest + +# Fast fail on first error +cd api && poetry run pytest -x + +# Short traceback +cd api && poetry run pytest --tb=short + +# Specific file +cd api && poetry run pytest api/src/backend/api/tests/test_views.py + +# Pattern match +cd api && poetry run pytest -k "Provider" + +# Verbose with print output +cd api && poetry run pytest -v -s + +# With coverage +cd api && poetry run pytest --cov=api --cov-report=html + +# Parallel execution +cd api && poetry run pytest -n auto +``` + +--- + +## pytest Configuration + +From `api/pyproject.toml`: + +```toml +[tool.pytest.ini_options] +DJANGO_SETTINGS_MODULE = "config.settings" +python_files = "test_*.py" +addopts = "--reuse-db" +``` + +Key points: +- Uses `--reuse-db` for faster test runs +- Settings from `config.settings` +- Test files must match `test_*.py` diff --git a/skills/skill-sync/assets/sync.sh b/skills/skill-sync/assets/sync.sh index 15b53a5d7c..941f7bae56 100755 --- a/skills/skill-sync/assets/sync.sh +++ b/skills/skill-sync/assets/sync.sh @@ -137,6 +137,8 @@ extract_metadata() { # On multi-line list, only accept "- item" lines. Anything else ends the list. line = $0 + # Stop at frontmatter delimiter (getline bypasses pattern matching) + if (line ~ /^---$/) break if (line ~ /^[[:space:]]*-[[:space:]]*/) { sub(/^[[:space:]]*-[[:space:]]*/, "", line) line = trim(line) diff --git a/ui/AGENTS.md b/ui/AGENTS.md index 94c3fab23e..edb3f53447 100644 --- a/ui/AGENTS.md +++ b/ui/AGENTS.md @@ -21,8 +21,10 @@ When performing these actions, ALWAYS invoke the corresponding skill FIRST: | Add changelog entry for a PR or feature | `prowler-changelog` | | App Router / Server Actions | `nextjs-15` | | Building AI chat features | `ai-sdk-5` | +| Committing changes | `prowler-commit` | | Create PR that requires changelog entry | `prowler-changelog` | | Creating Zod schemas | `zod-4` | +| Creating a git commit | `prowler-commit` | | Creating/modifying Prowler UI components | `prowler-ui` | | Review changelog format and conventions | `prowler-changelog` | | Update CHANGELOG.md in any component | `prowler-changelog` |