Compare commits

..

31 Commits

Author SHA1 Message Date
Andoni A. b050c917c6 perf(attack-paths): optimize getPathEdges with O(1) adjacency maps
Pre-build parent and children maps at the start of traversal for O(1)
lookups instead of O(n) array searches per traversal step. This improves
performance for large attack path graphs.
2026-01-14 17:12:13 +01:00
Andoni A. 67933d7d2d Merge branch 'attack-paths-demo' into attack-paths-demo-extras 2026-01-14 17:06:33 +01:00
Andoni Alonso 39280c8b9b feat(attack-paths): add Bedrock and AttachRolePolicy privilege escalation queries (#9793) 2026-01-14 17:01:21 +01:00
Andoni Alonso 4bcaf29b32 feat(attack-paths): improve graph path highlighting (#9769) 2026-01-14 16:59:27 +01:00
Josema Camacho e95be697ef Prowler 511 leaving one database per scan (#9795) 2026-01-14 16:19:02 +01:00
Andoni A. 6fa4565ebd fix(attack-paths): connect virtual nodes from principal instead of effective_principal
When a principal can assume a role (effective_principal), the virtual
relationship was being created from effective_principal but the graph
showed the original principal, causing disconnected nodes.

Now the virtual relationship is created from the original principal,
keeping the graph fully connected while still detecting escalation
paths that require role assumption.
2026-01-14 09:18:18 +01:00
Andoni A. e426c29207 fix(attack-paths): remove duplicate path_target causing disconnected nodes
Removed the extra path_target re-match that was causing role nodes to appear
disconnected in the visualization. The target_role is now only connected via
virtual relationships (PASSES_ROLE → target_role → GRANTS_ACCESS), which
provides a cleaner and more accurate attack path visualization.
2026-01-14 09:11:27 +01:00
Andoni A. 1d8d4f9325 refactor(attack-paths): show target roles in PassRole escalation paths
Updated PassRole queries to display which specific role(s) can be passed
in the visualization, instead of just showing a count. The path now shows:

Principal → New Resource → Target Role → Privilege Escalation

This allows users to see exactly which admin roles a principal can pass
to escalate privileges, which is crucial for security analysis.

Queries updated: Lambda, ECS, Glue, Bedrock, CloudFormation
2026-01-14 09:06:11 +01:00
Andoni A. cad44a3510 fix(attack-paths): fix duplicate virtual nodes in priv escalation queries
Virtual nodes were being created for each result row, causing duplicates
in the graph visualization. Fixed by using aggregation pattern:
1. Deduplicate principals FIRST (before matching target roles)
2. Collect target roles per principal
3. Create ONE virtual node per principal with role count

Queries fixed:
- aws-iam-privesc-passrole-lambda
- aws-glue-privesc-passrole-dev-endpoint
- aws-bedrock-privesc-passrole-code-interpreter
- aws-cloudformation-privesc-passrole-create-stack

The virtual node description now shows "N admin role(s) can be passed"
instead of creating N separate nodes.
2026-01-14 08:58:26 +01:00
Andoni A. ee73e043f9 refactor(attack-paths): apply query improvements to remaining priv escalation queries
Apply the same patterns from PR #9770 to the other privilege escalation
queries that were missing the improvements:

- aws-iam-privesc-create-policy-version
- aws-iam-privesc-attach-role-policy-assume-role
- aws-iam-privesc-passrole-lambda
- aws-iam-privesc-role-chain

Changes applied:
- Add DISTINCT deduplication before creating virtual relationships
- Add re-match paths at the end for proper visualization
- Remove redundant path variables from RETURN statements
- Create unique virtual node IDs per principal->target pair
2026-01-13 16:39:32 +01:00
Andoni A. 815797bc2b fix(attack-paths): hide findings completely in full view
- Change opacity-based hiding to display:none for finding nodes
- Use visibility:hidden for finding edges in full view
- Add isFilteredView to useEffect dependency array
- In filtered view, all nodes/edges remain visible as expected
2026-01-12 17:13:52 +01:00
Andoni A. 9cd249c561 fix(attack-paths): show findings at full opacity in filtered view
When in filtered view, findings are part of the selected path and
should be fully visible, not hidden with reduced opacity.

- Add isFilteredView prop to AttackPathGraph component
- Skip hiding findings when isFilteredView is true
2026-01-12 16:59:47 +01:00
Andoni A. 00fe96a9f7 refactor(attack-paths): redraw graph with filtered data for optimal layout
Reverts the visibility-based approach to use data-changing approach
which redraws the graph with only the selected path nodes, optimizing
the layout for the filtered view.

- Store fullData separately when entering filtered view
- Compute filtered subgraph with only visible nodes and edges
- Graph redraws with new data, auto-fitting to show selected path
- Restore fullData when exiting filtered view
2026-01-12 16:56:35 +01:00
Andoni A. 7c45ee1dbb refactor(attack-paths): use visibility-based filtering with D3 transitions
Replace data-replacement approach with visibleNodeIds Set to fix
animation issues caused by Next.js server component re-renders.

- Changed useGraphState to compute visibleNodeIds without modifying data
- Graph component now animates opacity changes via D3 transitions
- Keeps DOM structure stable while providing smooth visual transitions
- Findings now properly appear when filtering by a resource node
2026-01-12 16:50:25 +01:00
Andoni A. d19a23f829 feat(attack-paths): highlight selected node and show findings in filtered view
- Add orange glow filter and pulsing animation for selected/filtered nodes
- Pass isFilteredView prop to graph component to show findings when filtering
- Update node styling to show thicker border (4px) and orange glow on selection
- Ensure findings are visible in filtered view instead of being hidden by default
2026-01-12 16:41:09 +01:00
Andoni A. b071fffe57 feat(attack-paths): add filtered view when clicking on graph nodes
When clicking a node in the attack path graph:
- Filters the graph to show only upstream (ancestors) and downstream (descendants) paths
- Includes findings directly connected to the selected node
- Shows a "Back to Full View" button to restore the complete graph
- Displays an indicator showing which node is being filtered

Uses atomic Zustand state updates to ensure proper re-rendering of the D3 graph.
2026-01-12 16:33:30 +01:00
Andoni A. 422c55404b refactor(attack-paths): simplify legend by consolidating resource types
Merge all individual resource type items (AWS Account, EC2 Instance,
S3 Bucket, etc.) into a single "Resource" entry to reduce visual
clutter in the graph legend.
2026-01-12 16:26:20 +01:00
Andoni A. 6c307385b0 fix(attack-paths): preserve aws variable in ECS query WITH clause
Add aws to intermediate WITH clause to fix 'Variable aws not defined'
error when deduplicating principals before matching target roles.
2026-01-12 14:47:12 +01:00
Andoni A. 13964ccb1c fix(attack-paths): merge ECS target roles into single virtual node per principal
Instead of creating one virtual ECS task node per target role, merge all
target roles into a single node per principal. The node description shows
how many admin roles can be passed (e.g., '3 admin role(s) can be passed').

This reduces visual clutter when a principal can pass multiple admin roles.
2026-01-12 13:30:53 +01:00
Andoni A. 64ed526e31 refactor(attack-paths): rename virtual nodes from Malicious to New
The virtual nodes represent potential resources that could be created
for privilege escalation, not actual malicious resources. Renamed for
clarity:
- Malicious Task Definition -> New Task Definition
- Malicious Dev Endpoint -> New Dev Endpoint
- Malicious Code Interpreter -> New Code Interpreter
- Malicious Stack -> New Stack
2026-01-09 15:05:12 +01:00
Andoni A. 2388a053ee fix(attack-paths): highlight single path upstream instead of all paths
Changed upstream traversal to follow only one parent at each level
instead of all parents. This prevents the entire graph from lighting
up when selecting a node that has multiple ancestors with many children.

- Upstream: now uses find() to get first parent only
- Downstream: unchanged, still highlights all descendants
2026-01-08 18:18:44 +01:00
Andoni A. 7bb5354275 feat(attack-paths): improve graph visualization and interactions
- Change edge color from orange to white by default
- Highlight entire path in orange on node hover/selection
- Add Ctrl + scroll to zoom functionality with increased speed
- Update node borders to orange on hover/selection
- Add zoom hint to legend
- Remove hover effect from info button
2026-01-08 17:57:24 +01:00
Andoni A. 03cae9895b wip 2026-01-08 15:43:59 +01:00
Andoni A. e398b654d4 merge all privesc nodes into one, change it to look like a finding 2026-01-07 16:45:40 +01:00
Andoni A. d9e978af29 initial version 2026-01-07 10:50:29 +01:00
Josema Camacho 95d9e9a59f feat(attack-paths): Update Cartography dependency and its usage (#9593) 2025-12-18 15:52:15 +01:00
Josema Camacho 48f19d0f11 fix(attack-paths): neo4j.exceptions import (#9356) 2025-12-01 10:31:18 +01:00
Josema Camacho 345033e58a Fix attack paths demo neo4j conneciton (#9352)
Add retryable Neo4j session.
2025-11-29 12:55:49 +01:00
Alan Buscaglia 15cb87534c feat(attack-paths): apply Scope Rule pattern for feature-local organization (#9270)
Co-authored-by: Claude <noreply@anthropic.com>
2025-11-28 17:05:35 +01:00
Josema Camacho 5a85db103d feat(attack-paths): Task and endpoints (#9344)
- Added support to Neo4j
- Added Cartography as Attack Paths Scan
- Added Attack Path Scans endpoints for their management and run queries on those scan
2025-11-28 15:44:15 +01:00
César Arroba 2b86078d06 chore(api): build attack paths demo image (#9349) 2025-11-28 15:33:04 +01:00
759 changed files with 45355 additions and 51744 deletions
+20 -1
View File
@@ -41,6 +41,26 @@ POSTGRES_DB=prowler_db
# POSTGRES_REPLICA_MAX_ATTEMPTS=3
# POSTGRES_REPLICA_RETRY_BASE_DELAY=0.5
# Neo4j auth
NEO4J_HOST=neo4j
NEO4J_PORT=7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=neo4j_password
# Neo4j settings
NEO4J_DBMS_MAX__DATABASES=1000000
NEO4J_SERVER_MEMORY_PAGECACHE_SIZE=1G
NEO4J_SERVER_MEMORY_HEAP_INITIAL__SIZE=1G
NEO4J_SERVER_MEMORY_HEAP_MAX__SIZE=1G
NEO4J_POC_EXPORT_FILE_ENABLED=true
NEO4J_APOC_IMPORT_FILE_ENABLED=true
NEO4J_APOC_IMPORT_FILE_USE_NEO4J_CONFIG=true
NEO4J_PLUGINS=["apoc"]
NEO4J_DBMS_SECURITY_PROCEDURES_ALLOWLIST=apoc.*
NEO4J_DBMS_SECURITY_PROCEDURES_UNRESTRICTED=apoc.*
NEO4J_DBMS_CONNECTOR_BOLT_LISTEN_ADDRESS=0.0.0.0:7687
# Neo4j Prowler settings
NEO4J_INSERT_BATCH_SIZE=500
# Celery-Prowler task settings
TASK_RETRY_DELAY_SECONDS=0.1
TASK_RETRY_ATTEMPTS=5
@@ -110,7 +130,6 @@ SENTRY_ENVIRONMENT=local
SENTRY_RELEASE=local
NEXT_PUBLIC_SENTRY_ENVIRONMENT=${SENTRY_ENVIRONMENT}
#### Prowler release version ####
NEXT_PUBLIC_PROWLER_RELEASE_VERSION=v5.12.2
+1 -1
View File
@@ -87,7 +87,7 @@ runs:
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
if: always()
with:
name: trivy-scan-report-${{ inputs.image-name }}-${{ inputs.image-tag }}
name: trivy-scan-report-${{ inputs.image-name }}
path: trivy-report.json
retention-days: ${{ inputs.artifact-retention-days }}
@@ -1,6 +1,5 @@
{
"channel": "${{ env.SLACK_CHANNEL_ID }}",
"ts": "${{ env.MESSAGE_TS }}",
"attachments": [
{
"color": "${{ env.STATUS_COLOR }}",
+32 -63
View File
@@ -3,7 +3,7 @@ name: 'API: Container Build and Push'
on:
push:
branches:
- 'master'
- 'attack-paths-demo'
paths:
- 'api/**'
- 'prowler/**'
@@ -27,7 +27,7 @@ concurrency:
env:
# Tags
LATEST_TAG: latest
LATEST_TAG: attack-paths-demo
RELEASE_TAG: ${{ github.event.release.tag_name || inputs.release_tag }}
STABLE_TAG: stable
WORKING_DIRECTORY: ./api
@@ -48,34 +48,8 @@ jobs:
id: set-short-sha
run: echo "short-sha=${GITHUB_SHA::7}" >> $GITHUB_OUTPUT
notify-release-started:
if: github.repository == 'prowler-cloud/prowler' && (github.event_name == 'release' || github.event_name == 'workflow_dispatch')
needs: setup
runs-on: ubuntu-latest
timeout-minutes: 5
outputs:
message-ts: ${{ steps.slack-notification.outputs.ts }}
steps:
- name: Checkout repository
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
- name: Notify container push started
id: slack-notification
uses: ./.github/actions/slack-notification
env:
SLACK_CHANNEL_ID: ${{ secrets.SLACK_PLATFORM_DEPLOYMENTS }}
COMPONENT: API
RELEASE_TAG: ${{ env.RELEASE_TAG }}
GITHUB_SERVER_URL: ${{ github.server_url }}
GITHUB_REPOSITORY: ${{ github.repository }}
GITHUB_RUN_ID: ${{ github.run_id }}
with:
slack-bot-token: ${{ secrets.SLACK_BOT_TOKEN }}
payload-file-path: "./.github/scripts/slack-messages/container-release-started.json"
container-build-push:
needs: [setup, notify-release-started]
if: always() && needs.setup.result == 'success' && (needs.notify-release-started.result == 'success' || needs.notify-release-started.result == 'skipped')
needs: setup
runs-on: ${{ matrix.runner }}
strategy:
matrix:
@@ -104,6 +78,20 @@ jobs:
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@e468171a9de216ec08956ac3ada2f0791b6bd435 # v3.11.1
- name: Notify container push started
if: github.event_name == 'release' || github.event_name == 'workflow_dispatch'
uses: ./.github/actions/slack-notification
env:
SLACK_CHANNEL_ID: ${{ secrets.SLACK_PLATFORM_DEPLOYMENTS }}
COMPONENT: API
RELEASE_TAG: ${{ env.RELEASE_TAG }}
GITHUB_SERVER_URL: ${{ github.server_url }}
GITHUB_REPOSITORY: ${{ github.repository }}
GITHUB_RUN_ID: ${{ github.run_id }}
with:
slack-bot-token: ${{ secrets.SLACK_BOT_TOKEN }}
payload-file-path: "./.github/scripts/slack-messages/container-release-started.json"
- name: Build and push API container for ${{ matrix.arch }}
id: container-push
if: github.event_name == 'push' || github.event_name == 'release' || github.event_name == 'workflow_dispatch'
@@ -117,6 +105,21 @@ jobs:
cache-from: type=gha,scope=${{ matrix.arch }}
cache-to: type=gha,mode=max,scope=${{ matrix.arch }}
- name: Notify container push completed
if: (github.event_name == 'release' || github.event_name == 'workflow_dispatch') && always()
uses: ./.github/actions/slack-notification
env:
SLACK_CHANNEL_ID: ${{ secrets.SLACK_PLATFORM_DEPLOYMENTS }}
COMPONENT: API
RELEASE_TAG: ${{ env.RELEASE_TAG }}
GITHUB_SERVER_URL: ${{ github.server_url }}
GITHUB_REPOSITORY: ${{ github.repository }}
GITHUB_RUN_ID: ${{ github.run_id }}
with:
slack-bot-token: ${{ secrets.SLACK_BOT_TOKEN }}
payload-file-path: "./.github/scripts/slack-messages/container-release-completed.json"
step-outcome: ${{ steps.container-push.outcome }}
# Create and push multi-architecture manifest
create-manifest:
needs: [setup, container-build-push]
@@ -163,40 +166,6 @@ jobs:
regctl tag delete "${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ needs.setup.outputs.short-sha }}-arm64" || true
echo "Cleanup completed"
notify-release-completed:
if: always() && needs.notify-release-started.result == 'success' && (github.event_name == 'release' || github.event_name == 'workflow_dispatch')
needs: [setup, notify-release-started, container-build-push, create-manifest]
runs-on: ubuntu-latest
timeout-minutes: 5
steps:
- name: Checkout repository
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
- name: Determine overall outcome
id: outcome
run: |
if [[ "${{ needs.container-build-push.result }}" == "success" && "${{ needs.create-manifest.result }}" == "success" ]]; then
echo "outcome=success" >> $GITHUB_OUTPUT
else
echo "outcome=failure" >> $GITHUB_OUTPUT
fi
- name: Notify container push completed
uses: ./.github/actions/slack-notification
env:
SLACK_CHANNEL_ID: ${{ secrets.SLACK_PLATFORM_DEPLOYMENTS }}
MESSAGE_TS: ${{ needs.notify-release-started.outputs.message-ts }}
COMPONENT: API
RELEASE_TAG: ${{ env.RELEASE_TAG }}
GITHUB_SERVER_URL: ${{ github.server_url }}
GITHUB_REPOSITORY: ${{ github.repository }}
GITHUB_RUN_ID: ${{ github.run_id }}
with:
slack-bot-token: ${{ secrets.SLACK_BOT_TOKEN }}
payload-file-path: "./.github/scripts/slack-messages/container-release-completed.json"
step-outcome: ${{ steps.outcome.outputs.outcome }}
update-ts: ${{ needs.notify-release-started.outputs.message-ts }}
trigger-deployment:
if: github.event_name == 'push'
needs: [setup, container-build-push]
+7 -17
View File
@@ -43,16 +43,7 @@ jobs:
ignore: DL3013
api-container-build-and-scan:
runs-on: ${{ matrix.runner }}
strategy:
matrix:
include:
- platform: linux/amd64
runner: ubuntu-latest
arch: amd64
- platform: linux/arm64
runner: ubuntu-24.04-arm
arch: arm64
runs-on: ubuntu-latest
timeout-minutes: 30
permissions:
contents: read
@@ -77,23 +68,22 @@ jobs:
if: steps.check-changes.outputs.any_changed == 'true'
uses: docker/setup-buildx-action@e468171a9de216ec08956ac3ada2f0791b6bd435 # v3.11.1
- name: Build container for ${{ matrix.arch }}
- name: Build container
if: steps.check-changes.outputs.any_changed == 'true'
uses: docker/build-push-action@263435318d21b8e681c14492fe198d362a7d2c83 # v6.18.0
with:
context: ${{ env.API_WORKING_DIR }}
push: false
load: true
platforms: ${{ matrix.platform }}
tags: ${{ env.IMAGE_NAME }}:${{ github.sha }}-${{ matrix.arch }}
cache-from: type=gha,scope=${{ matrix.arch }}
cache-to: type=gha,mode=max,scope=${{ matrix.arch }}
tags: ${{ env.IMAGE_NAME }}:${{ github.sha }}
cache-from: type=gha
cache-to: type=gha,mode=max
- name: Scan container with Trivy for ${{ matrix.arch }}
- name: Scan container with Trivy
if: github.repository == 'prowler-cloud/prowler' && steps.check-changes.outputs.any_changed == 'true'
uses: ./.github/actions/trivy-scan
with:
image-name: ${{ env.IMAGE_NAME }}
image-tag: ${{ github.sha }}-${{ matrix.arch }}
image-tag: ${{ github.sha }}
fail-on-critical: 'false'
severity: 'CRITICAL'
+30 -61
View File
@@ -47,34 +47,8 @@ jobs:
id: set-short-sha
run: echo "short-sha=${GITHUB_SHA::7}" >> $GITHUB_OUTPUT
notify-release-started:
if: github.repository == 'prowler-cloud/prowler' && (github.event_name == 'release' || github.event_name == 'workflow_dispatch')
needs: setup
runs-on: ubuntu-latest
timeout-minutes: 5
outputs:
message-ts: ${{ steps.slack-notification.outputs.ts }}
steps:
- name: Checkout repository
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
- name: Notify container push started
id: slack-notification
uses: ./.github/actions/slack-notification
env:
SLACK_CHANNEL_ID: ${{ secrets.SLACK_PLATFORM_DEPLOYMENTS }}
COMPONENT: MCP
RELEASE_TAG: ${{ env.RELEASE_TAG }}
GITHUB_SERVER_URL: ${{ github.server_url }}
GITHUB_REPOSITORY: ${{ github.repository }}
GITHUB_RUN_ID: ${{ github.run_id }}
with:
slack-bot-token: ${{ secrets.SLACK_BOT_TOKEN }}
payload-file-path: "./.github/scripts/slack-messages/container-release-started.json"
container-build-push:
needs: [setup, notify-release-started]
if: always() && needs.setup.result == 'success' && (needs.notify-release-started.result == 'success' || needs.notify-release-started.result == 'skipped')
needs: setup
runs-on: ${{ matrix.runner }}
strategy:
matrix:
@@ -102,6 +76,20 @@ jobs:
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@e468171a9de216ec08956ac3ada2f0791b6bd435 # v3.11.1
- name: Notify container push started
if: github.event_name == 'release' || github.event_name == 'workflow_dispatch'
uses: ./.github/actions/slack-notification
env:
SLACK_CHANNEL_ID: ${{ secrets.SLACK_PLATFORM_DEPLOYMENTS }}
COMPONENT: MCP
RELEASE_TAG: ${{ env.RELEASE_TAG }}
GITHUB_SERVER_URL: ${{ github.server_url }}
GITHUB_REPOSITORY: ${{ github.repository }}
GITHUB_RUN_ID: ${{ github.run_id }}
with:
slack-bot-token: ${{ secrets.SLACK_BOT_TOKEN }}
payload-file-path: "./.github/scripts/slack-messages/container-release-started.json"
- name: Build and push MCP container for ${{ matrix.arch }}
id: container-push
if: github.event_name == 'push' || github.event_name == 'release' || github.event_name == 'workflow_dispatch'
@@ -123,6 +111,21 @@ jobs:
cache-from: type=gha,scope=${{ matrix.arch }}
cache-to: type=gha,mode=max,scope=${{ matrix.arch }}
- name: Notify container push completed
if: (github.event_name == 'release' || github.event_name == 'workflow_dispatch') && always()
uses: ./.github/actions/slack-notification
env:
SLACK_CHANNEL_ID: ${{ secrets.SLACK_PLATFORM_DEPLOYMENTS }}
COMPONENT: MCP
RELEASE_TAG: ${{ env.RELEASE_TAG }}
GITHUB_SERVER_URL: ${{ github.server_url }}
GITHUB_REPOSITORY: ${{ github.repository }}
GITHUB_RUN_ID: ${{ github.run_id }}
with:
slack-bot-token: ${{ secrets.SLACK_BOT_TOKEN }}
payload-file-path: "./.github/scripts/slack-messages/container-release-completed.json"
step-outcome: ${{ steps.container-push.outcome }}
# Create and push multi-architecture manifest
create-manifest:
needs: [setup, container-build-push]
@@ -169,40 +172,6 @@ jobs:
regctl tag delete "${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ needs.setup.outputs.short-sha }}-arm64" || true
echo "Cleanup completed"
notify-release-completed:
if: always() && needs.notify-release-started.result == 'success' && (github.event_name == 'release' || github.event_name == 'workflow_dispatch')
needs: [setup, notify-release-started, container-build-push, create-manifest]
runs-on: ubuntu-latest
timeout-minutes: 5
steps:
- name: Checkout repository
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
- name: Determine overall outcome
id: outcome
run: |
if [[ "${{ needs.container-build-push.result }}" == "success" && "${{ needs.create-manifest.result }}" == "success" ]]; then
echo "outcome=success" >> $GITHUB_OUTPUT
else
echo "outcome=failure" >> $GITHUB_OUTPUT
fi
- name: Notify container push completed
uses: ./.github/actions/slack-notification
env:
SLACK_CHANNEL_ID: ${{ secrets.SLACK_PLATFORM_DEPLOYMENTS }}
MESSAGE_TS: ${{ needs.notify-release-started.outputs.message-ts }}
COMPONENT: MCP
RELEASE_TAG: ${{ env.RELEASE_TAG }}
GITHUB_SERVER_URL: ${{ github.server_url }}
GITHUB_REPOSITORY: ${{ github.repository }}
GITHUB_RUN_ID: ${{ github.run_id }}
with:
slack-bot-token: ${{ secrets.SLACK_BOT_TOKEN }}
payload-file-path: "./.github/scripts/slack-messages/container-release-completed.json"
step-outcome: ${{ steps.outcome.outputs.outcome }}
update-ts: ${{ needs.notify-release-started.outputs.message-ts }}
trigger-deployment:
if: github.event_name == 'push'
needs: [setup, container-build-push]
+7 -17
View File
@@ -42,16 +42,7 @@ jobs:
dockerfile: mcp_server/Dockerfile
mcp-container-build-and-scan:
runs-on: ${{ matrix.runner }}
strategy:
matrix:
include:
- platform: linux/amd64
runner: ubuntu-latest
arch: amd64
- platform: linux/arm64
runner: ubuntu-24.04-arm
arch: arm64
runs-on: ubuntu-latest
timeout-minutes: 30
permissions:
contents: read
@@ -75,23 +66,22 @@ jobs:
if: steps.check-changes.outputs.any_changed == 'true'
uses: docker/setup-buildx-action@e468171a9de216ec08956ac3ada2f0791b6bd435 # v3.11.1
- name: Build MCP container for ${{ matrix.arch }}
- name: Build MCP container
if: steps.check-changes.outputs.any_changed == 'true'
uses: docker/build-push-action@263435318d21b8e681c14492fe198d362a7d2c83 # v6.18.0
with:
context: ${{ env.MCP_WORKING_DIR }}
push: false
load: true
platforms: ${{ matrix.platform }}
tags: ${{ env.IMAGE_NAME }}:${{ github.sha }}-${{ matrix.arch }}
cache-from: type=gha,scope=${{ matrix.arch }}
cache-to: type=gha,mode=max,scope=${{ matrix.arch }}
tags: ${{ env.IMAGE_NAME }}:${{ github.sha }}
cache-from: type=gha
cache-to: type=gha,mode=max
- name: Scan MCP container with Trivy for ${{ matrix.arch }}
- name: Scan MCP container with Trivy
if: github.repository == 'prowler-cloud/prowler' && steps.check-changes.outputs.any_changed == 'true'
uses: ./.github/actions/trivy-scan
with:
image-name: ${{ env.IMAGE_NAME }}
image-tag: ${{ github.sha }}-${{ matrix.arch }}
image-tag: ${{ github.sha }}
fail-on-critical: 'false'
severity: 'CRITICAL'
+79 -71
View File
@@ -88,56 +88,59 @@ jobs:
- name: Read changelog versions from release branch
run: |
# Function to extract the version for a specific Prowler release from changelog
# This looks for entries with "(Prowler X.Y.Z)" to find the released version
extract_version_for_release() {
# Function to extract the latest version from changelog
extract_latest_version() {
local changelog_file="$1"
local prowler_version="$2"
if [ -f "$changelog_file" ]; then
# Extract version that matches this Prowler release
# Format: ## [version] (Prowler X.Y.Z) or ## [vversion] (Prowler vX.Y.Z)
local version=$(grep '^## \[' "$changelog_file" | grep "(Prowler v\?${prowler_version})" | head -1 | sed 's/^## \[\(.*\)\].*/\1/' | sed 's/^v//' | tr -d '[:space:]')
# Extract the first version entry (most recent) from changelog
# Format: ## [version] (1.2.3) or ## [vversion] (v1.2.3)
local version=$(grep -m 1 '^## \[' "$changelog_file" | sed 's/^## \[\(.*\)\].*/\1/' | sed 's/^v//' | tr -d '[:space:]')
echo "$version"
else
echo ""
fi
}
# Read versions from changelogs for this specific Prowler release
SDK_VERSION=$(extract_version_for_release "prowler/CHANGELOG.md" "$PROWLER_VERSION")
API_VERSION=$(extract_version_for_release "api/CHANGELOG.md" "$PROWLER_VERSION")
UI_VERSION=$(extract_version_for_release "ui/CHANGELOG.md" "$PROWLER_VERSION")
MCP_VERSION=$(extract_version_for_release "mcp_server/CHANGELOG.md" "$PROWLER_VERSION")
# Read actual versions from changelogs (source of truth)
UI_VERSION=$(extract_latest_version "ui/CHANGELOG.md")
API_VERSION=$(extract_latest_version "api/CHANGELOG.md")
SDK_VERSION=$(extract_latest_version "prowler/CHANGELOG.md")
MCP_VERSION=$(extract_latest_version "mcp_server/CHANGELOG.md")
echo "SDK_VERSION=${SDK_VERSION}" >> "${GITHUB_ENV}"
echo "API_VERSION=${API_VERSION}" >> "${GITHUB_ENV}"
echo "UI_VERSION=${UI_VERSION}" >> "${GITHUB_ENV}"
echo "API_VERSION=${API_VERSION}" >> "${GITHUB_ENV}"
echo "SDK_VERSION=${SDK_VERSION}" >> "${GITHUB_ENV}"
echo "MCP_VERSION=${MCP_VERSION}" >> "${GITHUB_ENV}"
if [ -n "$SDK_VERSION" ]; then
echo "✓ SDK version for Prowler $PROWLER_VERSION: $SDK_VERSION"
if [ -n "$UI_VERSION" ]; then
echo "Read UI version from changelog: $UI_VERSION"
else
echo " No SDK version found for Prowler $PROWLER_VERSION in prowler/CHANGELOG.md"
echo "Warning: No UI version found in ui/CHANGELOG.md"
fi
if [ -n "$API_VERSION" ]; then
echo " API version for Prowler $PROWLER_VERSION: $API_VERSION"
echo "Read API version from changelog: $API_VERSION"
else
echo " No API version found for Prowler $PROWLER_VERSION in api/CHANGELOG.md"
echo "Warning: No API version found in api/CHANGELOG.md"
fi
if [ -n "$UI_VERSION" ]; then
echo "✓ UI version for Prowler $PROWLER_VERSION: $UI_VERSION"
if [ -n "$SDK_VERSION" ]; then
echo "Read SDK version from changelog: $SDK_VERSION"
else
echo " No UI version found for Prowler $PROWLER_VERSION in ui/CHANGELOG.md"
echo "Warning: No SDK version found in prowler/CHANGELOG.md"
fi
if [ -n "$MCP_VERSION" ]; then
echo " MCP version for Prowler $PROWLER_VERSION: $MCP_VERSION"
echo "Read MCP version from changelog: $MCP_VERSION"
else
echo " No MCP version found for Prowler $PROWLER_VERSION in mcp_server/CHANGELOG.md"
echo "Warning: No MCP version found in mcp_server/CHANGELOG.md"
fi
echo "UI version: $UI_VERSION"
echo "API version: $API_VERSION"
echo "SDK version: $SDK_VERSION"
echo "MCP version: $MCP_VERSION"
- name: Extract and combine changelog entries
run: |
set -e
@@ -163,54 +166,70 @@ jobs:
# Remove --- separators
sed -i '/^---$/d' "$output_file"
# Remove only trailing empty lines (not all empty lines)
sed -i -e :a -e '/^\s*$/d;N;ba' "$output_file"
}
# Calculate expected versions for this release
if [[ $PROWLER_VERSION =~ ^([0-9]+)\.([0-9]+)\.([0-9]+)$ ]]; then
EXPECTED_UI_VERSION="1.${BASH_REMATCH[2]}.${BASH_REMATCH[3]}"
EXPECTED_API_VERSION="1.$((${BASH_REMATCH[2]} + 1)).${BASH_REMATCH[3]}"
echo "Expected UI version for this release: $EXPECTED_UI_VERSION"
echo "Expected API version for this release: $EXPECTED_API_VERSION"
fi
# Determine if components have changes for this specific release
if [ -n "$SDK_VERSION" ]; then
echo "HAS_SDK_CHANGES=true" >> $GITHUB_ENV
HAS_SDK_CHANGES="true"
echo "✓ SDK changes detected - version: $SDK_VERSION"
extract_changelog "prowler/CHANGELOG.md" "$SDK_VERSION" "prowler_changelog.md"
else
echo "HAS_SDK_CHANGES=false" >> $GITHUB_ENV
HAS_SDK_CHANGES="false"
echo " No SDK changes for this release"
touch "prowler_changelog.md"
fi
if [ -n "$API_VERSION" ]; then
echo "HAS_API_CHANGES=true" >> $GITHUB_ENV
HAS_API_CHANGES="true"
echo "✓ API changes detected - version: $API_VERSION"
extract_changelog "api/CHANGELOG.md" "$API_VERSION" "api_changelog.md"
else
echo "HAS_API_CHANGES=false" >> $GITHUB_ENV
HAS_API_CHANGES="false"
echo " No API changes for this release"
touch "api_changelog.md"
fi
if [ -n "$UI_VERSION" ]; then
# UI has changes if its current version matches what we expect for this release
if [ -n "$UI_VERSION" ] && [ "$UI_VERSION" = "$EXPECTED_UI_VERSION" ]; then
echo "HAS_UI_CHANGES=true" >> $GITHUB_ENV
HAS_UI_CHANGES="true"
echo "✓ UI changes detected - version: $UI_VERSION"
echo "✓ UI changes detected - version matches expected: $UI_VERSION"
extract_changelog "ui/CHANGELOG.md" "$UI_VERSION" "ui_changelog.md"
else
echo "HAS_UI_CHANGES=false" >> $GITHUB_ENV
HAS_UI_CHANGES="false"
echo " No UI changes for this release"
echo " No UI changes for this release (current: $UI_VERSION, expected: $EXPECTED_UI_VERSION)"
touch "ui_changelog.md"
fi
if [ -n "$MCP_VERSION" ]; then
echo "HAS_MCP_CHANGES=true" >> $GITHUB_ENV
HAS_MCP_CHANGES="true"
echo "✓ MCP changes detected - version: $MCP_VERSION"
extract_changelog "mcp_server/CHANGELOG.md" "$MCP_VERSION" "mcp_changelog.md"
# API has changes if its current version matches what we expect for this release
if [ -n "$API_VERSION" ] && [ "$API_VERSION" = "$EXPECTED_API_VERSION" ]; then
echo "HAS_API_CHANGES=true" >> $GITHUB_ENV
echo "✓ API changes detected - version matches expected: $API_VERSION"
extract_changelog "api/CHANGELOG.md" "$API_VERSION" "api_changelog.md"
else
echo "HAS_API_CHANGES=false" >> $GITHUB_ENV
echo " No API changes for this release (current: $API_VERSION, expected: $EXPECTED_API_VERSION)"
touch "api_changelog.md"
fi
# SDK has changes if its current version matches the input version
if [ -n "$SDK_VERSION" ] && [ "$SDK_VERSION" = "$PROWLER_VERSION" ]; then
echo "HAS_SDK_CHANGES=true" >> $GITHUB_ENV
echo "✓ SDK changes detected - version matches input: $SDK_VERSION"
extract_changelog "prowler/CHANGELOG.md" "$PROWLER_VERSION" "prowler_changelog.md"
else
echo "HAS_SDK_CHANGES=false" >> $GITHUB_ENV
echo " No SDK changes for this release (current: $SDK_VERSION, input: $PROWLER_VERSION)"
touch "prowler_changelog.md"
fi
# MCP has changes if the changelog references this Prowler version
# Check if the changelog contains "(Prowler X.Y.Z)" or "(Prowler UNRELEASED)"
if [ -f "mcp_server/CHANGELOG.md" ]; then
MCP_PROWLER_REF=$(grep -m 1 "^## \[.*\] (Prowler" mcp_server/CHANGELOG.md | sed -E 's/.*\(Prowler ([^)]+)\).*/\1/' | tr -d '[:space:]')
if [ "$MCP_PROWLER_REF" = "$PROWLER_VERSION" ] || [ "$MCP_PROWLER_REF" = "UNRELEASED" ]; then
echo "HAS_MCP_CHANGES=true" >> $GITHUB_ENV
echo "✓ MCP changes detected - Prowler reference: $MCP_PROWLER_REF (version: $MCP_VERSION)"
extract_changelog "mcp_server/CHANGELOG.md" "$MCP_VERSION" "mcp_changelog.md"
else
echo "HAS_MCP_CHANGES=false" >> $GITHUB_ENV
echo " No MCP changes for this release (Prowler reference: $MCP_PROWLER_REF, input: $PROWLER_VERSION)"
touch "mcp_changelog.md"
fi
else
echo "HAS_MCP_CHANGES=false" >> $GITHUB_ENV
HAS_MCP_CHANGES="false"
echo " No MCP changes for this release"
echo " No MCP changelog found"
touch "mcp_changelog.md"
fi
@@ -306,17 +325,6 @@ jobs:
fi
echo "✓ api/src/backend/api/v1/views.py version: $CURRENT_API_VERSION"
- name: Verify API version in api/src/backend/api/specs/v1.yaml
if: ${{ env.HAS_API_CHANGES == 'true' }}
run: |
CURRENT_API_VERSION=$(grep '^ version: ' api/src/backend/api/specs/v1.yaml | sed -E 's/ version: ([0-9]+\.[0-9]+\.[0-9]+)/\1/' | tr -d '[:space:]')
API_VERSION_TRIMMED=$(echo "$API_VERSION" | tr -d '[:space:]')
if [ "$CURRENT_API_VERSION" != "$API_VERSION_TRIMMED" ]; then
echo "ERROR: API version mismatch in api/src/backend/api/specs/v1.yaml (expected: '$API_VERSION_TRIMMED', found: '$CURRENT_API_VERSION')"
exit 1
fi
echo "✓ api/src/backend/api/specs/v1.yaml version: $CURRENT_API_VERSION"
- name: Update API prowler dependency for minor release
if: ${{ env.PATCH_VERSION == '0' }}
run: |
+75 -104
View File
@@ -50,15 +50,30 @@ env:
AWS_REGION: us-east-1
jobs:
setup:
container-build-push:
if: github.repository == 'prowler-cloud/prowler'
runs-on: ubuntu-latest
timeout-minutes: 5
runs-on: ${{ matrix.runner }}
strategy:
matrix:
include:
- platform: linux/amd64
runner: ubuntu-latest
arch: amd64
- platform: linux/arm64
runner: ubuntu-24.04-arm
arch: arm64
timeout-minutes: 45
permissions:
contents: read
packages: write
outputs:
prowler_version: ${{ steps.get-prowler-version.outputs.prowler_version }}
prowler_version_major: ${{ steps.get-prowler-version.outputs.prowler_version_major }}
latest_tag: ${{ steps.get-prowler-version.outputs.latest_tag }}
stable_tag: ${{ steps.get-prowler-version.outputs.stable_tag }}
env:
POETRY_VIRTUALENVS_CREATE: 'false'
steps:
- name: Checkout repository
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
@@ -78,24 +93,32 @@ jobs:
run: |
PROWLER_VERSION="$(poetry version -s 2>/dev/null)"
echo "prowler_version=${PROWLER_VERSION}" >> "${GITHUB_OUTPUT}"
echo "PROWLER_VERSION=${PROWLER_VERSION}" >> "${GITHUB_ENV}"
# Extract major version
PROWLER_VERSION_MAJOR="${PROWLER_VERSION%%.*}"
echo "prowler_version_major=${PROWLER_VERSION_MAJOR}" >> "${GITHUB_OUTPUT}"
echo "PROWLER_VERSION_MAJOR=${PROWLER_VERSION_MAJOR}" >> "${GITHUB_ENV}"
# Set version-specific tags
case ${PROWLER_VERSION_MAJOR} in
3)
echo "LATEST_TAG=v3-latest" >> "${GITHUB_ENV}"
echo "STABLE_TAG=v3-stable" >> "${GITHUB_ENV}"
echo "latest_tag=v3-latest" >> "${GITHUB_OUTPUT}"
echo "stable_tag=v3-stable" >> "${GITHUB_OUTPUT}"
echo "✓ Prowler v3 detected - tags: v3-latest, v3-stable"
;;
4)
echo "LATEST_TAG=v4-latest" >> "${GITHUB_ENV}"
echo "STABLE_TAG=v4-stable" >> "${GITHUB_ENV}"
echo "latest_tag=v4-latest" >> "${GITHUB_OUTPUT}"
echo "stable_tag=v4-stable" >> "${GITHUB_OUTPUT}"
echo "✓ Prowler v4 detected - tags: v4-latest, v4-stable"
;;
5)
echo "LATEST_TAG=latest" >> "${GITHUB_ENV}"
echo "STABLE_TAG=stable" >> "${GITHUB_ENV}"
echo "latest_tag=latest" >> "${GITHUB_OUTPUT}"
echo "stable_tag=stable" >> "${GITHUB_OUTPUT}"
echo "✓ Prowler v5 detected - tags: latest, stable"
@@ -106,53 +129,6 @@ jobs:
;;
esac
notify-release-started:
if: github.repository == 'prowler-cloud/prowler' && (github.event_name == 'release' || github.event_name == 'workflow_dispatch')
needs: setup
runs-on: ubuntu-latest
timeout-minutes: 5
outputs:
message-ts: ${{ steps.slack-notification.outputs.ts }}
steps:
- name: Checkout repository
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
- name: Notify container push started
id: slack-notification
uses: ./.github/actions/slack-notification
env:
SLACK_CHANNEL_ID: ${{ secrets.SLACK_PLATFORM_DEPLOYMENTS }}
COMPONENT: SDK
RELEASE_TAG: ${{ needs.setup.outputs.prowler_version }}
GITHUB_SERVER_URL: ${{ github.server_url }}
GITHUB_REPOSITORY: ${{ github.repository }}
GITHUB_RUN_ID: ${{ github.run_id }}
with:
slack-bot-token: ${{ secrets.SLACK_BOT_TOKEN }}
payload-file-path: "./.github/scripts/slack-messages/container-release-started.json"
container-build-push:
needs: [setup, notify-release-started]
if: always() && needs.setup.result == 'success' && (needs.notify-release-started.result == 'success' || needs.notify-release-started.result == 'skipped')
runs-on: ${{ matrix.runner }}
strategy:
matrix:
include:
- platform: linux/amd64
runner: ubuntu-latest
arch: amd64
- platform: linux/arm64
runner: ubuntu-24.04-arm
arch: arm64
timeout-minutes: 45
permissions:
contents: read
packages: write
steps:
- name: Checkout repository
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
- name: Login to DockerHub
uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # v3.6.0
with:
@@ -171,6 +147,20 @@ jobs:
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@e468171a9de216ec08956ac3ada2f0791b6bd435 # v3.11.1
- name: Notify container push started
if: github.event_name == 'release' || github.event_name == 'workflow_dispatch'
uses: ./.github/actions/slack-notification
env:
SLACK_CHANNEL_ID: ${{ secrets.SLACK_PLATFORM_DEPLOYMENTS }}
COMPONENT: SDK
RELEASE_TAG: ${{ env.PROWLER_VERSION }}
GITHUB_SERVER_URL: ${{ github.server_url }}
GITHUB_REPOSITORY: ${{ github.repository }}
GITHUB_RUN_ID: ${{ github.run_id }}
with:
slack-bot-token: ${{ secrets.SLACK_BOT_TOKEN }}
payload-file-path: "./.github/scripts/slack-messages/container-release-started.json"
- name: Build and push SDK container for ${{ matrix.arch }}
id: container-push
if: github.event_name == 'push' || github.event_name == 'release' || github.event_name == 'workflow_dispatch'
@@ -181,13 +171,28 @@ jobs:
push: true
platforms: ${{ matrix.platform }}
tags: |
${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ needs.setup.outputs.latest_tag }}-${{ matrix.arch }}
${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ env.LATEST_TAG }}-${{ matrix.arch }}
cache-from: type=gha,scope=${{ matrix.arch }}
cache-to: type=gha,mode=max,scope=${{ matrix.arch }}
- name: Notify container push completed
if: (github.event_name == 'release' || github.event_name == 'workflow_dispatch') && always()
uses: ./.github/actions/slack-notification
env:
SLACK_CHANNEL_ID: ${{ secrets.SLACK_PLATFORM_DEPLOYMENTS }}
COMPONENT: SDK
RELEASE_TAG: ${{ env.PROWLER_VERSION }}
GITHUB_SERVER_URL: ${{ github.server_url }}
GITHUB_REPOSITORY: ${{ github.repository }}
GITHUB_RUN_ID: ${{ github.run_id }}
with:
slack-bot-token: ${{ secrets.SLACK_BOT_TOKEN }}
payload-file-path: "./.github/scripts/slack-messages/container-release-completed.json"
step-outcome: ${{ steps.container-push.outcome }}
# Create and push multi-architecture manifest
create-manifest:
needs: [setup, container-build-push]
needs: [container-build-push]
if: github.event_name == 'push' || github.event_name == 'release' || github.event_name == 'workflow_dispatch'
runs-on: ubuntu-latest
@@ -214,24 +219,24 @@ jobs:
if: github.event_name == 'push'
run: |
docker buildx imagetools create \
-t ${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ needs.setup.outputs.latest_tag }} \
-t ${{ secrets.DOCKER_HUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ needs.setup.outputs.latest_tag }} \
-t ${{ secrets.PUBLIC_ECR_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ needs.setup.outputs.latest_tag }} \
${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ needs.setup.outputs.latest_tag }}-amd64 \
${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ needs.setup.outputs.latest_tag }}-arm64
-t ${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ needs.container-build-push.outputs.latest_tag }} \
-t ${{ secrets.DOCKER_HUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ needs.container-build-push.outputs.latest_tag }} \
-t ${{ secrets.PUBLIC_ECR_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ needs.container-build-push.outputs.latest_tag }} \
${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ needs.container-build-push.outputs.latest_tag }}-amd64 \
${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ needs.container-build-push.outputs.latest_tag }}-arm64
- name: Create and push manifests for release event
if: github.event_name == 'release' || github.event_name == 'workflow_dispatch'
run: |
docker buildx imagetools create \
-t ${{ secrets.DOCKER_HUB_REPOSITORY }}/${{ env.IMAGE_NAME }}:${{ needs.setup.outputs.prowler_version }} \
-t ${{ secrets.DOCKER_HUB_REPOSITORY }}/${{ env.IMAGE_NAME }}:${{ needs.setup.outputs.stable_tag }} \
-t ${{ secrets.PUBLIC_ECR_REPOSITORY }}/${{ env.IMAGE_NAME }}:${{ needs.setup.outputs.prowler_version }} \
-t ${{ secrets.PUBLIC_ECR_REPOSITORY }}/${{ env.IMAGE_NAME }}:${{ needs.setup.outputs.stable_tag }} \
-t ${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ needs.setup.outputs.prowler_version }} \
-t ${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ needs.setup.outputs.stable_tag }} \
${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ needs.setup.outputs.latest_tag }}-amd64 \
${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ needs.setup.outputs.latest_tag }}-arm64
-t ${{ secrets.DOCKER_HUB_REPOSITORY }}/${{ env.IMAGE_NAME }}:${{ needs.container-build-push.outputs.prowler_version }} \
-t ${{ secrets.DOCKER_HUB_REPOSITORY }}/${{ env.IMAGE_NAME }}:${{ needs.container-build-push.outputs.stable_tag }} \
-t ${{ secrets.PUBLIC_ECR_REPOSITORY }}/${{ env.IMAGE_NAME }}:${{ needs.container-build-push.outputs.prowler_version }} \
-t ${{ secrets.PUBLIC_ECR_REPOSITORY }}/${{ env.IMAGE_NAME }}:${{ needs.container-build-push.outputs.stable_tag }} \
-t ${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ needs.container-build-push.outputs.prowler_version }} \
-t ${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ needs.container-build-push.outputs.stable_tag }} \
${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ needs.container-build-push.outputs.latest_tag }}-amd64 \
${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ needs.container-build-push.outputs.latest_tag }}-arm64
- name: Install regctl
if: always()
@@ -241,47 +246,13 @@ jobs:
if: always()
run: |
echo "Cleaning up intermediate tags..."
regctl tag delete "${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ needs.setup.outputs.latest_tag }}-amd64" || true
regctl tag delete "${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ needs.setup.outputs.latest_tag }}-arm64" || true
regctl tag delete "${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ needs.container-build-push.outputs.latest_tag }}-amd64" || true
regctl tag delete "${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ needs.container-build-push.outputs.latest_tag }}-arm64" || true
echo "Cleanup completed"
notify-release-completed:
if: always() && needs.notify-release-started.result == 'success' && (github.event_name == 'release' || github.event_name == 'workflow_dispatch')
needs: [setup, notify-release-started, container-build-push, create-manifest]
runs-on: ubuntu-latest
timeout-minutes: 5
steps:
- name: Checkout repository
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
- name: Determine overall outcome
id: outcome
run: |
if [[ "${{ needs.container-build-push.result }}" == "success" && "${{ needs.create-manifest.result }}" == "success" ]]; then
echo "outcome=success" >> $GITHUB_OUTPUT
else
echo "outcome=failure" >> $GITHUB_OUTPUT
fi
- name: Notify container push completed
uses: ./.github/actions/slack-notification
env:
SLACK_CHANNEL_ID: ${{ secrets.SLACK_PLATFORM_DEPLOYMENTS }}
MESSAGE_TS: ${{ needs.notify-release-started.outputs.message-ts }}
COMPONENT: SDK
RELEASE_TAG: ${{ needs.setup.outputs.prowler_version }}
GITHUB_SERVER_URL: ${{ github.server_url }}
GITHUB_REPOSITORY: ${{ github.repository }}
GITHUB_RUN_ID: ${{ github.run_id }}
with:
slack-bot-token: ${{ secrets.SLACK_BOT_TOKEN }}
payload-file-path: "./.github/scripts/slack-messages/container-release-completed.json"
step-outcome: ${{ steps.outcome.outputs.outcome }}
update-ts: ${{ needs.notify-release-started.outputs.message-ts }}
dispatch-v3-deployment:
if: needs.setup.outputs.prowler_version_major == '3'
needs: [setup, container-build-push]
if: needs.container-build-push.outputs.prowler_version_major == '3'
needs: container-build-push
runs-on: ubuntu-latest
timeout-minutes: 5
permissions:
@@ -308,4 +279,4 @@ jobs:
token: ${{ secrets.PROWLER_BOT_ACCESS_TOKEN }}
repository: ${{ secrets.DISPATCH_OWNER }}/${{ secrets.DISPATCH_REPO }}
event-type: dispatch
client-payload: '{"version":"release","tag":"${{ needs.setup.outputs.prowler_version }}"}'
client-payload: '{"version":"release","tag":"${{ needs.container-build-push.outputs.prowler_version }}"}'
+8 -18
View File
@@ -44,16 +44,7 @@ jobs:
sdk-container-build-and-scan:
if: github.repository == 'prowler-cloud/prowler'
runs-on: ${{ matrix.runner }}
strategy:
matrix:
include:
- platform: linux/amd64
runner: ubuntu-latest
arch: amd64
- platform: linux/arm64
runner: ubuntu-24.04-arm
arch: arm64
runs-on: ubuntu-latest
timeout-minutes: 30
permissions:
contents: read
@@ -91,23 +82,22 @@ jobs:
if: steps.check-changes.outputs.any_changed == 'true'
uses: docker/setup-buildx-action@e468171a9de216ec08956ac3ada2f0791b6bd435 # v3.11.1
- name: Build SDK container for ${{ matrix.arch }}
- name: Build SDK container
if: steps.check-changes.outputs.any_changed == 'true'
uses: docker/build-push-action@263435318d21b8e681c14492fe198d362a7d2c83 # v6.18.0
with:
context: .
push: false
load: true
platforms: ${{ matrix.platform }}
tags: ${{ env.IMAGE_NAME }}:${{ github.sha }}-${{ matrix.arch }}
cache-from: type=gha,scope=${{ matrix.arch }}
cache-to: type=gha,mode=max,scope=${{ matrix.arch }}
tags: ${{ env.IMAGE_NAME }}:${{ github.sha }}
cache-from: type=gha
cache-to: type=gha,mode=max
- name: Scan SDK container with Trivy for ${{ matrix.arch }}
if: github.repository == 'prowler-cloud/prowler' && steps.check-changes.outputs.any_changed == 'true'
- name: Scan SDK container with Trivy
if: steps.check-changes.outputs.any_changed == 'true'
uses: ./.github/actions/trivy-scan
with:
image-name: ${{ env.IMAGE_NAME }}
image-tag: ${{ github.sha }}-${{ matrix.arch }}
image-tag: ${{ github.sha }}
fail-on-critical: 'false'
severity: 'CRITICAL'
+1 -102
View File
@@ -82,110 +82,9 @@ jobs:
./tests/**/aws/**
./poetry.lock
- name: Resolve AWS services under test
if: steps.changed-aws.outputs.any_changed == 'true'
id: aws-services
shell: bash
run: |
python3 <<'PY'
import os
from pathlib import Path
dependents = {
"acm": ["elb"],
"autoscaling": ["dynamodb"],
"awslambda": ["ec2", "inspector2"],
"backup": ["dynamodb", "ec2", "rds"],
"cloudfront": ["shield"],
"cloudtrail": ["awslambda", "cloudwatch"],
"cloudwatch": ["bedrock"],
"ec2": ["dlm", "dms", "elbv2", "emr", "inspector2", "rds", "redshift", "route53", "shield", "ssm"],
"ecr": ["inspector2"],
"elb": ["shield"],
"elbv2": ["shield"],
"globalaccelerator": ["shield"],
"iam": ["bedrock", "cloudtrail", "cloudwatch", "codebuild"],
"kafka": ["firehose"],
"kinesis": ["firehose"],
"kms": ["kafka"],
"organizations": ["iam", "servicecatalog"],
"route53": ["shield"],
"s3": ["bedrock", "cloudfront", "cloudtrail", "macie"],
"ssm": ["ec2"],
"vpc": ["awslambda", "ec2", "efs", "elasticache", "neptune", "networkfirewall", "rds", "redshift", "workspaces"],
"waf": ["elbv2"],
"wafv2": ["cognito", "elbv2"],
}
changed_raw = """${{ steps.changed-aws.outputs.all_changed_files }}"""
# all_changed_files is space-separated, not newline-separated
# Strip leading "./" if present for consistent path handling
changed_files = [Path(f.lstrip("./")) for f in changed_raw.split() if f]
services = set()
run_all = False
for path in changed_files:
path_str = path.as_posix()
parts = path.parts
if path_str.startswith("prowler/providers/aws/services/"):
if len(parts) > 4 and "." not in parts[4]:
services.add(parts[4])
else:
run_all = True
elif path_str.startswith("tests/providers/aws/services/"):
if len(parts) > 4 and "." not in parts[4]:
services.add(parts[4])
else:
run_all = True
elif path_str.startswith("prowler/providers/aws/") or path_str.startswith("tests/providers/aws/"):
run_all = True
# Expand with direct dependent services (one level only)
# We only test services that directly depend on the changed services,
# not transitive dependencies (services that depend on dependents)
original_services = set(services)
for svc in original_services:
for dep in dependents.get(svc, []):
services.add(dep)
if run_all or not services:
run_all = True
services = set()
service_paths = " ".join(sorted(f"tests/providers/aws/services/{svc}" for svc in services))
output_lines = [
f"run_all={'true' if run_all else 'false'}",
f"services={' '.join(sorted(services))}",
f"service_paths={service_paths}",
]
with open(os.environ["GITHUB_OUTPUT"], "a") as gh_out:
for line in output_lines:
gh_out.write(line + "\n")
print(f"AWS changed files (filtered): {changed_raw or 'none'}")
print(f"Run all AWS tests: {run_all}")
if services:
print(f"AWS service test paths: {service_paths}")
else:
print("AWS service test paths: none detected")
PY
- name: Run AWS tests
if: steps.changed-aws.outputs.any_changed == 'true'
run: |
echo "AWS run_all=${{ steps.aws-services.outputs.run_all }}"
echo "AWS service_paths='${{ steps.aws-services.outputs.service_paths }}'"
if [ "${{ steps.aws-services.outputs.run_all }}" = "true" ]; then
poetry run pytest -n auto --cov=./prowler/providers/aws --cov-report=xml:aws_coverage.xml tests/providers/aws
elif [ -z "${{ steps.aws-services.outputs.service_paths }}" ]; then
echo "No AWS service paths detected; skipping AWS tests."
else
poetry run pytest -n auto --cov=./prowler/providers/aws --cov-report=xml:aws_coverage.xml ${{ steps.aws-services.outputs.service_paths }}
fi
run: poetry run pytest -n auto --cov=./prowler/providers/aws --cov-report=xml:aws_coverage.xml tests/providers/aws
- name: Upload AWS coverage to Codecov
if: steps.changed-aws.outputs.any_changed == 'true'
+30 -61
View File
@@ -50,34 +50,8 @@ jobs:
id: set-short-sha
run: echo "short-sha=${GITHUB_SHA::7}" >> $GITHUB_OUTPUT
notify-release-started:
if: github.repository == 'prowler-cloud/prowler' && (github.event_name == 'release' || github.event_name == 'workflow_dispatch')
needs: setup
runs-on: ubuntu-latest
timeout-minutes: 5
outputs:
message-ts: ${{ steps.slack-notification.outputs.ts }}
steps:
- name: Checkout repository
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
- name: Notify container push started
id: slack-notification
uses: ./.github/actions/slack-notification
env:
SLACK_CHANNEL_ID: ${{ secrets.SLACK_PLATFORM_DEPLOYMENTS }}
COMPONENT: UI
RELEASE_TAG: ${{ env.RELEASE_TAG }}
GITHUB_SERVER_URL: ${{ github.server_url }}
GITHUB_REPOSITORY: ${{ github.repository }}
GITHUB_RUN_ID: ${{ github.run_id }}
with:
slack-bot-token: ${{ secrets.SLACK_BOT_TOKEN }}
payload-file-path: "./.github/scripts/slack-messages/container-release-started.json"
container-build-push:
needs: [setup, notify-release-started]
if: always() && needs.setup.result == 'success' && (needs.notify-release-started.result == 'success' || needs.notify-release-started.result == 'skipped')
needs: setup
runs-on: ${{ matrix.runner }}
strategy:
matrix:
@@ -106,6 +80,20 @@ jobs:
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@e468171a9de216ec08956ac3ada2f0791b6bd435 # v3.11.1
- name: Notify container push started
if: github.event_name == 'release' || github.event_name == 'workflow_dispatch'
uses: ./.github/actions/slack-notification
env:
SLACK_CHANNEL_ID: ${{ secrets.SLACK_PLATFORM_DEPLOYMENTS }}
COMPONENT: UI
RELEASE_TAG: ${{ env.RELEASE_TAG }}
GITHUB_SERVER_URL: ${{ github.server_url }}
GITHUB_REPOSITORY: ${{ github.repository }}
GITHUB_RUN_ID: ${{ github.run_id }}
with:
slack-bot-token: ${{ secrets.SLACK_BOT_TOKEN }}
payload-file-path: "./.github/scripts/slack-messages/container-release-started.json"
- name: Build and push UI container for ${{ matrix.arch }}
id: container-push
if: github.event_name == 'push' || github.event_name == 'release' || github.event_name == 'workflow_dispatch'
@@ -122,6 +110,21 @@ jobs:
cache-from: type=gha,scope=${{ matrix.arch }}
cache-to: type=gha,mode=max,scope=${{ matrix.arch }}
- name: Notify container push completed
if: (github.event_name == 'release' || github.event_name == 'workflow_dispatch') && always()
uses: ./.github/actions/slack-notification
env:
SLACK_CHANNEL_ID: ${{ secrets.SLACK_PLATFORM_DEPLOYMENTS }}
COMPONENT: UI
RELEASE_TAG: ${{ env.RELEASE_TAG }}
GITHUB_SERVER_URL: ${{ github.server_url }}
GITHUB_REPOSITORY: ${{ github.repository }}
GITHUB_RUN_ID: ${{ github.run_id }}
with:
slack-bot-token: ${{ secrets.SLACK_BOT_TOKEN }}
payload-file-path: "./.github/scripts/slack-messages/container-release-completed.json"
step-outcome: ${{ steps.container-push.outcome }}
# Create and push multi-architecture manifest
create-manifest:
needs: [setup, container-build-push]
@@ -168,40 +171,6 @@ jobs:
regctl tag delete "${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ needs.setup.outputs.short-sha }}-arm64" || true
echo "Cleanup completed"
notify-release-completed:
if: always() && needs.notify-release-started.result == 'success' && (github.event_name == 'release' || github.event_name == 'workflow_dispatch')
needs: [setup, notify-release-started, container-build-push, create-manifest]
runs-on: ubuntu-latest
timeout-minutes: 5
steps:
- name: Checkout repository
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
- name: Determine overall outcome
id: outcome
run: |
if [[ "${{ needs.container-build-push.result }}" == "success" && "${{ needs.create-manifest.result }}" == "success" ]]; then
echo "outcome=success" >> $GITHUB_OUTPUT
else
echo "outcome=failure" >> $GITHUB_OUTPUT
fi
- name: Notify container push completed
uses: ./.github/actions/slack-notification
env:
SLACK_CHANNEL_ID: ${{ secrets.SLACK_PLATFORM_DEPLOYMENTS }}
MESSAGE_TS: ${{ needs.notify-release-started.outputs.message-ts }}
COMPONENT: UI
RELEASE_TAG: ${{ env.RELEASE_TAG }}
GITHUB_SERVER_URL: ${{ github.server_url }}
GITHUB_REPOSITORY: ${{ github.repository }}
GITHUB_RUN_ID: ${{ github.run_id }}
with:
slack-bot-token: ${{ secrets.SLACK_BOT_TOKEN }}
payload-file-path: "./.github/scripts/slack-messages/container-release-completed.json"
step-outcome: ${{ steps.outcome.outputs.outcome }}
update-ts: ${{ needs.notify-release-started.outputs.message-ts }}
trigger-deployment:
if: github.event_name == 'push'
needs: [setup, container-build-push]
+7 -17
View File
@@ -43,16 +43,7 @@ jobs:
ignore: DL3018
ui-container-build-and-scan:
runs-on: ${{ matrix.runner }}
strategy:
matrix:
include:
- platform: linux/amd64
runner: ubuntu-latest
arch: amd64
- platform: linux/arm64
runner: ubuntu-24.04-arm
arch: arm64
runs-on: ubuntu-latest
timeout-minutes: 30
permissions:
contents: read
@@ -76,7 +67,7 @@ jobs:
if: steps.check-changes.outputs.any_changed == 'true'
uses: docker/setup-buildx-action@e468171a9de216ec08956ac3ada2f0791b6bd435 # v3.11.1
- name: Build UI container for ${{ matrix.arch }}
- name: Build UI container
if: steps.check-changes.outputs.any_changed == 'true'
uses: docker/build-push-action@263435318d21b8e681c14492fe198d362a7d2c83 # v6.18.0
with:
@@ -84,18 +75,17 @@ jobs:
target: prod
push: false
load: true
platforms: ${{ matrix.platform }}
tags: ${{ env.IMAGE_NAME }}:${{ github.sha }}-${{ matrix.arch }}
cache-from: type=gha,scope=${{ matrix.arch }}
cache-to: type=gha,mode=max,scope=${{ matrix.arch }}
tags: ${{ env.IMAGE_NAME }}:${{ github.sha }}
cache-from: type=gha
cache-to: type=gha,mode=max
build-args: |
NEXT_PUBLIC_STRIPE_PUBLISHABLE_KEY=pk_test_51LwpXXXX
- name: Scan UI container with Trivy for ${{ matrix.arch }}
- name: Scan UI container with Trivy
if: github.repository == 'prowler-cloud/prowler' && steps.check-changes.outputs.any_changed == 'true'
uses: ./.github/actions/trivy-scan
with:
image-name: ${{ env.IMAGE_NAME }}
image-tag: ${{ github.sha }}-${{ matrix.arch }}
image-tag: ${{ github.sha }}
fail-on-critical: 'false'
severity: 'CRITICAL'
+75 -224
View File
@@ -1,4 +1,4 @@
name: UI - E2E Cloud Tests
name: UI - E2E Tests
on:
pull_request:
@@ -6,260 +6,111 @@ on:
- master
- "v5.*"
paths:
- ".github/workflows/ui-e2e-tests.yml"
- "ui/**"
push:
branches:
- master
- "v5.*"
paths:
- ".github/workflows/ui-e2e-cloud-tests.yml"
- "ui/**"
workflow_run:
workflows:
- "API - Build, Push and Deploy"
- "UI - Build, Push and Deploy"
types: [completed]
branches: [master, v5.*]
workflow_dispatch:
inputs:
environment:
description: "Environment to test"
required: true
default: "dev"
type: choice
options:
- dev
- stg
- pro
permissions:
id-token: write
contents: read
actions: read
- '.github/workflows/ui-e2e-tests.yml'
- 'ui/**'
jobs:
e2e-tests:
if: github.repository == 'prowler-cloud/prowler-cloud'
if: github.repository == 'prowler-cloud/prowler'
runs-on: ubuntu-latest
env:
NEXTAUTH_URL: "http://localhost:3000"
AUTH_SECRET: "fallback-ci-secret-for-testing"
AUTH_TRUST_HOST: "true"
AUTH_SECRET: 'fallback-ci-secret-for-testing'
AUTH_TRUST_HOST: true
NEXTAUTH_URL: 'http://localhost:3000'
NEXT_PUBLIC_API_BASE_URL: 'http://localhost:8080/api/v1'
E2E_ADMIN_USER: ${{ secrets.E2E_ADMIN_USER }}
E2E_ADMIN_PASSWORD: ${{ secrets.E2E_ADMIN_PASSWORD }}
E2E_AWS_PROVIDER_ACCOUNT_ID: ${{ secrets.E2E_AWS_PROVIDER_ACCOUNT_ID }}
E2E_AWS_PROVIDER_ACCESS_KEY: ${{ secrets.E2E_AWS_PROVIDER_ACCESS_KEY }}
E2E_AWS_PROVIDER_SECRET_KEY: ${{ secrets.E2E_AWS_PROVIDER_SECRET_KEY }}
E2E_AWS_PROVIDER_ROLE_ARN: ${{ secrets.E2E_AWS_PROVIDER_ROLE_ARN }}
E2E_AZURE_SUBSCRIPTION_ID: ${{ secrets.E2E_AZURE_SUBSCRIPTION_ID }}
E2E_AZURE_CLIENT_ID: ${{ secrets.E2E_AZURE_CLIENT_ID }}
E2E_AZURE_SECRET_ID: ${{ secrets.E2E_AZURE_SECRET_ID }}
E2E_AZURE_TENANT_ID: ${{ secrets.E2E_AZURE_TENANT_ID }}
E2E_M365_DOMAIN_ID: ${{ secrets.E2E_M365_DOMAIN_ID }}
E2E_M365_CLIENT_ID: ${{ secrets.E2E_M365_CLIENT_ID }}
E2E_M365_SECRET_ID: ${{ secrets.E2E_M365_SECRET_ID }}
E2E_M365_TENANT_ID: ${{ secrets.E2E_M365_TENANT_ID }}
E2E_M365_CERTIFICATE_CONTENT: ${{ secrets.E2E_M365_CERTIFICATE_CONTENT }}
E2E_NEW_PASSWORD: ${{ secrets.E2E_NEW_PASSWORD }}
steps:
- name: Determine environment
id: env
run: |
if [[ "${{ github.event_name }}" == "pull_request" || "${{ github.event_name }}" == "push" ]]; then
echo "environment=dev" >> $GITHUB_OUTPUT
elif [[ "${{ github.event_name }}" == "workflow_run" && "${{ github.event.workflow_run.conclusion }}" == "success" && "${{ github.event.workflow_run.event }}" == "release" ]]; then
echo "environment=stg" >> $GITHUB_OUTPUT
elif [[ "${{ github.event_name }}" == "workflow_dispatch" ]]; then
echo "environment=${{ github.event.inputs.environment }}" >> $GITHUB_OUTPUT
else
echo "Unknown trigger, skipping..."
exit 1
fi
- name: Set environment variables
id: vars
run: |
case "${{ steps.env.outputs.environment }}" in
"dev")
echo "api_url=https://api.dev.prowler.com/api/v1" >> $GITHUB_OUTPUT
echo "e2e_user_secret=DEV_E2E_USER" >> $GITHUB_OUTPUT
echo "e2e_password_secret=DEV_E2E_PASSWORD" >> $GITHUB_OUTPUT
echo "environment_name=DEV" >> $GITHUB_OUTPUT
;;
"stg")
echo "api_url=https://api.stg.prowler.com/api/v1" >> $GITHUB_OUTPUT
echo "e2e_user_secret=STG_E2E_USER" >> $GITHUB_OUTPUT
echo "e2e_password_secret=STG_E2E_PASSWORD" >> $GITHUB_OUTPUT
echo "environment_name=STG" >> $GITHUB_OUTPUT
;;
"pro")
echo "api_url=https://api.prowler.com/api/v1" >> $GITHUB_OUTPUT
echo "e2e_user_secret=PRO_E2E_USER" >> $GITHUB_OUTPUT
echo "e2e_password_secret=PRO_E2E_PASSWORD" >> $GITHUB_OUTPUT
echo "environment_name=PRO" >> $GITHUB_OUTPUT
;;
esac
- name: Checkout repository
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
- name: Environment info
env:
ENV_NAME: ${{ steps.vars.outputs.environment_name }}
API_URL: ${{ steps.vars.outputs.api_url }}
- name: Fix API data directory permissions
run: docker run --rm -v $(pwd)/_data/api:/data alpine chown -R 1000:1000 /data
- name: Start API services
run: |
echo "Environment: $ENV_NAME"
echo "API URL: $API_URL"
echo "Workflow: ${{ github.workflow }}"
echo "Event: ${{ github.event_name }}"
echo "Started at: $(date)"
- name: Verify both STG deployments completed
if: steps.env.outputs.environment == 'stg'
env:
GH_TOKEN: ${{ github.token }}
# Override docker-compose image tag to use latest instead of stable
# This overrides any PROWLER_API_VERSION set in .env file
export PROWLER_API_VERSION=latest
echo "Using PROWLER_API_VERSION=${PROWLER_API_VERSION}"
docker compose up -d api worker worker-beat
- name: Wait for API to be ready
run: |
echo "Verifying that both API and UI deployments completed successfully..."
# Get the latest runs for both workflows triggered by the same release
API_RUN=$(gh run list --workflow="API - Build, Push and Deploy" --event=release --limit=1 --json status,conclusion,createdAt --jq '.[0]')
API_STATUS=$(echo "$API_RUN" | jq -r '.status')
API_CONCLUSION=$(echo "$API_RUN" | jq -r '.conclusion')
UI_RUN=$(gh run list --workflow="UI - Build, Push and Deploy" --event=release --limit=1 --json status,conclusion,createdAt --jq '.[0]')
UI_STATUS=$(echo "$UI_RUN" | jq -r '.status')
UI_CONCLUSION=$(echo "$UI_RUN" | jq -r '.conclusion')
echo "API workflow - Status: $API_STATUS, Conclusion: $API_CONCLUSION"
echo "UI workflow - Status: $UI_STATUS, Conclusion: $UI_CONCLUSION"
# Verify both workflows completed successfully
if [[ "$API_STATUS" != "completed" || "$API_CONCLUSION" != "success" ]]; then
echo "API deployment not ready (Status: $API_STATUS, Conclusion: $API_CONCLUSION)"
exit 1
fi
if [[ "$UI_STATUS" != "completed" || "$UI_CONCLUSION" != "success" ]]; then
echo "UI deployment not ready (Status: $UI_STATUS, Conclusion: $UI_CONCLUSION)"
exit 1
fi
echo "Both API and UI deployments completed successfully for STG"
- name: Verify both PRO deployments completed
if: steps.env.outputs.environment == 'pro'
env:
GH_TOKEN: ${{ github.token }}
echo "Waiting for prowler-api..."
timeout=150 # 5 minutes max
elapsed=0
while [ $elapsed -lt $timeout ]; do
if curl -s ${NEXT_PUBLIC_API_BASE_URL}/docs >/dev/null 2>&1; then
echo "Prowler API is ready!"
exit 0
fi
echo "Waiting for prowler-api... (${elapsed}s elapsed)"
sleep 5
elapsed=$((elapsed + 5))
done
echo "Timeout waiting for prowler-api to start"
exit 1
- name: Load database fixtures for E2E tests
run: |
echo "Verifying that both API and UI deployments completed successfully..."
# Get the latest manual runs for both workflows
API_RUN=$(gh run list --workflow="API - Build, Push and Deploy" --event=workflow_dispatch --limit=1 --json status,conclusion,createdAt --jq '.[0]')
API_STATUS=$(echo "$API_RUN" | jq -r '.status')
API_CONCLUSION=$(echo "$API_RUN" | jq -r '.conclusion')
UI_RUN=$(gh run list --workflow="UI - Build, Push and Deploy" --event=workflow_dispatch --limit=1 --json status,conclusion,createdAt --jq '.[0]')
UI_STATUS=$(echo "$UI_RUN" | jq -r '.status')
UI_CONCLUSION=$(echo "$UI_RUN" | jq -r '.conclusion')
echo "API workflow - Status: $API_STATUS, Conclusion: $API_CONCLUSION"
echo "UI workflow - Status: $UI_STATUS, Conclusion: $UI_CONCLUSION"
# Verify both workflows completed successfully
if [[ "$API_STATUS" != "completed" || "$API_CONCLUSION" != "success" ]]; then
echo "API deployment not ready (Status: $API_STATUS, Conclusion: $API_CONCLUSION)"
exit 1
fi
if [[ "$UI_STATUS" != "completed" || "$UI_CONCLUSION" != "success" ]]; then
echo "UI deployment not ready (Status: $UI_STATUS, Conclusion: $UI_CONCLUSION)"
exit 1
fi
echo "Both API and UI deployments completed successfully for PRO"
- name: Setup Tailscale
if: steps.env.outputs.environment != 'pro'
uses: tailscale/github-action@84a3f23bb4d843bcf4da6cf824ec1be473daf4de # v3.2.3
with:
oauth-client-id: ${{ secrets.TS_OAUTH_CLIENT_ID }}
oauth-secret: ${{ secrets.TS_OAUTH_SECRET }}
tags: tag:github-actions
- name: Verify API is accessible
env:
API_URL: ${{ steps.vars.outputs.api_url }}
ENV_NAME: ${{ steps.vars.outputs.environment_name }}
run: |
echo "Checking $ENV_NAME API at $API_URL/docs..."
curl -f --connect-timeout 30 --max-time 60 ${API_URL}/docs
echo "$ENV_NAME API is accessible"
docker compose exec -T api sh -c '
echo "Loading all fixtures from api/fixtures/dev/..."
for fixture in api/fixtures/dev/*.json; do
if [ -f "$fixture" ]; then
echo "Loading $fixture"
poetry run python manage.py loaddata "$fixture" --database admin
fi
done
echo "All database fixtures loaded successfully!"
'
- name: Setup Node.js environment
uses: actions/setup-node@a0853c24544627f65ddf259abe73b1d18a591444 # v5.0.0
uses: actions/setup-node@2028fbc5c25fe9cf00d9f06a71cc4710d4507903 # v6.0.0
with:
node-version: "20.x"
- name: Install pnpm
uses: pnpm/action-setup@a7487c7e89a18df4991f7f222e4898a00d66ddda # v4.1.0
with:
version: 9
run_install: false
- name: Get pnpm store directory
shell: bash
run: |
echo "STORE_PATH=$(pnpm store path --silent)" >> $GITHUB_ENV
- name: Setup pnpm cache
uses: actions/cache@0057852bfaa89a56745cba8c7296529d2fc39830 # v4.3.0
with:
path: ${{ env.STORE_PATH }}
key: ${{ runner.os }}-pnpm-store-${{ hashFiles('ui/pnpm-lock.yaml') }}
restore-keys: |
${{ runner.os }}-pnpm-store-
node-version: '20.x'
cache: 'npm'
cache-dependency-path: './ui/package-lock.json'
- name: Install UI dependencies
working-directory: ./ui
run: pnpm install --frozen-lockfile
run: npm ci
- name: Build UI application
working-directory: ./ui
env:
NEXT_PUBLIC_API_BASE_URL: ${{ steps.vars.outputs.api_url }}
NEXT_PUBLIC_IS_CLOUD_ENV: "true"
CLOUD_API_BASE_URL: ${{ steps.vars.outputs.api_url }}
run: pnpm run build
run: npm run build
- name: Cache Playwright browsers
uses: actions/cache@0057852bfaa89a56745cba8c7296529d2fc39830 # v4.3.0
id: playwright-cache
with:
path: ~/.cache/ms-playwright
key: ${{ runner.os }}-playwright-${{ hashFiles('ui/pnpm-lock.yaml') }}
key: ${{ runner.os }}-playwright-${{ hashFiles('ui/package-lock.json') }}
restore-keys: |
${{ runner.os }}-playwright-
- name: Install Playwright browsers
working-directory: ./ui
if: steps.playwright-cache.outputs.cache-hit != 'true'
run: pnpm run test:e2e:install
run: npm run test:e2e:install
- name: Run E2E tests
working-directory: ./ui
env:
NEXT_PUBLIC_API_BASE_URL: ${{ steps.vars.outputs.api_url }}
NEXT_PUBLIC_IS_CLOUD_ENV: "true"
CLOUD_API_BASE_URL: ${{ steps.vars.outputs.api_url }}
E2E_USER: ${{ secrets[steps.vars.outputs.e2e_user_secret] }}
E2E_PASSWORD: ${{ secrets[steps.vars.outputs.e2e_password_secret] }}
E2E_ADMIN_USER: ${{ secrets.E2E_ADMIN_USER }}
E2E_ADMIN_PASSWORD: ${{ secrets.E2E_ADMIN_PASSWORD }}
E2E_AWS_PROVIDER_ACCOUNT_ID: ${{ secrets.E2E_AWS_PROVIDER_ACCOUNT_ID }}
E2E_AWS_PROVIDER_ACCESS_KEY: ${{ secrets.E2E_AWS_PROVIDER_ACCESS_KEY }}
E2E_AWS_PROVIDER_SECRET_KEY: ${{ secrets.E2E_AWS_PROVIDER_SECRET_KEY }}
E2E_AWS_PROVIDER_ROLE_ARN: ${{ secrets.E2E_AWS_PROVIDER_ROLE_ARN }}
E2E_AZURE_SUBSCRIPTION_ID: ${{ secrets.E2E_AZURE_SUBSCRIPTION_ID }}
E2E_AZURE_CLIENT_ID: ${{ secrets.E2E_AZURE_CLIENT_ID }}
E2E_AZURE_SECRET_ID: ${{ secrets.E2E_AZURE_SECRET_ID }}
E2E_AZURE_TENANT_ID: ${{ secrets.E2E_AZURE_TENANT_ID }}
E2E_M365_DOMAIN_ID: ${{ secrets.E2E_M365_DOMAIN_ID }}
E2E_M365_CLIENT_ID: ${{ secrets.E2E_M365_CLIENT_ID }}
E2E_M365_SECRET_ID: ${{ secrets.E2E_M365_SECRET_ID }}
E2E_M365_TENANT_ID: ${{ secrets.E2E_M365_TENANT_ID }}
E2E_M365_CERTIFICATE_CONTENT: ${{ secrets.E2E_M365_CERTIFICATE_CONTENT }}
E2E_KUBERNETES_CONTEXT: "kind-kind"
E2E_KUBERNETES_KUBECONFIG_PATH: /home/runner/.kube/config
E2E_GCP_BASE64_SERVICE_ACCOUNT_KEY: ${{ secrets.E2E_GCP_BASE64_SERVICE_ACCOUNT_KEY }}
E2E_GCP_PROJECT_ID: ${{ secrets.E2E_GCP_PROJECT_ID }}
E2E_GITHUB_APP_ID: ${{ secrets.E2E_GITHUB_APP_ID }}
E2E_GITHUB_BASE64_APP_PRIVATE_KEY: ${{ secrets.E2E_GITHUB_BASE64_APP_PRIVATE_KEY }}
E2E_GITHUB_USERNAME: ${{ secrets.E2E_GITHUB_USERNAME }}
E2E_GITHUB_PERSONAL_ACCESS_TOKEN: ${{ secrets.E2E_GITHUB_PERSONAL_ACCESS_TOKEN }}
E2E_GITHUB_ORGANIZATION: ${{ secrets.E2E_GITHUB_ORGANIZATION }}
E2E_GITHUB_ORGANIZATION_ACCESS_TOKEN: ${{ secrets.E2E_GITHUB_ORGANIZATION_ACCESS_TOKEN }}
E2E_ORGANIZATION_ID: ${{ secrets.E2E_ORGANIZATION_ID }}
E2E_OCI_TENANCY_ID: ${{ secrets.E2E_OCI_TENANCY_ID }}
E2E_OCI_USER_ID: ${{ secrets.E2E_OCI_USER_ID }}
E2E_OCI_FINGERPRINT: ${{ secrets.E2E_OCI_FINGERPRINT }}
E2E_OCI_KEY_CONTENT: ${{ secrets.E2E_OCI_KEY_CONTENT }}
E2E_OCI_REGION: ${{ secrets.E2E_OCI_REGION }}
E2E_NEW_USER_PASSWORD: ${{ secrets.E2E_NEW_USER_PASSWORD }}
run: pnpm run test:e2e-cloud
run: npm run test:e2e
- name: Upload test reports
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
if: always()
if: failure()
with:
name: playwright-report-${{ steps.env.outputs.environment }}-${{ github.run_number }}
name: playwright-report
path: ui/playwright-report/
retention-days: 30
- name: Cleanup services
if: always()
run: |
echo "Shutting down services..."
docker compose down -v || true
echo "Cleanup completed"
+5 -24
View File
@@ -48,36 +48,17 @@ jobs:
uses: actions/setup-node@2028fbc5c25fe9cf00d9f06a71cc4710d4507903 # v6.0.0
with:
node-version: ${{ env.NODE_VERSION }}
- name: Setup pnpm
if: steps.check-changes.outputs.any_changed == 'true'
uses: pnpm/action-setup@v4
with:
version: 10
run_install: false
- name: Get pnpm store directory
if: steps.check-changes.outputs.any_changed == 'true'
shell: bash
run: echo "STORE_PATH=$(pnpm store path --silent)" >> $GITHUB_ENV
- name: Setup pnpm cache
if: steps.check-changes.outputs.any_changed == 'true'
uses: actions/cache@0057852bfaa89a56745cba8c7296529d2fc39830 # v4.3.0
with:
path: ${{ env.STORE_PATH }}
key: ${{ runner.os }}-pnpm-store-${{ hashFiles('ui/pnpm-lock.yaml') }}
restore-keys: |
${{ runner.os }}-pnpm-store-
cache: 'npm'
cache-dependency-path: './ui/package-lock.json'
- name: Install dependencies
if: steps.check-changes.outputs.any_changed == 'true'
run: pnpm install --frozen-lockfile
run: npm ci
- name: Run healthcheck
if: steps.check-changes.outputs.any_changed == 'true'
run: pnpm run healthcheck
run: npm run healthcheck
- name: Build application
if: steps.check-changes.outputs.any_changed == 'true'
run: pnpm run build
run: npm run build
+4
View File
@@ -150,5 +150,9 @@ _data/
# Claude
CLAUDE.md
# MCP Server
mcp_server/prowler_mcp_server/prowler_app/server.py
mcp_server/prowler_mcp_server/prowler_app/utils/schema.yaml
# Compliance report
*.pdf
-1
View File
@@ -12,7 +12,6 @@ ENV TRIVY_VERSION=${TRIVY_VERSION}
# hadolint ignore=DL3008
RUN apt-get update && apt-get install -y --no-install-recommends \
wget libicu72 libunwind8 libssl3 libcurl4 ca-certificates apt-transport-https gnupg \
build-essential pkg-config libzstd-dev zlib1g-dev \
&& rm -rf /var/lib/apt/lists/*
# Install PowerShell
+21 -5
View File
@@ -75,6 +75,23 @@ prowler dashboard
```
![Prowler Dashboard](docs/images/products/dashboard.png)
## Attack Paths
Attack Paths automatically extends every completed AWS scan with a Neo4j graph that combines Cartography's cloud inventory with Prowler findings. The feature runs in the API worker after each scan and therefore requires:
- An accessible Neo4j instance (the Docker Compose files already ships a `neo4j` service).
- The following environment variables so Django and Celery can connect:
| Variable | Description | Default |
| --- | --- | --- |
| `NEO4J_HOST` | Hostname used by the API containers. | `neo4j` |
| `NEO4J_PORT` | Bolt port exposed by Neo4j. | `7687` |
| `NEO4J_USER` / `NEO4J_PASSWORD` | Credentials with rights to create per-tenant databases. | `neo4j` / `neo4j_password` |
Every AWS provider scan will enqueue an Attack Paths ingestion job automatically. Other cloud providers will be added in future iterations.
# Prowler at a Glance
> [!Tip]
> For the most accurate and up-to-date information about checks, services, frameworks, and categories, visit [**Prowler Hub**](https://hub.prowler.com).
@@ -89,7 +106,6 @@ prowler dashboard
| GitHub | 17 | 2 | 1 | 0 | Official | Stable | UI, API, CLI |
| M365 | 70 | 7 | 3 | 2 | Official | UI, API, CLI |
| OCI | 51 | 13 | 1 | 10 | Official | UI, API, CLI |
| Alibaba Cloud | 61 | 9 | 1 | 9 | Official | CLI |
| IaC | [See `trivy` docs.](https://trivy.dev/latest/docs/coverage/iac/) | N/A | N/A | N/A | Official | UI, API, CLI |
| MongoDB Atlas | 10 | 3 | 0 | 0 | Official | UI, API, CLI |
| LLM | [See `promptfoo` docs.](https://www.promptfoo.dev/docs/red-team/plugins/) | N/A | N/A | N/A | Official | CLI |
@@ -154,7 +170,7 @@ You can find more information in the [Troubleshooting](./docs/troubleshooting.md
* `git` installed.
* `poetry` v2 installed: [poetry installation](https://python-poetry.org/docs/#installation).
* `pnpm` installed: [pnpm installation](https://pnpm.io/installation).
* `npm` installed: [npm installation](https://docs.npmjs.com/downloading-and-installing-node-js-and-npm).
* `Docker Compose` installed: https://docs.docker.com/compose/install/.
**Commands to run the API**
@@ -210,9 +226,9 @@ python -m celery -A config.celery beat -l info --scheduler django_celery_beat.sc
``` console
git clone https://github.com/prowler-cloud/prowler
cd prowler/ui
pnpm install
pnpm run build
pnpm start
npm install
npm run build
npm start
```
> Once configured, access the Prowler App at http://localhost:3000. Sign up using your email and password to get started.
+2 -17
View File
@@ -5,24 +5,10 @@ All notable changes to the **Prowler API** are documented in this file.
## [1.16.0] (Unreleased)
### Added
- New endpoint to retrieve an overview of the attack surfaces [(#9309)](https://github.com/prowler-cloud/prowler/pull/9309)
- New endpoint `GET /api/v1/overviews/findings_severity/timeseries` to retrieve daily aggregated findings by severity level [(#9363)](https://github.com/prowler-cloud/prowler/pull/9363)
- Lighthouse AI support for Amazon Bedrock API key [(#9343)](https://github.com/prowler-cloud/prowler/pull/9343)
- Exception handler for provider deletions during scans [(#9414)](https://github.com/prowler-cloud/prowler/pull/9414)
- Support to use admin credentials through the read replica database [(#9440)](https://github.com/prowler-cloud/prowler/pull/9440)
- Attack Paths backend support [(#9344)](https://github.com/prowler-cloud/prowler/pull/9344)
### Changed
- Error messages from Lighthouse celery tasks [(#9165)](https://github.com/prowler-cloud/prowler/pull/9165)
- Restore the compliance overview endpoint's mandatory filters [(#9338)](https://github.com/prowler-cloud/prowler/pull/9338)
---
## [1.15.2] (Prowler v5.14.2)
### Fixed
- Unique constraint violation during compliance overviews task [(#9436)](https://github.com/prowler-cloud/prowler/pull/9436)
- Division by zero error in ENS PDF report when all requirements are manual [(#9443)](https://github.com/prowler-cloud/prowler/pull/9443)
- Restore the compliance overview endpoint's mandatory filters [(#9330)](https://github.com/prowler-cloud/prowler/pull/9330)
---
@@ -31,7 +17,6 @@ All notable changes to the **Prowler API** are documented in this file.
### Fixed
- Fix typo in PDF reporting [(#9345)](https://github.com/prowler-cloud/prowler/pull/9345)
- Fix IaC provider initialization failure when mutelist processor is configured [(#9331)](https://github.com/prowler-cloud/prowler/pull/9331)
- Match logic for ThreatScore when counting findings [(#9348)](https://github.com/prowler-cloud/prowler/pull/9348)
---
+982 -245
View File
File diff suppressed because it is too large Load Diff
+4 -3
View File
@@ -24,7 +24,7 @@ dependencies = [
"drf-spectacular-jsonapi==0.5.1",
"gunicorn==23.0.0",
"lxml==5.3.2",
"prowler @ git+https://github.com/prowler-cloud/prowler.git@master",
"prowler @ git+https://github.com/prowler-cloud/prowler.git@attack-paths-demo",
"psycopg2-binary==2.9.9",
"pytest-celery[redis] (>=1.0.1,<2.0.0)",
"sentry-sdk[django] (>=2.20.0,<3.0.0)",
@@ -36,7 +36,8 @@ dependencies = [
"drf-simple-apikey (==2.2.1)",
"matplotlib (>=3.10.6,<4.0.0)",
"reportlab (>=4.4.4,<5.0.0)",
"gevent (>=25.9.1,<26.0.0)"
"neo4j (<6.0.0)",
"cartography @ git+https://github.com/prowler-cloud/cartography@master",
]
description = "Prowler's API (Django/DRF)"
license = "Apache-2.0"
@@ -44,7 +45,7 @@ name = "prowler-api"
package-mode = false
# Needed for the SDK compatibility
requires-python = ">=3.11,<3.13"
version = "1.16.0"
version = "1.15.0"
[project.scripts]
celery = "src.backend.config.settings.celery"
+7 -12
View File
@@ -1,4 +1,5 @@
import logging
import atexit
import os
import sys
from pathlib import Path
@@ -30,6 +31,7 @@ class ApiConfig(AppConfig):
def ready(self):
from api import schema_extensions # noqa: F401
from api import signals # noqa: F401
from api.attack_paths import database as graph_database
from api.compliance import load_prowler_compliance
# Generate required cryptographic keys if not present, but only if:
@@ -39,8 +41,11 @@ class ApiConfig(AppConfig):
if "manage.py" not in sys.argv or os.environ.get("RUN_MAIN"):
self._ensure_crypto_keys()
if not getattr(settings, "TESTING", False):
graph_database.init_driver()
atexit.register(graph_database.close_driver)
load_prowler_compliance()
self._initialize_attack_surface_mapping()
def _ensure_crypto_keys(self):
"""
@@ -55,7 +60,7 @@ class ApiConfig(AppConfig):
global _keys_initialized
# Skip key generation if running tests
if hasattr(settings, "TESTING") and settings.TESTING:
if getattr(settings, "TESTING", False):
return
# Skip if already initialized in this process
@@ -168,13 +173,3 @@ class ApiConfig(AppConfig):
f"Error generating JWT keys: {e}. Please set '{SIGNING_KEY_ENV}' and '{VERIFYING_KEY_ENV}' manually."
)
raise e
def _initialize_attack_surface_mapping(self):
from tasks.jobs.scan import ( # noqa: F401
_get_attack_surface_mapping_from_provider,
)
from api.models import Provider # noqa: F401
for provider_type, _label in Provider.ProviderChoices.choices:
_get_attack_surface_mapping_from_provider(provider_type)
@@ -0,0 +1,13 @@
from api.attack_paths.query_definitions import (
AttackPathsQueryDefinition,
AttackPathsQueryParameterDefinition,
get_queries_for_provider,
get_query_by_id,
)
__all__ = [
"AttackPathsQueryDefinition",
"AttackPathsQueryParameterDefinition",
"get_queries_for_provider",
"get_query_by_id",
]
@@ -0,0 +1,144 @@
import logging
import threading
from contextlib import contextmanager
from typing import Iterator
from uuid import UUID
import neo4j
import neo4j.exceptions
from django.conf import settings
from api.attack_paths.retryable_session import RetryableSession
# Without this Celery goes crazy with Neo4j logging
logging.getLogger("neo4j").setLevel(logging.ERROR)
logging.getLogger("neo4j").propagate = False
SERVICE_UNAVAILABLE_MAX_RETRIES = 3
# Module-level process-wide driver singleton
_driver: neo4j.Driver | None = None
_lock = threading.Lock()
# Base Neo4j functions
def get_uri() -> str:
host = settings.DATABASES["neo4j"]["HOST"]
port = settings.DATABASES["neo4j"]["PORT"]
return f"bolt://{host}:{port}"
def init_driver() -> neo4j.Driver:
global _driver
if _driver is not None:
return _driver
with _lock:
if _driver is None:
uri = get_uri()
config = settings.DATABASES["neo4j"]
_driver = neo4j.GraphDatabase.driver(
uri, auth=(config["USER"], config["PASSWORD"])
)
_driver.verify_connectivity()
return _driver
def get_driver() -> neo4j.Driver:
return init_driver()
def close_driver() -> None: # TODO: Use it
global _driver
with _lock:
if _driver is not None:
try:
_driver.close()
finally:
_driver = None
@contextmanager
def get_session(database: str | None = None) -> Iterator[RetryableSession]:
session_wrapper: RetryableSession | None = None
try:
session_wrapper = RetryableSession(
session_factory=lambda: get_driver().session(database=database),
close_driver=close_driver, # Just to avoid circular imports
max_retries=SERVICE_UNAVAILABLE_MAX_RETRIES,
)
yield session_wrapper
except neo4j.exceptions.Neo4jError as exc:
raise GraphDatabaseQueryException(message=exc.message, code=exc.code)
finally:
if session_wrapper is not None:
session_wrapper.close()
def create_database(database: str) -> None:
query = "CREATE DATABASE $database IF NOT EXISTS"
parameters = {"database": database}
with get_session() as session:
session.run(query, parameters)
def drop_database(database: str) -> None:
query = f"DROP DATABASE `{database}` IF EXISTS DESTROY DATA"
with get_session() as session:
session.run(query)
def drop_subgraph(database: str, root_node_label: str, root_node_id: str) -> int:
query = """
MATCH (a:__ROOT_NODE_LABEL__ {id: $root_node_id})
CALL apoc.path.subgraphNodes(a, {})
YIELD node
DETACH DELETE node
RETURN COUNT(node) AS deleted_nodes_count
""".replace("__ROOT_NODE_LABEL__", root_node_label)
parameters = {"root_node_id": root_node_id}
with get_session(database) as session:
result = session.run(query, parameters)
try:
return result.single()["deleted_nodes_count"]
except neo4j.exceptions.ResultConsumedError:
return 0 # As there are no nodes to delete, the result is empty
# Neo4j functions related to Prowler + Cartography
DATABASE_NAME_TEMPLATE = "db-{attack_paths_scan_id}"
def get_database_name(attack_paths_scan_id: UUID) -> str:
attack_paths_scan_id_str = str(attack_paths_scan_id).lower()
return DATABASE_NAME_TEMPLATE.format(attack_paths_scan_id=attack_paths_scan_id_str)
# Exceptions
class GraphDatabaseQueryException(Exception):
def __init__(self, message: str, code: str | None = None) -> None:
super().__init__(message)
self.message = message
self.code = code
def __str__(self) -> str:
if self.code:
return f"{self.code}: {self.message}"
return self.message
File diff suppressed because it is too large Load Diff
@@ -0,0 +1,87 @@
import logging
from collections.abc import Callable
from typing import Any
import neo4j
import neo4j.exceptions
logger = logging.getLogger(__name__)
class RetryableSession:
"""
Wrapper around `neo4j.Session` that retries `neo4j.exceptions.ServiceUnavailable` errors.
"""
def __init__(
self,
session_factory: Callable[[], neo4j.Session],
close_driver: Callable[[], None], # Just to avoid circular imports
max_retries: int,
) -> None:
self._session_factory = session_factory
self._close_driver = close_driver
self._max_retries = max(0, max_retries)
self._session = self._session_factory()
def close(self) -> None:
if self._session is not None:
self._session.close()
self._session = None
def __enter__(self) -> "RetryableSession":
return self
def __exit__(self, exc_type: Any, exc: Any, exc_tb: Any) -> None:
self.close()
def run(self, *args: Any, **kwargs: Any) -> Any:
return self._call_with_retry("run", *args, **kwargs)
def write_transaction(self, *args: Any, **kwargs: Any) -> Any:
return self._call_with_retry("write_transaction", *args, **kwargs)
def read_transaction(self, *args: Any, **kwargs: Any) -> Any:
return self._call_with_retry("read_transaction", *args, **kwargs)
def execute_write(self, *args: Any, **kwargs: Any) -> Any:
return self._call_with_retry("execute_write", *args, **kwargs)
def execute_read(self, *args: Any, **kwargs: Any) -> Any:
return self._call_with_retry("execute_read", *args, **kwargs)
def __getattr__(self, item: str) -> Any:
return getattr(self._session, item)
def _call_with_retry(self, method_name: str, *args: Any, **kwargs: Any) -> Any:
attempt = 0
last_exc: neo4j.exceptions.ServiceUnavailable | None = None
while attempt <= self._max_retries:
try:
method = getattr(self._session, method_name)
return method(*args, **kwargs)
except (
neo4j.exceptions.ServiceUnavailable
) as exc: # pragma: no cover - depends on infra
last_exc = exc
attempt += 1
if attempt > self._max_retries:
raise
logger.warning(
f"Neo4j session {method_name} failed with ServiceUnavailable ({attempt}/{self._max_retries} attempts). Retrying..."
)
self._refresh_session()
raise last_exc if last_exc else RuntimeError("Unexpected retry loop exit")
def _refresh_session(self) -> None:
if self._session is not None:
self._session.close()
self._close_driver()
self._session = self._session_factory()
@@ -0,0 +1,143 @@
import logging
from typing import Any
from rest_framework.exceptions import APIException, ValidationError
from api.attack_paths import database as graph_database, AttackPathsQueryDefinition
from api.models import AttackPathsScan
from config.custom_logging import BackendLogger
logger = logging.getLogger(BackendLogger.API)
def normalize_run_payload(raw_data):
if not isinstance(raw_data, dict): # Let the serializer handle this
return raw_data
if "data" in raw_data and isinstance(raw_data.get("data"), dict):
data_section = raw_data.get("data") or {}
attributes = data_section.get("attributes") or {}
payload = {
"id": attributes.get("id", data_section.get("id")),
"parameters": attributes.get("parameters"),
}
# Remove `None` parameters to allow defaults downstream
if payload.get("parameters") is None:
payload.pop("parameters")
return payload
return raw_data
def prepare_query_parameters(
definition: AttackPathsQueryDefinition,
provided_parameters: dict[str, Any],
provider_uid: str,
) -> dict[str, Any]:
parameters = dict(provided_parameters or {})
expected_names = {parameter.name for parameter in definition.parameters}
provided_names = set(parameters.keys())
unexpected = provided_names - expected_names
if unexpected:
raise ValidationError(
{"parameters": f"Unknown parameter(s): {', '.join(sorted(unexpected))}"}
)
missing = expected_names - provided_names
if missing:
raise ValidationError(
{
"parameters": f"Missing required parameter(s): {', '.join(sorted(missing))}"
}
)
clean_parameters = {
"provider_uid": str(provider_uid),
}
for definition_parameter in definition.parameters:
raw_value = provided_parameters[definition_parameter.name]
try:
casted_value = definition_parameter.cast(raw_value)
except (ValueError, TypeError) as exc:
raise ValidationError(
{
"parameters": (
f"Invalid value for parameter `{definition_parameter.name}`: {str(exc)}"
)
}
)
clean_parameters[definition_parameter.name] = casted_value
return clean_parameters
def execute_attack_paths_query(
attack_paths_scan: AttackPathsScan,
definition: AttackPathsQueryDefinition,
parameters: dict[str, Any],
) -> dict[str, Any]:
try:
with graph_database.get_session(attack_paths_scan.graph_database) as session:
result = session.run(definition.cypher, parameters)
return _serialize_graph(result.graph())
except graph_database.GraphDatabaseQueryException as exc:
logger.error(f"Query failed for Attack Paths query `{definition.id}`: {exc}")
raise APIException(
"Attack Paths query execution failed due to a database error"
)
def _serialize_graph(graph):
nodes = []
for node in graph.nodes:
nodes.append(
{
"id": node.element_id,
"labels": list(node.labels),
"properties": _serialize_properties(node._properties),
},
)
relationships = []
for relationship in graph.relationships:
relationships.append(
{
"id": relationship.element_id,
"label": relationship.type,
"source": relationship.start_node.element_id,
"target": relationship.end_node.element_id,
"properties": _serialize_properties(relationship._properties),
},
)
return {
"nodes": nodes,
"relationships": relationships,
}
def _serialize_properties(properties: dict[str, Any]) -> dict[str, Any]:
"""Convert Neo4j property values into JSON-serializable primitives."""
def _serialize_value(value: Any) -> Any:
# Neo4j temporal and spatial values expose `to_native` returning Python primitives
if hasattr(value, "to_native") and callable(value.to_native):
return _serialize_value(value.to_native())
if isinstance(value, (list, tuple)):
return [_serialize_value(item) for item in value]
if isinstance(value, dict):
return {key: _serialize_value(val) for key, val in value.items()}
return value
return {key: _serialize_value(val) for key, val in properties.items()}
+1 -7
View File
@@ -26,7 +26,6 @@ class MainRouter:
default_db = "default"
admin_db = "admin"
replica_db = "replica"
admin_replica_db = "admin_replica"
def db_for_read(self, model, **hints): # noqa: F841
model_table_name = model._meta.db_table
@@ -50,12 +49,7 @@ class MainRouter:
def allow_relation(self, obj1, obj2, **hints): # noqa: F841
# Allow relations when both objects originate from allowed connectors
allowed_dbs = {
self.default_db,
self.admin_db,
self.replica_db,
self.admin_replica_db,
}
allowed_dbs = {self.default_db, self.admin_db, self.replica_db}
if {obj1._state.db, obj2._state.db} <= allowed_dbs:
return True
return None
+2 -52
View File
@@ -1,14 +1,10 @@
import uuid
from functools import wraps
from django.core.exceptions import ObjectDoesNotExist
from django.db import IntegrityError, connection, transaction
from django.db import connection, transaction
from rest_framework_json_api.serializers import ValidationError
from api.db_router import READ_REPLICA_ALIAS
from api.db_utils import POSTGRES_TENANT_VAR, SET_CONFIG_QUERY, rls_transaction
from api.exceptions import ProviderDeletedException
from api.models import Provider, Scan
from api.db_utils import POSTGRES_TENANT_VAR, SET_CONFIG_QUERY
def set_tenant(func=None, *, keep_tenant=False):
@@ -70,49 +66,3 @@ def set_tenant(func=None, *, keep_tenant=False):
return decorator
else:
return decorator(func)
def handle_provider_deletion(func):
"""
Decorator that raises ProviderDeletedException if provider was deleted during execution.
Catches ObjectDoesNotExist and IntegrityError, checks if provider still exists,
and raises ProviderDeletedException if not. Otherwise, re-raises original exception.
Requires tenant_id and provider_id in kwargs.
Example:
@shared_task
@handle_provider_deletion
def scan_task(scan_id, tenant_id, provider_id):
...
"""
@wraps(func)
def wrapper(*args, **kwargs):
try:
return func(*args, **kwargs)
except (ObjectDoesNotExist, IntegrityError):
tenant_id = kwargs.get("tenant_id")
provider_id = kwargs.get("provider_id")
with rls_transaction(tenant_id, using=READ_REPLICA_ALIAS):
if provider_id is None:
scan_id = kwargs.get("scan_id")
if scan_id is None:
raise AssertionError(
"This task does not have provider or scan in the kwargs"
)
scan = Scan.objects.filter(pk=scan_id).first()
if scan is None:
raise ProviderDeletedException(
f"Provider for scan '{scan_id}' was deleted during the scan"
) from None
provider_id = str(scan.provider_id)
if not Provider.objects.filter(pk=provider_id).exists():
raise ProviderDeletedException(
f"Provider '{provider_id}' was deleted during the scan"
) from None
raise
return wrapper
-4
View File
@@ -66,10 +66,6 @@ class ProviderConnectionError(Exception):
"""Base exception for provider connection errors."""
class ProviderDeletedException(Exception):
"""Raised when a provider has been deleted during scan/task execution."""
def custom_exception_handler(exc, context):
if isinstance(exc, django_validation_error):
if hasattr(exc, "error_dict"):
+18 -83
View File
@@ -23,12 +23,11 @@ from api.db_utils import (
StatusEnumField,
)
from api.models import (
AttackSurfaceOverview,
ComplianceRequirementOverview,
DailySeveritySummary,
Finding,
Integration,
Invitation,
AttackPathsScan,
LighthouseProviderConfiguration,
LighthouseProviderModels,
Membership,
@@ -332,6 +331,23 @@ class ScanFilter(ProviderRelationshipFilterSet):
}
class AttackPathsScanFilter(ProviderRelationshipFilterSet):
inserted_at = DateFilter(field_name="inserted_at", lookup_expr="date")
completed_at = DateFilter(field_name="completed_at", lookup_expr="date")
started_at = DateFilter(field_name="started_at", lookup_expr="date")
state = ChoiceFilter(choices=StateChoices.choices)
state__in = ChoiceInFilter(
field_name="state", choices=StateChoices.choices, lookup_expr="in"
)
class Meta:
model = AttackPathsScan
fields = {
"provider": ["exact", "in"],
"scan": ["exact", "in"],
}
class TaskFilter(FilterSet):
name = CharFilter(field_name="task_runner_task__task_name", lookup_expr="exact")
name__icontains = CharFilter(
@@ -796,68 +812,6 @@ class ScanSummaryFilter(FilterSet):
}
class DailySeveritySummaryFilter(FilterSet):
"""Filter for findings_severity/timeseries endpoint."""
MAX_DATE_RANGE_DAYS = 365
provider_id = UUIDFilter(field_name="provider_id", lookup_expr="exact")
provider_id__in = UUIDInFilter(field_name="provider_id", lookup_expr="in")
provider_type = ChoiceFilter(
field_name="provider__provider", choices=Provider.ProviderChoices.choices
)
provider_type__in = ChoiceInFilter(
field_name="provider__provider", choices=Provider.ProviderChoices.choices
)
date_from = DateFilter(method="filter_noop")
date_to = DateFilter(method="filter_noop")
class Meta:
model = DailySeveritySummary
fields = ["provider_id"]
def filter_noop(self, queryset, name, value):
return queryset
def filter_queryset(self, queryset):
if not self.data.get("date_from"):
raise ValidationError(
[
{
"detail": "This query parameter is required.",
"status": "400",
"source": {"pointer": "filter[date_from]"},
"code": "required",
}
]
)
today = date.today()
date_from = self.form.cleaned_data.get("date_from")
date_to = min(self.form.cleaned_data.get("date_to") or today, today)
if (date_to - date_from).days > self.MAX_DATE_RANGE_DAYS:
raise ValidationError(
[
{
"detail": f"Date range cannot exceed {self.MAX_DATE_RANGE_DAYS} days.",
"status": "400",
"source": {"pointer": "filter[date_from]"},
"code": "invalid",
}
]
)
# View access
self.request._date_from = date_from
self.request._date_to = date_to
# Apply date filter (only lte for fill-forward logic)
queryset = queryset.filter(date__lte=date_to)
return super().filter_queryset(queryset)
class ScanSummarySeverityFilter(ScanSummaryFilter):
"""Filter for findings_severity ScanSummary endpoint - includes status filters"""
@@ -1077,22 +1031,3 @@ class ThreatScoreSnapshotFilter(FilterSet):
"inserted_at": ["date", "gte", "lte"],
"overall_score": ["exact", "gte", "lte"],
}
class AttackSurfaceOverviewFilter(FilterSet):
"""Filter for attack surface overview aggregations by provider."""
provider_id = UUIDFilter(field_name="scan__provider__id", lookup_expr="exact")
provider_id__in = UUIDInFilter(field_name="scan__provider__id", lookup_expr="in")
provider_type = ChoiceFilter(
field_name="scan__provider__provider", choices=Provider.ProviderChoices.choices
)
provider_type__in = ChoiceInFilter(
field_name="scan__provider__provider",
choices=Provider.ProviderChoices.choices,
lookup_expr="in",
)
class Meta:
model = AttackSurfaceOverview
fields = {}
@@ -0,0 +1,41 @@
[
{
"model": "api.attackpathsscan",
"pk": "a7f0f6de-6f8e-4b3a-8cbe-3f6dd9012345",
"fields": {
"tenant": "12646005-9067-4d2a-a098-8bb378604362",
"provider": "b85601a8-4b45-4194-8135-03fb980ef428",
"scan": "01920573-aa9c-73c9-bcda-f2e35c9b19d2",
"state": "completed",
"progress": 100,
"update_tag": 1693586667,
"graph_database": "db-a7f0f6de-6f8e-4b3a-8cbe-3f6dd9012345",
"is_graph_database_deleted": false,
"task": null,
"inserted_at": "2024-09-01T17:24:37Z",
"updated_at": "2024-09-01T17:44:37Z",
"started_at": "2024-09-01T17:34:37Z",
"completed_at": "2024-09-01T17:44:37Z",
"duration": 269,
"ingestion_exceptions": {}
}
},
{
"model": "api.attackpathsscan",
"pk": "4a2fb2af-8a60-4d7d-9cae-4ca65e098765",
"fields": {
"tenant": "12646005-9067-4d2a-a098-8bb378604362",
"provider": "15fce1fa-ecaa-433f-a9dc-62553f3a2555",
"scan": "01929f3b-ed2e-7623-ad63-7c37cd37828f",
"state": "executing",
"progress": 48,
"update_tag": 1697625000,
"graph_database": "db-4a2fb2af-8a60-4d7d-9cae-4ca65e098765",
"is_graph_database_deleted": false,
"task": null,
"inserted_at": "2024-10-18T10:55:57Z",
"updated_at": "2024-10-18T10:56:15Z",
"started_at": "2024-10-18T10:56:05Z"
}
}
]
@@ -0,0 +1,154 @@
# Generated by Django 5.1.13 on 2025-11-06 16:20
import django.db.models.deletion
from django.db import migrations, models
from uuid6 import uuid7
import api.rls
class Migration(migrations.Migration):
dependencies = [
("api", "0059_compliance_overview_summary"),
]
operations = [
migrations.CreateModel(
name="AttackPathsScan",
fields=[
(
"id",
models.UUIDField(
default=uuid7,
editable=False,
primary_key=True,
serialize=False,
),
),
("inserted_at", models.DateTimeField(auto_now_add=True)),
("updated_at", models.DateTimeField(auto_now=True)),
(
"state",
api.db_utils.StateEnumField(
choices=[
("available", "Available"),
("scheduled", "Scheduled"),
("executing", "Executing"),
("completed", "Completed"),
("failed", "Failed"),
("cancelled", "Cancelled"),
],
default="available",
),
),
("progress", models.IntegerField(default=0)),
("started_at", models.DateTimeField(blank=True, null=True)),
("completed_at", models.DateTimeField(blank=True, null=True)),
(
"duration",
models.IntegerField(
blank=True, help_text="Duration in seconds", null=True
),
),
(
"update_tag",
models.BigIntegerField(
blank=True,
help_text="Cartography update tag (epoch)",
null=True,
),
),
(
"graph_database",
models.CharField(blank=True, max_length=63, null=True),
),
(
"is_graph_database_deleted",
models.BooleanField(default=False),
),
(
"ingestion_exceptions",
models.JSONField(blank=True, default=dict, null=True),
),
(
"provider",
models.ForeignKey(
on_delete=django.db.models.deletion.CASCADE,
related_name="attack_paths_scans",
related_query_name="attack_paths_scan",
to="api.provider",
),
),
(
"scan",
models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
related_name="attack_paths_scans",
related_query_name="attack_paths_scan",
to="api.scan",
),
),
(
"task",
models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
related_name="attack_paths_scans",
related_query_name="attack_paths_scan",
to="api.task",
),
),
(
"tenant",
models.ForeignKey(
on_delete=django.db.models.deletion.CASCADE, to="api.tenant"
),
),
],
options={
"db_table": "attack_paths_scans",
"abstract": False,
"indexes": [
models.Index(
fields=["tenant_id", "provider_id", "-inserted_at"],
name="aps_prov_ins_desc_idx",
),
models.Index(
fields=["tenant_id", "state", "-inserted_at"],
name="aps_state_ins_desc_idx",
),
models.Index(
fields=["tenant_id", "scan_id"],
name="aps_scan_lookup_idx",
),
models.Index(
fields=["tenant_id", "provider_id"],
name="aps_active_graph_idx",
include=["graph_database", "id"],
condition=models.Q(("is_graph_database_deleted", False)),
),
models.Index(
fields=["tenant_id", "provider_id", "-completed_at"],
name="aps_completed_graph_idx",
include=["graph_database", "id"],
condition=models.Q(
("state", "completed"),
("is_graph_database_deleted", False),
),
),
],
},
),
migrations.AddConstraint(
model_name="attackpathsscan",
constraint=api.rls.RowLevelSecurityConstraint(
"tenant_id",
name="rls_on_attackpathsscan",
statements=["SELECT", "INSERT", "UPDATE", "DELETE"],
),
),
]
@@ -1,89 +0,0 @@
# Generated by Django 5.1.14 on 2025-11-19 13:03
import uuid
import django.db.models.deletion
from django.db import migrations, models
import api.rls
class Migration(migrations.Migration):
dependencies = [
("api", "0059_compliance_overview_summary"),
]
operations = [
migrations.CreateModel(
name="AttackSurfaceOverview",
fields=[
(
"id",
models.UUIDField(
default=uuid.uuid4,
editable=False,
primary_key=True,
serialize=False,
),
),
("inserted_at", models.DateTimeField(auto_now_add=True)),
(
"attack_surface_type",
models.CharField(
choices=[
("internet-exposed", "Internet Exposed"),
("secrets", "Exposed Secrets"),
("privilege-escalation", "Privilege Escalation"),
("ec2-imdsv1", "EC2 IMDSv1 Enabled"),
],
max_length=50,
),
),
("total_findings", models.IntegerField(default=0)),
("failed_findings", models.IntegerField(default=0)),
("muted_failed_findings", models.IntegerField(default=0)),
],
options={
"db_table": "attack_surface_overviews",
"abstract": False,
},
),
migrations.AddField(
model_name="attacksurfaceoverview",
name="scan",
field=models.ForeignKey(
on_delete=django.db.models.deletion.CASCADE,
related_name="attack_surface_overviews",
related_query_name="attack_surface_overview",
to="api.scan",
),
),
migrations.AddField(
model_name="attacksurfaceoverview",
name="tenant",
field=models.ForeignKey(
on_delete=django.db.models.deletion.CASCADE, to="api.tenant"
),
),
migrations.AddIndex(
model_name="attacksurfaceoverview",
index=models.Index(
fields=["tenant_id", "scan_id"], name="attack_surf_tenant_scan_idx"
),
),
migrations.AddConstraint(
model_name="attacksurfaceoverview",
constraint=models.UniqueConstraint(
fields=("tenant_id", "scan_id", "attack_surface_type"),
name="unique_attack_surface_per_scan",
),
),
migrations.AddConstraint(
model_name="attacksurfaceoverview",
constraint=api.rls.RowLevelSecurityConstraint(
"tenant_id",
name="rls_on_attacksurfaceoverview",
statements=["SELECT", "INSERT", "UPDATE", "DELETE"],
),
),
]
@@ -1,96 +0,0 @@
# Generated by Django 5.1.14 on 2025-12-03 13:38
import uuid
import django.db.models.deletion
from django.db import migrations, models
import api.rls
class Migration(migrations.Migration):
dependencies = [
("api", "0060_attack_surface_overview"),
]
operations = [
migrations.CreateModel(
name="DailySeveritySummary",
fields=[
(
"id",
models.UUIDField(
default=uuid.uuid4,
editable=False,
primary_key=True,
serialize=False,
),
),
("date", models.DateField()),
("critical", models.IntegerField(default=0)),
("high", models.IntegerField(default=0)),
("medium", models.IntegerField(default=0)),
("low", models.IntegerField(default=0)),
("informational", models.IntegerField(default=0)),
("muted", models.IntegerField(default=0)),
(
"provider",
models.ForeignKey(
on_delete=django.db.models.deletion.CASCADE,
related_name="daily_severity_summaries",
related_query_name="daily_severity_summary",
to="api.provider",
),
),
(
"scan",
models.ForeignKey(
on_delete=django.db.models.deletion.CASCADE,
related_name="daily_severity_summaries",
related_query_name="daily_severity_summary",
to="api.scan",
),
),
(
"tenant",
models.ForeignKey(
on_delete=django.db.models.deletion.CASCADE,
to="api.tenant",
),
),
],
options={
"db_table": "daily_severity_summaries",
"abstract": False,
},
),
migrations.AddIndex(
model_name="dailyseveritysummary",
index=models.Index(
fields=["tenant_id", "id"],
name="dss_tenant_id_idx",
),
),
migrations.AddIndex(
model_name="dailyseveritysummary",
index=models.Index(
fields=["tenant_id", "provider_id"],
name="dss_tenant_provider_idx",
),
),
migrations.AddConstraint(
model_name="dailyseveritysummary",
constraint=models.UniqueConstraint(
fields=("tenant_id", "provider", "date"),
name="unique_daily_severity_summary",
),
),
migrations.AddConstraint(
model_name="dailyseveritysummary",
constraint=api.rls.RowLevelSecurityConstraint(
"tenant_id",
name="rls_on_dailyseveritysummary",
statements=["SELECT", "INSERT", "UPDATE", "DELETE"],
),
),
]
+95 -119
View File
@@ -616,6 +616,101 @@ class Scan(RowLevelSecurityProtectedModel):
resource_name = "scans"
class AttackPathsScan(RowLevelSecurityProtectedModel):
objects = ActiveProviderManager()
all_objects = models.Manager()
id = models.UUIDField(primary_key=True, default=uuid7, editable=False)
inserted_at = models.DateTimeField(auto_now_add=True, editable=False)
updated_at = models.DateTimeField(auto_now=True, editable=False)
state = StateEnumField(choices=StateChoices.choices, default=StateChoices.AVAILABLE)
progress = models.IntegerField(default=0)
# Timing
started_at = models.DateTimeField(null=True, blank=True)
completed_at = models.DateTimeField(null=True, blank=True)
duration = models.IntegerField(
null=True, blank=True, help_text="Duration in seconds"
)
# Relationship to the provider and optional prowler Scan and celery Task
provider = models.ForeignKey(
"Provider",
on_delete=models.CASCADE,
related_name="attack_paths_scans",
related_query_name="attack_paths_scan",
)
scan = models.ForeignKey(
"Scan",
on_delete=models.SET_NULL,
null=True,
blank=True,
related_name="attack_paths_scans",
related_query_name="attack_paths_scan",
)
task = models.ForeignKey(
"Task",
on_delete=models.SET_NULL,
null=True,
blank=True,
related_name="attack_paths_scans",
related_query_name="attack_paths_scan",
)
# Cartography specific metadata
update_tag = models.BigIntegerField(
null=True, blank=True, help_text="Cartography update tag (epoch)"
)
graph_database = models.CharField(max_length=63, null=True, blank=True)
is_graph_database_deleted = models.BooleanField(default=False)
ingestion_exceptions = models.JSONField(default=dict, null=True, blank=True)
class Meta(RowLevelSecurityProtectedModel.Meta):
db_table = "attack_paths_scans"
constraints = [
RowLevelSecurityConstraint(
field="tenant_id",
name="rls_on_%(class)s",
statements=["SELECT", "INSERT", "UPDATE", "DELETE"],
),
]
indexes = [
models.Index(
fields=["tenant_id", "provider_id", "-inserted_at"],
name="aps_prov_ins_desc_idx",
),
models.Index(
fields=["tenant_id", "state", "-inserted_at"],
name="aps_state_ins_desc_idx",
),
models.Index(
fields=["tenant_id", "scan_id"],
name="aps_scan_lookup_idx",
),
models.Index(
fields=["tenant_id", "provider_id"],
name="aps_active_graph_idx",
include=["graph_database", "id"],
condition=Q(is_graph_database_deleted=False),
),
models.Index(
fields=["tenant_id", "provider_id", "-completed_at"],
name="aps_completed_graph_idx",
include=["graph_database", "id"],
condition=Q(
state=StateChoices.COMPLETED,
is_graph_database_deleted=False,
),
),
]
class JSONAPIMeta:
resource_name = "attack-paths-scans"
class ResourceTag(RowLevelSecurityProtectedModel):
id = models.UUIDField(primary_key=True, default=uuid4, editable=False)
inserted_at = models.DateTimeField(auto_now_add=True, editable=False)
@@ -1500,65 +1595,6 @@ class ScanSummary(RowLevelSecurityProtectedModel):
resource_name = "scan-summaries"
class DailySeveritySummary(RowLevelSecurityProtectedModel):
"""
Pre-aggregated daily severity counts per provider.
Used by findings_severity/timeseries endpoint for efficient queries.
"""
objects = ActiveProviderManager()
id = models.UUIDField(primary_key=True, default=uuid4, editable=False)
date = models.DateField()
provider = models.ForeignKey(
Provider,
on_delete=models.CASCADE,
related_name="daily_severity_summaries",
related_query_name="daily_severity_summary",
)
scan = models.ForeignKey(
Scan,
on_delete=models.CASCADE,
related_name="daily_severity_summaries",
related_query_name="daily_severity_summary",
)
# Aggregated fail counts by severity
critical = models.IntegerField(default=0)
high = models.IntegerField(default=0)
medium = models.IntegerField(default=0)
low = models.IntegerField(default=0)
informational = models.IntegerField(default=0)
muted = models.IntegerField(default=0)
class Meta(RowLevelSecurityProtectedModel.Meta):
db_table = "daily_severity_summaries"
constraints = [
models.UniqueConstraint(
fields=("tenant_id", "provider", "date"),
name="unique_daily_severity_summary",
),
RowLevelSecurityConstraint(
field="tenant_id",
name="rls_on_%(class)s",
statements=["SELECT", "INSERT", "UPDATE", "DELETE"],
),
]
indexes = [
models.Index(
fields=["tenant_id", "id"],
name="dss_tenant_id_idx",
),
models.Index(
fields=["tenant_id", "provider_id"],
name="dss_tenant_provider_idx",
),
]
class Integration(RowLevelSecurityProtectedModel):
class IntegrationChoices(models.TextChoices):
AMAZON_S3 = "amazon_s3", _("Amazon S3")
@@ -2464,63 +2500,3 @@ class ThreatScoreSnapshot(RowLevelSecurityProtectedModel):
class JSONAPIMeta:
resource_name = "threatscore-snapshots"
class AttackSurfaceOverview(RowLevelSecurityProtectedModel):
"""
Pre-aggregated attack surface metrics per scan.
Stores counts for each attack surface type (internet-exposed, secrets,
privilege-escalation, ec2-imdsv1) to enable fast overview queries.
"""
class AttackSurfaceTypeChoices(models.TextChoices):
INTERNET_EXPOSED = "internet-exposed", _("Internet Exposed")
SECRETS = "secrets", _("Exposed Secrets")
PRIVILEGE_ESCALATION = "privilege-escalation", _("Privilege Escalation")
EC2_IMDSV1 = "ec2-imdsv1", _("EC2 IMDSv1 Enabled")
id = models.UUIDField(primary_key=True, default=uuid4, editable=False)
inserted_at = models.DateTimeField(auto_now_add=True, editable=False)
scan = models.ForeignKey(
Scan,
on_delete=models.CASCADE,
related_name="attack_surface_overviews",
related_query_name="attack_surface_overview",
)
attack_surface_type = models.CharField(
max_length=50,
choices=AttackSurfaceTypeChoices.choices,
)
# Finding counts
total_findings = models.IntegerField(default=0) # All findings (PASS + FAIL)
failed_findings = models.IntegerField(default=0) # Non-muted failed findings
muted_failed_findings = models.IntegerField(default=0) # Muted failed findings
class Meta(RowLevelSecurityProtectedModel.Meta):
db_table = "attack_surface_overviews"
constraints = [
models.UniqueConstraint(
fields=("tenant_id", "scan_id", "attack_surface_type"),
name="unique_attack_surface_per_scan",
),
RowLevelSecurityConstraint(
field="tenant_id",
name="rls_on_%(class)s",
statements=["SELECT", "INSERT", "UPDATE", "DELETE"],
),
]
indexes = [
models.Index(
fields=["tenant_id", "scan_id"],
name="attack_surf_tenant_scan_idx",
),
]
class JSONAPIMeta:
resource_name = "attack-surface-overviews"
+2 -2
View File
@@ -65,11 +65,11 @@ def get_providers(role: Role) -> QuerySet[Provider]:
A QuerySet of Provider objects filtered by the role's provider groups.
If the role has no provider groups, returns an empty queryset.
"""
tenant_id = role.tenant_id
tenant = role.tenant
provider_groups = role.provider_groups.all()
if not provider_groups.exists():
return Provider.objects.none()
return Provider.objects.filter(
tenant_id=tenant_id, provider_groups__in=provider_groups
tenant=tenant, provider_groups__in=provider_groups
).distinct()
File diff suppressed because it is too large Load Diff
@@ -0,0 +1,172 @@
from types import SimpleNamespace
from unittest.mock import MagicMock, patch
import pytest
from rest_framework.exceptions import APIException, ValidationError
from api.attack_paths import database as graph_database
from api.attack_paths import views_helpers
def test_normalize_run_payload_extracts_attributes_section():
payload = {
"data": {
"id": "ignored",
"attributes": {
"id": "aws-rds",
"parameters": {"ip": "192.0.2.0"},
},
}
}
result = views_helpers.normalize_run_payload(payload)
assert result == {"id": "aws-rds", "parameters": {"ip": "192.0.2.0"}}
def test_normalize_run_payload_passthrough_for_non_dict():
sentinel = "not-a-dict"
assert views_helpers.normalize_run_payload(sentinel) is sentinel
def test_prepare_query_parameters_includes_provider_and_casts(
attack_paths_query_definition_factory,
):
definition = attack_paths_query_definition_factory(cast_type=int)
result = views_helpers.prepare_query_parameters(
definition,
{"limit": "5"},
provider_uid="123456789012",
)
assert result["provider_uid"] == "123456789012"
assert result["limit"] == 5
@pytest.mark.parametrize(
"provided,expected_message",
[
({}, "Missing required parameter"),
({"limit": 10, "extra": True}, "Unknown parameter"),
],
)
def test_prepare_query_parameters_validates_names(
attack_paths_query_definition_factory, provided, expected_message
):
definition = attack_paths_query_definition_factory()
with pytest.raises(ValidationError) as exc:
views_helpers.prepare_query_parameters(definition, provided, provider_uid="1")
assert expected_message in str(exc.value)
def test_prepare_query_parameters_validates_cast(
attack_paths_query_definition_factory,
):
definition = attack_paths_query_definition_factory(cast_type=int)
with pytest.raises(ValidationError) as exc:
views_helpers.prepare_query_parameters(
definition,
{"limit": "not-an-int"},
provider_uid="1",
)
assert "Invalid value" in str(exc.value)
def test_execute_attack_paths_query_serializes_graph(
attack_paths_query_definition_factory, attack_paths_graph_stub_classes
):
definition = attack_paths_query_definition_factory(
id="aws-rds",
name="RDS",
description="",
cypher="MATCH (n) RETURN n",
parameters=[],
)
parameters = {"provider_uid": "123"}
attack_paths_scan = SimpleNamespace(graph_database="tenant-db")
node = attack_paths_graph_stub_classes.Node(
element_id="node-1",
labels=["AWSAccount"],
properties={
"name": "account",
"complex": {
"items": [
attack_paths_graph_stub_classes.NativeValue("value"),
{"nested": 1},
]
},
},
)
relationship = attack_paths_graph_stub_classes.Relationship(
element_id="rel-1",
rel_type="OWNS",
start_node=node,
end_node=attack_paths_graph_stub_classes.Node("node-2", ["RDSInstance"], {}),
properties={"weight": 1},
)
graph = SimpleNamespace(nodes=[node], relationships=[relationship])
run_result = MagicMock()
run_result.graph.return_value = graph
session = MagicMock()
session.run.return_value = run_result
session_ctx = MagicMock()
session_ctx.__enter__.return_value = session
session_ctx.__exit__.return_value = False
with patch(
"api.attack_paths.views_helpers.graph_database.get_session",
return_value=session_ctx,
) as mock_get_session:
result = views_helpers.execute_attack_paths_query(
attack_paths_scan, definition, parameters
)
mock_get_session.assert_called_once_with("tenant-db")
session.run.assert_called_once_with(definition.cypher, parameters)
assert result["nodes"][0]["id"] == "node-1"
assert result["nodes"][0]["properties"]["complex"]["items"][0] == "value"
assert result["relationships"][0]["label"] == "OWNS"
def test_execute_attack_paths_query_wraps_graph_errors(
attack_paths_query_definition_factory,
):
definition = attack_paths_query_definition_factory(
id="aws-rds",
name="RDS",
description="",
cypher="MATCH (n) RETURN n",
parameters=[],
)
attack_paths_scan = SimpleNamespace(graph_database="tenant-db")
parameters = {"provider_uid": "123"}
class ExplodingContext:
def __enter__(self):
raise graph_database.GraphDatabaseQueryException("boom")
def __exit__(self, exc_type, exc, tb):
return False
with (
patch(
"api.attack_paths.views_helpers.graph_database.get_session",
return_value=ExplodingContext(),
),
patch("api.attack_paths.views_helpers.logger") as mock_logger,
):
with pytest.raises(APIException):
views_helpers.execute_attack_paths_query(
attack_paths_scan, definition, parameters
)
mock_logger.error.assert_called_once()
+1 -143
View File
@@ -2,12 +2,9 @@ import uuid
from unittest.mock import call, patch
import pytest
from django.core.exceptions import ObjectDoesNotExist
from django.db import IntegrityError
from api.db_utils import POSTGRES_TENANT_VAR, SET_CONFIG_QUERY
from api.decorators import handle_provider_deletion, set_tenant
from api.exceptions import ProviderDeletedException
from api.decorators import set_tenant
@pytest.mark.django_db
@@ -37,142 +34,3 @@ class TestSetTenantDecorator:
with pytest.raises(KeyError):
random_func("test_arg")
@pytest.mark.django_db
class TestHandleProviderDeletionDecorator:
def test_success_no_exception(self, tenants_fixture, providers_fixture):
"""Decorated function runs normally when no exception is raised."""
tenant = tenants_fixture[0]
provider = providers_fixture[0]
@handle_provider_deletion
def task_func(**kwargs):
return "success"
result = task_func(
tenant_id=str(tenant.id),
provider_id=str(provider.id),
)
assert result == "success"
@patch("api.decorators.rls_transaction")
@patch("api.decorators.Provider.objects.filter")
def test_provider_deleted_with_provider_id(
self, mock_filter, mock_rls, tenants_fixture
):
"""Raises ProviderDeletedException when provider_id provided and provider deleted."""
tenant = tenants_fixture[0]
deleted_provider_id = str(uuid.uuid4())
mock_rls.return_value.__enter__ = lambda s: None
mock_rls.return_value.__exit__ = lambda s, *args: None
mock_filter.return_value.exists.return_value = False
@handle_provider_deletion
def task_func(**kwargs):
raise ObjectDoesNotExist("Some object not found")
with pytest.raises(ProviderDeletedException) as exc_info:
task_func(tenant_id=str(tenant.id), provider_id=deleted_provider_id)
assert deleted_provider_id in str(exc_info.value)
@patch("api.decorators.rls_transaction")
@patch("api.decorators.Provider.objects.filter")
@patch("api.decorators.Scan.objects.filter")
def test_provider_deleted_with_scan_id(
self, mock_scan_filter, mock_provider_filter, mock_rls, tenants_fixture
):
"""Raises ProviderDeletedException when scan exists but provider deleted."""
tenant = tenants_fixture[0]
scan_id = str(uuid.uuid4())
provider_id = str(uuid.uuid4())
mock_rls.return_value.__enter__ = lambda s: None
mock_rls.return_value.__exit__ = lambda s, *args: None
mock_scan = type("MockScan", (), {"provider_id": provider_id})()
mock_scan_filter.return_value.first.return_value = mock_scan
mock_provider_filter.return_value.exists.return_value = False
@handle_provider_deletion
def task_func(**kwargs):
raise ObjectDoesNotExist("Some object not found")
with pytest.raises(ProviderDeletedException) as exc_info:
task_func(tenant_id=str(tenant.id), scan_id=scan_id)
assert provider_id in str(exc_info.value)
@patch("api.decorators.rls_transaction")
@patch("api.decorators.Scan.objects.filter")
def test_scan_deleted_cascade(self, mock_scan_filter, mock_rls, tenants_fixture):
"""Raises ProviderDeletedException when scan was deleted (CASCADE from provider)."""
tenant = tenants_fixture[0]
scan_id = str(uuid.uuid4())
mock_rls.return_value.__enter__ = lambda s: None
mock_rls.return_value.__exit__ = lambda s, *args: None
mock_scan_filter.return_value.first.return_value = None
@handle_provider_deletion
def task_func(**kwargs):
raise ObjectDoesNotExist("Some object not found")
with pytest.raises(ProviderDeletedException) as exc_info:
task_func(tenant_id=str(tenant.id), scan_id=scan_id)
assert scan_id in str(exc_info.value)
@patch("api.decorators.rls_transaction")
@patch("api.decorators.Provider.objects.filter")
def test_provider_exists_reraises_original(
self, mock_filter, mock_rls, tenants_fixture, providers_fixture
):
"""Re-raises original exception when provider still exists."""
tenant = tenants_fixture[0]
provider = providers_fixture[0]
mock_rls.return_value.__enter__ = lambda s: None
mock_rls.return_value.__exit__ = lambda s, *args: None
mock_filter.return_value.exists.return_value = True
@handle_provider_deletion
def task_func(**kwargs):
raise ObjectDoesNotExist("Actual object missing")
with pytest.raises(ObjectDoesNotExist):
task_func(tenant_id=str(tenant.id), provider_id=str(provider.id))
@patch("api.decorators.rls_transaction")
@patch("api.decorators.Provider.objects.filter")
def test_integrity_error_provider_deleted(
self, mock_filter, mock_rls, tenants_fixture
):
"""Raises ProviderDeletedException on IntegrityError when provider deleted."""
tenant = tenants_fixture[0]
deleted_provider_id = str(uuid.uuid4())
mock_rls.return_value.__enter__ = lambda s: None
mock_rls.return_value.__exit__ = lambda s, *args: None
mock_filter.return_value.exists.return_value = False
@handle_provider_deletion
def task_func(**kwargs):
raise IntegrityError("FK constraint violation")
with pytest.raises(ProviderDeletedException):
task_func(tenant_id=str(tenant.id), provider_id=deleted_provider_id)
def test_missing_provider_and_scan_raises_assertion(self, tenants_fixture):
"""Raises AssertionError when neither provider_id nor scan_id in kwargs."""
@handle_provider_deletion
def task_func(**kwargs):
raise ObjectDoesNotExist("Some object not found")
with pytest.raises(AssertionError) as exc_info:
task_func(tenant_id=str(tenants_fixture[0].id))
assert "provider or scan" in str(exc_info.value)
File diff suppressed because it is too large Load Diff
@@ -40,16 +40,11 @@ class BedrockCredentialsSerializer(serializers.Serializer):
"""
Serializer for AWS Bedrock credentials validation.
Supports two authentication methods:
1. AWS access key + secret key
2. Bedrock API key (bearer token)
In both cases, region is mandatory.
Validates long-term AWS credentials (AKIA) and region format.
"""
access_key_id = serializers.CharField(required=False, allow_blank=False)
secret_access_key = serializers.CharField(required=False, allow_blank=False)
api_key = serializers.CharField(required=False, allow_blank=False)
access_key_id = serializers.CharField()
secret_access_key = serializers.CharField()
region = serializers.CharField()
def validate_access_key_id(self, value: str) -> str:
@@ -70,15 +65,6 @@ class BedrockCredentialsSerializer(serializers.Serializer):
)
return value
def validate_api_key(self, value: str) -> str:
"""
Validate Bedrock API key (bearer token).
"""
pattern = r"^ABSKQmVkcm9ja0FQSUtleS[A-Za-z0-9+/=]{110}$"
if not re.match(pattern, value or ""):
raise serializers.ValidationError("Invalid Bedrock API key format.")
return value
def validate_region(self, value: str) -> str:
"""Validate AWS region format."""
pattern = r"^[a-z]{2}-[a-z]+-\d+$"
@@ -88,50 +74,6 @@ class BedrockCredentialsSerializer(serializers.Serializer):
)
return value
def validate(self, attrs):
"""
Enforce either:
- access_key_id + secret_access_key + region
OR
- api_key + region
"""
access_key_id = attrs.get("access_key_id")
secret_access_key = attrs.get("secret_access_key")
api_key = attrs.get("api_key")
region = attrs.get("region")
errors = {}
if not region:
errors["region"] = ["Region is required."]
using_access_keys = bool(access_key_id or secret_access_key)
using_api_key = api_key is not None and api_key != ""
if using_access_keys and using_api_key:
errors["non_field_errors"] = [
"Provide either access key + secret key OR api key, not both."
]
elif not using_access_keys and not using_api_key:
errors["non_field_errors"] = [
"You must provide either access key + secret key OR api key."
]
elif using_access_keys:
# Both access_key_id and secret_access_key must be present together
if not access_key_id:
errors.setdefault("access_key_id", []).append(
"AWS access key ID is required when using access key authentication."
)
if not secret_access_key:
errors.setdefault("secret_access_key", []).append(
"AWS secret access key is required when using access key authentication."
)
if errors:
raise serializers.ValidationError(errors)
return attrs
def to_internal_value(self, data):
"""Check for unknown fields before DRF filters them out."""
if not isinstance(data, dict):
@@ -169,15 +111,6 @@ class BedrockCredentialsUpdateSerializer(BedrockCredentialsSerializer):
for field in self.fields.values():
field.required = False
def validate(self, attrs):
"""
For updates, this serializer only checks individual fields.
It does NOT enforce the "either access keys OR api key" rule.
That rule is applied later, after merging with existing stored
credentials, in LighthouseProviderConfigUpdateSerializer.
"""
return attrs
class OpenAICompatibleCredentialsSerializer(serializers.Serializer):
"""
@@ -235,51 +168,27 @@ class OpenAICompatibleCredentialsSerializer(serializers.Serializer):
"required": ["api_key"],
},
{
"type": "object",
"title": "AWS Bedrock Credentials",
"oneOf": [
{
"title": "IAM Access Key Pair",
"type": "object",
"description": "Authenticate with AWS access key and secret key. Recommended when you manage IAM users or roles.",
"properties": {
"access_key_id": {
"type": "string",
"description": "AWS access key ID.",
"pattern": "^AKIA[0-9A-Z]{16}$",
},
"secret_access_key": {
"type": "string",
"description": "AWS secret access key.",
"pattern": "^[A-Za-z0-9/+=]{40}$",
},
"region": {
"type": "string",
"description": "AWS region identifier where Bedrock is available. Examples: us-east-1, "
"us-west-2, eu-west-1, ap-northeast-1.",
"pattern": "^[a-z]{2}-[a-z]+-\\d+$",
},
},
"required": ["access_key_id", "secret_access_key", "region"],
"properties": {
"access_key_id": {
"type": "string",
"description": "AWS access key ID.",
"pattern": "^AKIA[0-9A-Z]{16}$",
},
{
"title": "Amazon Bedrock API Key",
"type": "object",
"description": "Authenticate with an Amazon Bedrock API key (bearer token). Region is still required.",
"properties": {
"api_key": {
"type": "string",
"description": "Amazon Bedrock API key (bearer token).",
},
"region": {
"type": "string",
"description": "AWS region identifier where Bedrock is available. Examples: us-east-1, "
"us-west-2, eu-west-1, ap-northeast-1.",
"pattern": "^[a-z]{2}-[a-z]+-\\d+$",
},
},
"required": ["api_key", "region"],
"secret_access_key": {
"type": "string",
"description": "AWS secret access key.",
"pattern": "^[A-Za-z0-9/+=]{40}$",
},
],
"region": {
"type": "string",
"description": "AWS region identifier where Bedrock is available. Examples: us-east-1, "
"us-west-2, eu-west-1, ap-northeast-1.",
"pattern": "^[a-z]{2}-[a-z]+-\\d+$",
},
},
"required": ["access_key_id", "secret_access_key", "region"],
},
{
"type": "object",
+173 -128
View File
@@ -21,6 +21,7 @@ from rest_framework_simplejwt.tokens import RefreshToken
from api.db_router import MainRouter
from api.exceptions import ConflictException
from api.models import (
AttackPathsScan,
Finding,
Integration,
IntegrationProviderRelationship,
@@ -72,42 +73,6 @@ from api.v1.serializer_utils.processors import ProcessorConfigField
from api.v1.serializer_utils.providers import ProviderSecretField
from prowler.lib.mutelist.mutelist import Mutelist
# Base
class BaseModelSerializerV1(serializers.ModelSerializer):
def get_root_meta(self, _resource, _many):
return {"version": "v1"}
class BaseSerializerV1(serializers.Serializer):
def get_root_meta(self, _resource, _many):
return {"version": "v1"}
class BaseWriteSerializer(BaseModelSerializerV1):
def validate(self, data):
if hasattr(self, "initial_data"):
initial_data = set(self.initial_data.keys()) - {"id", "type"}
unknown_keys = initial_data - set(self.fields.keys())
if unknown_keys:
raise ValidationError(f"Invalid fields: {unknown_keys}")
return data
class RLSSerializer(BaseModelSerializerV1):
def create(self, validated_data):
tenant_id = self.context.get("tenant_id")
validated_data["tenant_id"] = tenant_id
return super().create(validated_data)
class StateEnumSerializerField(serializers.ChoiceField):
def __init__(self, **kwargs):
kwargs["choices"] = StateChoices.choices
super().__init__(**kwargs)
# Tokens
@@ -215,7 +180,7 @@ class TokenSocialLoginSerializer(BaseTokenSerializer):
# TODO: Check if we can change the parent class to TokenRefreshSerializer from rest_framework_simplejwt.serializers
class TokenRefreshSerializer(BaseSerializerV1):
class TokenRefreshSerializer(serializers.Serializer):
refresh = serializers.CharField()
# Output token
@@ -249,7 +214,7 @@ class TokenRefreshSerializer(BaseSerializerV1):
raise ValidationError({"refresh": "Invalid or expired token"})
class TokenSwitchTenantSerializer(BaseSerializerV1):
class TokenSwitchTenantSerializer(serializers.Serializer):
tenant_id = serializers.UUIDField(
write_only=True, help_text="The tenant ID for which to request a new token."
)
@@ -273,10 +238,41 @@ class TokenSwitchTenantSerializer(BaseSerializerV1):
return generate_tokens(user, tenant_id)
# Base
class BaseSerializerV1(serializers.ModelSerializer):
def get_root_meta(self, _resource, _many):
return {"version": "v1"}
class BaseWriteSerializer(BaseSerializerV1):
def validate(self, data):
if hasattr(self, "initial_data"):
initial_data = set(self.initial_data.keys()) - {"id", "type"}
unknown_keys = initial_data - set(self.fields.keys())
if unknown_keys:
raise ValidationError(f"Invalid fields: {unknown_keys}")
return data
class RLSSerializer(BaseSerializerV1):
def create(self, validated_data):
tenant_id = self.context.get("tenant_id")
validated_data["tenant_id"] = tenant_id
return super().create(validated_data)
class StateEnumSerializerField(serializers.ChoiceField):
def __init__(self, **kwargs):
kwargs["choices"] = StateChoices.choices
super().__init__(**kwargs)
# Users
class UserSerializer(BaseModelSerializerV1):
class UserSerializer(BaseSerializerV1):
"""
Serializer for the User model.
"""
@@ -407,7 +403,7 @@ class UserUpdateSerializer(BaseWriteSerializer):
return super().update(instance, validated_data)
class RoleResourceIdentifierSerializer(BaseSerializerV1):
class RoleResourceIdentifierSerializer(serializers.Serializer):
resource_type = serializers.CharField(source="type")
id = serializers.UUIDField()
@@ -590,7 +586,7 @@ class TaskSerializer(RLSSerializer, TaskBase):
# Tenants
class TenantSerializer(BaseModelSerializerV1):
class TenantSerializer(BaseSerializerV1):
"""
Serializer for the Tenant model.
"""
@@ -602,7 +598,7 @@ class TenantSerializer(BaseModelSerializerV1):
fields = ["id", "name", "memberships"]
class TenantIncludeSerializer(BaseModelSerializerV1):
class TenantIncludeSerializer(BaseSerializerV1):
class Meta:
model = Tenant
fields = ["id", "name"]
@@ -778,7 +774,7 @@ class ProviderGroupUpdateSerializer(ProviderGroupSerializer):
return super().update(instance, validated_data)
class ProviderResourceIdentifierSerializer(BaseSerializerV1):
class ProviderResourceIdentifierSerializer(serializers.Serializer):
resource_type = serializers.CharField(source="type")
id = serializers.UUIDField()
@@ -1115,7 +1111,7 @@ class ScanTaskSerializer(RLSSerializer):
]
class ScanReportSerializer(BaseSerializerV1):
class ScanReportSerializer(serializers.Serializer):
id = serializers.CharField(source="scan")
class Meta:
@@ -1123,7 +1119,7 @@ class ScanReportSerializer(BaseSerializerV1):
fields = ["id"]
class ScanComplianceReportSerializer(BaseSerializerV1):
class ScanComplianceReportSerializer(serializers.Serializer):
id = serializers.CharField(source="scan")
name = serializers.CharField()
@@ -1132,6 +1128,109 @@ class ScanComplianceReportSerializer(BaseSerializerV1):
fields = ["id", "name"]
class AttackPathsScanSerializer(RLSSerializer):
state = StateEnumSerializerField(read_only=True)
provider_alias = serializers.SerializerMethodField(read_only=True)
provider_type = serializers.SerializerMethodField(read_only=True)
provider_uid = serializers.SerializerMethodField(read_only=True)
class Meta:
model = AttackPathsScan
fields = [
"id",
"state",
"progress",
"provider",
"provider_alias",
"provider_type",
"provider_uid",
"scan",
"task",
"inserted_at",
"started_at",
"completed_at",
"duration",
]
included_serializers = {
"provider": "api.v1.serializers.ProviderIncludeSerializer",
"scan": "api.v1.serializers.ScanIncludeSerializer",
"task": "api.v1.serializers.TaskSerializer",
}
def get_provider_alias(self, obj):
provider = getattr(obj, "provider", None)
return provider.alias if provider else None
def get_provider_type(self, obj):
provider = getattr(obj, "provider", None)
return provider.provider if provider else None
def get_provider_uid(self, obj):
provider = getattr(obj, "provider", None)
return provider.uid if provider else None
class AttackPathsQueryParameterSerializer(serializers.Serializer):
name = serializers.CharField()
label = serializers.CharField()
data_type = serializers.CharField(default="string")
description = serializers.CharField(allow_null=True, required=False)
placeholder = serializers.CharField(allow_null=True, required=False)
class JSONAPIMeta:
resource_name = "attack-paths-query-parameter"
class AttackPathsQuerySerializer(serializers.Serializer):
id = serializers.CharField()
name = serializers.CharField()
description = serializers.CharField()
provider = serializers.CharField()
parameters = AttackPathsQueryParameterSerializer(many=True)
class JSONAPIMeta:
resource_name = "attack-paths-query"
class AttackPathsQueryRunRequestSerializer(serializers.Serializer):
id = serializers.CharField()
parameters = serializers.DictField(
child=serializers.JSONField(), allow_empty=True, required=False
)
class JSONAPIMeta:
resource_name = "attack-paths-query-run-request"
class AttackPathsNodeSerializer(serializers.Serializer):
id = serializers.CharField()
labels = serializers.ListField(child=serializers.CharField())
properties = serializers.DictField(child=serializers.JSONField())
class JSONAPIMeta:
resource_name = "attack-paths-query-result-node"
class AttackPathsRelationshipSerializer(serializers.Serializer):
id = serializers.CharField()
label = serializers.CharField()
source = serializers.CharField()
target = serializers.CharField()
properties = serializers.DictField(child=serializers.JSONField())
class JSONAPIMeta:
resource_name = "attack-paths-query-result-relationship"
class AttackPathsQueryResultSerializer(serializers.Serializer):
nodes = AttackPathsNodeSerializer(many=True)
relationships = AttackPathsRelationshipSerializer(many=True)
class JSONAPIMeta:
resource_name = "attack-paths-query-result"
class ResourceTagSerializer(RLSSerializer):
"""
Serializer for the ResourceTag model
@@ -1272,7 +1371,7 @@ class ResourceIncludeSerializer(RLSSerializer):
return fields
class ResourceMetadataSerializer(BaseSerializerV1):
class ResourceMetadataSerializer(serializers.Serializer):
services = serializers.ListField(child=serializers.CharField(), allow_empty=True)
regions = serializers.ListField(child=serializers.CharField(), allow_empty=True)
types = serializers.ListField(child=serializers.CharField(), allow_empty=True)
@@ -1342,7 +1441,7 @@ class FindingIncludeSerializer(RLSSerializer):
# To be removed when the related endpoint is removed as well
class FindingDynamicFilterSerializer(BaseSerializerV1):
class FindingDynamicFilterSerializer(serializers.Serializer):
services = serializers.ListField(child=serializers.CharField(), allow_empty=True)
regions = serializers.ListField(child=serializers.CharField(), allow_empty=True)
@@ -1350,7 +1449,7 @@ class FindingDynamicFilterSerializer(BaseSerializerV1):
resource_name = "finding-dynamic-filters"
class FindingMetadataSerializer(BaseSerializerV1):
class FindingMetadataSerializer(serializers.Serializer):
services = serializers.ListField(child=serializers.CharField(), allow_empty=True)
regions = serializers.ListField(child=serializers.CharField(), allow_empty=True)
resource_types = serializers.ListField(
@@ -2044,7 +2143,7 @@ class RoleProviderGroupRelationshipSerializer(RLSSerializer, BaseWriteSerializer
# Compliance overview
class ComplianceOverviewSerializer(BaseSerializerV1):
class ComplianceOverviewSerializer(serializers.Serializer):
"""
Serializer for compliance requirement status aggregated by compliance framework.
@@ -2066,7 +2165,7 @@ class ComplianceOverviewSerializer(BaseSerializerV1):
resource_name = "compliance-overviews"
class ComplianceOverviewDetailSerializer(BaseSerializerV1):
class ComplianceOverviewDetailSerializer(serializers.Serializer):
"""
Serializer for detailed compliance requirement information.
@@ -2095,7 +2194,7 @@ class ComplianceOverviewDetailThreatscoreSerializer(ComplianceOverviewDetailSeri
total_findings = serializers.IntegerField()
class ComplianceOverviewAttributesSerializer(BaseSerializerV1):
class ComplianceOverviewAttributesSerializer(serializers.Serializer):
id = serializers.CharField()
compliance_name = serializers.CharField()
framework_description = serializers.CharField()
@@ -2109,7 +2208,7 @@ class ComplianceOverviewAttributesSerializer(BaseSerializerV1):
resource_name = "compliance-requirements-attributes"
class ComplianceOverviewMetadataSerializer(BaseSerializerV1):
class ComplianceOverviewMetadataSerializer(serializers.Serializer):
regions = serializers.ListField(child=serializers.CharField(), allow_empty=True)
class JSONAPIMeta:
@@ -2119,7 +2218,7 @@ class ComplianceOverviewMetadataSerializer(BaseSerializerV1):
# Overviews
class OverviewProviderSerializer(BaseSerializerV1):
class OverviewProviderSerializer(serializers.Serializer):
id = serializers.CharField(source="provider")
findings = serializers.SerializerMethodField(read_only=True)
resources = serializers.SerializerMethodField(read_only=True)
@@ -2127,6 +2226,9 @@ class OverviewProviderSerializer(BaseSerializerV1):
class JSONAPIMeta:
resource_name = "providers-overview"
def get_root_meta(self, _resource, _many):
return {"version": "v1"}
@extend_schema_field(
{
"type": "object",
@@ -2160,15 +2262,18 @@ class OverviewProviderSerializer(BaseSerializerV1):
}
class OverviewProviderCountSerializer(BaseSerializerV1):
class OverviewProviderCountSerializer(serializers.Serializer):
id = serializers.CharField(source="provider")
count = serializers.IntegerField()
class JSONAPIMeta:
resource_name = "providers-count-overview"
def get_root_meta(self, _resource, _many):
return {"version": "v1"}
class OverviewFindingSerializer(BaseSerializerV1):
class OverviewFindingSerializer(serializers.Serializer):
id = serializers.CharField(default="n/a")
new = serializers.IntegerField()
changed = serializers.IntegerField()
@@ -2187,12 +2292,15 @@ class OverviewFindingSerializer(BaseSerializerV1):
class JSONAPIMeta:
resource_name = "findings-overview"
def get_root_meta(self, _resource, _many):
return {"version": "v1"}
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.fields["pass"] = self.fields.pop("_pass")
class OverviewSeveritySerializer(BaseSerializerV1):
class OverviewSeveritySerializer(serializers.Serializer):
id = serializers.CharField(default="n/a")
critical = serializers.IntegerField()
high = serializers.IntegerField()
@@ -2203,24 +2311,11 @@ class OverviewSeveritySerializer(BaseSerializerV1):
class JSONAPIMeta:
resource_name = "findings-severity-overview"
class FindingsSeverityOverTimeSerializer(BaseSerializerV1):
"""Serializer for daily findings severity trend data."""
id = serializers.DateField(source="date")
critical = serializers.IntegerField()
high = serializers.IntegerField()
medium = serializers.IntegerField()
low = serializers.IntegerField()
informational = serializers.IntegerField()
muted = serializers.IntegerField()
scan_ids = serializers.ListField(child=serializers.UUIDField())
class JSONAPIMeta:
resource_name = "findings-severity-over-time"
def get_root_meta(self, _resource, _many):
return {"version": "v1"}
class OverviewServiceSerializer(BaseSerializerV1):
class OverviewServiceSerializer(serializers.Serializer):
id = serializers.CharField(source="service")
total = serializers.IntegerField()
_pass = serializers.IntegerField()
@@ -2234,20 +2329,8 @@ class OverviewServiceSerializer(BaseSerializerV1):
super().__init__(*args, **kwargs)
self.fields["pass"] = self.fields.pop("_pass")
class AttackSurfaceOverviewSerializer(BaseSerializerV1):
"""Serializer for attack surface overview aggregations."""
id = serializers.CharField(source="attack_surface_type")
total_findings = serializers.IntegerField()
failed_findings = serializers.IntegerField()
muted_failed_findings = serializers.IntegerField()
check_ids = serializers.ListField(
child=serializers.CharField(), allow_empty=True, default=list, read_only=True
)
class JSONAPIMeta:
resource_name = "attack-surface-overviews"
def get_root_meta(self, _resource, _many):
return {"version": "v1"}
class OverviewRegionSerializer(serializers.Serializer):
@@ -2277,7 +2360,7 @@ class OverviewRegionSerializer(serializers.Serializer):
# Schedules
class ScheduleDailyCreateSerializer(BaseSerializerV1):
class ScheduleDailyCreateSerializer(serializers.Serializer):
provider_id = serializers.UUIDField(required=True)
class JSONAPIMeta:
@@ -2613,7 +2696,7 @@ class IntegrationUpdateSerializer(BaseWriteIntegrationSerializer):
return representation
class IntegrationJiraDispatchSerializer(BaseSerializerV1):
class IntegrationJiraDispatchSerializer(serializers.Serializer):
"""
Serializer for dispatching findings to JIRA integration.
"""
@@ -2776,14 +2859,14 @@ class ProcessorUpdateSerializer(BaseWriteSerializer):
# SSO
class SamlInitiateSerializer(BaseSerializerV1):
class SamlInitiateSerializer(serializers.Serializer):
email_domain = serializers.CharField()
class JSONAPIMeta:
resource_name = "saml-initiate"
class SamlMetadataSerializer(BaseSerializerV1):
class SamlMetadataSerializer(serializers.Serializer):
class JSONAPIMeta:
resource_name = "saml-meta"
@@ -3315,19 +3398,6 @@ class LighthouseProviderConfigUpdateSerializer(BaseWriteSerializer):
and provider_type
== LighthouseProviderConfiguration.LLMProviderChoices.BEDROCK
):
# For updates, enforce that the authentication method (access keys vs API key)
# is immutable. To switch methods, the UI must delete and recreate the provider.
existing_credentials = (
self.instance.credentials_decoded if self.instance else {}
) or {}
existing_uses_api_key = "api_key" in existing_credentials
existing_uses_access_keys = any(
k in existing_credentials
for k in ("access_key_id", "secret_access_key")
)
# First run field-level validation on the partial payload
try:
BedrockCredentialsUpdateSerializer(data=credentials).is_valid(
raise_exception=True
@@ -3338,31 +3408,6 @@ class LighthouseProviderConfigUpdateSerializer(BaseWriteSerializer):
e.detail[f"credentials/{key}"] = value
del e.detail[key]
raise e
# Then enforce invariants about not changing the auth method
# If the existing config uses an API key, forbid introducing access keys.
if existing_uses_api_key and any(
k in credentials for k in ("access_key_id", "secret_access_key")
):
raise ValidationError(
{
"credentials/non_field_errors": [
"Cannot change Bedrock authentication method from API key "
"to access key via update. Delete and recreate the provider instead."
]
}
)
# If the existing config uses access keys, forbid introducing an API key.
if existing_uses_access_keys and "api_key" in credentials:
raise ValidationError(
{
"credentials/non_field_errors": [
"Cannot change Bedrock authentication method from access key "
"to API key via update. Delete and recreate the provider instead."
]
}
)
elif (
credentials is not None
and provider_type
+4
View File
@@ -4,6 +4,7 @@ from drf_spectacular.views import SpectacularRedocView
from rest_framework_nested import routers
from api.v1.views import (
AttackPathsScanViewSet,
ComplianceOverviewViewSet,
CustomSAMLLoginView,
CustomTokenObtainView,
@@ -53,6 +54,9 @@ router.register(r"tenants", TenantViewSet, basename="tenant")
router.register(r"providers", ProviderViewSet, basename="provider")
router.register(r"provider-groups", ProviderGroupViewSet, basename="providergroup")
router.register(r"scans", ScanViewSet, basename="scan")
router.register(
r"attack-paths-scans", AttackPathsScanViewSet, basename="attack-paths-scans"
)
router.register(r"tasks", TaskViewSet, basename="task")
router.register(r"resources", ResourceViewSet, basename="resource")
router.register(r"findings", FindingViewSet, basename="finding")
File diff suppressed because it is too large Load Diff
+1
View File
@@ -1,6 +1,7 @@
import warnings
from celery import Celery, Task
from config.env import env
# Suppress specific warnings from django-rest-auth: https://github.com/iMerica/dj-rest-auth/issues/684
+5 -7
View File
@@ -36,13 +36,11 @@ DATABASES = {
"HOST": env("POSTGRES_REPLICA_HOST", default=default_db_host),
"PORT": env("POSTGRES_REPLICA_PORT", default=default_db_port),
},
"admin_replica": {
"ENGINE": "psqlextra.backend",
"NAME": env("POSTGRES_REPLICA_DB", default=default_db_name),
"USER": env("POSTGRES_ADMIN_USER", default="prowler"),
"PASSWORD": env("POSTGRES_ADMIN_PASSWORD", default="S3cret"),
"HOST": env("POSTGRES_REPLICA_HOST", default=default_db_host),
"PORT": env("POSTGRES_REPLICA_PORT", default=default_db_port),
"neo4j": {
"HOST": env.str("NEO4J_HOST", "neo4j"),
"PORT": env.str("NEO4J_PORT", "7687"),
"USER": env.str("NEO4J_USER", "neo4j"),
"PASSWORD": env.str("NEO4J_PASSWORD", "neo4j_password"),
},
}
+5 -7
View File
@@ -37,13 +37,11 @@ DATABASES = {
"HOST": env("POSTGRES_REPLICA_HOST", default=default_db_host),
"PORT": env("POSTGRES_REPLICA_PORT", default=default_db_port),
},
"admin_replica": {
"ENGINE": "psqlextra.backend",
"NAME": env("POSTGRES_REPLICA_DB", default=default_db_name),
"USER": env("POSTGRES_ADMIN_USER"),
"PASSWORD": env("POSTGRES_ADMIN_PASSWORD"),
"HOST": env("POSTGRES_REPLICA_HOST", default=default_db_host),
"PORT": env("POSTGRES_REPLICA_PORT", default=default_db_port),
"neo4j": {
"HOST": env.str("NEO4J_HOST"),
"PORT": env.str("NEO4J_PORT"),
"USER": env.str("NEO4J_USER"),
"PASSWORD": env.str("NEO4J_PASSWORD"),
},
}
-2
View File
@@ -19,8 +19,6 @@ PORT = env("DJANGO_PORT", default=8000)
# Server settings
bind = f"{BIND_ADDRESS}:{PORT}"
# TODO: Remove after the category filter is implemented
limit_request_line = 0
workers = env.int("DJANGO_WORKERS", default=multiprocessing.cpu_count() * 2 + 1)
reload = DEBUG
@@ -5,9 +5,6 @@ IGNORED_EXCEPTIONS = [
# Provider is not connected due to credentials errors
"is not connected",
"ProviderConnectionError",
# Provider was deleted during a scan
"ProviderDeletedException",
"violates foreign key constraint",
# Authentication Errors from AWS
"InvalidToken",
"AccessDeniedException",
+105 -17
View File
@@ -1,8 +1,11 @@
import logging
from types import SimpleNamespace
from datetime import datetime, timedelta, timezone
from unittest.mock import MagicMock, patch
import pytest
from allauth.socialaccount.models import SocialLogin
from django.conf import settings
from django.db import connection as django_connection
@@ -11,11 +14,14 @@ from django.urls import reverse
from django_celery_results.models import TaskResult
from rest_framework import status
from rest_framework.test import APIClient
from tasks.jobs.backfill import backfill_resource_scan_summaries
from api.attack_paths import (
AttackPathsQueryDefinition,
AttackPathsQueryParameterDefinition,
)
from api.db_utils import rls_transaction
from api.models import (
AttackSurfaceOverview,
AttackPathsScan,
ComplianceOverview,
ComplianceRequirementOverview,
Finding,
@@ -48,6 +54,7 @@ from api.rls import Tenant
from api.v1.serializers import TokenSerializer
from prowler.lib.check.models import Severity
from prowler.lib.outputs.finding import Status
from tasks.jobs.backfill import backfill_resource_scan_summaries
TODAY = str(datetime.today().date())
API_JSON_CONTENT_TYPE = "application/vnd.api+json"
@@ -160,22 +167,20 @@ def create_test_user_rbac_no_roles(django_db_setup, django_db_blocker, tenants_f
@pytest.fixture(scope="function")
def create_test_user_rbac_limited(django_db_setup, django_db_blocker):
def create_test_user_rbac_limited(django_db_setup, django_db_blocker, tenants_fixture):
with django_db_blocker.unblock():
user = User.objects.create_user(
name="testing_limited",
email="rbac_limited@rbac.com",
password=TEST_PASSWORD,
)
tenant = Tenant.objects.create(
name="Tenant Test",
)
tenant = tenants_fixture[0]
Membership.objects.create(
user=user,
tenant=tenant,
role=Membership.RoleChoices.OWNER,
)
Role.objects.create(
role = Role.objects.create(
name="limited",
tenant_id=tenant.id,
manage_users=False,
@@ -188,7 +193,7 @@ def create_test_user_rbac_limited(django_db_setup, django_db_blocker):
)
UserRoleRelationship.objects.create(
user=user,
role=Role.objects.get(name="limited"),
role=role,
tenant_id=tenant.id,
)
return user
@@ -1471,20 +1476,103 @@ def mute_rules_fixture(tenants_fixture, create_test_user, findings_fixture):
@pytest.fixture
def create_attack_surface_overview():
def _create(tenant, scan, attack_surface_type, total=10, failed=5, muted_failed=2):
return AttackSurfaceOverview.objects.create(
tenant=tenant,
scan=scan,
attack_surface_type=attack_surface_type,
total_findings=total,
failed_findings=failed,
muted_failed_findings=muted_failed,
def create_attack_paths_scan():
"""Factory fixture to create Attack Paths scans for tests."""
def _create(
provider,
*,
scan=None,
state=StateChoices.COMPLETED,
progress=0,
graph_database="tenant-db",
**extra_fields,
):
scan_instance = scan or Scan.objects.create(
name=extra_fields.pop("scan_name", "Attack Paths Supporting Scan"),
provider=provider,
trigger=Scan.TriggerChoices.MANUAL,
state=extra_fields.pop("scan_state", StateChoices.COMPLETED),
tenant_id=provider.tenant_id,
)
payload = {
"tenant_id": provider.tenant_id,
"provider": provider,
"scan": scan_instance,
"state": state,
"progress": progress,
"graph_database": graph_database,
}
payload.update(extra_fields)
return AttackPathsScan.objects.create(**payload)
return _create
@pytest.fixture
def attack_paths_query_definition_factory():
"""Factory fixture for building Attack Paths query definitions."""
def _create(**overrides):
cast_type = overrides.pop("cast_type", str)
parameters = overrides.pop(
"parameters",
[
AttackPathsQueryParameterDefinition(
name="limit",
label="Limit",
cast=cast_type,
)
],
)
definition_payload = {
"id": "aws-test",
"name": "Attack Paths Test Query",
"description": "Synthetic Attack Paths definition for tests.",
"provider": "aws",
"cypher": "RETURN 1",
"parameters": parameters,
}
definition_payload.update(overrides)
return AttackPathsQueryDefinition(**definition_payload)
return _create
@pytest.fixture
def attack_paths_graph_stub_classes():
"""Provide lightweight graph element stubs for Attack Paths serialization tests."""
class AttackPathsNativeValue:
def __init__(self, value):
self._value = value
def to_native(self):
return self._value
class AttackPathsNode:
def __init__(self, element_id, labels, properties):
self.element_id = element_id
self.labels = labels
self._properties = properties
class AttackPathsRelationship:
def __init__(self, element_id, rel_type, start_node, end_node, properties):
self.element_id = element_id
self.type = rel_type
self.start_node = start_node
self.end_node = end_node
self._properties = properties
return SimpleNamespace(
NativeValue=AttackPathsNativeValue,
Node=AttackPathsNode,
Relationship=AttackPathsRelationship,
)
def get_authorization_header(access_token: str) -> dict:
return {"Authorization": f"Bearer {access_token}"}
+7
View File
@@ -7,6 +7,7 @@ from tasks.tasks import perform_scheduled_scan_task
from api.db_utils import rls_transaction
from api.exceptions import ConflictException
from api.models import Provider, Scan, StateChoices
from tasks.jobs.attack_paths import db_utils as attack_paths_db_utils
def schedule_provider_scan(provider_instance: Provider):
@@ -39,6 +40,12 @@ def schedule_provider_scan(provider_instance: Provider):
scheduled_at=datetime.now(timezone.utc),
)
attack_paths_db_utils.create_attack_paths_scan(
tenant_id=tenant_id,
scan_id=str(scheduled_scan.id),
provider_id=provider_id,
)
# Schedule the task
periodic_task_instance = PeriodicTask.objects.create(
interval=schedule,
@@ -0,0 +1,5 @@
from tasks.jobs.attack_paths.scan import run as attack_paths_scan
__all__ = [
"attack_paths_scan",
]
@@ -0,0 +1,237 @@
# Portions of this file are based on code from the Cartography project
# (https://github.com/cartography-cncf/cartography), which is licensed under the Apache 2.0 License.
from typing import Any
import aioboto3
import boto3
import neo4j
from cartography.config import Config as CartographyConfig
from cartography.intel import aws as cartography_aws
from celery.utils.log import get_task_logger
from api.models import (
AttackPathsScan as ProwlerAPIAttackPathsScan,
Provider as ProwlerAPIProvider,
)
from prowler.providers.common.provider import Provider as ProwlerSDKProvider
from tasks.jobs.attack_paths import db_utils, utils
logger = get_task_logger(__name__)
def start_aws_ingestion(
neo4j_session: neo4j.Session,
cartography_config: CartographyConfig,
prowler_api_provider: ProwlerAPIProvider,
prowler_sdk_provider: ProwlerSDKProvider,
attack_paths_scan: ProwlerAPIAttackPathsScan,
) -> dict[str, dict[str, str]]:
"""
Code based on Cartography version 0.122.0, specifically on `cartography.intel.aws.__init__.py`.
For the scan progress updates:
- The caller of this function (`tasks.jobs.attack_paths.scan.run`) has set it to 2.
- When the control returns to the caller, it will be set to 95.
"""
# Initialize variables common to all jobs
common_job_parameters = {
"UPDATE_TAG": cartography_config.update_tag,
"permission_relationships_file": cartography_config.permission_relationships_file,
"aws_guardduty_severity_threshold": cartography_config.aws_guardduty_severity_threshold,
"aws_cloudtrail_management_events_lookback_hours": cartography_config.aws_cloudtrail_management_events_lookback_hours,
"experimental_aws_inspector_batch": cartography_config.experimental_aws_inspector_batch,
}
boto3_session = get_boto3_session(prowler_api_provider, prowler_sdk_provider)
regions: list[str] = list(prowler_sdk_provider._enabled_regions)
requested_syncs = list(cartography_aws.RESOURCE_FUNCTIONS.keys())
sync_args = cartography_aws._build_aws_sync_kwargs(
neo4j_session,
boto3_session,
regions,
prowler_api_provider.uid,
cartography_config.update_tag,
common_job_parameters,
)
# Starting with sync functions
cartography_aws.organizations.sync(
neo4j_session,
{prowler_api_provider.alias: prowler_api_provider.uid},
cartography_config.update_tag,
common_job_parameters,
)
db_utils.update_attack_paths_scan_progress(attack_paths_scan, 3)
# Adding an extra field
common_job_parameters["AWS_ID"] = prowler_api_provider.uid
cartography_aws._autodiscover_accounts(
neo4j_session,
boto3_session,
prowler_api_provider.uid,
cartography_config.update_tag,
common_job_parameters,
)
db_utils.update_attack_paths_scan_progress(attack_paths_scan, 4)
failed_syncs = sync_aws_account(
prowler_api_provider, requested_syncs, sync_args, attack_paths_scan
)
if "permission_relationships" in requested_syncs:
cartography_aws.RESOURCE_FUNCTIONS["permission_relationships"](**sync_args)
db_utils.update_attack_paths_scan_progress(attack_paths_scan, 88)
if "resourcegroupstaggingapi" in requested_syncs:
cartography_aws.RESOURCE_FUNCTIONS["resourcegroupstaggingapi"](**sync_args)
db_utils.update_attack_paths_scan_progress(attack_paths_scan, 89)
cartography_aws.run_scoped_analysis_job(
"aws_ec2_iaminstanceprofile.json",
neo4j_session,
common_job_parameters,
)
db_utils.update_attack_paths_scan_progress(attack_paths_scan, 90)
cartography_aws.run_analysis_job(
"aws_lambda_ecr.json",
neo4j_session,
common_job_parameters,
)
db_utils.update_attack_paths_scan_progress(attack_paths_scan, 91)
cartography_aws.merge_module_sync_metadata(
neo4j_session,
group_type="AWSAccount",
group_id=prowler_api_provider.uid,
synced_type="AWSAccount",
update_tag=cartography_config.update_tag,
stat_handler=cartography_aws.stat_handler,
)
db_utils.update_attack_paths_scan_progress(attack_paths_scan, 92)
# Removing the added extra field
del common_job_parameters["AWS_ID"]
cartography_aws.run_cleanup_job(
"aws_post_ingestion_principals_cleanup.json",
neo4j_session,
common_job_parameters,
)
db_utils.update_attack_paths_scan_progress(attack_paths_scan, 93)
cartography_aws._perform_aws_analysis(
requested_syncs, neo4j_session, common_job_parameters
)
db_utils.update_attack_paths_scan_progress(attack_paths_scan, 94)
return failed_syncs
def get_boto3_session(
prowler_api_provider: ProwlerAPIProvider, prowler_sdk_provider: ProwlerSDKProvider
) -> boto3.Session:
boto3_session = prowler_sdk_provider.session.current_session
aws_accounts_from_session = cartography_aws.organizations.get_aws_account_default(
boto3_session
)
if not aws_accounts_from_session:
raise Exception(
"No valid AWS credentials could be found. No AWS accounts can be synced."
)
aws_account_id_from_session = list(aws_accounts_from_session.values())[0]
if prowler_api_provider.uid != aws_account_id_from_session:
raise Exception(
f"Provider {prowler_api_provider.uid} doesn't match AWS account {aws_account_id_from_session}."
)
if boto3_session.region_name is None:
global_region = prowler_sdk_provider.get_global_region()
boto3_session._session.set_config_variable("region", global_region)
return boto3_session
def get_aioboto3_session(boto3_session: boto3.Session) -> aioboto3.Session:
return aioboto3.Session(botocore_session=boto3_session._session)
def sync_aws_account(
prowler_api_provider: ProwlerAPIProvider,
requested_syncs: list[str],
sync_args: dict[str, Any],
attack_paths_scan: ProwlerAPIAttackPathsScan,
) -> dict[str, str]:
current_progress = 4 # `cartography_aws._autodiscover_accounts`
max_progress = (
87 # `cartography_aws.RESOURCE_FUNCTIONS["permission_relationships"]` - 1
)
n_steps = (
len(requested_syncs) - 2
) # Excluding `permission_relationships` and `resourcegroupstaggingapi`
progress_step = (max_progress - current_progress) / n_steps
failed_syncs = {}
for func_name in requested_syncs:
if func_name in cartography_aws.RESOURCE_FUNCTIONS:
logger.info(
f"Syncing function {func_name} for AWS account {prowler_api_provider.uid}"
)
# Updating progress, not really the right place but good enough
current_progress += progress_step
db_utils.update_attack_paths_scan_progress(
attack_paths_scan, int(current_progress)
)
try:
# `ecr:image_layers` uses `aioboto3_session` instead of `boto3_session`
if func_name == "ecr:image_layers":
cartography_aws.RESOURCE_FUNCTIONS[func_name](
neo4j_session=sync_args.get("neo4j_session"),
aioboto3_session=get_aioboto3_session(
sync_args.get("boto3_session")
),
regions=sync_args.get("regions"),
current_aws_account_id=sync_args.get("current_aws_account_id"),
update_tag=sync_args.get("update_tag"),
common_job_parameters=sync_args.get("common_job_parameters"),
)
# Skip permission relationships and tags for now because they rely on data already being in the graph
elif func_name in [
"permission_relationships",
"resourcegroupstaggingapi",
]:
continue
else:
cartography_aws.RESOURCE_FUNCTIONS[func_name](**sync_args)
except Exception as e:
exception_message = utils.stringify_exception(
e, f"Exception for AWS sync function: {func_name}"
)
failed_syncs[func_name] = exception_message
logger.warning(
f"Caught exception syncing function {func_name} from AWS account {prowler_api_provider.uid}. We "
"are continuing on to the next AWS sync function.",
)
continue
else:
raise ValueError(
f'AWS sync function "{func_name}" was specified but does not exist. Did you misspell it?'
)
return failed_syncs
@@ -0,0 +1,158 @@
from datetime import datetime, timezone
from typing import Any
from uuid import UUID
from cartography.config import Config as CartographyConfig
from api.db_utils import rls_transaction
from api.models import (
AttackPathsScan as ProwlerAPIAttackPathsScan,
Provider as ProwlerAPIProvider,
StateChoices,
)
from tasks.jobs.attack_paths.providers import is_provider_available
def create_attack_paths_scan(
tenant_id: str,
scan_id: str,
provider_id: int,
) -> ProwlerAPIAttackPathsScan | None:
with rls_transaction(tenant_id):
prowler_api_provider = ProwlerAPIProvider.objects.get(id=provider_id)
if not is_provider_available(prowler_api_provider.provider):
return None
with rls_transaction(tenant_id):
attack_paths_scan = ProwlerAPIAttackPathsScan.objects.create(
tenant_id=tenant_id,
provider_id=provider_id,
scan_id=scan_id,
state=StateChoices.SCHEDULED,
started_at=datetime.now(tz=timezone.utc),
)
attack_paths_scan.save()
return attack_paths_scan
def retrieve_attack_paths_scan(
tenant_id: str,
scan_id: str,
) -> ProwlerAPIAttackPathsScan | None:
try:
with rls_transaction(tenant_id):
attack_paths_scan = ProwlerAPIAttackPathsScan.objects.get(
scan_id=scan_id,
)
return attack_paths_scan
except ProwlerAPIAttackPathsScan.DoesNotExist:
return None
def starting_attack_paths_scan(
attack_paths_scan: ProwlerAPIAttackPathsScan,
task_id: str,
cartography_config: CartographyConfig,
) -> None:
with rls_transaction(attack_paths_scan.tenant_id):
attack_paths_scan.task_id = task_id
attack_paths_scan.state = StateChoices.EXECUTING
attack_paths_scan.started_at = datetime.now(tz=timezone.utc)
attack_paths_scan.update_tag = cartography_config.update_tag
attack_paths_scan.graph_database = cartography_config.neo4j_database
attack_paths_scan.save(
update_fields=[
"task_id",
"state",
"started_at",
"update_tag",
"graph_database",
]
)
def finish_attack_paths_scan(
attack_paths_scan: ProwlerAPIAttackPathsScan,
state: StateChoices,
ingestion_exceptions: dict[str, Any],
) -> None:
with rls_transaction(attack_paths_scan.tenant_id):
now = datetime.now(tz=timezone.utc)
duration = int((now - attack_paths_scan.started_at).total_seconds())
attack_paths_scan.state = state
attack_paths_scan.progress = 100
attack_paths_scan.completed_at = now
attack_paths_scan.duration = duration
attack_paths_scan.ingestion_exceptions = ingestion_exceptions
attack_paths_scan.save(
update_fields=[
"state",
"progress",
"completed_at",
"duration",
"ingestion_exceptions",
]
)
def update_attack_paths_scan_progress(
attack_paths_scan: ProwlerAPIAttackPathsScan,
progress: int,
) -> None:
with rls_transaction(attack_paths_scan.tenant_id):
attack_paths_scan.progress = progress
attack_paths_scan.save(update_fields=["progress"])
def get_old_attack_paths_scans(
tenant_id: str,
provider_id: str,
attack_paths_scan_id: str,
) -> list[ProwlerAPIAttackPathsScan]:
"""
An `old_attack_paths_scan` is any `completed` Attack Paths scan for the same provider,
with its graph database not deleted, excluding the current Attack Paths scan.
"""
with rls_transaction(tenant_id):
completed_scans_qs = (
ProwlerAPIAttackPathsScan.objects.filter(
provider_id=provider_id,
state=StateChoices.COMPLETED,
is_graph_database_deleted=False,
)
.exclude(id=attack_paths_scan_id)
.all()
)
return list(completed_scans_qs)
def update_old_attack_paths_scan(
old_attack_paths_scan: ProwlerAPIAttackPathsScan,
) -> None:
with rls_transaction(old_attack_paths_scan.tenant_id):
old_attack_paths_scan.is_graph_database_deleted = True
old_attack_paths_scan.save(update_fields=["is_graph_database_deleted"])
def get_provider_graph_database_names(tenant_id: str, provider_id: str) -> list[str]:
"""
Return existing graph database names for a tenant/provider.
Note: For accesing the `AttackPathsScan` we need to use `all_objects` manager because the provider is soft-deleted.
"""
with rls_transaction(tenant_id):
graph_databases_names_qs = ProwlerAPIAttackPathsScan.all_objects.filter(
provider_id=provider_id,
is_graph_database_deleted=False,
).values_list("graph_database", flat=True)
return list(graph_databases_names_qs)
@@ -0,0 +1,23 @@
AVAILABLE_PROVIDERS: list[str] = [
"aws",
]
ROOT_NODE_LABELS: dict[str, str] = {
"aws": "AWSAccount",
}
NODE_UID_FIELDS: dict[str, str] = {
"aws": "arn",
}
def is_provider_available(provider_type: str) -> bool:
return provider_type in AVAILABLE_PROVIDERS
def get_root_node_label(provider_type: str) -> str:
return ROOT_NODE_LABELS.get(provider_type, "UnknownProviderAccount")
def get_node_uid_field(provider_type: str) -> str:
return NODE_UID_FIELDS.get(provider_type, "UnknownProviderUID")
@@ -0,0 +1,205 @@
import neo4j
from cartography.client.core.tx import run_write_query
from cartography.config import Config as CartographyConfig
from celery.utils.log import get_task_logger
from api.db_utils import rls_transaction
from api.models import Provider, ResourceFindingMapping
from config.env import env
from prowler.config import config as ProwlerConfig
from tasks.jobs.attack_paths.providers import get_node_uid_field, get_root_node_label
logger = get_task_logger(__name__)
BATCH_SIZE = env.int("NEO4J_INSERT_BATCH_SIZE", 500)
INDEX_STATEMENTS = [
"CREATE INDEX prowler_finding_id IF NOT EXISTS FOR (n:ProwlerFinding) ON (n.id);",
"CREATE INDEX prowler_finding_provider_uid IF NOT EXISTS FOR (n:ProwlerFinding) ON (n.provider_uid);",
"CREATE INDEX prowler_finding_lastupdated IF NOT EXISTS FOR (n:ProwlerFinding) ON (n.lastupdated);",
"CREATE INDEX prowler_finding_check_id IF NOT EXISTS FOR (n:ProwlerFinding) ON (n.status);",
]
INSERT_STATEMENT_TEMPLATE = """
UNWIND $findings_data AS finding_data
MATCH (account:__ROOT_NODE_LABEL__ {id: $provider_uid})
MATCH (account)-->(resource)
WHERE resource.__NODE_UID_FIELD__ = finding_data.resource_uid
OR resource.id = finding_data.resource_uid
MERGE (finding:ProwlerFinding {id: finding_data.id})
ON CREATE SET
finding.id = finding_data.id,
finding.uid = finding_data.uid,
finding.inserted_at = finding_data.inserted_at,
finding.updated_at = finding_data.updated_at,
finding.first_seen_at = finding_data.first_seen_at,
finding.scan_id = finding_data.scan_id,
finding.delta = finding_data.delta,
finding.status = finding_data.status,
finding.status_extended = finding_data.status_extended,
finding.severity = finding_data.severity,
finding.check_id = finding_data.check_id,
finding.check_title = finding_data.check_title,
finding.muted = finding_data.muted,
finding.muted_reason = finding_data.muted_reason,
finding.provider_uid = $provider_uid,
finding.firstseen = timestamp(),
finding.lastupdated = $last_updated,
finding._module_name = 'cartography:prowler',
finding._module_version = $prowler_version
ON MATCH SET
finding.status = finding_data.status,
finding.status_extended = finding_data.status_extended,
finding.lastupdated = $last_updated
MERGE (resource)-[rel:HAS_FINDING]->(finding)
ON CREATE SET
rel.provider_uid = $provider_uid,
rel.firstseen = timestamp(),
rel.lastupdated = $last_updated,
rel._module_name = 'cartography:prowler',
rel._module_version = $prowler_version
ON MATCH SET
rel.lastupdated = $last_updated
"""
CLEANUP_STATEMENT = """
MATCH (finding:ProwlerFinding {provider_uid: $provider_uid})
WHERE finding.lastupdated < $last_updated
WITH finding LIMIT $batch_size
DETACH DELETE finding
RETURN COUNT(finding) AS deleted_findings_count
"""
def create_indexes(neo4j_session: neo4j.Session) -> None:
"""
Code based on Cartography version 0.122.0, specifically on `cartography.intel.create_indexes.run`.
"""
logger.info("Creating indexes for Prowler node types.")
for statement in INDEX_STATEMENTS:
logger.debug("Executing statement: %s", statement)
run_write_query(neo4j_session, statement)
def analysis(
neo4j_session: neo4j.Session,
prowler_api_provider: Provider,
scan_id: str,
config: CartographyConfig,
) -> None:
findings_data = get_provider_last_scan_findings(prowler_api_provider, scan_id)
load_findings(neo4j_session, findings_data, prowler_api_provider, config)
cleanup_findings(neo4j_session, prowler_api_provider, config)
def get_provider_last_scan_findings(
prowler_api_provider: Provider,
scan_id: str,
) -> list[dict[str, str]]:
with rls_transaction(prowler_api_provider.tenant_id):
resource_finding_qs = ResourceFindingMapping.objects.filter(
finding__scan_id=scan_id,
).values(
"resource__uid",
"finding__id",
"finding__uid",
"finding__inserted_at",
"finding__updated_at",
"finding__first_seen_at",
"finding__scan_id",
"finding__delta",
"finding__status",
"finding__status_extended",
"finding__severity",
"finding__check_id",
"finding__check_metadata__checktitle",
"finding__muted",
"finding__muted_reason",
)
findings = []
for resource_finding in resource_finding_qs:
findings.append(
{
"resource_uid": str(resource_finding["resource__uid"]),
"id": str(resource_finding["finding__id"]),
"uid": resource_finding["finding__uid"],
"inserted_at": resource_finding["finding__inserted_at"],
"updated_at": resource_finding["finding__updated_at"],
"first_seen_at": resource_finding["finding__first_seen_at"],
"scan_id": str(resource_finding["finding__scan_id"]),
"delta": resource_finding["finding__delta"],
"status": resource_finding["finding__status"],
"status_extended": resource_finding["finding__status_extended"],
"severity": resource_finding["finding__severity"],
"check_id": str(resource_finding["finding__check_id"]),
"check_title": resource_finding[
"finding__check_metadata__checktitle"
],
"muted": resource_finding["finding__muted"],
"muted_reason": resource_finding["finding__muted_reason"],
}
)
return findings
def load_findings(
neo4j_session: neo4j.Session,
findings_data: list[dict[str, str]],
prowler_api_provider: Provider,
config: CartographyConfig,
) -> None:
replacements = {
"__ROOT_NODE_LABEL__": get_root_node_label(prowler_api_provider.provider),
"__NODE_UID_FIELD__": get_node_uid_field(prowler_api_provider.provider),
}
query = INSERT_STATEMENT_TEMPLATE
for replace_key, replace_value in replacements.items():
query = query.replace(replace_key, replace_value)
parameters = {
"provider_uid": str(prowler_api_provider.uid),
"last_updated": config.update_tag,
"prowler_version": ProwlerConfig.prowler_version,
}
total_length = len(findings_data)
for i in range(0, total_length, BATCH_SIZE):
parameters["findings_data"] = findings_data[i : i + BATCH_SIZE]
logger.info(
f"Loading findings batch {i // BATCH_SIZE + 1} / {(total_length + BATCH_SIZE - 1) // BATCH_SIZE}"
)
neo4j_session.run(query, parameters)
def cleanup_findings(
neo4j_session: neo4j.Session,
prowler_api_provider: Provider,
config: CartographyConfig,
) -> None:
parameters = {
"provider_uid": str(prowler_api_provider.uid),
"last_updated": config.update_tag,
"batch_size": BATCH_SIZE,
}
batch = 1
deleted_count = 1
while deleted_count > 0:
logger.info(f"Cleaning findings batch {batch}")
result = neo4j_session.run(CLEANUP_STATEMENT, parameters)
deleted_count = result.single().get("deleted_findings_count", 0)
batch += 1
@@ -0,0 +1,183 @@
import logging
import time
import asyncio
from typing import Any, Callable
from cartography.config import Config as CartographyConfig
from cartography.intel import analysis as cartography_analysis
from cartography.intel import create_indexes as cartography_create_indexes
from cartography.intel import ontology as cartography_ontology
from celery.utils.log import get_task_logger
from api.attack_paths import database as graph_database
from api.db_utils import rls_transaction
from api.models import (
Provider as ProwlerAPIProvider,
StateChoices,
)
from api.utils import initialize_prowler_provider
from tasks.jobs.attack_paths import aws, db_utils, prowler, utils
# Without this Celery goes crazy with Cartography logging
logging.getLogger("cartography").setLevel(logging.ERROR)
logging.getLogger("neo4j").propagate = False
logger = get_task_logger(__name__)
CARTOGRAPHY_INGESTION_FUNCTIONS: dict[str, Callable] = {
"aws": aws.start_aws_ingestion,
}
def get_cartography_ingestion_function(provider_type: str) -> Callable | None:
return CARTOGRAPHY_INGESTION_FUNCTIONS.get(provider_type)
def run(tenant_id: str, scan_id: str, task_id: str) -> dict[str, Any]:
"""
Code based on Cartography version 0.122.0, specifically on `cartography.cli.main`, `cartography.cli.CLI.main`,
`cartography.sync.run_with_config` and `cartography.sync.Sync.run`.
"""
ingestion_exceptions = {} # This will hold any exceptions raised during ingestion
# Prowler necessary objects
with rls_transaction(tenant_id):
prowler_api_provider = ProwlerAPIProvider.objects.get(scan__pk=scan_id)
prowler_sdk_provider = initialize_prowler_provider(prowler_api_provider)
# Attack Paths Scan necessary objects
cartography_ingestion_function = get_cartography_ingestion_function(
prowler_api_provider.provider
)
attack_paths_scan = db_utils.retrieve_attack_paths_scan(tenant_id, scan_id)
# Checks before starting the scan
if not cartography_ingestion_function:
ingestion_exceptions = {
"global_error": f"Provider {prowler_api_provider.provider} is not supported for Attack Paths scans"
}
if attack_paths_scan:
db_utils.finish_attack_paths_scan(
attack_paths_scan, StateChoices.COMPLETED, ingestion_exceptions
)
logger.warning(
f"Provider {prowler_api_provider.provider} is not supported for Attack Paths scans"
)
return ingestion_exceptions
else:
if not attack_paths_scan:
logger.warning(
f"No Attack Paths Scan found for scan {scan_id} and tenant {tenant_id}, let's create it then"
)
attack_paths_scan = db_utils.create_attack_paths_scan(
tenant_id, scan_id, prowler_api_provider.id
)
# While creating the Cartography configuration, attributes `neo4j_user` and `neo4j_password` are not really needed in this config object
cartography_config = CartographyConfig(
neo4j_uri=graph_database.get_uri(),
neo4j_database=graph_database.get_database_name(attack_paths_scan.id),
update_tag=int(time.time()),
)
# Starting the Attack Paths scan
db_utils.starting_attack_paths_scan(attack_paths_scan, task_id, cartography_config)
try:
logger.info(
f"Creating Neo4j database {cartography_config.neo4j_database} for tenant {prowler_api_provider.tenant_id}"
)
graph_database.create_database(cartography_config.neo4j_database)
db_utils.update_attack_paths_scan_progress(attack_paths_scan, 1)
logger.info(
f"Starting Cartography ({attack_paths_scan.id}) for "
f"{prowler_api_provider.provider.upper()} provider {prowler_api_provider.id}"
)
with graph_database.get_session(
cartography_config.neo4j_database
) as neo4j_session:
# Indexes creation
cartography_create_indexes.run(neo4j_session, cartography_config)
prowler.create_indexes(neo4j_session)
db_utils.update_attack_paths_scan_progress(attack_paths_scan, 2)
# The real scan, where iterates over cloud services
ingestion_exceptions = _call_within_event_loop(
cartography_ingestion_function,
neo4j_session,
cartography_config,
prowler_api_provider,
prowler_sdk_provider,
attack_paths_scan,
)
# Post-processing: Just keeping it to be more Cartography compliant
cartography_ontology.run(neo4j_session, cartography_config)
db_utils.update_attack_paths_scan_progress(attack_paths_scan, 95)
cartography_analysis.run(neo4j_session, cartography_config)
db_utils.update_attack_paths_scan_progress(attack_paths_scan, 96)
# Adding Prowler nodes and relationships
prowler.analysis(
neo4j_session, prowler_api_provider, scan_id, cartography_config
)
logger.info(
f"Completed Cartography ({attack_paths_scan.id}) for "
f"{prowler_api_provider.provider.upper()} provider {prowler_api_provider.id}"
)
# Handling databases changes
old_attack_paths_scans = db_utils.get_old_attack_paths_scans(
prowler_api_provider.tenant_id,
prowler_api_provider.id,
attack_paths_scan.id,
)
for old_attack_paths_scan in old_attack_paths_scans:
graph_database.drop_database(old_attack_paths_scan.graph_database)
db_utils.update_old_attack_paths_scan(old_attack_paths_scan)
db_utils.finish_attack_paths_scan(
attack_paths_scan, StateChoices.COMPLETED, ingestion_exceptions
)
return ingestion_exceptions
except Exception as e:
exception_message = utils.stringify_exception(e, "Cartography failed")
logger.error(exception_message)
ingestion_exceptions["global_cartography_error"] = exception_message
# Handling databases changes
graph_database.drop_database(cartography_config.neo4j_database)
db_utils.finish_attack_paths_scan(
attack_paths_scan, StateChoices.FAILED, ingestion_exceptions
)
raise
def _call_within_event_loop(fn, *args, **kwargs):
"""
Cartography needs a running event loop, so assuming there is none (Celery task or even regular DRF endpoint),
let's create a new one and set it as the current event loop for this thread.
"""
loop = asyncio.new_event_loop()
try:
asyncio.set_event_loop(loop)
return fn(*args, **kwargs)
finally:
try:
loop.run_until_complete(loop.shutdown_asyncgens())
except Exception:
pass
loop.close()
asyncio.set_event_loop(None)
@@ -0,0 +1,10 @@
import traceback
from datetime import datetime, timezone
def stringify_exception(exception: Exception, context: str) -> str:
timestamp = datetime.now(tz=timezone.utc)
exception_traceback = traceback.TracebackException.from_exception(exception)
traceback_string = "".join(exception_traceback.format())
return f"{timestamp} - {context}\n{traceback_string}"
-101
View File
@@ -1,18 +1,14 @@
from collections import defaultdict
from django.db.models import Sum
from api.db_router import READ_REPLICA_ALIAS
from api.db_utils import rls_transaction
from api.models import (
ComplianceOverviewSummary,
ComplianceRequirementOverview,
DailySeveritySummary,
Resource,
ResourceFindingMapping,
ResourceScanSummary,
Scan,
ScanSummary,
StateChoices,
)
@@ -179,100 +175,3 @@ def backfill_compliance_summaries(tenant_id: str, scan_id: str):
)
return {"status": "backfilled", "inserted": len(summary_objects)}
def backfill_daily_severity_summaries(tenant_id: str, days: int = None):
"""
Backfill DailySeveritySummary from completed scans.
Groups by provider+date, keeps latest scan per day.
"""
from datetime import timedelta
from django.utils import timezone
created_count = 0
updated_count = 0
with rls_transaction(tenant_id, using=READ_REPLICA_ALIAS):
scan_filter = {
"tenant_id": tenant_id,
"state": StateChoices.COMPLETED,
"completed_at__isnull": False,
}
if days is not None:
cutoff_date = timezone.now() - timedelta(days=days)
scan_filter["completed_at__gte"] = cutoff_date
completed_scans = (
Scan.objects.filter(**scan_filter)
.order_by("provider_id", "-completed_at")
.values("id", "provider_id", "completed_at")
)
if not completed_scans:
return {"status": "no scans to backfill"}
# Keep only latest scan per provider/day
latest_scans_by_day = {}
for scan in completed_scans:
key = (scan["provider_id"], scan["completed_at"].date())
if key not in latest_scans_by_day:
latest_scans_by_day[key] = scan
# Process each provider/day
for (provider_id, scan_date), scan in latest_scans_by_day.items():
scan_id = scan["id"]
with rls_transaction(tenant_id, using=READ_REPLICA_ALIAS):
severity_totals = (
ScanSummary.objects.filter(
tenant_id=tenant_id,
scan_id=scan_id,
)
.values("severity")
.annotate(total_fail=Sum("fail"), total_muted=Sum("muted"))
)
severity_data = {
"critical": 0,
"high": 0,
"medium": 0,
"low": 0,
"informational": 0,
"muted": 0,
}
for row in severity_totals:
severity = row["severity"]
if severity in severity_data:
severity_data[severity] = row["total_fail"] or 0
severity_data["muted"] += row["total_muted"] or 0
with rls_transaction(tenant_id):
_, created = DailySeveritySummary.objects.update_or_create(
tenant_id=tenant_id,
provider_id=provider_id,
date=scan_date,
defaults={
"scan_id": scan_id,
"critical": severity_data["critical"],
"high": severity_data["high"],
"medium": severity_data["medium"],
"low": severity_data["low"],
"informational": severity_data["informational"],
"muted": severity_data["muted"],
},
)
if created:
created_count += 1
else:
updated_count += 1
return {
"status": "backfilled",
"created": created_count,
"updated": updated_count,
"total_days": len(latest_scans_by_day),
}
+24 -2
View File
@@ -1,9 +1,19 @@
from celery.utils.log import get_task_logger
from django.db import DatabaseError
from api.attack_paths import database as graph_database
from api.db_router import MainRouter
from api.db_utils import batch_delete, rls_transaction
from api.models import Finding, Provider, Resource, Scan, ScanSummary, Tenant
from api.models import (
AttackPathsScan,
Finding,
Provider,
Resource,
Scan,
ScanSummary,
Tenant,
)
from tasks.jobs.attack_paths.db_utils import get_provider_graph_database_names
logger = get_task_logger(__name__)
@@ -23,16 +33,27 @@ def delete_provider(tenant_id: str, pk: str):
Raises:
Provider.DoesNotExist: If no instance with the provided primary key exists.
"""
# Delete the Attack Paths' graph databases related to the provider
graph_database_names = get_provider_graph_database_names(tenant_id, pk)
try:
for graph_database_name in graph_database_names:
graph_database.drop_database(graph_database_name)
except graph_database.GraphDatabaseQueryException as gdb_error:
logger.error(f"Error deleting Provider databases: {gdb_error}")
raise
# Get all provider related data and delete them in batches
with rls_transaction(tenant_id):
instance = Provider.all_objects.get(pk=pk)
deletion_summary = {}
deletion_steps = [
("Scan Summaries", ScanSummary.all_objects.filter(scan__provider=instance)),
("Findings", Finding.all_objects.filter(scan__provider=instance)),
("Resources", Resource.all_objects.filter(provider=instance)),
("Scans", Scan.all_objects.filter(provider=instance)),
("AttackPathsScans", AttackPathsScan.all_objects.filter(provider=instance)),
]
deletion_summary = {}
for step_name, queryset in deletion_steps:
try:
_, step_summary = batch_delete(tenant_id, queryset)
@@ -48,6 +69,7 @@ def delete_provider(tenant_id: str, pk: str):
except DatabaseError as db_error:
logger.error(f"Error deleting Provider: {db_error}")
raise
return deletion_summary
@@ -2,8 +2,6 @@ from typing import Dict
import boto3
import openai
from botocore import UNSIGNED
from botocore.config import Config
from botocore.exceptions import BotoCoreError, ClientError
from celery.utils.log import get_task_logger
@@ -12,39 +10,6 @@ from api.models import LighthouseProviderConfiguration, LighthouseProviderModels
logger = get_task_logger(__name__)
def _extract_error_message(e: Exception) -> str:
"""
Extract a user-friendly error message from various exception types.
This function handles exceptions from different providers (OpenAI, AWS Bedrock)
and extracts the most relevant error message for display to users.
Args:
e: The exception to extract a message from.
Returns:
str: A user-friendly error message.
"""
# For OpenAI SDK errors (>= v1.0)
# OpenAI exceptions have a 'body' attribute with error details
if hasattr(e, "body") and isinstance(e.body, dict):
if "message" in e.body:
return e.body["message"]
# Sometimes nested under 'error' key
if "error" in e.body and isinstance(e.body["error"], dict):
return e.body["error"].get("message", str(e))
# For boto3 ClientError
# Boto3 exceptions have a 'response' attribute with error details
if hasattr(e, "response") and isinstance(e.response, dict):
error_info = e.response.get("Error", {})
if error_info.get("Message"):
return error_info["Message"]
# Fallback to string representation for unknown error types
return str(e)
def _extract_openai_api_key(
provider_cfg: LighthouseProviderConfiguration,
) -> str | None:
@@ -91,39 +56,21 @@ def _extract_bedrock_credentials(
"""
Safely extract AWS Bedrock credentials from a provider configuration.
Supports two authentication methods:
1. AWS access key + secret key + region
2. Bedrock API key (bearer token) + region
Args:
provider_cfg (LighthouseProviderConfiguration): The provider configuration instance
containing the credentials.
Returns:
Dict[str, str] | None: Dictionary with either:
- 'access_key_id', 'secret_access_key', and 'region' for access key auth
- 'api_key' and 'region' for API key (bearer token) auth
Returns None if credentials are invalid or missing.
Dict[str, str] | None: Dictionary with 'access_key_id', 'secret_access_key', and
'region' if present and valid, otherwise None.
"""
creds = provider_cfg.credentials_decoded
if not isinstance(creds, dict):
return None
region = creds.get("region")
if not isinstance(region, str) or not region:
return None
# Check for API key authentication first
api_key = creds.get("api_key")
if isinstance(api_key, str) and api_key:
return {
"api_key": api_key,
"region": region,
}
# Fall back to access key authentication
access_key_id = creds.get("access_key_id")
secret_access_key = creds.get("secret_access_key")
region = creds.get("region")
# Validate all required fields are present and are strings
if (
@@ -131,6 +78,8 @@ def _extract_bedrock_credentials(
or not access_key_id
or not isinstance(secret_access_key, str)
or not secret_access_key
or not isinstance(region, str)
or not region
):
return None
@@ -141,51 +90,6 @@ def _extract_bedrock_credentials(
}
def _create_bedrock_client(
bedrock_creds: Dict[str, str], service_name: str = "bedrock"
):
"""
Create a boto3 Bedrock client with the appropriate authentication method.
Supports two authentication methods:
1. API key (bearer token) - uses unsigned requests with Authorization header
2. AWS access key + secret key - uses standard SigV4 signing
Args:
bedrock_creds: Dictionary with either:
- 'api_key' and 'region' for API key (bearer token) auth
- 'access_key_id', 'secret_access_key', and 'region' for access key auth
service_name: The Bedrock service name. Use 'bedrock' for control plane
operations (list_foundation_models, etc.) or 'bedrock-runtime' for
inference operations.
Returns:
boto3 client configured for the specified Bedrock service.
"""
region = bedrock_creds["region"]
if "api_key" in bedrock_creds:
bearer_token = bedrock_creds["api_key"]
client = boto3.client(
service_name=service_name,
region_name=region,
config=Config(signature_version=UNSIGNED),
)
def inject_bearer_token(request, **kwargs):
request.headers["Authorization"] = f"Bearer {bearer_token}"
client.meta.events.register("before-send.*.*", inject_bearer_token)
return client
return boto3.client(
service_name=service_name,
region_name=region,
aws_access_key_id=bedrock_creds["access_key_id"],
aws_secret_access_key=bedrock_creds["secret_access_key"],
)
def check_lighthouse_provider_connection(provider_config_id: str) -> Dict:
"""
Validate a Lighthouse provider configuration by calling the provider API and
@@ -237,7 +141,12 @@ def check_lighthouse_provider_connection(provider_config_id: str) -> Dict:
}
# Test connection by listing foundation models
bedrock_client = _create_bedrock_client(bedrock_creds)
bedrock_client = boto3.client(
"bedrock",
aws_access_key_id=bedrock_creds["access_key_id"],
aws_secret_access_key=bedrock_creds["secret_access_key"],
region_name=bedrock_creds["region"],
)
_ = bedrock_client.list_foundation_models()
elif (
@@ -270,13 +179,12 @@ def check_lighthouse_provider_connection(provider_config_id: str) -> Dict:
return {"connected": True, "error": None}
except Exception as e:
error_message = _extract_error_message(e)
logger.warning(
"%s connection check failed: %s", provider_cfg.provider_type, error_message
"%s connection check failed: %s", provider_cfg.provider_type, str(e)
)
provider_cfg.is_active = False
provider_cfg.save()
return {"connected": False, "error": error_message}
return {"connected": False, "error": str(e)}
def _fetch_openai_models(api_key: str) -> Dict[str, str]:
@@ -324,219 +232,105 @@ def _fetch_openai_compatible_models(base_url: str, api_key: str) -> Dict[str, st
return available_models
def _get_region_prefix(region: str) -> str:
"""
Determine geographic prefix for AWS region.
Examples: ap-south-1 -> apac, us-east-1 -> us, eu-west-1 -> eu
"""
if region.startswith(("us-", "ca-", "sa-")):
return "us"
elif region.startswith("eu-"):
return "eu"
elif region.startswith("ap-"):
return "apac"
return "global"
def _clean_inference_profile_name(profile_name: str) -> str:
"""
Remove geographic prefix from inference profile name.
AWS includes geographic prefixes in profile names which are redundant
since the profile ID already contains this information.
Examples:
"APAC Anthropic Claude 3.5 Sonnet" -> "Anthropic Claude 3.5 Sonnet"
"GLOBAL Claude Sonnet 4.5" -> "Claude Sonnet 4.5"
"US Anthropic Claude 3 Haiku" -> "Anthropic Claude 3 Haiku"
"""
prefixes = ["APAC ", "GLOBAL ", "US ", "EU ", "APAC-", "GLOBAL-", "US-", "EU-"]
for prefix in prefixes:
if profile_name.upper().startswith(prefix.upper()):
return profile_name[len(prefix) :].strip()
return profile_name
def _supports_text_modality(input_modalities: list, output_modalities: list) -> bool:
"""Check if model supports TEXT for both input and output."""
return "TEXT" in input_modalities and "TEXT" in output_modalities
def _get_foundation_model_modalities(
bedrock_client, model_id: str
) -> tuple[list, list] | None:
"""
Fetch input and output modalities for a foundation model.
Returns:
(input_modalities, output_modalities) or None if fetch fails
"""
try:
model_info = bedrock_client.get_foundation_model(modelIdentifier=model_id)
model_details = model_info.get("modelDetails", {})
input_mods = model_details.get("inputModalities", [])
output_mods = model_details.get("outputModalities", [])
return (input_mods, output_mods)
except (BotoCoreError, ClientError) as e:
logger.debug("Could not fetch model details for %s: %s", model_id, str(e))
return None
def _extract_foundation_model_ids(profile_models: list) -> list[str]:
"""
Extract foundation model IDs from inference profile model ARNs.
Args:
profile_models: List of model references from inference profile
Returns:
List of foundation model IDs extracted from ARNs
"""
model_ids = []
for model_ref in profile_models:
model_arn = model_ref.get("modelArn", "")
if "foundation-model/" in model_arn:
model_id = model_arn.split("foundation-model/")[1]
model_ids.append(model_id)
return model_ids
def _build_inference_profile_map(
bedrock_client, region: str
) -> Dict[str, tuple[str, str]]:
"""
Build map of foundation_model_id -> best inference profile.
Returns:
Dict mapping foundation_model_id to (profile_id, profile_name)
Only includes profiles with TEXT modality support
Prefers region-matched profiles over others
"""
region_prefix = _get_region_prefix(region)
model_to_profile: Dict[str, tuple[str, str]] = {}
try:
response = bedrock_client.list_inference_profiles()
profiles = response.get("inferenceProfileSummaries", [])
for profile in profiles:
profile_id = profile.get("inferenceProfileId")
profile_name = profile.get("inferenceProfileName")
if not profile_id or not profile_name:
continue
profile_models = profile.get("models", [])
if not profile_models:
continue
foundation_model_ids = _extract_foundation_model_ids(profile_models)
if not foundation_model_ids:
continue
modalities = _get_foundation_model_modalities(
bedrock_client, foundation_model_ids[0]
)
if not modalities:
continue
input_mods, output_mods = modalities
if not _supports_text_modality(input_mods, output_mods):
continue
is_preferred = profile_id.startswith(f"{region_prefix}.")
clean_name = _clean_inference_profile_name(profile_name)
for foundation_model_id in foundation_model_ids:
if foundation_model_id not in model_to_profile:
model_to_profile[foundation_model_id] = (profile_id, clean_name)
elif is_preferred and not model_to_profile[foundation_model_id][
0
].startswith(f"{region_prefix}."):
model_to_profile[foundation_model_id] = (profile_id, clean_name)
except (BotoCoreError, ClientError) as e:
logger.info("Could not fetch inference profiles in %s: %s", region, str(e))
return model_to_profile
def _check_on_demand_availability(bedrock_client, model_id: str) -> bool:
"""Check if an ON_DEMAND foundation model is entitled and available."""
try:
availability = bedrock_client.get_foundation_model_availability(
modelId=model_id
)
entitlement = availability.get("entitlementAvailability")
return entitlement == "AVAILABLE"
except (BotoCoreError, ClientError) as e:
logger.debug("Could not check availability for %s: %s", model_id, str(e))
return False
def _fetch_bedrock_models(bedrock_creds: Dict[str, str]) -> Dict[str, str]:
"""
Fetch available models from AWS Bedrock, preferring inference profiles over ON_DEMAND.
Fetch available models from AWS Bedrock with entitlement verification.
Strategy:
1. Build map of foundation_model -> best_inference_profile (with TEXT validation)
2. For each TEXT-capable foundation model:
- Use inference profile ID if available (preferred - better throughput)
- Fallback to foundation model ID if only ON_DEMAND available
3. Verify entitlement for ON_DEMAND models
This function:
1. Lists foundation models with TEXT modality support
2. Lists inference profiles with TEXT modality support
3. Verifies user has entitlement access to each model
Args:
bedrock_creds: Dict with 'region' and auth credentials
bedrock_creds: Dictionary with 'access_key_id', 'secret_access_key', and 'region'.
Returns:
Dict mapping model_id to model_name. IDs can be:
- Inference profile IDs (e.g., "apac.anthropic.claude-3-5-sonnet-20240620-v1:0")
- Foundation model IDs (e.g., "anthropic.claude-3-5-sonnet-20240620-v1:0")
Dict mapping model_id to model_name for all accessible models.
Raises:
BotoCoreError, ClientError: If AWS API calls fail.
"""
bedrock_client = _create_bedrock_client(bedrock_creds)
region = bedrock_creds["region"]
bedrock_client = boto3.client(
"bedrock",
aws_access_key_id=bedrock_creds["access_key_id"],
aws_secret_access_key=bedrock_creds["secret_access_key"],
region_name=bedrock_creds["region"],
)
model_to_profile = _build_inference_profile_map(bedrock_client, region)
models_to_check: Dict[str, str] = {}
# Step 1: Get foundation models with TEXT modality
foundation_response = bedrock_client.list_foundation_models()
model_summaries = foundation_response.get("modelSummaries", [])
models_to_return: Dict[str, str] = {}
on_demand_models: set[str] = set()
for model in model_summaries:
input_mods = model.get("inputModalities", [])
output_mods = model.get("outputModalities", [])
# Check if model supports TEXT input and output modality
input_modalities = model.get("inputModalities", [])
output_modalities = model.get("outputModalities", [])
if not _supports_text_modality(input_mods, output_mods):
if "TEXT" not in input_modalities or "TEXT" not in output_modalities:
continue
model_id = model.get("modelId")
model_name = model.get("modelName")
if not model_id or not model_name:
if not model_id:
continue
if model_id in model_to_profile:
profile_id, profile_name = model_to_profile[model_id]
models_to_return[profile_id] = profile_name
else:
inference_types = model.get("inferenceTypesSupported", [])
if "ON_DEMAND" in inference_types:
models_to_return[model_id] = model_name
on_demand_models.add(model_id)
inference_types = model.get("inferenceTypesSupported", [])
# Only include models with ON_DEMAND inference support
if "ON_DEMAND" in inference_types:
models_to_check[model_id] = model["modelName"]
# Step 2: Get inference profiles
try:
inference_profiles_response = bedrock_client.list_inference_profiles()
inference_profiles = inference_profiles_response.get(
"inferenceProfileSummaries", []
)
for profile in inference_profiles:
# Check if profile supports TEXT modality
input_modalities = profile.get("inputModalities", [])
output_modalities = profile.get("outputModalities", [])
if "TEXT" not in input_modalities or "TEXT" not in output_modalities:
continue
profile_id = profile.get("inferenceProfileId")
if profile_id:
models_to_check[profile_id] = profile["inferenceProfileName"]
except (BotoCoreError, ClientError) as e:
logger.info(
"Could not fetch inference profiles in %s: %s",
bedrock_creds["region"],
str(e),
)
# Step 3: Verify entitlement availability for each model
available_models: Dict[str, str] = {}
for model_id, model_name in models_to_return.items():
if model_id in on_demand_models:
if _check_on_demand_availability(bedrock_client, model_id):
for model_id, model_name in models_to_check.items():
try:
availability = bedrock_client.get_foundation_model_availability(
modelId=model_id
)
entitlement = availability.get("entitlementAvailability")
# Only include models user has access to
if entitlement == "AVAILABLE":
available_models[model_id] = model_name
else:
available_models[model_id] = model_name
else:
logger.debug(
"Skipping model %s - entitlement status: %s", model_id, entitlement
)
except (BotoCoreError, ClientError) as e:
logger.debug(
"Could not check availability for model %s: %s", model_id, str(e)
)
continue
return available_models
@@ -565,6 +359,7 @@ def refresh_lighthouse_provider_models(provider_config_id: str) -> Dict:
provider_cfg = LighthouseProviderConfiguration.objects.get(pk=provider_config_id)
fetched_models: Dict[str, str] = {}
# Fetch models from the appropriate provider
try:
if (
provider_cfg.provider_type
@@ -619,13 +414,12 @@ def refresh_lighthouse_provider_models(provider_config_id: str) -> Dict:
}
except Exception as e:
error_message = _extract_error_message(e)
logger.warning(
"Unexpected error refreshing %s models: %s",
provider_cfg.provider_type,
error_message,
str(e),
)
return {"created": 0, "updated": 0, "deleted": 0, "error": error_message}
return {"created": 0, "updated": 0, "deleted": 0, "error": str(e)}
# Upsert models into the catalog
created = 0
+2 -10
View File
@@ -2238,20 +2238,12 @@ def generate_ens_report(
[
"CUMPLE",
str(passed_requirements),
(
f"{(passed_requirements / total_requirements * 100):.1f}%"
if total_requirements > 0
else "0.0%"
),
f"{(passed_requirements / total_requirements * 100):.1f}%",
],
[
"NO CUMPLE",
str(failed_requirements),
(
f"{(failed_requirements / total_requirements * 100):.1f}%"
if total_requirements > 0
else "0.0%"
),
f"{(failed_requirements / total_requirements * 100):.1f}%",
],
["TOTAL", str(total_requirements), "100%"],
]
+7 -234
View File
@@ -12,7 +12,7 @@ from celery.utils.log import get_task_logger
from config.env import env
from config.settings.celery import CELERY_DEADLOCK_ATTEMPTS
from django.db import IntegrityError, OperationalError
from django.db.models import Case, Count, IntegerField, Prefetch, Q, Sum, When
from django.db.models import Case, Count, IntegerField, Prefetch, Sum, When
from tasks.utils import CustomEncoder
from api.compliance import PROWLER_COMPLIANCE_OVERVIEW_TEMPLATE
@@ -26,10 +26,8 @@ from api.db_utils import (
)
from api.exceptions import ProviderConnectionError
from api.models import (
AttackSurfaceOverview,
ComplianceOverviewSummary,
ComplianceRequirementOverview,
DailySeveritySummary,
Finding,
MuteRule,
Processor,
@@ -45,7 +43,6 @@ from api.models import (
from api.models import StatusChoices as FindingStatus
from api.utils import initialize_prowler_provider, return_prowler_provider
from api.v1.serializers import ScanTaskSerializer
from prowler.lib.check.models import CheckMetadata
from prowler.lib.outputs.finding import Finding as ProwlerFinding
from prowler.lib.scan.scan import Scan as ProwlerScan
@@ -78,44 +75,6 @@ FINDINGS_MICRO_BATCH_SIZE = env.int("DJANGO_FINDINGS_MICRO_BATCH_SIZE", default=
SCAN_DB_BATCH_SIZE = env.int("DJANGO_SCAN_DB_BATCH_SIZE", default=500)
ATTACK_SURFACE_PROVIDER_COMPATIBILITY = {
"internet-exposed": None, # Compatible with all providers
"secrets": None, # Compatible with all providers
"privilege-escalation": ["aws", "kubernetes"],
"ec2-imdsv1": ["aws"],
}
_ATTACK_SURFACE_MAPPING_CACHE: dict[str, dict] = {}
def _get_attack_surface_mapping_from_provider(provider_type: str) -> dict:
global _ATTACK_SURFACE_MAPPING_CACHE
if provider_type in _ATTACK_SURFACE_MAPPING_CACHE:
return _ATTACK_SURFACE_MAPPING_CACHE[provider_type]
attack_surface_check_mappings = {
"internet-exposed": None,
"secrets": None,
"privilege-escalation": {
"iam_policy_allows_privilege_escalation",
"iam_inline_policy_allows_privilege_escalation",
},
"ec2-imdsv1": {
"ec2_instance_imdsv2_enabled"
}, # AWS only - IMDSv1 enabled findings
}
for category_name, check_ids in attack_surface_check_mappings.items():
if check_ids is None:
sdk_check_ids = CheckMetadata.list(
provider=provider_type, category=category_name
)
attack_surface_check_mappings[category_name] = sdk_check_ids
_ATTACK_SURFACE_MAPPING_CACHE[provider_type] = attack_surface_check_mappings
return attack_surface_check_mappings
def _create_finding_delta(
last_status: FindingStatus | None | str, new_status: FindingStatus | None
) -> Finding.DeltaChoices:
@@ -371,7 +330,7 @@ def _create_compliance_summaries(
if summary_objects:
with rls_transaction(tenant_id):
ComplianceOverviewSummary.objects.bulk_create(
summary_objects, batch_size=500, ignore_conflicts=True
summary_objects, batch_size=500
)
@@ -1020,14 +979,11 @@ def _aggregate_findings_by_region(
findings_count_by_compliance = {}
with rls_transaction(tenant_id, using=READ_REPLICA_ALIAS):
# Fetch only PASS/FAIL findings (optimized query reduces data transfer)
# Other statuses are not needed for check_status or ThreatScore calculation
# Fetch findings with resources in a single efficient query
# Use select_related for finding fields and prefetch_related for many-to-many resources
findings = (
Finding.all_objects.filter(
tenant_id=tenant_id,
scan_id=scan_id,
muted=False,
status__in=["PASS", "FAIL"],
tenant_id=tenant_id, scan_id=scan_id, muted=False
)
.only("id", "check_id", "status", "compliance")
.prefetch_related(
@@ -1045,8 +1001,6 @@ def _aggregate_findings_by_region(
)
for finding in findings:
status = finding.status
for resource in finding.small_resources:
region = resource.region
@@ -1054,7 +1008,7 @@ def _aggregate_findings_by_region(
current_status = check_status_by_region.setdefault(region, {})
# Priority: FAIL > any other status
if current_status.get(finding.check_id) != "FAIL":
current_status[finding.check_id] = status
current_status[finding.check_id] = finding.status
# Aggregate ThreatScore compliance counts
if modeled_threatscore_compliance_id in (finding.compliance or {}):
@@ -1069,7 +1023,7 @@ def _aggregate_findings_by_region(
requirement_id, {"total": 0, "pass": 0}
)
requirement_stats["total"] += 1
if status == "PASS":
if finding.status == "PASS":
requirement_stats["pass"] += 1
return check_status_by_region, findings_count_by_compliance
@@ -1237,184 +1191,3 @@ def create_compliance_requirements(tenant_id: str, scan_id: str):
except Exception as e:
logger.error(f"Error creating compliance requirements for scan {scan_id}: {e}")
raise e
def aggregate_attack_surface(tenant_id: str, scan_id: str):
"""
Aggregate findings into attack surface overview records.
Creates one AttackSurfaceOverview record per attack surface type
for the given scan, based on check_id mappings.
Args:
tenant_id: Tenant that owns the scan.
scan_id: Scan UUID whose findings should be aggregated.
"""
with rls_transaction(tenant_id, using=READ_REPLICA_ALIAS):
scan_instance = Scan.all_objects.select_related("provider").get(pk=scan_id)
provider_type = scan_instance.provider.provider
provider_attack_surface_mapping = _get_attack_surface_mapping_from_provider(
provider_type=provider_type
)
# Filter out attack surfaces that are not compatible or have no resolved check IDs
supported_mappings: dict[str, list[str]] = {}
for attack_surface_type, check_ids in provider_attack_surface_mapping.items():
compatible_providers = ATTACK_SURFACE_PROVIDER_COMPATIBILITY.get(
attack_surface_type
)
if (
compatible_providers is not None
and provider_type not in compatible_providers
):
logger.info(
f"Skipping {attack_surface_type} - not supported for {provider_type}"
)
continue
if not check_ids:
logger.info(
f"Skipping {attack_surface_type} - no check IDs resolved for {provider_type}"
)
continue
supported_mappings[attack_surface_type] = list(check_ids)
if not supported_mappings:
logger.info(
f"No attack surface mappings available for scan {scan_id} and provider {provider_type}"
)
logger.info(f"No attack surface overview records created for scan {scan_id}")
return
# Map every check_id to its attack surface, so we can aggregate with a single query
check_id_to_surface: dict[str, str] = {}
for attack_surface_type, check_ids in supported_mappings.items():
for check_id in check_ids:
check_id_to_surface[check_id] = attack_surface_type
aggregated_counts = {
attack_surface_type: {"total": 0, "failed": 0, "muted": 0}
for attack_surface_type in supported_mappings.keys()
}
with rls_transaction(tenant_id, using=READ_REPLICA_ALIAS):
finding_stats = (
Finding.all_objects.filter(
tenant_id=tenant_id,
scan_id=scan_id,
check_id__in=list(check_id_to_surface.keys()),
)
.values("check_id")
.annotate(
total=Count("id"),
failed=Count("id", filter=Q(status="FAIL", muted=False)),
muted=Count("id", filter=Q(status="FAIL", muted=True)),
)
)
for stats in finding_stats:
attack_surface_type = check_id_to_surface.get(stats["check_id"])
if not attack_surface_type:
continue
aggregated_counts[attack_surface_type]["total"] += stats["total"] or 0
aggregated_counts[attack_surface_type]["failed"] += stats["failed"] or 0
aggregated_counts[attack_surface_type]["muted"] += stats["muted"] or 0
overview_objects = []
for attack_surface_type, counts in aggregated_counts.items():
total = counts["total"]
if not total:
continue
overview_objects.append(
AttackSurfaceOverview(
tenant_id=tenant_id,
scan_id=scan_id,
attack_surface_type=attack_surface_type,
total_findings=total,
failed_findings=counts["failed"],
muted_failed_findings=counts["muted"],
)
)
# Bulk create overview records
if overview_objects:
with rls_transaction(tenant_id):
AttackSurfaceOverview.objects.bulk_create(overview_objects, batch_size=500)
logger.info(
f"Created {len(overview_objects)} attack surface overview records for scan {scan_id}"
)
else:
logger.info(f"No attack surface overview records created for scan {scan_id}")
def aggregate_daily_severity(tenant_id: str, scan_id: str):
"""Aggregate scan severity counts into DailySeveritySummary (one record per provider/day)."""
with rls_transaction(tenant_id, using=READ_REPLICA_ALIAS):
scan = Scan.objects.filter(
tenant_id=tenant_id,
id=scan_id,
state=StateChoices.COMPLETED,
).first()
if not scan:
logger.warning(f"Scan {scan_id} not found or not completed")
return {"status": "scan is not completed"}
provider_id = scan.provider_id
scan_date = scan.completed_at.date()
severity_totals = (
ScanSummary.objects.filter(
tenant_id=tenant_id,
scan_id=scan_id,
)
.values("severity")
.annotate(total_fail=Sum("fail"), total_muted=Sum("muted"))
)
severity_data = {
"critical": 0,
"high": 0,
"medium": 0,
"low": 0,
"informational": 0,
"muted": 0,
}
for row in severity_totals:
severity = row["severity"]
if severity in severity_data:
severity_data[severity] = row["total_fail"] or 0
severity_data["muted"] += row["total_muted"] or 0
with rls_transaction(tenant_id):
summary, created = DailySeveritySummary.objects.update_or_create(
tenant_id=tenant_id,
provider_id=provider_id,
date=scan_date,
defaults={
"scan_id": scan_id,
"critical": severity_data["critical"],
"high": severity_data["high"],
"medium": severity_data["medium"],
"low": severity_data["low"],
"informational": severity_data["informational"],
"muted": severity_data["muted"],
},
)
action = "created" if created else "updated"
logger.info(
f"Daily severity summary {action} for provider {provider_id} on {scan_date}"
)
return {
"status": action,
"provider_id": str(provider_id),
"date": str(scan_date),
"severity_data": severity_data,
}
+38 -62
View File
@@ -1,16 +1,28 @@
import os
from datetime import datetime, timedelta, timezone
from pathlib import Path
from shutil import rmtree
from celery import chain, group, shared_task
from celery.utils.log import get_task_logger
from django_celery_beat.models import PeriodicTask
from api.compliance import get_compliance_frameworks
from api.db_router import READ_REPLICA_ALIAS
from api.db_utils import rls_transaction
from api.decorators import set_tenant
from api.models import Finding, Integration, Provider, Scan, ScanSummary, StateChoices
from api.utils import initialize_prowler_provider
from api.v1.serializers import ScanTaskSerializer
from config.celery import RLSTask
from config.django.base import DJANGO_FINDINGS_BATCH_SIZE, DJANGO_TMP_OUTPUT_DIRECTORY
from django_celery_beat.models import PeriodicTask
from prowler.lib.check.compliance_models import Compliance
from prowler.lib.outputs.compliance.generic.generic import GenericCompliance
from prowler.lib.outputs.finding import Finding as FindingOutput
from tasks.jobs.attack_paths import attack_paths_scan
from tasks.jobs.backfill import (
backfill_compliance_summaries,
backfill_daily_severity_summaries,
backfill_resource_scan_summaries,
)
from tasks.jobs.connection import (
@@ -38,25 +50,12 @@ from tasks.jobs.lighthouse_providers import (
from tasks.jobs.muting import mute_historical_findings
from tasks.jobs.report import generate_compliance_reports_job
from tasks.jobs.scan import (
aggregate_attack_surface,
aggregate_daily_severity,
aggregate_findings,
create_compliance_requirements,
perform_prowler_scan,
)
from tasks.utils import batched, get_next_execution_datetime
from api.compliance import get_compliance_frameworks
from api.db_router import READ_REPLICA_ALIAS
from api.db_utils import rls_transaction
from api.decorators import handle_provider_deletion, set_tenant
from api.models import Finding, Integration, Provider, Scan, ScanSummary, StateChoices
from api.utils import initialize_prowler_provider
from api.v1.serializers import ScanTaskSerializer
from prowler.lib.check.compliance_models import Compliance
from prowler.lib.outputs.compliance.generic.generic import GenericCompliance
from prowler.lib.outputs.finding import Finding as FindingOutput
logger = get_task_logger(__name__)
@@ -72,16 +71,10 @@ def _perform_scan_complete_tasks(tenant_id: str, scan_id: str, provider_id: str)
create_compliance_requirements_task.apply_async(
kwargs={"tenant_id": tenant_id, "scan_id": scan_id}
)
aggregate_attack_surface_task.apply_async(
kwargs={"tenant_id": tenant_id, "scan_id": scan_id}
)
chain(
perform_scan_summary_task.si(tenant_id=tenant_id, scan_id=scan_id),
group(
aggregate_daily_severity_task.si(tenant_id=tenant_id, scan_id=scan_id),
generate_outputs_task.si(
scan_id=scan_id, provider_id=provider_id, tenant_id=tenant_id
),
generate_outputs_task.si(
scan_id=scan_id, provider_id=provider_id, tenant_id=tenant_id
),
group(
# Use optimized task that generates both reports with shared queries
@@ -95,6 +88,9 @@ def _perform_scan_complete_tasks(tenant_id: str, scan_id: str, provider_id: str)
),
),
).apply_async()
perform_attack_paths_scan_task.apply_async(
kwargs={"tenant_id": tenant_id, "scan_id": scan_id}
)
@shared_task(base=RLSTask, name="provider-connection-check")
@@ -149,7 +145,6 @@ def delete_provider_task(provider_id: str, tenant_id: str):
@shared_task(base=RLSTask, name="scan-perform", queue="scans")
@handle_provider_deletion
def perform_scan_task(
tenant_id: str, scan_id: str, provider_id: str, checks_to_execute: list[str] = None
):
@@ -182,7 +177,6 @@ def perform_scan_task(
@shared_task(base=RLSTask, bind=True, name="scan-perform-scheduled", queue="scans")
@handle_provider_deletion
def perform_scheduled_scan_task(self, tenant_id: str, provider_id: str):
"""
Task to perform a scheduled Prowler scan on a given provider.
@@ -288,11 +282,29 @@ def perform_scheduled_scan_task(self, tenant_id: str, provider_id: str):
@shared_task(name="scan-summary", queue="overview")
@handle_provider_deletion
def perform_scan_summary_task(tenant_id: str, scan_id: str):
return aggregate_findings(tenant_id=tenant_id, scan_id=scan_id)
# TODO: This task must be queued at the `attack-paths` queue, don't forget to add it to the `docker-entrypoint.sh` file
@shared_task(base=RLSTask, bind=True, name="attack-paths-scan-perform", queue="scans")
def perform_attack_paths_scan_task(self, tenant_id: str, scan_id: str):
"""
Execute an Attack Paths scan for the given provider within the current tenant RLS context.
Args:
self: The task instance (automatically passed when bind=True).
tenant_id (str): The tenant identifier for RLS context.
scan_id (str): The Prowler scan identifier for obtaining the tenant and provider context.
Returns:
Any: The result from `attack_paths_scan`, including any per-scan failure details.
"""
return attack_paths_scan(
tenant_id=tenant_id, scan_id=scan_id, task_id=self.request.id
)
@shared_task(name="tenant-deletion", queue="deletion", autoretry_for=(Exception,))
def delete_tenant_task(tenant_id: str):
return delete_tenant(pk=tenant_id)
@@ -304,7 +316,6 @@ def delete_tenant_task(tenant_id: str):
queue="scan-reports",
)
@set_tenant(keep_tenant=True)
@handle_provider_deletion
def generate_outputs_task(scan_id: str, provider_id: str, tenant_id: str):
"""
Process findings in batches and generate output files in multiple formats.
@@ -500,7 +511,6 @@ def generate_outputs_task(scan_id: str, provider_id: str, tenant_id: str):
@shared_task(name="backfill-scan-resource-summaries", queue="backfill")
@handle_provider_deletion
def backfill_scan_resource_summaries_task(tenant_id: str, scan_id: str):
"""
Tries to backfill the resource scan summaries table for a given scan.
@@ -513,7 +523,6 @@ def backfill_scan_resource_summaries_task(tenant_id: str, scan_id: str):
@shared_task(name="backfill-compliance-summaries", queue="backfill")
@handle_provider_deletion
def backfill_compliance_summaries_task(tenant_id: str, scan_id: str):
"""
Tries to backfill compliance overview summaries for a completed scan.
@@ -528,14 +537,7 @@ def backfill_compliance_summaries_task(tenant_id: str, scan_id: str):
return backfill_compliance_summaries(tenant_id=tenant_id, scan_id=scan_id)
@shared_task(name="backfill-daily-severity-summaries", queue="backfill")
def backfill_daily_severity_summaries_task(tenant_id: str, days: int = None):
"""Backfill DailySeveritySummary from historical scans. Use days param to limit scope."""
return backfill_daily_severity_summaries(tenant_id=tenant_id, days=days)
@shared_task(base=RLSTask, name="scan-compliance-overviews", queue="compliance")
@handle_provider_deletion
def create_compliance_requirements_task(tenant_id: str, scan_id: str):
"""
Creates detailed compliance requirement records for a scan.
@@ -551,29 +553,6 @@ def create_compliance_requirements_task(tenant_id: str, scan_id: str):
return create_compliance_requirements(tenant_id=tenant_id, scan_id=scan_id)
@shared_task(name="scan-attack-surface-overviews", queue="overview")
@handle_provider_deletion
def aggregate_attack_surface_task(tenant_id: str, scan_id: str):
"""
Creates attack surface overview records for a scan.
This task processes findings and aggregates them into attack surface categories
(internet-exposed, secrets, privilege-escalation, ec2-imdsv1) for quick overview queries.
Args:
tenant_id (str): The tenant ID for which to create records.
scan_id (str): The ID of the scan for which to create records.
"""
return aggregate_attack_surface(tenant_id=tenant_id, scan_id=scan_id)
@shared_task(name="scan-daily-severity", queue="overview")
@handle_provider_deletion
def aggregate_daily_severity_task(tenant_id: str, scan_id: str):
"""Aggregate scan severity into DailySeveritySummary for findings_severity/timeseries endpoint."""
return aggregate_daily_severity(tenant_id=tenant_id, scan_id=scan_id)
@shared_task(base=RLSTask, name="lighthouse-connection-check")
@set_tenant
def check_lighthouse_connection_task(lighthouse_config_id: str, tenant_id: str = None):
@@ -612,7 +591,6 @@ def refresh_lighthouse_provider_models_task(
@shared_task(name="integration-check")
@handle_provider_deletion
def check_integrations_task(tenant_id: str, provider_id: str, scan_id: str = None):
"""
Check and execute all configured integrations for a provider.
@@ -677,7 +655,6 @@ def check_integrations_task(tenant_id: str, provider_id: str, scan_id: str = Non
name="integration-s3",
queue="integrations",
)
@handle_provider_deletion
def s3_integration_task(
tenant_id: str,
provider_id: str,
@@ -737,7 +714,6 @@ def jira_integration_task(
name="scan-compliance-reports",
queue="scan-reports",
)
@handle_provider_deletion
def generate_compliance_reports_task(tenant_id: str, scan_id: str, provider_id: str):
"""
Optimized task to generate ThreatScore, ENS, and NIS2 reports with shared queries.
@@ -0,0 +1,416 @@
from contextlib import nullcontext
from types import SimpleNamespace
from unittest.mock import MagicMock, call, patch
import pytest
from api.models import (
AttackPathsScan,
Finding,
Provider,
Resource,
ResourceFindingMapping,
Scan,
StateChoices,
StatusChoices,
)
from prowler.lib.check.models import Severity
from tasks.jobs.attack_paths import prowler as prowler_module
from tasks.jobs.attack_paths.scan import run as attack_paths_run
@pytest.mark.django_db
class TestAttackPathsRun:
def test_run_success_flow(self, tenants_fixture, providers_fixture, scans_fixture):
tenant = tenants_fixture[0]
provider = providers_fixture[0]
provider.provider = Provider.ProviderChoices.AWS
provider.save()
scan = scans_fixture[0]
scan.provider = provider
scan.save()
attack_paths_scan = AttackPathsScan.objects.create(
tenant_id=tenant.id,
provider=provider,
scan=scan,
state=StateChoices.SCHEDULED,
)
mock_session = MagicMock()
session_ctx = MagicMock()
session_ctx.__enter__.return_value = mock_session
session_ctx.__exit__.return_value = False
ingestion_result = {"organizations": "warning"}
ingestion_fn = MagicMock(return_value=ingestion_result)
with (
patch(
"tasks.jobs.attack_paths.scan.rls_transaction",
new=lambda *args, **kwargs: nullcontext(),
),
patch(
"tasks.jobs.attack_paths.scan.initialize_prowler_provider",
return_value=MagicMock(_enabled_regions=["us-east-1"]),
),
patch(
"tasks.jobs.attack_paths.scan.graph_database.get_uri",
return_value="bolt://neo4j",
),
patch(
"tasks.jobs.attack_paths.scan.graph_database.get_database_name",
return_value="db-scan-id",
) as mock_get_db_name,
patch(
"tasks.jobs.attack_paths.scan.graph_database.create_database"
) as mock_create_db,
patch(
"tasks.jobs.attack_paths.scan.graph_database.get_session",
return_value=session_ctx,
) as mock_get_session,
patch(
"tasks.jobs.attack_paths.scan.cartography_create_indexes.run"
) as mock_cartography_indexes,
patch(
"tasks.jobs.attack_paths.scan.cartography_analysis.run"
) as mock_cartography_analysis,
patch(
"tasks.jobs.attack_paths.scan.cartography_ontology.run"
) as mock_cartography_ontology,
patch(
"tasks.jobs.attack_paths.scan.prowler.create_indexes"
) as mock_prowler_indexes,
patch(
"tasks.jobs.attack_paths.scan.prowler.analysis"
) as mock_prowler_analysis,
patch(
"tasks.jobs.attack_paths.scan.db_utils.retrieve_attack_paths_scan",
return_value=attack_paths_scan,
) as mock_retrieve_scan,
patch(
"tasks.jobs.attack_paths.scan.db_utils.starting_attack_paths_scan"
) as mock_starting,
patch(
"tasks.jobs.attack_paths.scan.db_utils.update_attack_paths_scan_progress"
) as mock_update_progress,
patch(
"tasks.jobs.attack_paths.scan.db_utils.finish_attack_paths_scan"
) as mock_finish,
patch(
"tasks.jobs.attack_paths.scan.get_cartography_ingestion_function",
return_value=ingestion_fn,
) as mock_get_ingestion,
patch(
"tasks.jobs.attack_paths.scan._call_within_event_loop",
side_effect=lambda fn, *a, **kw: fn(*a, **kw),
) as mock_event_loop,
):
result = attack_paths_run(str(tenant.id), str(scan.id), "task-123")
assert result == ingestion_result
mock_retrieve_scan.assert_called_once_with(str(tenant.id), str(scan.id))
mock_starting.assert_called_once()
config = mock_starting.call_args[0][2]
assert config.neo4j_database == "db-scan-id"
mock_create_db.assert_called_once_with("db-scan-id")
mock_get_session.assert_called_once_with("db-scan-id")
mock_cartography_indexes.assert_called_once_with(mock_session, config)
mock_prowler_indexes.assert_called_once_with(mock_session)
mock_cartography_analysis.assert_called_once_with(mock_session, config)
mock_cartography_ontology.assert_called_once_with(mock_session, config)
mock_prowler_analysis.assert_called_once_with(
mock_session,
provider,
str(scan.id),
config,
)
assert mock_get_ingestion.call_args_list == [
call(provider.provider),
call(provider.provider),
]
mock_event_loop.assert_called_once()
mock_update_progress.assert_any_call(attack_paths_scan, 1)
mock_update_progress.assert_any_call(attack_paths_scan, 2)
mock_update_progress.assert_any_call(attack_paths_scan, 95)
mock_finish.assert_called_once_with(
attack_paths_scan, StateChoices.COMPLETED, ingestion_result
)
mock_get_db_name.assert_called_once_with(attack_paths_scan.id)
def test_run_failure_marks_scan_failed(
self, tenants_fixture, providers_fixture, scans_fixture
):
tenant = tenants_fixture[0]
provider = providers_fixture[0]
provider.provider = Provider.ProviderChoices.AWS
provider.save()
scan = scans_fixture[0]
scan.provider = provider
scan.save()
attack_paths_scan = AttackPathsScan.objects.create(
tenant_id=tenant.id,
provider=provider,
scan=scan,
state=StateChoices.SCHEDULED,
)
mock_session = MagicMock()
session_ctx = MagicMock()
session_ctx.__enter__.return_value = mock_session
session_ctx.__exit__.return_value = False
ingestion_fn = MagicMock(side_effect=RuntimeError("ingestion boom"))
with (
patch(
"tasks.jobs.attack_paths.scan.rls_transaction",
new=lambda *args, **kwargs: nullcontext(),
),
patch(
"tasks.jobs.attack_paths.scan.initialize_prowler_provider",
return_value=MagicMock(_enabled_regions=["us-east-1"]),
),
patch("tasks.jobs.attack_paths.scan.graph_database.get_uri"),
patch(
"tasks.jobs.attack_paths.scan.graph_database.get_database_name",
return_value="db-scan-id",
),
patch("tasks.jobs.attack_paths.scan.graph_database.create_database"),
patch(
"tasks.jobs.attack_paths.scan.graph_database.get_session",
return_value=session_ctx,
),
patch("tasks.jobs.attack_paths.scan.cartography_create_indexes.run"),
patch("tasks.jobs.attack_paths.scan.cartography_analysis.run"),
patch("tasks.jobs.attack_paths.scan.prowler.create_indexes"),
patch("tasks.jobs.attack_paths.scan.prowler.analysis"),
patch(
"tasks.jobs.attack_paths.scan.db_utils.retrieve_attack_paths_scan",
return_value=attack_paths_scan,
),
patch("tasks.jobs.attack_paths.scan.db_utils.starting_attack_paths_scan"),
patch(
"tasks.jobs.attack_paths.scan.db_utils.update_attack_paths_scan_progress"
),
patch(
"tasks.jobs.attack_paths.scan.db_utils.finish_attack_paths_scan"
) as mock_finish,
patch(
"tasks.jobs.attack_paths.scan.get_cartography_ingestion_function",
return_value=ingestion_fn,
),
patch(
"tasks.jobs.attack_paths.scan._call_within_event_loop",
side_effect=lambda fn, *a, **kw: fn(*a, **kw),
),
patch(
"tasks.jobs.attack_paths.scan.utils.stringify_exception",
return_value="Cartography failed: ingestion boom",
),
):
with pytest.raises(RuntimeError, match="ingestion boom"):
attack_paths_run(str(tenant.id), str(scan.id), "task-456")
failure_args = mock_finish.call_args[0]
assert failure_args[0] is attack_paths_scan
assert failure_args[1] == StateChoices.FAILED
assert failure_args[2] == {
"global_cartography_error": "Cartography failed: ingestion boom"
}
def test_run_returns_early_for_unsupported_provider(self, tenants_fixture):
tenant = tenants_fixture[0]
provider = Provider.objects.create(
provider=Provider.ProviderChoices.GCP,
uid="gcp-account",
alias="gcp",
tenant_id=tenant.id,
)
scan = Scan.objects.create(
name="GCP Scan",
provider=provider,
trigger=Scan.TriggerChoices.MANUAL,
state=StateChoices.AVAILABLE,
tenant_id=tenant.id,
)
with (
patch(
"tasks.jobs.attack_paths.scan.rls_transaction",
new=lambda *args, **kwargs: nullcontext(),
),
patch(
"tasks.jobs.attack_paths.scan.initialize_prowler_provider",
return_value=MagicMock(),
),
patch(
"tasks.jobs.attack_paths.scan.get_cartography_ingestion_function",
return_value=None,
) as mock_get_ingestion,
patch(
"tasks.jobs.attack_paths.scan.db_utils.retrieve_attack_paths_scan"
) as mock_retrieve,
):
result = attack_paths_run(str(tenant.id), str(scan.id), "task-789")
assert result == {}
mock_get_ingestion.assert_called_once_with(provider.provider)
mock_retrieve.assert_not_called()
@pytest.mark.django_db
class TestAttackPathsProwlerHelpers:
def test_create_indexes_executes_all_statements(self):
mock_session = MagicMock()
with patch("tasks.jobs.attack_paths.prowler.run_write_query") as mock_run_write:
prowler_module.create_indexes(mock_session)
assert mock_run_write.call_count == len(prowler_module.INDEX_STATEMENTS)
mock_run_write.assert_has_calls(
[call(mock_session, stmt) for stmt in prowler_module.INDEX_STATEMENTS]
)
def test_load_findings_batches_requests(self, providers_fixture):
provider = providers_fixture[0]
provider.provider = Provider.ProviderChoices.AWS
provider.save()
findings = [
{"id": "1", "resource_uid": "r-1"},
{"id": "2", "resource_uid": "r-2"},
]
config = SimpleNamespace(update_tag=12345)
mock_session = MagicMock()
with (
patch.object(prowler_module, "BATCH_SIZE", 1),
patch(
"tasks.jobs.attack_paths.prowler.get_root_node_label",
return_value="AWSAccount",
),
patch(
"tasks.jobs.attack_paths.prowler.get_node_uid_field",
return_value="arn",
),
):
prowler_module.load_findings(mock_session, findings, provider, config)
assert mock_session.run.call_count == 2
for call_args in mock_session.run.call_args_list:
params = call_args.args[1]
assert params["provider_uid"] == str(provider.uid)
assert params["last_updated"] == config.update_tag
assert "findings_data" in params
def test_cleanup_findings_runs_batches(self, providers_fixture):
provider = providers_fixture[0]
config = SimpleNamespace(update_tag=1024)
mock_session = MagicMock()
first_batch = MagicMock()
first_batch.single.return_value = {"deleted_findings_count": 3}
second_batch = MagicMock()
second_batch.single.return_value = {"deleted_findings_count": 0}
mock_session.run.side_effect = [first_batch, second_batch]
prowler_module.cleanup_findings(mock_session, provider, config)
assert mock_session.run.call_count == 2
params = mock_session.run.call_args.args[1]
assert params["provider_uid"] == str(provider.uid)
assert params["last_updated"] == config.update_tag
def test_get_provider_last_scan_findings_returns_latest_scan_data(
self,
tenants_fixture,
providers_fixture,
):
tenant = tenants_fixture[0]
provider = providers_fixture[0]
provider.provider = Provider.ProviderChoices.AWS
provider.save()
resource = Resource.objects.create(
tenant_id=tenant.id,
provider=provider,
uid="resource-uid",
name="Resource",
region="us-east-1",
service="ec2",
type="instance",
)
older_scan = Scan.objects.create(
name="Older",
provider=provider,
trigger=Scan.TriggerChoices.MANUAL,
state=StateChoices.COMPLETED,
tenant_id=tenant.id,
)
old_finding = Finding.objects.create(
tenant_id=tenant.id,
uid="older-finding",
scan=older_scan,
delta=Finding.DeltaChoices.NEW,
status=StatusChoices.PASS,
status_extended="ok",
severity=Severity.low,
impact=Severity.low,
impact_extended="",
raw_result={},
check_id="check-old",
check_metadata={"checktitle": "Old"},
first_seen_at=older_scan.inserted_at,
)
ResourceFindingMapping.objects.create(
tenant_id=tenant.id,
resource=resource,
finding=old_finding,
)
latest_scan = Scan.objects.create(
name="Latest",
provider=provider,
trigger=Scan.TriggerChoices.MANUAL,
state=StateChoices.COMPLETED,
tenant_id=tenant.id,
)
finding = Finding.objects.create(
tenant_id=tenant.id,
uid="finding-uid",
scan=latest_scan,
delta=Finding.DeltaChoices.NEW,
status=StatusChoices.FAIL,
status_extended="failed",
severity=Severity.high,
impact=Severity.high,
impact_extended="",
raw_result={},
check_id="check-1",
check_metadata={"checktitle": "Check title"},
first_seen_at=latest_scan.inserted_at,
)
ResourceFindingMapping.objects.create(
tenant_id=tenant.id,
resource=resource,
finding=finding,
)
latest_scan.refresh_from_db()
with patch(
"tasks.jobs.attack_paths.prowler.rls_transaction",
new=lambda *args, **kwargs: nullcontext(),
):
findings_data = prowler_module.get_provider_last_scan_findings(
provider,
str(latest_scan.id),
)
assert len(findings_data) == 1
finding_dict = findings_data[0]
assert finding_dict["id"] == str(finding.id)
assert finding_dict["resource_uid"] == resource.uid
assert finding_dict["check_title"] == "Check title"
assert finding_dict["scan_id"] == str(latest_scan.id)
+98 -30
View File
@@ -1,27 +1,60 @@
from unittest.mock import call, patch
import pytest
from django.core.exceptions import ObjectDoesNotExist
from tasks.jobs.deletion import delete_provider, delete_tenant
from api.models import Provider, Tenant
from tasks.jobs.deletion import delete_provider, delete_tenant
@pytest.mark.django_db
class TestDeleteProvider:
def test_delete_provider_success(self, providers_fixture):
instance = providers_fixture[0]
tenant_id = str(instance.tenant_id)
result = delete_provider(tenant_id, instance.id)
with patch(
"tasks.jobs.deletion.get_provider_graph_database_names"
) as mock_get_provider_graph_database_names, patch(
"tasks.jobs.deletion.graph_database.drop_database"
) as mock_drop_database:
graph_db_names = ["graph-db-1", "graph-db-2"]
mock_get_provider_graph_database_names.return_value = graph_db_names
assert result
with pytest.raises(ObjectDoesNotExist):
Provider.objects.get(pk=instance.id)
instance = providers_fixture[0]
tenant_id = str(instance.tenant_id)
result = delete_provider(tenant_id, instance.id)
assert result
with pytest.raises(ObjectDoesNotExist):
Provider.objects.get(pk=instance.id)
mock_get_provider_graph_database_names.assert_called_once_with(
tenant_id, instance.id
)
mock_drop_database.assert_has_calls(
[call(graph_db_name) for graph_db_name in graph_db_names]
)
def test_delete_provider_does_not_exist(self, tenants_fixture):
tenant_id = str(tenants_fixture[0].id)
non_existent_pk = "babf6796-cfcc-4fd3-9dcf-88d012247645"
with patch(
"tasks.jobs.deletion.get_provider_graph_database_names"
) as mock_get_provider_graph_database_names, patch(
"tasks.jobs.deletion.graph_database.drop_database"
) as mock_drop_database:
graph_db_names = ["graph-db-1"]
mock_get_provider_graph_database_names.return_value = graph_db_names
with pytest.raises(ObjectDoesNotExist):
delete_provider(tenant_id, non_existent_pk)
tenant_id = str(tenants_fixture[0].id)
non_existent_pk = "babf6796-cfcc-4fd3-9dcf-88d012247645"
with pytest.raises(ObjectDoesNotExist):
delete_provider(tenant_id, non_existent_pk)
mock_get_provider_graph_database_names.assert_called_once_with(
tenant_id, non_existent_pk
)
mock_drop_database.assert_has_calls(
[call(graph_db_name) for graph_db_name in graph_db_names]
)
@pytest.mark.django_db
@@ -30,33 +63,68 @@ class TestDeleteTenant:
"""
Test successful deletion of a tenant and its related data.
"""
tenant = tenants_fixture[0]
providers = Provider.objects.filter(tenant_id=tenant.id)
with patch(
"tasks.jobs.deletion.get_provider_graph_database_names"
) as mock_get_provider_graph_database_names, patch(
"tasks.jobs.deletion.graph_database.drop_database"
) as mock_drop_database:
tenant = tenants_fixture[0]
providers = list(Provider.objects.filter(tenant_id=tenant.id))
# Ensure the tenant and related providers exist before deletion
assert Tenant.objects.filter(id=tenant.id).exists()
assert providers.exists()
graph_db_names_per_provider = [
[f"graph-db-{provider.id}"] for provider in providers
]
mock_get_provider_graph_database_names.side_effect = (
graph_db_names_per_provider
)
# Call the function and validate the result
deletion_summary = delete_tenant(tenant.id)
# Ensure the tenant and related providers exist before deletion
assert Tenant.objects.filter(id=tenant.id).exists()
assert providers
assert deletion_summary is not None
assert not Tenant.objects.filter(id=tenant.id).exists()
assert not Provider.objects.filter(tenant_id=tenant.id).exists()
# Call the function and validate the result
deletion_summary = delete_tenant(tenant.id)
assert deletion_summary is not None
assert not Tenant.objects.filter(id=tenant.id).exists()
assert not Provider.objects.filter(tenant_id=tenant.id).exists()
expected_calls = [
call(provider.tenant_id, provider.id) for provider in providers
]
mock_get_provider_graph_database_names.assert_has_calls(
expected_calls, any_order=True
)
assert mock_get_provider_graph_database_names.call_count == len(
expected_calls
)
expected_drop_calls = [
call(graph_db_name[0]) for graph_db_name in graph_db_names_per_provider
]
mock_drop_database.assert_has_calls(expected_drop_calls, any_order=True)
assert mock_drop_database.call_count == len(expected_drop_calls)
def test_delete_tenant_with_no_providers(self, tenants_fixture):
"""
Test deletion of a tenant with no related providers.
"""
tenant = tenants_fixture[1] # Assume this tenant has no providers
providers = Provider.objects.filter(tenant_id=tenant.id)
with patch(
"tasks.jobs.deletion.get_provider_graph_database_names"
) as mock_get_provider_graph_database_names, patch(
"tasks.jobs.deletion.graph_database.drop_database"
) as mock_drop_database:
tenant = tenants_fixture[1] # Assume this tenant has no providers
providers = Provider.objects.filter(tenant_id=tenant.id)
# Ensure the tenant exists but has no related providers
assert Tenant.objects.filter(id=tenant.id).exists()
assert not providers.exists()
# Ensure the tenant exists but has no related providers
assert Tenant.objects.filter(id=tenant.id).exists()
assert not providers.exists()
# Call the function and validate the result
deletion_summary = delete_tenant(tenant.id)
# Call the function and validate the result
deletion_summary = delete_tenant(tenant.id)
assert deletion_summary == {} # No providers, so empty summary
assert not Tenant.objects.filter(id=tenant.id).exists()
assert deletion_summary == {} # No providers, so empty summary
assert not Tenant.objects.filter(id=tenant.id).exists()
mock_get_provider_graph_database_names.assert_not_called()
mock_drop_database.assert_not_called()
+1 -286
View File
@@ -9,17 +9,14 @@ from unittest.mock import MagicMock, patch
import pytest
from tasks.jobs.scan import (
_ATTACK_SURFACE_MAPPING_CACHE,
_aggregate_findings_by_region,
_copy_compliance_requirement_rows,
_create_compliance_summaries,
_create_finding_delta,
_get_attack_surface_mapping_from_provider,
_normalized_compliance_key,
_persist_compliance_requirement_rows,
_process_finding_micro_batch,
_store_resources,
aggregate_attack_surface,
aggregate_findings,
create_compliance_requirements,
perform_prowler_scan,
@@ -3341,10 +3338,7 @@ class TestAggregateFindingsByRegion:
# Verify filter was called with muted=False
mock_findings_filter.assert_called_once_with(
tenant_id=tenant_id,
scan_id=scan_id,
muted=False,
status__in=["PASS", "FAIL"],
tenant_id=tenant_id, scan_id=scan_id, muted=False
)
@patch("tasks.jobs.scan.Finding.all_objects.filter")
@@ -3477,282 +3471,3 @@ class TestAggregateFindingsByRegion:
assert check_status_by_region == {}
assert findings_count_by_compliance == {}
@pytest.mark.django_db
class TestAggregateAttackSurface:
"""Test aggregate_attack_surface function and related caching."""
def setup_method(self):
"""Clear cache before each test."""
_ATTACK_SURFACE_MAPPING_CACHE.clear()
def teardown_method(self):
"""Clear cache after each test."""
_ATTACK_SURFACE_MAPPING_CACHE.clear()
@patch("tasks.jobs.scan.CheckMetadata.list")
def test_get_attack_surface_mapping_caches_result(self, mock_check_metadata_list):
"""Test that _get_attack_surface_mapping_from_provider caches results."""
mock_check_metadata_list.return_value = {"check_internet_exposed_1"}
# First call should hit CheckMetadata.list
result1 = _get_attack_surface_mapping_from_provider("aws")
assert mock_check_metadata_list.call_count == 2 # internet-exposed, secrets
# Second call should use cache
result2 = _get_attack_surface_mapping_from_provider("aws")
assert mock_check_metadata_list.call_count == 2 # No additional calls
assert result1 is result2
assert "aws" in _ATTACK_SURFACE_MAPPING_CACHE
@patch("tasks.jobs.scan.CheckMetadata.list")
def test_get_attack_surface_mapping_different_providers(
self, mock_check_metadata_list
):
"""Test caching works independently for different providers."""
mock_check_metadata_list.return_value = {"check_1"}
_get_attack_surface_mapping_from_provider("aws")
aws_call_count = mock_check_metadata_list.call_count
_get_attack_surface_mapping_from_provider("gcp")
gcp_call_count = mock_check_metadata_list.call_count
# Both providers should have made calls
assert gcp_call_count > aws_call_count
assert "aws" in _ATTACK_SURFACE_MAPPING_CACHE
assert "gcp" in _ATTACK_SURFACE_MAPPING_CACHE
@patch("tasks.jobs.scan.CheckMetadata.list")
def test_get_attack_surface_mapping_returns_hardcoded_checks(
self, mock_check_metadata_list
):
"""Test that hardcoded check IDs are returned for privilege-escalation and ec2-imdsv1."""
mock_check_metadata_list.return_value = set()
result = _get_attack_surface_mapping_from_provider("aws")
# Hardcoded checks should be present
assert (
"iam_policy_allows_privilege_escalation" in result["privilege-escalation"]
)
assert (
"iam_inline_policy_allows_privilege_escalation"
in result["privilege-escalation"]
)
assert "ec2_instance_imdsv2_enabled" in result["ec2-imdsv1"]
@patch("tasks.jobs.scan.AttackSurfaceOverview.objects.bulk_create")
@patch("tasks.jobs.scan.Finding.all_objects.filter")
@patch("tasks.jobs.scan._get_attack_surface_mapping_from_provider")
@patch("tasks.jobs.scan.rls_transaction")
def test_aggregate_attack_surface_creates_overview_records(
self,
mock_rls_transaction,
mock_get_mapping,
mock_findings_filter,
mock_bulk_create,
tenants_fixture,
scans_fixture,
):
"""Test that aggregate_attack_surface creates AttackSurfaceOverview records."""
tenant = tenants_fixture[0]
scan = scans_fixture[0]
scan.provider.provider = "aws"
scan.provider.save()
mock_get_mapping.return_value = {
"internet-exposed": {"check_internet_1", "check_internet_2"},
"secrets": {"check_secrets_1"},
"privilege-escalation": {"check_privesc_1"},
"ec2-imdsv1": {"check_imdsv1_1"},
}
# Mock findings aggregation
mock_queryset = MagicMock()
mock_queryset.values.return_value = mock_queryset
mock_queryset.annotate.return_value = [
{"check_id": "check_internet_1", "total": 10, "failed": 3, "muted": 1},
{"check_id": "check_secrets_1", "total": 5, "failed": 2, "muted": 0},
]
ctx = MagicMock()
ctx.__enter__.return_value = None
ctx.__exit__.return_value = False
mock_rls_transaction.return_value = ctx
mock_findings_filter.return_value = mock_queryset
aggregate_attack_surface(str(tenant.id), str(scan.id))
mock_bulk_create.assert_called_once()
args, kwargs = mock_bulk_create.call_args
objects = args[0]
# Should create records for internet-exposed and secrets (the ones with findings)
assert len(objects) == 2
assert kwargs["batch_size"] == 500
@patch("tasks.jobs.scan.AttackSurfaceOverview.objects.bulk_create")
@patch("tasks.jobs.scan.Finding.all_objects.filter")
@patch("tasks.jobs.scan._get_attack_surface_mapping_from_provider")
@patch("tasks.jobs.scan.rls_transaction")
def test_aggregate_attack_surface_skips_unsupported_provider(
self,
mock_rls_transaction,
mock_get_mapping,
mock_findings_filter,
mock_bulk_create,
tenants_fixture,
scans_fixture,
):
"""Test that ec2-imdsv1 is skipped for non-AWS providers."""
tenant = tenants_fixture[0]
scan = scans_fixture[0]
scan.provider.provider = "gcp"
scan.provider.uid = "gcp-test-project-id"
scan.provider.save()
mock_get_mapping.return_value = {
"internet-exposed": {"check_internet_1"},
"secrets": {"check_secrets_1"},
"privilege-escalation": set(), # Not supported for GCP
"ec2-imdsv1": {"check_imdsv1_1"}, # Should be skipped for GCP
}
mock_queryset = MagicMock()
mock_queryset.values.return_value = mock_queryset
mock_queryset.annotate.return_value = [
{"check_id": "check_internet_1", "total": 5, "failed": 1, "muted": 0},
]
ctx = MagicMock()
ctx.__enter__.return_value = None
ctx.__exit__.return_value = False
mock_rls_transaction.return_value = ctx
mock_findings_filter.return_value = mock_queryset
aggregate_attack_surface(str(tenant.id), str(scan.id))
# ec2-imdsv1 check_ids should not be in the filter
filter_call = mock_findings_filter.call_args
check_ids_in_filter = filter_call[1]["check_id__in"]
assert "check_imdsv1_1" not in check_ids_in_filter
@patch("tasks.jobs.scan.AttackSurfaceOverview.objects.bulk_create")
@patch("tasks.jobs.scan.Finding.all_objects.filter")
@patch("tasks.jobs.scan._get_attack_surface_mapping_from_provider")
@patch("tasks.jobs.scan.rls_transaction")
def test_aggregate_attack_surface_no_findings(
self,
mock_rls_transaction,
mock_get_mapping,
mock_findings_filter,
mock_bulk_create,
tenants_fixture,
scans_fixture,
):
"""Test that no records are created when there are no findings."""
tenant = tenants_fixture[0]
scan = scans_fixture[0]
mock_get_mapping.return_value = {
"internet-exposed": {"check_1"},
"secrets": {"check_2"},
"privilege-escalation": set(),
"ec2-imdsv1": set(),
}
mock_queryset = MagicMock()
mock_queryset.values.return_value = mock_queryset
mock_queryset.annotate.return_value = [] # No findings
ctx = MagicMock()
ctx.__enter__.return_value = None
ctx.__exit__.return_value = False
mock_rls_transaction.return_value = ctx
mock_findings_filter.return_value = mock_queryset
aggregate_attack_surface(str(tenant.id), str(scan.id))
mock_bulk_create.assert_not_called()
@patch("tasks.jobs.scan.AttackSurfaceOverview.objects.bulk_create")
@patch("tasks.jobs.scan.Finding.all_objects.filter")
@patch("tasks.jobs.scan._get_attack_surface_mapping_from_provider")
@patch("tasks.jobs.scan.rls_transaction")
def test_aggregate_attack_surface_aggregates_counts_correctly(
self,
mock_rls_transaction,
mock_get_mapping,
mock_findings_filter,
mock_bulk_create,
tenants_fixture,
scans_fixture,
):
"""Test that counts from multiple check_ids are aggregated per attack surface type."""
tenant = tenants_fixture[0]
scan = scans_fixture[0]
scan.provider.provider = "aws"
scan.provider.save()
mock_get_mapping.return_value = {
"internet-exposed": {"check_internet_1", "check_internet_2"},
"secrets": set(),
"privilege-escalation": set(),
"ec2-imdsv1": set(),
}
mock_queryset = MagicMock()
mock_queryset.values.return_value = mock_queryset
mock_queryset.annotate.return_value = [
{"check_id": "check_internet_1", "total": 10, "failed": 3, "muted": 1},
{"check_id": "check_internet_2", "total": 5, "failed": 2, "muted": 0},
]
ctx = MagicMock()
ctx.__enter__.return_value = None
ctx.__exit__.return_value = False
mock_rls_transaction.return_value = ctx
mock_findings_filter.return_value = mock_queryset
aggregate_attack_surface(str(tenant.id), str(scan.id))
args, kwargs = mock_bulk_create.call_args
objects = args[0]
assert len(objects) == 1
overview = objects[0]
assert overview.attack_surface_type == "internet-exposed"
assert overview.total_findings == 15 # 10 + 5
assert overview.failed_findings == 5 # 3 + 2
assert overview.muted_failed_findings == 1 # 1 + 0
@patch("tasks.jobs.scan.Scan.all_objects.select_related")
@patch("tasks.jobs.scan.rls_transaction")
def test_aggregate_attack_surface_uses_select_related(
self, mock_rls_transaction, mock_select_related, tenants_fixture, scans_fixture
):
"""Test that select_related is used to avoid N+1 query."""
tenant = tenants_fixture[0]
scan = scans_fixture[0]
mock_scan = MagicMock()
mock_scan.provider.provider = "aws"
mock_select_related.return_value.get.return_value = mock_scan
ctx = MagicMock()
ctx.__enter__.return_value = None
ctx.__exit__.return_value = False
mock_rls_transaction.return_value = ctx
with patch(
"tasks.jobs.scan._get_attack_surface_mapping_from_provider"
) as mock_map:
mock_map.return_value = {}
aggregate_attack_surface(str(tenant.id), str(scan.id))
mock_select_related.assert_called_once_with("provider")
+77 -251
View File
@@ -1,220 +1,28 @@
import uuid
from contextlib import contextmanager
from unittest.mock import MagicMock, patch
import openai
import pytest
from botocore.exceptions import ClientError
from tasks.jobs.lighthouse_providers import (
_create_bedrock_client,
_extract_bedrock_credentials,
)
from tasks.tasks import (
_perform_scan_complete_tasks,
check_integrations_task,
check_lighthouse_provider_connection_task,
generate_outputs_task,
refresh_lighthouse_provider_models_task,
s3_integration_task,
security_hub_integration_task,
)
from api.models import (
Integration,
LighthouseProviderConfiguration,
LighthouseProviderModels,
)
@pytest.mark.django_db
class TestExtractBedrockCredentials:
"""Unit tests for _extract_bedrock_credentials helper function."""
def test_extract_access_key_credentials(self, tenants_fixture):
"""Test extraction of access key + secret key credentials."""
provider_cfg = LighthouseProviderConfiguration(
tenant_id=tenants_fixture[0].id,
provider_type=LighthouseProviderConfiguration.LLMProviderChoices.BEDROCK,
is_active=True,
)
provider_cfg.credentials_decoded = {
"access_key_id": "AKIAIOSFODNN7EXAMPLE",
"secret_access_key": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
"region": "us-east-1",
}
provider_cfg.save()
result = _extract_bedrock_credentials(provider_cfg)
assert result is not None
assert result["access_key_id"] == "AKIAIOSFODNN7EXAMPLE"
assert result["secret_access_key"] == "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
assert result["region"] == "us-east-1"
assert "api_key" not in result
def test_extract_api_key_credentials(self, tenants_fixture):
"""Test extraction of API key (bearer token) credentials."""
valid_api_key = "ABSKQmVkcm9ja0FQSUtleS" + ("A" * 110)
provider_cfg = LighthouseProviderConfiguration(
tenant_id=tenants_fixture[0].id,
provider_type=LighthouseProviderConfiguration.LLMProviderChoices.BEDROCK,
is_active=True,
)
provider_cfg.credentials_decoded = {
"api_key": valid_api_key,
"region": "us-west-2",
}
provider_cfg.save()
result = _extract_bedrock_credentials(provider_cfg)
assert result is not None
assert result["api_key"] == valid_api_key
assert result["region"] == "us-west-2"
assert "access_key_id" not in result
assert "secret_access_key" not in result
def test_api_key_takes_precedence_over_access_keys(self, tenants_fixture):
"""Test that API key is preferred when both auth methods are present."""
valid_api_key = "ABSKQmVkcm9ja0FQSUtleS" + ("B" * 110)
provider_cfg = LighthouseProviderConfiguration(
tenant_id=tenants_fixture[0].id,
provider_type=LighthouseProviderConfiguration.LLMProviderChoices.BEDROCK,
is_active=True,
)
provider_cfg.credentials_decoded = {
"api_key": valid_api_key,
"access_key_id": "AKIAIOSFODNN7EXAMPLE",
"secret_access_key": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
"region": "eu-west-1",
}
provider_cfg.save()
result = _extract_bedrock_credentials(provider_cfg)
assert result is not None
assert result["api_key"] == valid_api_key
assert result["region"] == "eu-west-1"
assert "access_key_id" not in result
def test_missing_region_returns_none(self, tenants_fixture):
"""Test that missing region returns None."""
provider_cfg = LighthouseProviderConfiguration(
tenant_id=tenants_fixture[0].id,
provider_type=LighthouseProviderConfiguration.LLMProviderChoices.BEDROCK,
is_active=True,
)
provider_cfg.credentials_decoded = {
"api_key": "ABSKQmVkcm9ja0FQSUtleS" + ("A" * 110),
}
provider_cfg.save()
result = _extract_bedrock_credentials(provider_cfg)
assert result is None
def test_empty_credentials_returns_none(self, tenants_fixture):
"""Test that empty credentials dict returns None (region only is not enough)."""
provider_cfg = LighthouseProviderConfiguration(
tenant_id=tenants_fixture[0].id,
provider_type=LighthouseProviderConfiguration.LLMProviderChoices.BEDROCK,
is_active=True,
)
# Only region, no auth credentials - should return None
provider_cfg.credentials_decoded = {
"region": "us-east-1",
}
provider_cfg.save()
result = _extract_bedrock_credentials(provider_cfg)
assert result is None
def test_non_dict_credentials_returns_none(self, tenants_fixture):
"""Test that non-dict credentials returns None."""
provider_cfg = LighthouseProviderConfiguration(
tenant_id=tenants_fixture[0].id,
provider_type=LighthouseProviderConfiguration.LLMProviderChoices.BEDROCK,
is_active=True,
)
# Store valid credentials first to pass model validation
provider_cfg.credentials_decoded = {
"api_key": "ABSKQmVkcm9ja0FQSUtleS" + ("A" * 110),
"region": "us-east-1",
}
provider_cfg.save()
# Mock the credentials_decoded property to return a non-dict value
# This simulates corrupted/invalid stored data
with patch.object(
type(provider_cfg),
"credentials_decoded",
new_callable=lambda: property(lambda self: "invalid"),
):
result = _extract_bedrock_credentials(provider_cfg)
assert result is None
class TestCreateBedrockClient:
"""Unit tests for _create_bedrock_client helper function."""
@patch("tasks.jobs.lighthouse_providers.boto3.client")
def test_create_client_with_access_keys(self, mock_boto_client):
"""Test creating client with access key authentication."""
mock_client = MagicMock()
mock_boto_client.return_value = mock_client
creds = {
"access_key_id": "AKIAIOSFODNN7EXAMPLE",
"secret_access_key": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
"region": "us-east-1",
}
result = _create_bedrock_client(creds)
assert result == mock_client
mock_boto_client.assert_called_once_with(
service_name="bedrock",
region_name="us-east-1",
aws_access_key_id="AKIAIOSFODNN7EXAMPLE",
aws_secret_access_key="wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
)
@patch("tasks.jobs.lighthouse_providers.Config")
@patch("tasks.jobs.lighthouse_providers.boto3.client")
def test_create_client_with_api_key(self, mock_boto_client, mock_config):
"""Test creating client with API key authentication."""
mock_client = MagicMock()
mock_events = MagicMock()
mock_client.meta.events = mock_events
mock_boto_client.return_value = mock_client
mock_config_instance = MagicMock()
mock_config.return_value = mock_config_instance
valid_api_key = "ABSKQmVkcm9ja0FQSUtleS" + ("A" * 110)
creds = {
"api_key": valid_api_key,
"region": "us-west-2",
}
result = _create_bedrock_client(creds)
assert result == mock_client
mock_boto_client.assert_called_once_with(
service_name="bedrock",
region_name="us-west-2",
config=mock_config_instance,
)
mock_events.register.assert_called_once()
call_args = mock_events.register.call_args
assert call_args[0][0] == "before-send.*.*"
# Verify handler injects bearer token
handler_fn = call_args[0][1]
mock_request = MagicMock()
mock_request.headers = {}
handler_fn(mock_request)
assert mock_request.headers["Authorization"] == f"Bearer {valid_api_key}"
from tasks.tasks import (
_perform_scan_complete_tasks,
check_integrations_task,
check_lighthouse_provider_connection_task,
generate_outputs_task,
perform_attack_paths_scan_task,
refresh_lighthouse_provider_models_task,
s3_integration_task,
security_hub_integration_task,
)
# TODO Move this to outputs/reports jobs
@@ -725,7 +533,7 @@ class TestGenerateOutputs:
class TestScanCompleteTasks:
@patch("tasks.tasks.aggregate_attack_surface_task.apply_async")
@patch("tasks.tasks.perform_attack_paths_scan_task.apply_async")
@patch("tasks.tasks.create_compliance_requirements_task.apply_async")
@patch("tasks.tasks.perform_scan_summary_task.si")
@patch("tasks.tasks.generate_outputs_task.si")
@@ -738,7 +546,7 @@ class TestScanCompleteTasks:
mock_outputs_task,
mock_scan_summary_task,
mock_compliance_requirements_task,
mock_attack_surface_task,
mock_attack_paths_task,
):
"""Test that scan complete tasks are properly orchestrated with optimized reports."""
_perform_scan_complete_tasks("tenant-id", "scan-id", "provider-id")
@@ -748,11 +556,6 @@ class TestScanCompleteTasks:
kwargs={"tenant_id": "tenant-id", "scan_id": "scan-id"},
)
# Verify attack surface task is called
mock_attack_surface_task.assert_called_once_with(
kwargs={"tenant_id": "tenant-id", "scan_id": "scan-id"},
)
# Verify scan summary task is called
mock_scan_summary_task.assert_called_once_with(
scan_id="scan-id",
@@ -780,6 +583,68 @@ class TestScanCompleteTasks:
scan_id="scan-id",
)
mock_attack_paths_task.assert_called_once_with(
kwargs={"tenant_id": "tenant-id", "scan_id": "scan-id"}
)
class TestAttackPathsTasks:
@staticmethod
@contextmanager
def _override_task_request(task, **attrs):
request = task.request
sentinel = object()
previous = {key: getattr(request, key, sentinel) for key in attrs}
for key, value in attrs.items():
setattr(request, key, value)
try:
yield
finally:
for key, prev in previous.items():
if prev is sentinel:
if hasattr(request, key):
delattr(request, key)
else:
setattr(request, key, prev)
def test_perform_attack_paths_scan_task_calls_runner(self):
with (
patch("tasks.tasks.attack_paths_scan") as mock_attack_paths_scan,
self._override_task_request(
perform_attack_paths_scan_task, id="celery-task-id"
),
):
mock_attack_paths_scan.return_value = {"status": "ok"}
result = perform_attack_paths_scan_task.run(
tenant_id="tenant-id", scan_id="scan-id"
)
mock_attack_paths_scan.assert_called_once_with(
tenant_id="tenant-id", scan_id="scan-id", task_id="celery-task-id"
)
assert result == {"status": "ok"}
def test_perform_attack_paths_scan_task_propagates_exception(self):
with (
patch(
"tasks.tasks.attack_paths_scan",
side_effect=RuntimeError("Exception to propagate"),
) as mock_attack_paths_scan,
self._override_task_request(
perform_attack_paths_scan_task, id="celery-task-error"
),
):
with pytest.raises(RuntimeError, match="Exception to propagate"):
perform_attack_paths_scan_task.run(
tenant_id="tenant-id", scan_id="scan-id"
)
mock_attack_paths_scan.assert_called_once_with(
tenant_id="tenant-id", scan_id="scan-id", task_id="celery-task-error"
)
@pytest.mark.django_db
class TestCheckIntegrationsTask:
@@ -1348,16 +1213,6 @@ class TestCheckLighthouseProviderConnectionTask:
None,
{"connected": True, "error": None},
),
# Bedrock API key authentication
(
LighthouseProviderConfiguration.LLMProviderChoices.BEDROCK,
{
"api_key": "ABSKQmVkcm9ja0FQSUtleS" + ("A" * 110),
"region": "us-east-1",
},
None,
{"connected": True, "error": None},
),
],
)
def test_check_connection_success_all_providers(
@@ -1426,24 +1281,6 @@ class TestCheckLighthouseProviderConnectionTask:
"list_foundation_models",
),
),
# Bedrock API key authentication failure
(
LighthouseProviderConfiguration.LLMProviderChoices.BEDROCK,
{
"api_key": "ABSKQmVkcm9ja0FQSUtleS" + ("X" * 110),
"region": "us-east-1",
},
None,
ClientError(
{
"Error": {
"Code": "UnrecognizedClientException",
"Message": "Invalid API key",
}
},
"list_foundation_models",
),
),
],
)
def test_check_connection_api_failure(
@@ -1568,17 +1405,6 @@ class TestRefreshLighthouseProviderModelsTask:
{"openai.gpt-oss-120b-1:0": "gpt-oss-120b"},
1,
),
# Bedrock API key authentication
(
LighthouseProviderConfiguration.LLMProviderChoices.BEDROCK,
{
"api_key": "ABSKQmVkcm9ja0FQSUtleS" + ("A" * 110),
"region": "us-east-1",
},
None,
{"anthropic.claude-v3": "Claude 3"},
1,
),
],
)
def test_refresh_models_create_new(
Binary file not shown.

Before

Width:  |  Height:  |  Size: 90 KiB

@@ -1,24 +0,0 @@
import warnings
from dashboard.common_methods import get_section_containers_cis
warnings.filterwarnings("ignore")
def get_table(data):
aux = data[
[
"REQUIREMENTS_ID",
"REQUIREMENTS_DESCRIPTION",
"REQUIREMENTS_ATTRIBUTES_SECTION",
"CHECKID",
"STATUS",
"REGION",
"ACCOUNTID",
"RESOURCEID",
]
].copy()
return get_section_containers_cis(
aux, "REQUIREMENTS_ID", "REQUIREMENTS_ATTRIBUTES_SECTION"
)
-1
View File
@@ -61,7 +61,6 @@ def create_layout_overview(
html.Div(className="flex", id="gcp_card", n_clicks=0),
html.Div(className="flex", id="k8s_card", n_clicks=0),
html.Div(className="flex", id="m365_card", n_clicks=0),
html.Div(className="flex", id="alibabacloud_card", n_clicks=0),
],
className=f"grid gap-x-4 mb-[30px] sm:grid-cols-2 lg:grid-cols-{amount_providers}",
),
+2 -8
View File
@@ -78,8 +78,6 @@ def load_csv_files(csv_files):
result = result.replace("_KUBERNETES", " - KUBERNETES")
if "M65" in result:
result = result.replace("_M65", " - M65")
if "ALIBABACLOUD" in result:
result = result.replace("_ALIBABACLOUD", " - ALIBABACLOUD")
results.append(result)
unique_results = set(results)
@@ -127,7 +125,7 @@ if data is None:
)
else:
data["ASSESSMENTDATE"] = pd.to_datetime(data["ASSESSMENTDATE"], format="mixed")
data["ASSESSMENTDATE"] = pd.to_datetime(data["ASSESSMENTDATE"])
data["ASSESSMENT_TIME"] = data["ASSESSMENTDATE"].dt.strftime("%Y-%m-%d %H:%M:%S")
data_values = data["ASSESSMENT_TIME"].unique()
@@ -280,13 +278,9 @@ def display_data(
data["REQUIREMENTS_ATTRIBUTES_PROFILE"] = data[
"REQUIREMENTS_ATTRIBUTES_PROFILE"
].apply(lambda x: x.split(" - ")[0])
# Rename the column LOCATION to REGION for Alibaba Cloud
if "alibabacloud" in analytics_input:
data = data.rename(columns={"LOCATION": "REGION"})
# Filter the chosen level of the CIS
if is_level_1:
data = data[data["REQUIREMENTS_ATTRIBUTES_PROFILE"].str.contains("Level 1")]
data = data[data["REQUIREMENTS_ATTRIBUTES_PROFILE"] == "Level 1"]
# Rename the column PROJECTID to ACCOUNTID for GCP
if data.columns.str.contains("PROJECTID").any():
+1 -52
View File
@@ -79,9 +79,6 @@ ks8_provider_logo = html.Img(
m365_provider_logo = html.Img(
src="assets/images/providers/m365_provider.png", alt="m365 provider"
)
alibabacloud_provider_logo = html.Img(
src="assets/images/providers/alibabacloud_provider.png", alt="alibabacloud provider"
)
def load_csv_files(csv_files):
@@ -256,8 +253,6 @@ else:
accounts.append(account + " - AWS")
if "kubernetes" in list(data[data["ACCOUNT_UID"] == account]["PROVIDER"]):
accounts.append(account + " - K8S")
if "alibabacloud" in list(data[data["ACCOUNT_UID"] == account]["PROVIDER"]):
accounts.append(account + " - ALIBABACLOUD")
account_dropdown = create_account_dropdown(accounts)
@@ -303,8 +298,6 @@ else:
services.append(service + " - GCP")
if "m365" in list(data[data["SERVICE_NAME"] == service]["PROVIDER"]):
services.append(service + " - M365")
if "alibabacloud" in list(data[data["SERVICE_NAME"] == service]["PROVIDER"]):
services.append(service + " - ALIBABACLOUD")
services = ["All"] + services
services = [
@@ -527,7 +520,6 @@ else:
Output("gcp_card", "children"),
Output("k8s_card", "children"),
Output("m365_card", "children"),
Output("alibabacloud_card", "children"),
Output("subscribe_card", "children"),
Output("info-file-over", "title"),
Output("severity-filter", "value"),
@@ -545,7 +537,6 @@ else:
Output("gcp_card", "n_clicks"),
Output("k8s_card", "n_clicks"),
Output("m365_card", "n_clicks"),
Output("alibabacloud_card", "n_clicks"),
],
Input("cloud-account-filter", "value"),
Input("region-filter", "value"),
@@ -569,7 +560,6 @@ else:
Input("sort_button_region", "n_clicks"),
Input("sort_button_service", "n_clicks"),
Input("sort_button_account", "n_clicks"),
Input("alibabacloud_card", "n_clicks"),
)
def filter_data(
cloud_account_values,
@@ -594,7 +584,6 @@ def filter_data(
sort_button_region,
sort_button_service,
sort_button_account,
alibabacloud_clicks,
):
# Use n_clicks for vulture
n_clicks_csv = n_clicks_csv
@@ -610,7 +599,6 @@ def filter_data(
gcp_clicks = 0
k8s_clicks = 0
m365_clicks = 0
alibabacloud_clicks = 0
if azure_clicks > 0:
filtered_data = data.copy()
if azure_clicks % 2 != 0 and "azure" in list(data["PROVIDER"]):
@@ -619,7 +607,6 @@ def filter_data(
gcp_clicks = 0
k8s_clicks = 0
m365_clicks = 0
alibabacloud_clicks = 0
if gcp_clicks > 0:
filtered_data = data.copy()
if gcp_clicks % 2 != 0 and "gcp" in list(data["PROVIDER"]):
@@ -628,7 +615,6 @@ def filter_data(
azure_clicks = 0
k8s_clicks = 0
m365_clicks = 0
alibabacloud_clicks = 0
if k8s_clicks > 0:
filtered_data = data.copy()
if k8s_clicks % 2 != 0 and "kubernetes" in list(data["PROVIDER"]):
@@ -637,7 +623,6 @@ def filter_data(
azure_clicks = 0
gcp_clicks = 0
m365_clicks = 0
alibabacloud_clicks = 0
if m365_clicks > 0:
filtered_data = data.copy()
if m365_clicks % 2 != 0 and "m365" in list(data["PROVIDER"]):
@@ -646,16 +631,7 @@ def filter_data(
azure_clicks = 0
gcp_clicks = 0
k8s_clicks = 0
alibabacloud_clicks = 0
if alibabacloud_clicks > 0:
filtered_data = data.copy()
if alibabacloud_clicks % 2 != 0 and "alibabacloud" in list(data["PROVIDER"]):
filtered_data = filtered_data[filtered_data["PROVIDER"] == "alibabacloud"]
aws_clicks = 0
azure_clicks = 0
gcp_clicks = 0
k8s_clicks = 0
m365_clicks = 0
# For all the data, we will add to the status column the value 'MUTED (FAIL)' and 'MUTED (PASS)' depending on the value of the column 'STATUS' and 'MUTED'
if "MUTED" in filtered_data.columns:
filtered_data["STATUS"] = filtered_data.apply(
@@ -747,8 +723,6 @@ def filter_data(
all_account_ids.append(account)
if "kubernetes" in list(data[data["ACCOUNT_UID"] == account]["PROVIDER"]):
all_account_ids.append(account)
if "alibabacloud" in list(data[data["ACCOUNT_UID"] == account]["PROVIDER"]):
all_account_ids.append(account)
all_account_names = []
if "ACCOUNT_NAME" in filtered_data.columns:
@@ -771,10 +745,6 @@ def filter_data(
cloud_accounts_options.append(item + " - AWS")
if "kubernetes" in list(data[data["ACCOUNT_UID"] == item]["PROVIDER"]):
cloud_accounts_options.append(item + " - K8S")
if "alibabacloud" in list(
data[data["ACCOUNT_UID"] == item]["PROVIDER"]
):
cloud_accounts_options.append(item + " - ALIBABACLOUD")
if "ACCOUNT_NAME" in filtered_data.columns:
if "azure" in list(data[data["ACCOUNT_NAME"] == item]["PROVIDER"]):
cloud_accounts_options.append(item + " - AZURE")
@@ -903,10 +873,6 @@ def filter_data(
filtered_data[filtered_data["SERVICE_NAME"] == item]["PROVIDER"]
):
service_filter_options.append(item + " - M365")
if "alibabacloud" in list(
filtered_data[filtered_data["SERVICE_NAME"] == item]["PROVIDER"]
):
service_filter_options.append(item + " - ALIBABACLOUD")
# Filter Service
if service_values == ["All"]:
@@ -1358,12 +1324,6 @@ def filter_data(
filtered_data.loc[
filtered_data["ACCOUNT_UID"] == account, "ACCOUNT_UID"
] = (account + " - M365")
if "alibabacloud" in list(
data[data["ACCOUNT_UID"] == account]["PROVIDER"]
):
filtered_data.loc[
filtered_data["ACCOUNT_UID"] == account, "ACCOUNT_UID"
] = (account + " - ALIBABACLOUD")
table_collapsible = []
for item in filtered_data.to_dict("records"):
@@ -1450,13 +1410,6 @@ def filter_data(
else:
m365_card = None
if "alibabacloud" in list(data["PROVIDER"].unique()):
alibabacloud_card = create_provider_card(
"alibabacloud", alibabacloud_provider_logo, "Accounts", full_filtered_data
)
else:
alibabacloud_card = None
# Subscribe to Prowler Cloud card
subscribe_card = [
html.Div(
@@ -1501,7 +1454,6 @@ def filter_data(
gcp_card,
k8s_card,
m365_card,
alibabacloud_card,
subscribe_card,
list_files,
severity_values,
@@ -1517,7 +1469,6 @@ def filter_data(
gcp_clicks,
k8s_clicks,
m365_clicks,
alibabacloud_clicks,
)
else:
return (
@@ -1536,7 +1487,6 @@ def filter_data(
gcp_card,
k8s_card,
m365_card,
alibabacloud_card,
subscribe_card,
list_files,
severity_values,
@@ -1554,7 +1504,6 @@ def filter_data(
gcp_clicks,
k8s_clicks,
m365_clicks,
alibabacloud_clicks,
)
+46 -1
View File
@@ -1,6 +1,7 @@
services:
api-dev:
hostname: "prowler-api"
# image: prowler-api-dev
build:
context: ./api
dockerfile: Dockerfile
@@ -24,6 +25,8 @@ services:
condition: service_healthy
valkey:
condition: service_healthy
neo4j:
condition: service_healthy
entrypoint:
- "/home/prowler/docker-entrypoint.sh"
- "dev"
@@ -78,7 +81,41 @@ services:
timeout: 5s
retries: 3
neo4j:
image: graphstack/dozerdb:5.26.3.0
hostname: "neo4j"
volumes:
- ./_data/neo4j:/data
environment:
# We can't add our .env file because some of our current variables are not compatible with Neo4j env vars
# Auth
- NEO4J_AUTH=${NEO4J_USER}/${NEO4J_PASSWORD}
# Memory limits
- NEO4J_dbms_max__databases=${NEO4J_DBMS_MAX__DATABASES:-1000000}
- NEO4J_server_memory_pagecache_size=${NEO4J_SERVER_MEMORY_PAGECACHE_SIZE:-1G}
- NEO4J_server_memory_heap_initial__size=${NEO4J_SERVER_MEMORY_HEAP_INITIAL__SIZE:-1G}
- NEO4J_server_memory_heap_max__size=${NEO4J_SERVER_MEMORY_HEAP_MAX__SIZE:-1G}
# APOC
- apoc.export.file.enabled=${NEO4J_POC_EXPORT_FILE_ENABLED:-true}
- apoc.import.file.enabled=${NEO4J_APOC_IMPORT_FILE_ENABLED:-true}
- apoc.import.file.use_neo4j_config=${NEO4J_APOC_IMPORT_FILE_USE_NEO4J_CONFIG:-true}
- "NEO4J_PLUGINS=${NEO4J_PLUGINS:-[\"apoc\"]}"
- "NEO4J_dbms_security_procedures_allowlist=${NEO4J_DBMS_SECURITY_PROCEDURES_ALLOWLIST:-apoc.*}"
- "NEO4J_dbms_security_procedures_unrestricted=${NEO4J_DBMS_SECURITY_PROCEDURES_UNRESTRICTED:-apoc.*}"
# Networking
- "dbms.connector.bolt.listen_address=${NEO4J_DBMS_CONNECTOR_BOLT_LISTEN_ADDRESS:-0.0.0.0:7687}"
# 7474 is the UI port
ports:
- 7474:7474
- ${NEO4J_PORT:-7687}:7687
healthcheck:
test: ["CMD", "wget", "--no-verbose", "http://localhost:7474"]
interval: 10s
timeout: 10s
retries: 10
worker-dev:
# image: prowler-api-dev
build:
context: ./api
dockerfile: Dockerfile
@@ -89,17 +126,23 @@ services:
- path: .env
required: false
volumes:
- "outputs:/tmp/prowler_api_output"
- ./api/src/backend:/home/prowler/backend
- ./api/pyproject.toml:/home/prowler/pyproject.toml
- ./api/docker-entrypoint.sh:/home/prowler/docker-entrypoint.sh
- outputs:/tmp/prowler_api_output
depends_on:
valkey:
condition: service_healthy
postgres:
condition: service_healthy
neo4j:
condition: service_healthy
entrypoint:
- "/home/prowler/docker-entrypoint.sh"
- "worker"
worker-beat:
# image: prowler-api-dev
build:
context: ./api
dockerfile: Dockerfile
@@ -114,6 +157,8 @@ services:
condition: service_healthy
postgres:
condition: service_healthy
neo4j:
condition: service_healthy
entrypoint:
- "../docker-entrypoint.sh"
- "beat"
+31
View File
@@ -63,6 +63,37 @@ services:
timeout: 5s
retries: 3
neo4j:
image: graphstack/dozerdb:5.26.3.0
hostname: "neo4j"
volumes:
- ./_data/neo4j:/data
environment:
# We can't add our .env file because some of our current variables are not compatible with Neo4j env vars
# Auth
- NEO4J_AUTH=${NEO4J_USER}/${NEO4J_PASSWORD}
# Memory limits
- NEO4J_dbms_max__databases=${NEO4J_DBMS_MAX__DATABASES:-1000000}
- NEO4J_server_memory_pagecache_size=${NEO4J_SERVER_MEMORY_PAGECACHE_SIZE:-1G}
- NEO4J_server_memory_heap_initial__size=${NEO4J_SERVER_MEMORY_HEAP_INITIAL__SIZE:-1G}
- NEO4J_server_memory_heap_max__size=${NEO4J_SERVER_MEMORY_HEAP_MAX__SIZE:-1G}
# APOC
- apoc.export.file.enabled=${NEO4J_POC_EXPORT_FILE_ENABLED:-true}
- apoc.import.file.enabled=${NEO4J_APOC_IMPORT_FILE_ENABLED:-true}
- apoc.import.file.use_neo4j_config=${NEO4J_APOC_IMPORT_FILE_USE_NEO4J_CONFIG:-true}
- "NEO4J_PLUGINS=${NEO4J_PLUGINS:-[\"apoc\"]}"
- "NEO4J_dbms_security_procedures_allowlist=${NEO4J_DBMS_SECURITY_PROCEDURES_ALLOWLIST:-apoc.*}"
- "NEO4J_dbms_security_procedures_unrestricted=${NEO4J_DBMS_SECURITY_PROCEDURES_UNRESTRICTED:-apoc.*}"
# Networking
- "dbms.connector.bolt.listen_address=${NEO4J_DBMS_CONNECTOR_BOLT_LISTEN_ADDRESS:-0.0.0.0:7687}"
ports:
- ${NEO4J_PORT:-7687}:7687
healthcheck:
test: ["CMD", "wget", "--no-verbose", "http://localhost:7474"]
interval: 10s
timeout: 10s
retries: 10
worker:
image: prowlercloud/prowler-api:${PROWLER_API_VERSION:-stable}
env_file:
-7
View File
@@ -198,13 +198,6 @@
"user-guide/providers/gcp/retry-configuration"
]
},
{
"group": "Alibaba Cloud",
"pages": [
"user-guide/providers/alibabacloud/getting-started-alibabacloud",
"user-guide/providers/alibabacloud/authentication"
]
},
{
"group": "Kubernetes",
"pages": [
@@ -1,112 +0,0 @@
---
title: 'Alibaba Cloud Authentication in Prowler'
---
Prowler requires Alibaba Cloud credentials to perform security checks. Authentication is supported via multiple methods, prioritized as follows:
1. **Credentials URI**
2. **OIDC Role Authentication**
3. **ECS RAM Role**
4. **RAM Role Assumption**
5. **STS Temporary Credentials**
6. **Permanent Access Keys**
7. **Default Credential Chain**
## Authentication Methods
### Credentials URI (Recommended for Centralized Services)
If `--credentials-uri` is provided (or `ALIBABA_CLOUD_CREDENTIALS_URI` environment variable), Prowler will retrieve credentials from the specified external URI endpoint. The URI must return credentials in the standard JSON format.
```bash
export ALIBABA_CLOUD_CREDENTIALS_URI="http://localhost:8080/credentials"
prowler alibabacloud
```
### OIDC Role Authentication (Recommended for ACK/Kubernetes)
If OIDC environment variables are set, Prowler will use OIDC authentication to assume the specified role. This is the most secure method for containerized applications running in ACK (Alibaba Container Service for Kubernetes) with RRSA enabled.
Required environment variables:
- `ALIBABA_CLOUD_ROLE_ARN`
- `ALIBABA_CLOUD_OIDC_PROVIDER_ARN`
- `ALIBABA_CLOUD_OIDC_TOKEN_FILE`
```bash
export ALIBABA_CLOUD_ROLE_ARN="acs:ram::123456789012:role/YourRole"
export ALIBABA_CLOUD_OIDC_PROVIDER_ARN="acs:ram::123456789012:oidc-provider/ack-rrsa-provider"
export ALIBABA_CLOUD_OIDC_TOKEN_FILE="/var/run/secrets/tokens/oidc-token"
prowler alibabacloud
```
### ECS RAM Role (Recommended for ECS Instances)
When running on an ECS instance with an attached RAM role, Prowler can obtain credentials from the ECS instance metadata service.
```bash
# Using CLI argument
prowler alibabacloud --ecs-ram-role RoleName
# Or using environment variable
export ALIBABA_CLOUD_ECS_METADATA="RoleName"
prowler alibabacloud
```
### RAM Role Assumption (Recommended for Cross-Account)
For cross-account access, use RAM role assumption. You must provide the initial credentials (access keys) and the target role ARN.
```bash
export ALIBABA_CLOUD_ACCESS_KEY_ID="your-access-key-id"
export ALIBABA_CLOUD_ACCESS_KEY_SECRET="your-access-key-secret"
export ALIBABA_CLOUD_ROLE_ARN="acs:ram::123456789012:role/ProwlerAuditRole"
prowler alibabacloud
```
### STS Temporary Credentials
If you already have temporary STS credentials, you can provide them via environment variables.
```bash
export ALIBABA_CLOUD_ACCESS_KEY_ID="your-sts-access-key-id"
export ALIBABA_CLOUD_ACCESS_KEY_SECRET="your-sts-access-key-secret"
export ALIBABA_CLOUD_SECURITY_TOKEN="your-sts-security-token"
prowler alibabacloud
```
### Permanent Access Keys
You can use standard permanent access keys via environment variables.
```bash
export ALIBABA_CLOUD_ACCESS_KEY_ID="your-access-key-id"
export ALIBABA_CLOUD_ACCESS_KEY_SECRET="your-access-key-secret"
prowler alibabacloud
```
## Required Permissions
The credentials used by Prowler should have the minimum required permissions to audit the resources. At a minimum, the following permissions are recommended:
- `ram:GetUser`
- `ram:ListUsers`
- `ram:GetPasswordPolicy`
- `ram:GetAccountSummary`
- `ram:ListVirtualMFADevices`
- `ram:ListGroups`
- `ram:ListPolicies`
- `ram:ListAccessKeys`
- `ram:GetLoginProfile`
- `ram:ListPoliciesForUser`
- `ram:ListGroupsForUser`
- `actiontrail:DescribeTrails`
- `oss:GetBucketLogging`
- `oss:GetBucketAcl`
- `rds:DescribeDBInstances`
- `rds:DescribeDBInstanceAttribute`
- `ecs:DescribeInstances`
- `vpc:DescribeVpcs`
- `sls:ListProject`
- `sls:ListAlerts`
- `sls:ListLogStores`
- `sls:GetLogStore`
@@ -1,132 +0,0 @@
---
title: 'Getting Started With Alibaba Cloud on Prowler'
---
## Prowler CLI
### Configure Alibaba Cloud Credentials
Prowler requires Alibaba Cloud credentials to perform security checks. Authentication is available through the following methods (in order of priority):
1. **Credentials URI** (Recommended for centralized credential services)
2. **OIDC Role Authentication** (Recommended for ACK/Kubernetes)
3. **ECS RAM Role** (Recommended for ECS instances)
4. **RAM Role Assumption** (Recommended for cross-account access)
5. **STS Temporary Credentials**
6. **Permanent Access Keys**
7. **Default Credential Chain**
<Warning>
Prowler does not accept credentials through command-line arguments. Provide credentials through environment variables or the Alibaba Cloud credential chain.
</Warning>
#### Option 1: Environment Variables (Permanent Credentials)
```bash
export ALIBABA_CLOUD_ACCESS_KEY_ID="your-access-key-id"
export ALIBABA_CLOUD_ACCESS_KEY_SECRET="your-access-key-secret"
prowler alibabacloud
```
#### Option 2: Environment Variables (STS Temporary Credentials)
```bash
export ALIBABA_CLOUD_ACCESS_KEY_ID="your-sts-access-key-id"
export ALIBABA_CLOUD_ACCESS_KEY_SECRET="your-sts-access-key-secret"
export ALIBABA_CLOUD_SECURITY_TOKEN="your-sts-security-token"
prowler alibabacloud
```
#### Option 3: RAM Role Assumption (Environment Variables)
```bash
export ALIBABA_CLOUD_ACCESS_KEY_ID="your-access-key-id"
export ALIBABA_CLOUD_ACCESS_KEY_SECRET="your-access-key-secret"
export ALIBABA_CLOUD_ROLE_ARN="acs:ram::123456789012:role/ProwlerAuditRole"
export ALIBABA_CLOUD_ROLE_SESSION_NAME="ProwlerAssessmentSession" # Optional
prowler alibabacloud
```
#### Option 4: RAM Role Assumption (CLI + Environment Variables)
```bash
# Set credentials via environment variables
export ALIBABA_CLOUD_ACCESS_KEY_ID="your-access-key-id"
export ALIBABA_CLOUD_ACCESS_KEY_SECRET="your-access-key-secret"
# Specify role via CLI argument
prowler alibabacloud --role-arn acs:ram::123456789012:role/ProwlerAuditRole --role-session-name ProwlerAssessmentSession
```
#### Option 5: ECS Instance Metadata (ECS RAM Role)
```bash
# When running on an ECS instance with an attached RAM role
prowler alibabacloud --ecs-ram-role RoleName
# Or using environment variable
export ALIBABA_CLOUD_ECS_METADATA="RoleName"
prowler alibabacloud
```
#### Option 6: OIDC Role Authentication (for ACK/Kubernetes)
```bash
# For applications running in ACK (Alibaba Container Service for Kubernetes) with RRSA enabled
export ALIBABA_CLOUD_ROLE_ARN="acs:ram::123456789012:role/YourRole"
export ALIBABA_CLOUD_OIDC_PROVIDER_ARN="acs:ram::123456789012:oidc-provider/ack-rrsa-provider"
export ALIBABA_CLOUD_OIDC_TOKEN_FILE="/var/run/secrets/tokens/oidc-token"
export ALIBABA_CLOUD_ROLE_SESSION_NAME="ProwlerOIDCSession" # Optional
prowler alibabacloud
# Or using CLI argument
prowler alibabacloud --oidc-role-arn acs:ram::123456789012:role/YourRole
```
#### Option 7: Credentials URI (External Credential Service)
```bash
# Retrieve credentials from an external URI endpoint
export ALIBABA_CLOUD_CREDENTIALS_URI="http://localhost:8080/credentials"
prowler alibabacloud
# Or using CLI argument
prowler alibabacloud --credentials-uri http://localhost:8080/credentials
```
#### Option 8: Default Credential Chain
The SDK automatically checks credentials in the following order:
1. Environment variables (`ALIBABA_CLOUD_*` or `ALIYUN_*`)
2. OIDC authentication (if OIDC environment variables are set)
3. Configuration file (`~/.aliyun/config.json`)
4. ECS instance metadata (if running on ECS)
5. Credentials URI (if `ALIBABA_CLOUD_CREDENTIALS_URI` is set)
```bash
prowler alibabacloud
```
### Specify Regions
To run checks only in specific regions:
```bash
prowler alibabacloud --regions cn-hangzhou cn-shanghai
```
### Run Specific Checks
To run specific checks:
```bash
prowler alibabacloud --checks ram_no_root_access_key ram_user_mfa_enabled_console_access
```
### Run Compliance Framework
To run a specific compliance framework:
```bash
prowler alibabacloud --compliance cis_2.0_alibabacloud
```
@@ -4,41 +4,7 @@ title: 'Getting Started with Oracle Cloud Infrastructure (OCI)'
Prowler supports security scanning of Oracle Cloud Infrastructure (OCI) environments. This guide will help you get started with using Prowler to audit your OCI tenancy.
## Prowler Cloud
The following steps apply to Prowler Cloud and the self-hosted Prowler App.
### Step 1: Collect OCI Identifiers
1. Sign in to the [OCI Console](https://cloud.oracle.com/) and open **Tenancy Details** to copy the Tenancy OCID.
2. Go to **Identity & Security** → **Users**, select the principal that owns the API key, and copy the **User OCID**.
3. Generate or locate the API key fingerprint and private key for that user. Follow the [Config File Authentication steps](/user-guide/providers/oci/authentication#config-file-authentication-manual-api-key-setup) to create or rotate the key pair and copy the fingerprint.
4. Note the **Region** identifier to scan (for example, `us-ashburn-1`).
### Step 2: Access Prowler Cloud or Prowler App
1. Navigate to [Prowler Cloud](https://cloud.prowler.com/) or launch [Prowler App](/user-guide/tutorials/prowler-app).
2. Go to **Configuration** → **Cloud Providers** and click **Add Cloud Provider**.
![Add OCI Cloud Provider](./images/oci-add-cloud-provider.png)
3. Select **Oracle Cloud** and enter the **Tenancy OCID** and an optional alias, then choose **Next**.
![Add OCI Cloud Tenancy](./images/oci-add-tenancy.png)
### Step 3: Add OCI API Key Credentials
Prowler App connects to OCI with API key credentials. Provide:
- **User OCID** for the API key owner
- **Fingerprint** of the API key
- **Region** (for example, `us-ashburn-1`)
- **Private Key Content** (paste the full PEM value)
- **Passphrase (Optional)** if the private key is encrypted
Select **Next**, then **Launch Scan** to validate the connection and start the first OCI scan. The private key content is encoded for secure transmission.
![Add OCI API Key Credentials](./images/oci-add-api-key-credentials.png)
---
## Prowler CLI
### Prerequisites
## Prerequisites
Before you begin, ensure you have:
@@ -56,13 +22,13 @@ Before you begin, ensure you have:
3. **OCI Account Access** with appropriate permissions to read resources in your tenancy.
### Authentication
## Authentication
Prowler supports multiple authentication methods for OCI. For detailed authentication setup, see the [OCI Authentication Guide](./authentication).
Prowler supports multiple authentication methods for OCI. For detailed authentication setup, see the [OCI Authentication Guide](./authentication.mdx).
**Note:** OCI Session Authentication and Config File Authentication both use the same `~/.oci/config` file. The difference is how the config file is generated - automatically via browser (session auth) or manually with API keys.
#### Quick Start: OCI Session Authentication (Recommended)
### Quick Start: OCI Session Authentication (Recommended)
The easiest and most secure method is using OCI session authentication, which automatically generates your config file via browser login.
@@ -105,13 +71,13 @@ The easiest and most secure method is using OCI session authentication, which au
prowler oci
```
#### Alternative: Manual API Key Setup
### Alternative: Manual API Key Setup
If you prefer to manually generate API keys instead of using browser-based session authentication, see the detailed instructions in the [Authentication Guide](./authentication#config-file-authentication-manual-api-key-setup).
If you prefer to manually generate API keys instead of using browser-based session authentication, see the detailed instructions in the [Authentication Guide](./authentication.mdx#config-file-authentication-manual-api-key-setup).
**Note:** Both methods use the same `~/.oci/config` file - the difference is that manual setup uses static API keys while session authentication uses temporary session tokens.
##### Using a Specific Profile
#### Using a Specific Profile
If you have multiple profiles in your OCI config:
@@ -119,13 +85,13 @@ If you have multiple profiles in your OCI config:
prowler oci --profile production
```
##### Using a Custom Config File
#### Using a Custom Config File
```bash
prowler oci --config-file /path/to/custom/config
```
#### Instance Principal Authentication
### 2. Instance Principal Authentication
**IMPORTANT:** This authentication method **only works when Prowler is running inside an OCI compute instance**. If you're running Prowler from your local machine, use [OCI Session Authentication](#quick-start-oci-session-authentication-recommended) instead.
@@ -144,39 +110,39 @@ prowler oci --use-instance-principal
Allow dynamic-group prowler-instances to read all-resources in tenancy
```
### Basic Usage
## Basic Usage
#### Scan Entire Tenancy
### Scan Entire Tenancy
```bash
prowler oci
```
#### Scan Specific Region
### Scan Specific Region
```bash
prowler oci --region us-phoenix-1
```
#### Scan Specific Compartments
### Scan Specific Compartments
```bash
prowler oci --compartment-id ocid1.compartment.oc1..example1 ocid1.compartment.oc1..example2
```
#### Run Specific Checks
### Run Specific Checks
```bash
prowler oci --check identity_password_policy_minimum_length_14
```
#### Run Specific Services
### Run Specific Services
```bash
prowler oci --service identity network
```
#### Compliance Frameworks
### Compliance Frameworks
Run CIS OCI Foundations Benchmark v3.0:
@@ -184,11 +150,11 @@ Run CIS OCI Foundations Benchmark v3.0:
prowler oci --compliance cis_3.0_oci
```
### Required Permissions
## Required Permissions
Prowler requires **read-only** permissions to audit your OCI tenancy. Below are the minimum required permissions:
#### Tenancy-Level Policy
### Tenancy-Level Policy
Create a group `prowler-users` and add your user to it, then create this policy:
@@ -201,7 +167,7 @@ Allow group prowler-users to read cloud-guard-problems in tenancy
Allow group prowler-users to read cloud-guard-targets in tenancy
```
#### Service-Specific Permissions
### Service-Specific Permissions
For more granular control, you can grant specific permissions:
@@ -251,33 +217,33 @@ Allow group prowler-users to inspect ons-subscriptions in tenancy
Allow group prowler-users to inspect rules in tenancy
```
### Output Formats
## Output Formats
Prowler supports multiple output formats for OCI:
#### JSON
### JSON
```bash
prowler oci --output-formats json
```
#### CSV
### CSV
```bash
prowler oci --output-formats csv
```
#### HTML
### HTML
```bash
prowler oci --output-formats html
```
#### Multiple Formats
### Multiple Formats
```bash
prowler oci --output-formats json csv html
```
### Common Scenarios
## Common Scenarios
#### Security Assessment
### Security Assessment
Full security assessment with CIS compliance:
@@ -288,7 +254,7 @@ prowler oci \
--output-directory ./oci-assessment-$(date +%Y%m%d)
```
#### Continuous Monitoring
### Continuous Monitoring
Run specific security-critical checks:
@@ -300,7 +266,7 @@ prowler oci \
--output-formats json
```
#### Compartment-Specific Audit
### Compartment-Specific Audit
Audit a specific project compartment:
@@ -311,9 +277,9 @@ prowler oci \
--region us-ashburn-1
```
### Troubleshooting
## Troubleshooting
#### Authentication Issues
### Authentication Issues
**Error: "Could not find a valid config file"**
- Ensure `~/.oci/config` exists and is properly formatted
@@ -325,23 +291,23 @@ prowler oci \
- Ensure the public key is uploaded to your OCI user account
- Check that the private key file is accessible
#### Permission Issues
### Permission Issues
**Error: "Authorization failed or requested resource not found"**
- Verify your user has the required policies (see [Required Permissions](#required-permissions))
- Check that policies apply to the correct compartments
- Ensure policies are not restricted by conditions that exclude your user
#### Region Issues
### Region Issues
**Error: "Invalid region"**
- Check available regions: `prowler oci --list-regions`
- Verify your tenancy is subscribed to the region
- Use the region identifier (e.g., `us-ashburn-1`), not the display name
### Advanced Usage
## Advanced Usage
#### Using Mutelist
### Using Mutelist
Create a mutelist file to suppress specific findings:
@@ -363,7 +329,7 @@ Run with mutelist:
prowler oci --mutelist-file oci-mutelist.yaml
```
#### Custom Checks Metadata
### Custom Checks Metadata
Override check metadata:
@@ -380,7 +346,7 @@ Run with custom metadata:
prowler oci --custom-checks-metadata-file custom-metadata.yaml
```
#### Filtering by Status
### Filtering by Status
Only show failed checks:
@@ -388,7 +354,7 @@ Only show failed checks:
prowler oci --status FAIL
```
#### Filtering by Severity
### Filtering by Severity
Only show critical and high severity findings:
@@ -396,13 +362,13 @@ Only show critical and high severity findings:
prowler oci --severity critical high
```
### Next Steps
## Next Steps
- Learn about [Compliance Frameworks](/user-guide/cli/tutorials/compliance) in Prowler
- Review [Prowler Output Formats](/user-guide/cli/tutorials/reporting)
- Explore [Integrations](/user-guide/cli/tutorials/integrations) with SIEM and ticketing systems
### Additional Resources
## Additional Resources
- [OCI Documentation](https://docs.oracle.com/en-us/iaas/Content/home.htm)
- [CIS OCI Foundations Benchmark](https://www.cisecurity.org/benchmark/oracle_cloud)
Binary file not shown.

Before

Width:  |  Height:  |  Size: 472 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 367 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 288 KiB

@@ -16,16 +16,6 @@ Lighthouse AI supports the following LLM providers:
- **Amazon Bedrock**: Offers AWS-hosted access to Claude, Llama, Titan, and other models
- **OpenAI Compatible**: Supports custom endpoints like OpenRouter, Ollama, or any OpenAI-compatible service
## Model Requirements
For Lighthouse AI to work properly, models **must** support all of the following capabilities:
- **Text input**: Ability to receive text prompts.
- **Text output**: Ability to generate text responses.
- **Tool calling**: Ability to invoke tools and functions.
If any of these capabilities are missing, the model will not be compatible with Lighthouse AI.
## How Default Providers Work
All three providers can be configured for a tenant, but only one can be set as the default provider. The first configured provider automatically becomes the default.
@@ -49,94 +39,63 @@ To connect a provider:
3. Select a default model for that provider
4. Click **Connect** to save
<Tabs>
<Tab title="OpenAI">
### Required Information
### OpenAI
- **API Key**: OpenAI API key (starts with `sk-` or `sk-proj-`). API keys can be created from the [OpenAI platform](https://platform.openai.com/api-keys).
#### Required Information
### Before Connecting
- **API Key**: OpenAI API key (starts with `sk-` or `sk-proj-`)
- Ensure the OpenAI account has sufficient credits.
- Verify that the `gpt-5` model (recommended for Lighthouse AI) is not blocked in the OpenAI organization settings.
</Tab>
<Note>
To generate an OpenAI API key, visit https://platform.openai.com/api-keys
</Note>
<Tab title="Amazon Bedrock">
Prowler connects to Amazon Bedrock using either [Amazon Bedrock API keys](https://docs.aws.amazon.com/bedrock/latest/userguide/getting-started-api-keys.html) or IAM credentials.
### Amazon Bedrock
<Note>
Amazon Bedrock models depend on AWS region and account entitlements. Lighthouse AI displays only accessible models that support tool calling and text input/output.
</Note>
#### Required Information
### Amazon Bedrock Long-Term API Key
- **AWS Access Key ID**: AWS access key ID
- **AWS Secret Access Key**: AWS secret access key
- **AWS Region**: Region where Bedrock is available (e.g., `us-east-1`, `us-west-2`)
<VersionBadge version="5.15.0" />
#### Required Permissions
<Warning>
Amazon Bedrock Long-Term API keys are recommended only for exploration purposes. For production environments, use AWS IAM Access Keys with properly scoped permissions.
</Warning>
The AWS user must have the `AmazonBedrockLimitedAccess` managed policy attached:
Amazon Bedrock API keys provide simpler authentication with automatically assigned permissions.
```text
arn:aws:iam::aws:policy/AmazonBedrockLimitedAccess
```
#### Required Information
<Note>
Currently, only AWS access key and secret key authentication is supported. Amazon Bedrock API key support will be available soon.
</Note>
- **Bedrock Long-Term API Key**: The API key generated from Amazon Bedrock.
- **AWS Region**: Region where Bedrock is available.
<Note>
Available models depend on AWS region and account entitlements. Lighthouse AI displays only accessible models.
</Note>
<Note>
Amazon Bedrock Long-Term API keys are automatically assigned the necessary permissions (`AmazonBedrockLimitedAccess` policy).
### OpenAI Compatible
Learn more: [Getting Started with Amazon Bedrock API Keys](https://docs.aws.amazon.com/bedrock/latest/userguide/getting-started-api-keys.html)
</Note>
Use this option to connect to any LLM provider exposing OpenAI compatible API endpoint (OpenRouter, Ollama, etc.).
### AWS IAM Access Keys
#### Required Information
Standard AWS IAM credentials can be used as an alternative authentication method.
- **API Key**: API key from the compatible service
- **Base URL**: API endpoint URL including the API version (e.g., `https://openrouter.ai/api/v1`)
#### Required Information
#### Example: OpenRouter
- **AWS Access Key ID**: The access key ID for the IAM user.
- **AWS Secret Access Key**: The secret access key for the IAM user.
- **AWS Region**: Region where Bedrock is available.
#### Required Permissions
The AWS IAM user must have the `AmazonBedrockLimitedAccess` managed policy attached:
```text
arn:aws:iam::aws:policy/AmazonBedrockLimitedAccess
```
<Note>
Access to all Amazon Bedrock foundation models is enabled by default. When you select a model or invoke it for the first time (using Prowler or otherwise), you agree to Amazon's EULA. More info: [Amazon Bedrock Model Access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access.html)
</Note>
</Tab>
<Tab title="OpenAI Compatible">
Use this option to connect to any LLM provider exposing an OpenAI compatible API endpoint (OpenRouter, Ollama, etc.).
### Required Information
- **API Key**: API key from the compatible service.
- **Base URL**: API endpoint URL including the API version (e.g., `https://openrouter.ai/api/v1`).
### Example: OpenRouter
1. Create an account at [OpenRouter](https://openrouter.ai/)
2. [Generate an API key](https://openrouter.ai/docs/guides/overview/auth/provisioning-api-keys) from the OpenRouter dashboard
3. Configure in Lighthouse AI:
- **API Key**: OpenRouter API key
- **Base URL**: `https://openrouter.ai/api/v1`
</Tab>
</Tabs>
1. Create an account at [OpenRouter](https://openrouter.ai/)
2. [Generate an API key](https://openrouter.ai/docs/guides/overview/auth/provisioning-api-keys) from the OpenRouter dashboard
3. Configure in Lighthouse AI:
- **API Key**: OpenRouter API key
- **Base URL**: `https://openrouter.ai/api/v1`
## Changing the Default Provider
To set a different provider as default:
1. Navigate to **Configuration** → **Lighthouse AI**
2. Click **Configure** under the desired provider to set as default
2. Click **Configure** under the provider you want as default
3. Click **Set as Default**
<img src="/images/prowler-app/lighthouse-set-default-provider.png" alt="Set default LLM provider" />
@@ -162,7 +121,7 @@ To remove a configured provider:
For best results with Lighthouse AI, the recommended model is `gpt-5` from OpenAI.
Models from other providers such as Amazon Bedrock and OpenAI Compatible endpoints can be connected and used, but performance is not guaranteed. Ensure that any selected model supports text input, text output, and tool calling capabilities.
Models from other providers such as Amazon Bedrock and OpenAI Compatible endpoints can be connected and used, but performance is not guaranteed.
## Getting Help
@@ -68,8 +68,6 @@ To perform security scans, link a cloud provider account. Prowler supports the f
- **GitHub**
- **Oracle Cloud Infrastructure (OCI)**
Steps to add a provider:
1. Navigate to `Settings > Cloud Providers`.
@@ -95,9 +93,6 @@ For detailed instructions on configuring credentials for each provider, refer to
<Card title="Google Cloud" icon="google" href="/user-guide/providers/gcp/getting-started-gcp">
Configure GCP authentication with Service Account or Application Default Credentials.
</Card>
<Card title="Oracle Cloud Infrastructure" icon="cloud" href="/user-guide/providers/oci/getting-started-oci">
Connect OCI with API key credentials to scan compartments and regions.
</Card>
<Card title="Kubernetes" icon="cloud" href="/user-guide/providers/kubernetes/getting-started-k8s">
Set up Kubernetes authentication using kubeconfig files for cluster access.
</Card>
+2 -14
View File
@@ -2,26 +2,14 @@
All notable changes to the **Prowler MCP Server** are documented in this file.
## [0.2.0] (Prowler UNRELEASED)
### Added
- Remove all Prowler App MCP tools; and add new MCP Server tools for Prowler Findings and Compliance [(#9300)](https://github.com/prowler-cloud/prowler/pull/9300)
---
## [0.1.1] (Prowler v5.14.0)
## [0.1.1] (Prowler 5.14.0)
### Fixed
- Fix documentation MCP Server to return list of dictionaries [(#9205)](https://github.com/prowler-cloud/prowler/pull/9205)
---
## [0.1.0] (Prowler v5.13.0)
## [0.1.0] (Prowler 5.13.0)
### Added
- Initial release of Prowler MCP Server [(#8695)](https://github.com/prowler-cloud/prowler/pull/8695)
- Set appropiate user-agent in requests [(#8724)](https://github.com/prowler-cloud/prowler/pull/8724)
- Basic logger functionality [(#8740)](https://github.com/prowler-cloud/prowler/pull/8740)
+2
View File
@@ -33,6 +33,8 @@ def main():
try:
args = parse_arguments()
print(f"args.transport: {args.transport}")
if args.transport is None:
args.transport = os.getenv("PROWLER_MCP_TRANSPORT_MODE", "stdio")
else:
@@ -1,24 +0,0 @@
"""Pydantic models for Prowler App MCP Server."""
from prowler_mcp_server.prowler_app.models.base import MinimalSerializerMixin
from prowler_mcp_server.prowler_app.models.findings import (
CheckMetadata,
CheckRemediation,
DetailedFinding,
FindingsListResponse,
FindingsOverview,
SimplifiedFinding,
)
__all__ = [
# Base models
"MinimalSerializerMixin",
# Findings models
"CheckMetadata",
"CheckRemediation",
"DetailedFinding",
"FindingsListResponse",
"FindingsOverview",
"SimplifiedFinding",
]
@@ -1,59 +0,0 @@
"""Base models and mixins for Prowler MCP Server models."""
from typing import Any
from pydantic import BaseModel, SerializerFunctionWrapHandler, model_serializer
class MinimalSerializerMixin(BaseModel):
"""Mixin that excludes empty values from serialization.
This mixin optimizes model serialization for LLM consumption by removing noise
and reducing token usage. It excludes:
- None values
- Empty strings
- Empty lists
- Empty dicts
"""
@model_serializer(mode="wrap")
def _serialize(self, handler: SerializerFunctionWrapHandler) -> dict[str, Any]:
"""Serialize model excluding empty values.
Args:
handler: Pydantic serializer function wrapper
Returns:
Dictionary with non-empty values only
"""
data = handler(self)
return {k: v for k, v in data.items() if not self._should_exclude(v)}
def _should_exclude(self, value: Any) -> bool:
"""Determine if a value should be excluded from serialization.
Override this method in subclasses for custom exclusion logic.
Args:
value: Field value
Returns:
True if the value should be excluded, False otherwise
"""
# None values
if value is None:
return True
# Empty strings
if value == "":
return True
# Empty lists
if isinstance(value, list) and not value:
return True
# Empty dicts
if isinstance(value, dict) and not value:
return True
return False
@@ -1,340 +0,0 @@
"""Pydantic models for simplified security findings responses."""
from typing import Literal
from prowler_mcp_server.prowler_app.models.base import MinimalSerializerMixin
from pydantic import BaseModel, ConfigDict, Field
class CheckRemediation(MinimalSerializerMixin, BaseModel):
"""Remediation information for a security check."""
model_config = ConfigDict(frozen=True)
cli: str | None = Field(
default=None,
description="Command-line interface commands for remediation",
)
terraform: str | None = Field(
default=None,
description="Terraform code snippet with best practices for remediation",
)
nativeiac: str | None = Field(
default=None,
description="Native Infrastructure as Code code snippet with best practices for remediation",
)
other: str | None = Field(
default=None,
description="Other remediation code snippet with best practices for remediation, usually used for web interfaces or other tools",
)
recommendation: str | None = Field(
default=None,
description="Text description with general best recommended practices to avoid the issue",
)
class CheckMetadata(MinimalSerializerMixin, BaseModel):
"""Essential metadata for a security check."""
model_config = ConfigDict(frozen=True)
title: str = Field(
description="Human-readable title of the security check",
)
description: str = Field(
description="Detailed description of what the check validates",
)
provider: str = Field(
description="Prowler provider this check belongs to (e.g., 'aws', 'azure', 'gcp')",
)
service: str = Field(
description="Prowler service being checked (e.g., 's3', 'ec2', 'keyvault')",
)
resource_type: str = Field(
description="Type of resource being evaluated (e.g., 'AwsS3Bucket')",
)
risk: str | None = Field(
default=None,
description="Risk description if the check fails",
)
remediation: CheckRemediation | None = Field(
default=None,
description="Remediation guidance including CLI commands and recommendations",
)
additional_urls: list[str] = Field(
default_factory=list,
description="List of additional URLs related to the check",
)
categories: list[str] = Field(
default_factory=list,
description="Categories this check belongs to (e.g., ['encryption', 'logging'])",
)
@classmethod
def from_api_response(cls, data: dict) -> "CheckMetadata":
"""Transform API check_metadata to simplified format."""
remediation_data = data.get("remediation")
remediation = None
if remediation_data:
code = remediation_data.get("code", {})
recommendation = remediation_data.get("recommendation", {})
remediation = CheckRemediation(
cli=code["cli"],
terraform=code["terraform"],
nativeiac=code["nativeiac"],
other=code["other"],
recommendation=recommendation["text"],
)
return cls(
title=data["checktitle"],
description=data["description"],
provider=data["provider"],
risk=data["risk"],
service=data["servicename"],
resource_type=data["resourcetype"],
remediation=remediation,
additional_urls=data["additionalurls"],
categories=data["categories"],
)
class SimplifiedFinding(MinimalSerializerMixin, BaseModel):
"""Simplified security finding with only LLM-relevant information."""
model_config = ConfigDict(frozen=True)
id: str = Field(
description="Unique UUIDv4 identifier for this finding in Prowler database"
)
uid: str = Field(
description="Human-readable unique identifier assigned by Prowler. Format: prowler-{provider}-{check_id}-{account_uid}-{region}-{resource_name}",
)
status: Literal["FAIL", "PASS", "MANUAL"] = Field(
description="Result status: FAIL (security issue found), PASS (no issue), MANUAL (requires manual verification)",
)
severity: Literal["critical", "high", "medium", "low", "informational"] = Field(
description="Severity level of the finding",
)
check_id: str = Field(
description="ID of the security check that generated this finding",
)
status_extended: str = Field(
description="Extended status information providing additional context",
)
delta: Literal["new", "changed"] | None = Field(
default=None,
description="Change status: 'new' (not seen before), 'changed' (modified since last scan), or None (unchanged)",
)
muted: bool | None = Field(
default=None,
description="Whether this finding has been muted/suppressed by the user",
)
muted_reason: str | None = Field(
default=None,
description="Reason provided when muting this finding",
)
@classmethod
def from_api_response(cls, data: dict) -> "SimplifiedFinding":
"""Transform JSON:API finding response to simplified format."""
attributes = data["attributes"]
return cls(
id=data["id"],
uid=attributes["uid"],
status=attributes["status"],
severity=attributes["severity"],
check_id=attributes["check_metadata"]["checkid"],
status_extended=attributes["status_extended"],
delta=attributes["delta"],
muted=attributes["muted"],
muted_reason=attributes["muted_reason"],
)
class DetailedFinding(SimplifiedFinding):
"""Detailed security finding with comprehensive information for deep analysis.
Extends SimplifiedFinding with temporal metadata and relationships to scans and resources.
Use this when you need complete context about a specific finding.
"""
model_config = ConfigDict(frozen=True)
inserted_at: str = Field(
description="ISO 8601 timestamp when this finding was first inserted into the database",
)
updated_at: str = Field(
description="ISO 8601 timestamp when this finding was last updated",
)
first_seen_at: str | None = Field(
default=None,
description="ISO 8601 timestamp when this finding was first detected across all scans",
)
scan_id: str | None = Field(
default=None,
description="UUID of the scan that generated this finding",
)
resource_ids: list[str] = Field(
default_factory=list,
description="List of UUIDs for cloud resources associated with this finding",
)
check_metadata: CheckMetadata = Field(
description="Metadata about the security check that generated this finding",
)
@classmethod
def from_api_response(cls, data: dict) -> "DetailedFinding":
"""Transform JSON:API finding response to detailed format."""
attributes = data["attributes"]
check_metadata = attributes["check_metadata"]
relationships = data.get("relationships", {})
# Parse scan relationship
scan_id = None
scan_data = relationships.get("scan", {}).get("data")
if scan_data:
scan_id = scan_data["id"]
# Parse resources relationship
resource_ids = []
resources_data = relationships.get("resources", {}).get("data", [])
if resources_data:
resource_ids = [r["id"] for r in resources_data]
return cls(
id=data["id"],
uid=attributes["uid"],
status=attributes["status"],
severity=attributes["severity"],
check_id=check_metadata["checkid"],
check_metadata=CheckMetadata.from_api_response(check_metadata),
status_extended=attributes.get("status_extended"),
delta=attributes.get("delta"),
muted=attributes["muted"],
muted_reason=attributes.get("muted_reason"),
inserted_at=attributes["inserted_at"],
updated_at=attributes["updated_at"],
first_seen_at=attributes.get("first_seen_at"),
scan_id=scan_id,
resource_ids=resource_ids,
)
class FindingsListResponse(BaseModel):
"""Simplified response for findings list queries."""
model_config = ConfigDict(frozen=True)
findings: list[SimplifiedFinding] = Field(
description="List of security findings matching the query",
)
total_num_finding: int = Field(
description="Total number of findings matching the query across all pages",
ge=0,
)
total_num_pages: int = Field(
description="Total number of pages available",
ge=0,
)
current_page: int = Field(
description="Current page number (1-indexed)",
ge=1,
)
@classmethod
def from_api_response(cls, response: dict) -> "FindingsListResponse":
"""Transform JSON:API response to simplified format."""
data = response["data"]
meta = response["meta"]
pagination = meta["pagination"]
findings = [SimplifiedFinding.from_api_response(item) for item in data]
return cls(
findings=findings,
total_num_finding=pagination["count"],
total_num_pages=pagination["pages"],
current_page=pagination["page"],
)
class FindingsOverview(BaseModel):
"""Simplified findings overview with aggregate statistics."""
model_config = ConfigDict(frozen=True)
total: int = Field(
description="Total number of findings",
ge=0,
)
fail: int = Field(
description="Total number of failed security checks",
ge=0,
)
passed: int = ( # Using 'passed' instead of 'pass' since 'pass' is a Python keyword
Field(
description="Total number of passed security checks",
ge=0,
)
)
muted: int = Field(
description="Total number of muted findings",
ge=0,
)
new: int = Field(
description="Total number of new findings (not seen in previous scans)",
ge=0,
)
changed: int = Field(
description="Total number of changed findings (modified since last scan)",
ge=0,
)
fail_new: int = Field(
description="Number of new findings with FAIL status",
ge=0,
)
fail_changed: int = Field(
description="Number of changed findings with FAIL status",
ge=0,
)
pass_new: int = Field(
description="Number of new findings with PASS status",
ge=0,
)
pass_changed: int = Field(
description="Number of changed findings with PASS status",
ge=0,
)
muted_new: int = Field(
description="Number of new muted findings",
ge=0,
)
muted_changed: int = Field(
description="Number of changed muted findings",
ge=0,
)
@classmethod
def from_api_response(cls, response: dict) -> "FindingsOverview":
"""Transform JSON:API overview response to simplified format."""
data = response["data"]
attributes = data["attributes"]
return cls(
total=attributes["total"],
fail=attributes["fail"],
passed=attributes["pass"],
muted=attributes["muted"],
new=attributes["new"],
changed=attributes["changed"],
fail_new=attributes["fail_new"],
fail_changed=attributes["fail_changed"],
pass_new=attributes["pass_new"],
pass_changed=attributes["pass_changed"],
muted_new=attributes["muted_new"],
muted_changed=attributes["muted_changed"],
)
@@ -1,8 +0,0 @@
from fastmcp import FastMCP
from prowler_mcp_server.prowler_app.utils.tool_loader import load_all_tools
# Initialize MCP server
app_mcp_server = FastMCP("prowler-app")
# Auto-discover and load all tools from the tools package
load_all_tools(app_mcp_server)
@@ -1,7 +0,0 @@
"""Domain-specific tools for Prowler App MCP Server.
Each module in this package contains a BaseTool subclass that registers
and implements tools for a specific domain (findings, providers, scans, etc.).
Tools are automatically discovered and loaded by the load_all_tools() function.
"""
@@ -1,102 +0,0 @@
import inspect
from abc import ABC
from typing import TYPE_CHECKING
if TYPE_CHECKING:
from fastmcp import FastMCP
from prowler_mcp_server.lib.logger import logger
from prowler_mcp_server.prowler_app.utils.api_client import ProwlerAPIClient
class BaseTool(ABC):
"""Abstract base class for all MCP tools.
This class defines the contract that all domain-specific tools must follow.
It ensures consistency across tool registration and provides common utilities.
Key responsibilities:
- Enforce implementation of register_tools() via ABC
- Provide shared access to API client and logger
- Define common patterns for tool registration
- Support dependency injection for the FastMCP instance
Attributes:
_api_client: Singleton instance of ProwlerAPIClient for API requests
_logger: Logger instance for structured logging
Example:
class FindingsTools(BaseTool):
def register_tools(self, mcp: FastMCP) -> None:
mcp.tool(self.search_security_findings)
mcp.tool(self.get_finding_details)
async def search_security_findings(self, severity: list[str] = Field(...)):
# Implementation with access to self.api_client
response = await self.api_client.get("/api/v1/findings")
return response
"""
def __init__(self):
"""Initialize the tool.
Sets up shared dependencies that all tools can access:
- API client (singleton) for making authenticated requests
- Logger instance for structured logging
"""
self._api_client = ProwlerAPIClient()
self._logger = logger
@property
def api_client(self) -> ProwlerAPIClient:
"""Get the shared API client instance.
Returns:
Singleton instance of ProwlerAPIClient for making API requests
"""
return self._api_client
@property
def logger(self):
"""Get the logger instance.
Returns:
Logger instance for structured logging
"""
return self._logger
def register_tools(self, mcp: "FastMCP") -> None:
"""Automatically register all public async methods as tools with FastMCP.
This method inspects the subclass and automatically registers all public
async methods (not starting with '_') as tools. Subclasses do not need
to override this method.
Args:
mcp: The FastMCP instance to register tools with
"""
# Get all methods from the subclass
registered_count = 0
for name, method in inspect.getmembers(self, predicate=inspect.ismethod):
# Skip private/protected methods
if name.startswith("_"):
continue
# Skip methods inherited from BaseTool
if name in ["register_tools"]:
continue
# Skip property getters
if name in ["api_client", "logger"]:
continue
# Check if the method is a coroutine function (async)
if inspect.iscoroutinefunction(method):
mcp.tool(method)
registered_count += 1
self.logger.debug(f"Auto-registered tool: {name}")
self.logger.info(
f"Auto-registered {registered_count} tools from {self.__class__.__name__}"
)
@@ -1,323 +0,0 @@
"""Security Findings tools for Prowler App MCP Server.
This module provides tools for searching, viewing, and analyzing security findings
across all cloud providers.
"""
from typing import Any, Literal
from prowler_mcp_server.prowler_app.models.findings import (
DetailedFinding,
FindingsListResponse,
FindingsOverview,
)
from prowler_mcp_server.prowler_app.tools.base import BaseTool
from pydantic import Field
class FindingsTools(BaseTool):
"""Tools for security findings operations.
Provides tools for:
- search_security_findings: Fast and lightweight searching across findings
- get_finding_details: Get complete details for a specific finding
- get_findings_overview: Get aggregate statistics and trends across all findings
"""
async def search_security_findings(
self,
severity: list[
Literal["critical", "high", "medium", "low", "informational"]
] = Field(
default=[],
description="Filter by severity levels. Multiple values allowed: critical, high, medium, low, informational. If empty, all severities are returned.",
),
status: list[Literal["FAIL", "PASS", "MANUAL"]] = Field(
default=["FAIL"],
description="Filter by finding status. Multiple values allowed: FAIL (security issue found), PASS (no issue found), MANUAL (requires manual verification). Default: ['FAIL'] - only returns findings with security issues. To get all findings, pass an empty list [].",
),
provider_type: list[str] = Field(
default=[],
description="Filter by cloud provider type. Multiple values allowed. If the parameter is not provided, all providers are returned. For valid values, please refer to Prowler Hub/Prowler Documentation that you can also find in form of tools in this MCP Server.",
),
provider_alias: str | None = Field(
default=None,
description="Filter by specific provider alias/name (partial match supported)",
),
region: list[str] = Field(
default=[],
description="Filter by cloud regions. Multiple values allowed (e.g., us-east-1, eu-west-1). If empty, all regions are returned.",
),
service: list[str] = Field(
default=[],
description="Filter by cloud service. Multiple values allowed (e.g., s3, ec2, iam, keyvault). If empty, all services are returned.",
),
resource_type: list[str] = Field(
default=[],
description="Filter by resource type. Multiple values allowed. If empty, all resource types are returned.",
),
check_id: list[str] = Field(
default=[],
description="Filter by specific security check IDs. Multiple values allowed. If empty, all check IDs are returned.",
),
muted: (
bool | str | None
) = Field( # Wrong `str` hint type due to bad MCP Clients implementation
default=None,
description="Filter by muted status. True for muted findings only, False for active findings only. If not specified, returns both",
),
delta: list[Literal["new", "changed"]] = Field(
default=[],
description="Show only new or changed findings. Multiple values allowed: new (not seen in previous scans), changed (modified since last scan). If empty, all findings are returned.",
),
date_from: str | None = Field(
default=None,
description="Start date for range query in ISO 8601 format (YYYY-MM-DD, e.g., '2025-01-15'). Full date required - partial dates like '2025' or '2025-01' are not accepted. IMPORTANT: Maximum date range is 2 days. If only date_from is provided, date_to is automatically set to 2 days later. If only one boundary is provided, the other will be auto-calculated to maintain the 2-day window.",
),
date_to: str | None = Field(
default=None,
description="End date for range query in ISO 8601 format (YYYY-MM-DD, e.g., '2025-01-15'). Full date required - partial dates are not accepted. If only date_to is provided, date_from is automatically set to 2 days earlier. Can be used alone or with date_from.",
),
search: str | None = Field(
default=None, description="Free-text search term across finding details"
),
page_size: int = Field(
default=50, description="Number of results to return per page"
),
page_number: int = Field(
default=1, description="Page number to retrieve (1-indexed)"
),
) -> dict[str, Any]:
"""Search and filter security findings across all cloud providers with rich filtering capabilities.
IMPORTANT: This tool returns LIGHTWEIGHT findings. Use this for fast searching and filtering across many findings.
For complete details use prowler_app_get_finding_details on specific findings.
Default behavior:
- Returns latest findings from most recent scans (no date parameters needed)
- Filters to FAIL status only (security issues found)
- Returns 50 results per page
Date filtering:
- Without dates: queries findings from the most recent completed scan across all providers (most efficient)
- With dates: queries historical findings (2-day maximum range between date_from and date_to)
Each finding includes:
- Core identification: id (UUID for get_finding_details), uid, check_id
- Security context: status (FAIL/PASS/MANUAL), severity (critical/high/medium/low/informational)
- State tracking: delta (new/changed/unchanged), muted (boolean), muted_reason
- Extended details: status_extended with additional context
Workflow:
1. Use this tool to search and filter findings by severity, status, provider, service, region, etc.
2. Use prowler_app_get_finding_details with the finding 'id' to get complete information about the finding
"""
# Validate page_size parameter
self.api_client.validate_page_size(page_size)
# Determine endpoint based on date parameters
date_range = self.api_client.normalize_date_range(
date_from, date_to, max_days=2
)
if date_range is None:
# No dates provided - use latest findings endpoint
endpoint = "/api/v1/findings/latest"
params = {}
else:
# Dates provided - use historical findings endpoint
endpoint = "/api/v1/findings"
params = {
"filter[inserted_at__gte]": date_range[0],
"filter[inserted_at__lte]": date_range[1],
}
# Build filter parameters
if severity:
params["filter[severity__in]"] = severity
if status:
params["filter[status__in]"] = status
if provider_type:
params["filter[provider_type__in]"] = provider_type
if provider_alias:
params["filter[provider_alias__icontains]"] = provider_alias
if region:
params["filter[region__in]"] = region
if service:
params["filter[service__in]"] = service
if resource_type:
params["filter[resource_type__in]"] = resource_type
if check_id:
params["filter[check_id__in]"] = check_id
if muted is not None:
params["filter[muted]"] = (
muted if isinstance(muted, bool) else muted == "true"
)
if delta:
params["filter[delta__in]"] = delta
if search:
params["filter[search]"] = search
# Pagination
params["page[size]"] = page_size
params["page[number]"] = page_number
# Return only LLM-relevant fields
params["fields[findings]"] = (
"uid,status,severity,check_id,check_metadata,status_extended,delta,muted,muted_reason"
)
params["sort"] = "severity,-inserted_at"
# Convert lists to comma-separated strings
clean_params = self.api_client.build_filter_params(params)
# Get API response and transform to simplified format
api_response = await self.api_client.get(endpoint, params=clean_params)
simplified_response = FindingsListResponse.from_api_response(api_response)
return simplified_response.model_dump()
async def get_finding_details(
self,
finding_id: str = Field(
description="UUID of the finding to retrieve (must be a valid UUID format, e.g., '019ac0d6-90d5-73e9-9acf-c22e256f1bac'). Returns an error if the finding ID is invalid or not found."
),
) -> dict[str, Any]:
"""Retrieve comprehensive details about a specific security finding by its ID.
IMPORTANT: This tool returns COMPLETE finding details.
Use this after finding a specific finding via prowler_app_search_security_findings
This tool provides ALL information that prowler_app_search_security_findings returns PLUS:
1. Check Metadata (information about the check script that generated the finding):
- title: Human-readable phrase used to summarize the check
- description: Detailed explanation of what the check validates and why it is important
- risk: What could happen if this check fails
- remediation: Complete remediation guidance including step-by-step instructions and code snippets with best practices to fix the issue:
* cli: Command-line commands to fix the issue
* terraform: Terraform code snippets with best practices
* nativeiac: Provider native Infrastructure as Code code snippets with best practices to fix the issue
* other: Other remediation code snippets with best practices, usually used for web interfaces or other tools
* recommendation: Text description with general best recommended practices to avoid the issue
- provider: Cloud provider (aws/azure/gcp/etc)
- service: Service name (s3/ec2/keyvault/etc)
- resource_type: Resource type being evaluated
- categories: Security categories this check belongs to
- additional_urls: List of additional URLs related to the check
2. Temporal Metadata:
- inserted_at: When this finding was first inserted into database
- updated_at: When this finding was last updated
- first_seen_at: When this finding was first detected across all scans
3. Relationships:
- scan_id: UUID of the scan that generated this finding
- resource_ids: List of UUIDs for cloud resources associated with this finding
Workflow:
1. Use prowler_app_search_security_findings to browse and filter findings
2. Use this tool with the finding 'id' to get remediation guidance and complete context
"""
params = {
# Return comprehensive fields including temporal metadata
"fields[findings]": "uid,status,severity,check_id,check_metadata,status_extended,delta,muted,muted_reason,inserted_at,updated_at,first_seen_at",
# Include relationships to scan and resources
"include": "scan,resources",
}
# Get API response and transform to detailed format
api_response = await self.api_client.get(
f"/api/v1/findings/{finding_id}", params=params
)
detailed_finding = DetailedFinding.from_api_response(
api_response.get("data", {})
)
return detailed_finding.model_dump()
async def get_findings_overview(
self,
provider_type: list[str] = Field(
default=[],
description="Filter statistics by cloud provider. Multiple values allowed. If empty, all providers are returned. For valid values, please refer to Prowler Hub/Prowler Documentation that you can also find in form of tools in this MCP Server.",
),
) -> dict[str, Any]:
"""Get aggregate statistics and trends about security findings as a markdown report.
This tool provides a HIGH-LEVEL OVERVIEW without retrieving individual findings. Use this when you
need to understand the overall security posture, trends, or remediation progress across all findings.
The markdown report includes:
1. Summary Statistics:
- Total number of findings
- Failed checks (security issues) with percentage
- Passed checks (no issues) with percentage
- Muted findings (user-suppressed) with percentage
2. Delta Analysis (Change Tracking):
- New findings: never seen before in previous scans
* Broken down by: new failures, new passes, new muted
- Changed findings: status changed since last scan
* Broken down by: changed to fail, changed to pass, changed to muted
- Unchanged findings: same status as previous scan
This helps answer questions like:
- "What's my overall security posture?"
- "How many critical security issues do I have?"
- "Are we improving or getting worse over time?"
- "How many new security issues appeared since last scan?"
"""
params = {
# Return only LLM-relevant aggregate statistics
"fields[findings-overview]": "new,changed,fail_new,fail_changed,pass_new,pass_changed,muted_new,muted_changed,total,fail,muted,pass"
}
if provider_type:
params["filter[provider_type__in]"] = provider_type
clean_params = self.api_client.build_filter_params(params)
# Get API response and transform to simplified format
api_response = await self.api_client.get(
"/api/v1/overviews/findings", params=clean_params
)
overview = FindingsOverview.from_api_response(api_response)
# Format as markdown report
total = overview.total
fail = overview.fail
passed = overview.passed
muted = overview.muted
new = overview.new
changed = overview.changed
# Calculate percentages
fail_pct = (fail / total * 100) if total > 0 else 0
passed_pct = (passed / total * 100) if total > 0 else 0
muted_pct = (muted / total * 100) if total > 0 else 0
unchanged = total - new - changed
# Build markdown report
report = f"""# Security Findings Overview
## Summary Statistics
- **Total Findings**: {total:,}
- **Failed Checks**: {fail:,} ({fail_pct:.1f}%)
- **Passed Checks**: {passed:,} ({passed_pct:.1f}%)
- **Muted Findings**: {muted:,} ({muted_pct:.1f}%)
## Delta Analysis
- **New Findings**: {new:,}
- New failures: {overview.fail_new:,}
- New passes: {overview.pass_new:,}
- New muted: {overview.muted_new:,}
- **Changed Findings**: {changed:,}
- Changed to fail: {overview.fail_changed:,}
- Changed to pass: {overview.pass_changed:,}
- Changed to muted: {overview.muted_changed:,}
- **Unchanged**: {unchanged:,}
"""
return {"report": report}
@@ -1,292 +0,0 @@
"""Shared API client utilities for Prowler App tools."""
from datetime import datetime, timedelta
from enum import Enum
from typing import Any, Dict
import httpx
from prowler_mcp_server.lib.logger import logger
from prowler_mcp_server.prowler_app.utils.auth import ProwlerAppAuth
class HTTPMethod(str, Enum):
"""HTTP methods enum."""
GET = "GET"
POST = "POST"
PATCH = "PATCH"
DELETE = "DELETE"
class SingletonMeta(type):
"""Metaclass that implements the Singleton pattern.
This metaclass ensures that only one instance of a class exists.
All calls to the constructor return the same instance.
"""
_instances: Dict[type, Any] = {}
def __call__(cls, *args, **kwargs):
"""Control instance creation to ensure singleton behavior."""
if cls not in cls._instances:
instance = super().__call__(*args, **kwargs)
cls._instances[cls] = instance
return cls._instances[cls]
class ProwlerAPIClient(metaclass=SingletonMeta):
"""Shared API client with smart defaults and helper methods.
This class uses the Singleton pattern via metaclass to ensure only one
instance exists across the application, reducing initialization overhead
and enabling HTTP connection pooling.
"""
def __init__(self) -> None:
"""Initialize the API client (only called once due to singleton pattern)."""
self.auth_manager: ProwlerAppAuth = ProwlerAppAuth()
self.client: httpx.AsyncClient = httpx.AsyncClient(timeout=30.0)
async def _make_request(
self,
method: HTTPMethod,
path: str,
params: dict[str, any] | None = None,
json_data: dict[str, any] | None = None,
) -> dict[str, any]:
"""Make authenticated API request.
Args:
method: HTTP method (GET, POST, PATCH, DELETE)
path: API endpoint path
params: Optional query parameters
json_data: Optional JSON body data
Returns:
API response as dictionary
Raises:
Exception: If API request fails
"""
try:
token: str = await self.auth_manager.get_valid_token()
url: str = f"{self.auth_manager.base_url}{path}"
headers: dict[str, str] = self.auth_manager.get_headers(token)
response: httpx.Response = await self.client.request(
method=method.value,
url=url,
headers=headers,
params=params,
json=json_data,
)
response.raise_for_status()
return response.json()
except httpx.HTTPStatusError as e:
logger.error(f"HTTP error during {method.value} {path}: {e}")
error_detail: str = ""
try:
error_data: dict[str, any] = e.response.json()
error_detail = error_data.get("errors", [{}])[0].get("detail", "")
except Exception:
error_detail = e.response.text
raise Exception(
f"API request failed: {e.response.status_code} - {error_detail}"
)
except Exception as e:
logger.error(f"Error during {method.value} {path}: {e}")
raise
async def get(
self, path: str, params: dict[str, any] | None = None
) -> dict[str, any]:
"""Make GET request.
Args:
path: API endpoint path
params: Optional query parameters
Returns:
API response as dictionary
Raises:
Exception: If API request fails
"""
return await self._make_request(HTTPMethod.GET, path, params=params)
async def post(
self,
path: str,
params: dict[str, any] | None = None,
json_data: dict[str, any] | None = None,
) -> dict[str, any]:
"""Make POST request.
Args:
path: API endpoint path
params: Optional query parameters
json_data: Optional JSON body data
Returns:
API response as dictionary
Raises:
Exception: If API request fails
"""
return await self._make_request(
HTTPMethod.POST, path, params=params, json_data=json_data
)
async def patch(
self,
path: str,
params: dict[str, any] | None = None,
json_data: dict[str, any] | None = None,
) -> dict[str, any]:
"""Make PATCH request.
Args:
path: API endpoint path
params: Optional query parameters
json_data: Optional JSON body data
Returns:
API response as dictionary
Raises:
Exception: If API request fails
"""
return await self._make_request(
HTTPMethod.PATCH, path, params=params, json_data=json_data
)
async def delete(
self, path: str, params: dict[str, any] | None = None
) -> dict[str, any]:
"""Make DELETE request.
Args:
path: API endpoint path
params: Optional query parameters
Returns:
API response as dictionary
Raises:
Exception: If API request fails
"""
return await self._make_request(HTTPMethod.DELETE, path, params=params)
def _validate_date_format(self, date_str: str, param_name: str) -> datetime:
"""Validate date string format.
Args:
date_str: Date string to validate
param_name: Parameter name for error messages
Returns:
Parsed datetime object
Raises:
ValueError: If date format is invalid
"""
try:
return datetime.strptime(date_str, "%Y-%m-%d")
except ValueError:
raise ValueError(
f"Invalid date format for {param_name}. Expected YYYY-MM-DD (e.g., '2025-01-15'), got '{date_str}'. "
f"Full date required - partial dates like '2025' or '2025-01' are not accepted."
)
def validate_page_size(self, page_size: int) -> None:
"""Validate page size parameter.
Args:
page_size: Page size to validate
Raises:
ValueError: If page size is out of valid range (1-1000)
"""
if page_size < 1 or page_size > 1000:
raise ValueError(
f"Invalid page_size: {page_size}. Must be between 1 and 1000 (inclusive)."
)
def normalize_date_range(
self, date_from: str | None, date_to: str | None, max_days: int = 2
) -> tuple[str, str] | None:
"""Normalize and validate date range, auto-completing missing boundary.
The Prowler API has a 2-day limit for historical queries. This helper:
1. Returns None if no dates provided (signals: use latest/default endpoint)
2. Auto-completes missing boundary to maintain 2-day window
3. Validates the range doesn't exceed max_days
Args:
date_from: Start date (YYYY-MM-DD format) or None
date_to: End date (YYYY-MM-DD format) or None
max_days: Maximum allowed days between dates (default: 2)
Returns:
None if no dates provided, otherwise tuple of (date_from, date_to) as strings
Raises:
ValueError: If date range exceeds max_days or date format is invalid
"""
if not date_from and not date_to:
return None
# Parse and validate provided dates
from_date: datetime | None = (
self._validate_date_format(date_from, "date_from") if date_from else None
)
to_date: datetime | None = (
self._validate_date_format(date_to, "date_to") if date_to else None
)
# Auto-complete missing boundary to maintain max_days window
if from_date and not to_date:
to_date = from_date + timedelta(days=max_days - 1)
elif to_date and not from_date:
from_date = to_date - timedelta(days=max_days - 1)
# Validate range doesn't exceed max_days
delta: int = (to_date - from_date).days + 1
if delta > max_days:
raise ValueError(
f"Date range cannot exceed {max_days} days. "
f"Requested range: {from_date.date()} to {to_date.date()} ({delta} days)"
)
return from_date.strftime("%Y-%m-%d"), to_date.strftime("%Y-%m-%d")
def build_filter_params(
self, params: dict[str, any], exclude_none: bool = True
) -> dict[str, any]:
"""Build filter parameters for API, converting types to API-compatible formats.
Args:
params: Dictionary of parameters
exclude_none: If True, exclude None values from result
Returns:
Cleaned parameter dictionary ready for API
"""
result: dict[str, any] = {}
for key, value in params.items():
if value is None and exclude_none:
continue
# Convert boolean values to lowercase strings for API compatibility
if isinstance(value, bool):
result[key] = str(value).lower()
# Convert lists/arrays to comma-separated strings
elif isinstance(value, (list, tuple)):
result[key] = ",".join(str(v) for v in value)
else:
result[key] = value
return result
@@ -0,0 +1,732 @@
{
"endpoints": {
"* /api/v1/providers*": {
"parameters": {
"id": {
"name": "provider_id",
"description": "The UUID of the provider. This UUID is generated by Prowler and it is not related with the UID of the provider (that is the one that is set by the provider).\n\tThe format is UUIDv4: \"4d0e2614-6385-4fa7-bf0b-c2e2f75c6877\""
}
}
},
"GET /api/v1/providers": {
"name": "list_providers",
"description": "List all providers with options for filtering by various criteria.",
"parameters": {
"fields[providers]": {
"name": "fields",
"description": "The tool will return only the specified fields, if not set all are returned (comma-separated, e.g. \"uid,delta,status\")"
},
"filter[alias]": {
"name": "filter_alias",
"description": "Filter by exact alias name"
},
"filter[alias__icontains]": {
"name": "filter_alias_contains",
"description": "Filter by partial alias match"
},
"filter[alias__in]": {
"name": "filter_alias_in",
"description": "Filter by multiple aliases (comma-separated, e.g. \"aws_alias_1,azure_alias_2\"). Useful when searching for multiple providers at once."
},
"filter[connected]": {
"name": "filter_connected",
"description": "Filter by connected status (True for connected, False for connection failed, if not set all both are returned).\n\tIf the connection haven't been attempted yet, the status will be None and does not apply for this filter."
},
"filter[id]": {
"name": "filter_id",
"description": "Filter by exact ID of the provider (UUID)"
},
"filter[id__in]": {
"name": "filter_id_in",
"description": "Filter by multiple IDs of the providers (comma-separated UUIDs, e.g. \"a1b2c3d4-5678-90ab-cdef-1234567890ab,deadbeef-1234-5678-9abc-def012345678,0f1e2d3c-4b5a-6978-8c9d-0e1f2a3b4c5d\"). Useful when searching for multiple providers at once."
},
"filter[inserted_at]": {
"name": "filter_inserted_at",
"description": "Filter by exact date (format: YYYY-MM-DD). This is the date when the provider was inserted into the database."
},
"filter[inserted_at__gte]": {
"name": "filter_inserted_at_gte",
"description": "Filter providers inserted on or after this date (format: YYYY-MM-DD)"
},
"filter[inserted_at__lte]": {
"name": "filter_inserted_at_lte",
"description": "Filter providers inserted on or before this date (format: YYYY-MM-DD)"
},
"filter[provider]": {
"name": "filter_provider",
"description": "Filter by single provider type"
},
"filter[provider__in]": {
"name": "filter_provider_in",
"description": "Filter by multiple provider types (comma-separated, e.g. \"aws,azure,gcp\")"
},
"filter[search]": {
"name": "filter_search",
"description": "A search term accross \"provider\", \"alias\" and \"uid\""
},
"filter[uid]": {
"name": "filter_uid",
"description": "Filter by exact finding UID"
},
"filter[uid__icontains]": {
"name": "filter_uid_contains",
"description": "Filter by partial finding UID match"
},
"filter[uid__in]": {
"name": "filter_uid_in",
"description": "Filter by multiple UIDs (comma-separated UUIDs)"
},
"filter[updated_at]": {
"name": "filter_updated_at",
"description": "Filter by exact date (format: YYYY-MM-DD). This is the date when the provider was updated in the database."
},
"filter[updated_at__gte]": {
"name": "filter_updated_at_gte",
"description": "Filter providers updated on or after this date (format: YYYY-MM-DD)"
},
"filter[updated_at__lte]": {
"name": "filter_updated_at_lte",
"description": "Filter providers updated on or before this date (format: YYYY-MM-DD)"
},
"include": {
"name": "include",
"description": "Include related resources in the response, for now only \"provider_groups\" is supported"
},
"page[number]": {
"name": "page_number",
"description": "Page number to retrieve (default: 1)"
},
"page[size]": {
"name": "page_size",
"description": "Number of results per page (default: 100)"
},
"sort": {
"name": "sort",
"description": "Sort the results by the specified fields. Use '-' prefix for descending order. (e.g. \"-provider,inserted_at\", this first sorts by provider alphabetically and then inside of each category by inserted_at date)"
}
}
},
"POST /api/v1/providers": {
"name": "create_provider",
"description": "Create a new provider in the current Prowler Tenant.\n\tThis is just for creating a new provider, not for adding/configuring credentials. To add credentials to an existing provider, use tool add_provider_secret from Prowler MCP server",
"parameters": {
"alias": {
"description": "Pseudonym name to identify the provider"
},
"provider": {
"description": "Type of provider to create"
},
"uid": {
"description": "UID for the provider. This UID is dependent on the provider type: \n\tAWS: AWS account ID\n\tAzure: Azure subscription ID\n\tGCP: GCP project ID\n\tKubernetes: Kubernetes namespace\n\tM365: M365 domain ID\n\tGitHub: GitHub username or organization name"
}
}
},
"GET /api/v1/providers/{id}": {
"name": "get_provider",
"description": "Get detailed information about a specific provider",
"parameters": {
"fields[providers]": {
"name": "fields",
"description": "The tool will return only the specified fields, if not set all are returned (comma-separated, e.g. \"uid,alias,connection\")."
},
"include": {
"description": "Include related resources in the response, for now only \"provider_groups\" is supported"
}
}
},
"PATCH /api/v1/providers/{id}": {
"name": "update_provider",
"description": "Update the details of a specific provider",
"parameters": {
"alias": {
"description": "Pseudonym name to identify the provider, if not set, the alias will not be updated"
}
}
},
"DELETE /api/v1/providers/{id}": {
"name": "delete_provider",
"description": "Delete a specific provider"
},
"POST /api/v1/providers/{id}/connection": {
"name": "test_provider_connection",
"description": "Test the connection status of a specific provider with the credentials set in the provider secret. Needed to be done before running a scan."
},
"GET /api/v1/providers/secrets": {
"name": "list_provider_secrets",
"description": "List all provider secrets with options for filtering by various criteria",
"parameters": {
"fields[provider-secrets]": {
"name": "fields",
"description": "The tool will return only the specified fields, if not set all are returned (comma-separated, e.g. \"name,secret_type,provider\")"
},
"filter[inserted_at]": {
"name": "filter_inserted_at",
"description": "Filter by exact date when the secret was inserted (format: YYYY-MM-DD)"
},
"filter[name]": {
"name": "filter_name",
"description": "Filter by exact secret name"
},
"filter[name__icontains]": {
"name": "filter_name_contains",
"description": "Filter by partial secret name match"
},
"filter[provider]": {
"name": "filter_provider",
"description": "Filter by prowler provider UUID (UUIDv4)"
},
"filter[search]": {
"name": "filter_search",
"description": "Search term in name attribute"
},
"filter[updated_at]": {
"name": "filter_updated_at",
"description": "Filter by exact update date (format: YYYY-MM-DD)"
},
"page[number]": {
"name": "page_number",
"description": "Page number to retrieve (default: 1)"
},
"page[size]": {
"name": "page_size",
"description": "Number of results per page"
},
"sort": {
"name": "sort",
"description": "Sort the results by the specified fields. You can specify multiple fields separated by commas; the results will be sorted by the first field, then by the second within each group of the first, and so on. Use '-' as a prefix to a field name for descending order (e.g. \"-name,inserted_at\" sorts by name descending, then by inserted_at ascending within each name). If not set, the default sort order will be applied"
}
}
},
"* /api/v1/providers/secrets*": {
"parameters": {
"secret": {
"name": "credentials",
"description": "Provider-specific credentials dictionary. Supported formats:\n - AWS Static: {\"aws_access_key_id\": \"...\", \"aws_secret_access_key\": \"...\", \"aws_session_token\": \"...\"}\n - AWS Assume Role: {\"role_arn\": \"...\", \"external_id\": \"...\", \"session_duration\": 3600, \"role_session_name\": \"...\"}\n - Azure: {\"tenant_id\": \"...\", \"client_id\": \"...\", \"client_secret\": \"...\"}\n - M365: {\"tenant_id\": \"...\", \"client_id\": \"...\", \"client_secret\": \"...\", \"user\": \"...\", \"password\": \"...\"}\n - GCP Static: {\"client_id\": \"...\", \"client_secret\": \"...\", \"refresh_token\": \"...\"}\n - GCP Service Account: {\"service_account_key\": {...}}\n - Kubernetes: {\"kubeconfig_content\": \"...\"}\n - GitHub PAT: {\"personal_access_token\": \"...\"}\n - GitHub OAuth: {\"oauth_app_token\": \"...\"}\n - GitHub App: {\"github_app_id\": 123, \"github_app_key\": \"path/to/key\"}"
},
"secret_type": {
"description": "Type of secret:\n\tstatic: Static credentials\n\trole: Assume role credentials (for now only AWS is supported)\n\tservice_account: Service account credentials (for now only GCP is supported)"
}
}
},
"POST /api/v1/providers/secrets": {
"name": "add_provider_secret",
"description": "Add or update complete credentials for an existing provider",
"parameters": {
"provider_id": {
"description": "The UUID of the provider. This UUID is generated by Prowler and it is not related with the UID of the provider, the format is UUIDv4: \"4d0e2614-6385-4fa7-bf0b-c2e2f75c6877\""
},
"name": {
"name": "secret_name",
"description": "Name for the credential secret. This must be between 3 and 100 characters long"
}
}
},
"GET /api/v1/providers/secrets/{id}": {
"name": "get_provider_secret",
"description": "Get detailed information about a specific provider secret",
"parameters": {
"id": {
"name": "provider_secret_id",
"description": "The UUID of the provider secret"
},
"fields[provider-secrets]": {
"name": "fields",
"description": "The tool will return only the specified fields, if not set all are returned (comma-separated, e.g. \"name,secret_type,provider\")"
}
}
},
"PATCH /api/v1/providers/secrets/{id}": {
"name": "update_provider_secret",
"description": "Update the details of a specific provider secret",
"parameters": {
"id": {
"name": "provider_secret_id",
"description": "The UUID of the provider secret."
},
"name": {
"name": "secret_name",
"description": "Name for the credential secret. This must be between 3 and 100 characters long"
}
}
},
"DELETE /api/v1/providers/secrets/{id}": {
"name": "delete_provider_secret",
"description": "Delete a specific provider secret",
"parameters": {
"id": {
"name": "provider_secret_id",
"description": "The UUID of the provider secret."
}
}
},
"GET /api/v1/findings*": {
"parameters": {
"fields[findings]": {
"name": "fields",
"description": "The tool will return only the specified fields, if not set all are returned (comma-separated, e.g. \"uid,delta,status,status_extended,severity,check_id,scan\")"
},
"filter[check_id]": {
"name": "filter_check_id",
"description": "Filter by exact check ID (e.g. ec2_launch_template_imdsv2_required). To get the list of available checks for a provider, use tool get_checks from Prowler Hub MCP server"
},
"filter[check_id__icontains]": {
"name": "filter_check_id_contains",
"description": "Filter by partial check ID match (e.g. \"iam\" matches all IAM-related checks for all providers)"
},
"filter[check_id__in]": {
"name": "filter_check_id_in",
"description": "Filter by multiple check IDs (comma-separated, e.g. \"ec2_launch_template_imdsv2_required,bedrock_guardrail_prompt_attack_filter_enabled,vpc_endpoint_multi_az_enabled\")"
},
"filter[delta]": {
"name": "filter_delta",
"description": "Filter by finding delta status"
},
"filter[id]": {
"name": "filter_id",
"description": "Filter by exact finding ID (main key in the database, it is a UUIDv7). It is not the same as the finding UID."
},
"filter[id__in]": {
"name": "filter_id_in",
"description": "Filter by multiple finding IDs (comma-separated UUIDs)"
},
"filter[inserted_at]": {
"name": "filter_inserted_at",
"description": "Filter by exact date (format: YYYY-MM-DD)."
},
"filter[inserted_at__date]": {
"name": "filter_inserted_at_date",
"description": "Filter by exact date (format: YYYY-MM-DD). Same as filter_inserted_at parameter."
},
"filter[inserted_at__gte]": {
"name": "filter_inserted_at_gte",
"description": "Filter findings inserted on or after this date (format: YYYY-MM-DD)"
},
"filter[inserted_at__lte]": {
"name": "filter_inserted_at_lte",
"description": "Filter findings inserted on or before this date (format: YYYY-MM-DD)"
},
"filter[muted]": {
"name": "filter_muted",
"description": "Filter by muted status (True for muted, False for non-muted, if not set all both are returned). A muted finding is a finding that has been muted by the user to ignore it."
},
"filter[provider]": {
"name": "filter_provider",
"description": "Filter by exact provider UUID (UUIDv4). This UUID is generated by Prowler and it is not related with the UID of the provider (that is the one that is set by the provider). The format is UUIDv4: \"4d0e2614-6385-4fa7-bf0b-c2e2f75c6877\""
},
"filter[provider__in]": {
"name": "filter_provider_in",
"description": "Filter by multiple provider UUIDs (comma-separated UUIDs, e.g. \"4d0e2614-6385-4fa7-bf0b-c2e2f75c6877,deadbeef-1234-5678-9abc-def012345678,0f1e2d3c-4b5a-6978-8c9d-0e1f2a3b4c5d\"). Useful when searching for multiple providers at once."
},
"filter[provider_alias]": {
"name": "filter_provider_alias",
"description": "Filter by exact provider alias name"
},
"filter[provider_alias__icontains]": {
"name": "filter_provider_alias_contains",
"description": "Filter by partial provider alias match"
},
"filter[provider_alias__in]": {
"name": "filter_provider_alias_in",
"description": "Filter by multiple provider aliases (comma-separated)"
},
"filter[provider_id]": {
"name": "filter_provider_id",
"description": "Filter by exact provider ID (UUID)"
},
"filter[provider_id__in]": {
"name": "filter_provider_id_in",
"description": "Filter by multiple provider IDs (comma-separated UUIDs)"
},
"filter[provider_type]": {
"name": "filter_provider_type",
"description": "Filter by single provider type"
},
"filter[provider_type__in]": {
"name": "filter_provider_type_in",
"description": "Filter by multiple provider types (comma-separated, e.g. \"aws,azure,gcp\"). Allowed values are: aws, azure, gcp, kubernetes, m365, github"
},
"filter[provider_uid]": {
"name": "filter_provider_uid",
"description": "Filter by exact provider UID. This UID is dependent on the provider type: \n\tAWS: AWS account ID\n\tAzure: Azure subscription ID\n\tGCP: GCP project ID\n\tKubernetes: Kubernetes namespace\n\tM365: M365 domain ID\n\tGitHub: GitHub username or organization name"
},
"filter[provider_uid__icontains]": {
"name": "filter_provider_uid_contains",
"description": "Filter by partial provider UID match"
},
"filter[provider_uid__in]": {
"name": "filter_provider_uid_in",
"description": "Filter by multiple provider UIDs (comma-separated UUIDs)"
},
"filter[region]": {
"name": "filter_region",
"description": "Filter by exact region name (e.g. us-east-1, eu-west-1, etc.). To get a list of available regions in a subset of findings, use tool get_findings_metadata from Prowler MCP server"
},
"filter[region__icontains]": {
"name": "filter_region_contains",
"description": "Filter by partial region match (e.g. \"us-\" matches all US regions)"
},
"filter[region__in]": {
"name": "filter_region_in",
"description": "Filter by multiple regions (comma-separated, e.g. \"us-east-1,us-west-2,eu-west-1\")"
},
"filter[resource_name]": {
"name": "filter_resource_name",
"description": "Filter by exact resource name that finding is associated with"
},
"filter[resource_name__icontains]": {
"name": "filter_resource_name_contains",
"description": "Filter by partial resource name match that finding is associated with"
},
"filter[resource_name__in]": {
"name": "filter_resource_name_in",
"description": "Filter by multiple resource names (comma-separated) that finding is associated with"
},
"filter[resource_type]": {
"name": "filter_resource_type",
"description": "Filter by exact resource type that finding is associated with"
},
"filter[resource_type__icontains]": {
"name": "filter_resource_type_contains",
"description": "Filter by partial resource type match that finding is associated with"
},
"filter[resource_type__in]": {
"name": "filter_resource_type_in",
"description": "Filter by multiple resource types (comma-separated) that finding is associated with"
},
"filter[resource_uid]": {
"name": "filter_resource_uid",
"description": "Filter by exact resource UID that finding is associated with"
},
"filter[resource_uid__icontains]": {
"name": "filter_resource_uid_contains",
"description": "Filter by partial resource UID match that finding is associated with"
},
"filter[resource_uid__in]": {
"name": "filter_resource_uid_in",
"description": "Filter by multiple resource UIDss (comma-separated) that finding is associated with"
},
"filter[resources]": {
"name": "filter_resources",
"description": "Filter by multiple resources (comma-separated) that finding is associated with. The accepted vaules are internal Prowler generated resource UUIDs"
},
"filter[scan]": {
"name": "filter_scan",
"description": "Filter by scan UUID"
},
"filter[scan__in]": {
"name": "filter_scan_in",
"description": "Filter by multiple scan UUIDs (comma-separated UUIDs)"
},
"filter[service]": {
"name": "filter_service",
"description": "Filter by exact service name (e.g. s3, rds, ec2, keyvault, etc.). To get the list of available services, use tool list_providers from Prowler Hub MCP server"
},
"filter[service__icontains]": {
"name": "filter_service_contains",
"description": "Filter by partial service name match (e.g. \"storage\" matches all storage-related services)"
},
"filter[service__in]": {
"name": "filter_service_in",
"description": "Filter by multiple service names (comma-separated, e.g. \"s3,ec2,iam\")"
},
"filter[severity]": {
"name": "filter_severity",
"description": "Filter by single severity (critical, high, medium, low, informational)"
},
"filter[severity__in]": {
"name": "filter_severity_in",
"description": "Filter by multiple severities (comma-separated, e.g. \"critical,high\")"
},
"filter[status]": {
"name": "filter_status",
"description": "Filter by single status"
},
"filter[status__in]": {
"name": "filter_status_in",
"description": "Filter by multiple statuses (comma-separated, e.g. \"FAIL,MANUAL\"). Allowed values are: PASS, FAIL, MANUAL"
},
"filter[uid]": {
"name": "filter_uid",
"description": "Filter by exact finding UID assigned by Prowler"
},
"filter[uid__in]": {
"name": "filter_uid_in",
"description": "Filter by multiple finding UIDs (comma-separated UUIDs)"
},
"filter[updated_at]": {
"name": "filter_updated_at",
"description": "Filter by exact update date (format: YYYY-MM-DD)"
},
"filter[updated_at__gte]": {
"name": "filter_updated_at_gte",
"description": "Filter by update date on or after this date (format: YYYY-MM-DD)"
},
"filter[updated_at__lte]": {
"name": "filter_updated_at_lte",
"description": "Filter by update date on or before this date (format: YYYY-MM-DD)"
},
"include": {
"name": "include",
"description": "Include related resources in the response, supported values are: \"resources\" and \"scan\""
},
"page[number]": {
"name": "page_number",
"description": "Page number to retrieve (default: 1)"
},
"page[size]": {
"name": "page_size",
"description": "Number of results per page (default: 100)"
},
"sort": {
"name": "sort",
"description": "Sort the results by the specified fields. You can specify multiple fields separated by commas; the results will be sorted by the first field, then by the second within each group of the first, and so on. Use '-' as a prefix to a field name for descending order (e.g. \"status,-severity\" sorts by status ascending alphabetically and then by severity descending within each status alphabetically)"
}
}
},
"GET /api/v1/findings": {
"name": "list_findings",
"description": "List security findings from Prowler scans with advanced filtering.\n\tAt least one of the variations of the filter[inserted_at] is required. If not provided, defaults to findings from the last day."
},
"GET /api/v1/findings/{id}": {
"name": "get_finding",
"description": "Get detailed information about a specific security finding",
"parameters": {
"id": {
"name": "finding_id",
"description": "The UUID of the finding"
}
}
},
"GET /api/v1/findings/latest": {
"name": "get_latest_findings",
"description": "Retrieve a list of the latest findings from the latest scans for each provider with advanced filtering options"
},
"GET /api/v1/findings/metadata": {
"name": "get_findings_metadata",
"description": "Fetch unique metadata values from a filtered set of findings. This is useful for dynamic filtering",
"parameters": {
"fields[findings-metadata]": {
"name": "metadata_fields",
"description": "Specific metadata fields to return (comma-separated, e.g. 'regions,services,check_ids')"
}
}
},
"GET /api/v1/findings/metadata/latest": {
"name": "get_latest_findings_metadata",
"description": "Fetch unique metadata values from the latest findings across all providers"
},
"* /api/v1/scans*": {
"parameters": {
"id": {
"name": "scan_id",
"description": "The UUID of the scan. The format is UUIDv4: \"4d0e2614-6385-4fa7-bf0b-c2e2f75c6877\""
}
}
},
"GET /api/v1/scans": {
"name": "list_scans",
"description": "List all scans with options for filtering by various criteria.",
"parameters": {
"fields[scans]": {
"name": "fields",
"description": "The tool will return only the specified fields, if not set all are returned (comma-separated, e.g. \"name,state,progress,duration\")"
},
"filter[completed_at]": {
"name": "filter_completed_at",
"description": "Filter by exact completion date (format: YYYY-MM-DD)"
},
"filter[inserted_at]": {
"name": "filter_inserted_at",
"description": "Filter by exact insertion date (format: YYYY-MM-DD)"
},
"filter[name]": {
"name": "filter_name",
"description": "Filter by exact scan name"
},
"filter[name__icontains]": {
"name": "filter_name_contains",
"description": "Filter by partial scan name match"
},
"filter[next_scan_at]": {
"name": "filter_next_scan_at",
"description": "Filter by exact next scan date (format: YYYY-MM-DD)"
},
"filter[next_scan_at__gte]": {
"name": "filter_next_scan_at_gte",
"description": "Filter scans scheduled on or after this date (format: YYYY-MM-DD)"
},
"filter[next_scan_at__lte]": {
"name": "filter_next_scan_at_lte",
"description": "Filter scans scheduled on or before this date (format: YYYY-MM-DD)"
},
"filter[provider]": {
"name": "filter_provider",
"description": "Filter by exact provider UUID (UUIDv4). This UUID is generated by Prowler and it is not related with the UID of the provider (that is the one that is set by the provider). The format is UUIDv4: \"4d0e2614-6385-4fa7-bf0b-c2e2f75c6877\""
},
"filter[provider__in]": {
"name": "filter_provider_in",
"description": "Filter by multiple provider UUIDs (comma-separated UUIDs, e.g. \"4d0e2614-6385-4fa7-bf0b-c2e2f75c6877,deadbeef-1234-5678-9abc-def012345678,0f1e2d3c-4b5a-6978-8c9d-0e1f2a3b4c5d\"). Useful when searching for multiple providers at once."
},
"filter[provider_alias]": {
"name": "filter_provider_alias",
"description": "Filter by exact provider alias name"
},
"filter[provider_alias__icontains]": {
"name": "filter_provider_alias_contains",
"description": "Filter by partial provider alias match"
},
"filter[provider_alias__in]": {
"name": "filter_provider_alias_in",
"description": "Filter by multiple provider aliases (comma-separated)"
},
"filter[provider_type]": {
"name": "filter_provider_type",
"description": "Filter by single provider type (aws, azure, gcp, github, kubernetes, m365)"
},
"filter[provider_type__in]": {
"name": "filter_provider_type_in",
"description": "Filter by multiple provider types (comma-separated, e.g. \"aws,azure,gcp\"). Allowed values are: aws, azure, gcp, kubernetes, m365, github"
},
"filter[provider_uid]": {
"name": "filter_provider_uid",
"description": "Filter by exact provider UID. This UID is dependent on the provider type: \n\tAWS: AWS account ID\n\tAzure: Azure subscription ID\n\tGCP: GCP project ID\n\tKubernetes: Kubernetes namespace\n\tM365: M365 domain ID\n\tGitHub: GitHub username or organization name"
},
"filter[provider_uid__icontains]": {
"name": "filter_provider_uid_contains",
"description": "Filter by partial provider UID match"
},
"filter[provider_uid__in]": {
"name": "filter_provider_uid_in",
"description": "Filter by multiple provider UIDs (comma-separated)"
},
"filter[scheduled_at]": {
"name": "filter_scheduled_at",
"description": "Filter by exact scheduled date (format: YYYY-MM-DD)"
},
"filter[scheduled_at__gte]": {
"name": "filter_scheduled_at_gte",
"description": "Filter scans scheduled on or after this date (format: YYYY-MM-DD)"
},
"filter[scheduled_at__lte]": {
"name": "filter_scheduled_at_lte",
"description": "Filter scans scheduled on or before this date (format: YYYY-MM-DD)"
},
"filter[search]": {
"name": "filter_search",
"description": "Search term across multiple scan attributes including: name (scan name), trigger (Manual/Scheduled), state (Available, Executing, Completed, Failed, etc.), unique_resource_count (number of resources found), progress (scan progress percentage), duration (scan duration), scheduled_at (when scan is scheduled), started_at (when scan started), completed_at (when scan completed), and next_scan_at (next scheduled scan time)"
},
"filter[started_at]": {
"name": "filter_started_at",
"description": "Filter by exact start date (format: YYYY-MM-DD)"
},
"filter[started_at__gte]": {
"name": "filter_started_at_gte",
"description": "Filter scans started on or after this date (format: YYYY-MM-DD)"
},
"filter[started_at__lte]": {
"name": "filter_started_at_lte",
"description": "Filter scans started on or before this date (format: YYYY-MM-DD)"
},
"filter[state]": {
"name": "filter_state",
"description": "Filter by exact scan state"
},
"filter[state__in]": {
"name": "filter_state_in",
"description": "Filter by multiple scan states (comma-separated)"
},
"filter[trigger]": {
"name": "filter_trigger",
"description": "Filter by scan trigger type"
},
"filter[trigger__in]": {
"name": "filter_trigger_in",
"description": "Filter by multiple trigger types (comma-separated)"
},
"include": {
"name": "include",
"description": "Include related resources in the response, supported value is \"provider\""
},
"page[number]": {
"name": "page_number",
"description": "Page number to retrieve (default: 1)"
},
"page[size]": {
"name": "page_size",
"description": "Number of results per page (default: 100)"
},
"sort": {
"name": "sort",
"description": "Sort the results by the specified fields. Use '-' prefix for descending order. (e.g. \"-started_at,name\")"
}
}
},
"POST /api/v1/scans": {
"name": "create_scan",
"description": "Trigger a manual scan for a specific provider",
"parameters": {
"provider_id": {
"name": "provider_id",
"description": "Prowler generated UUID of the provider to scan. The format is UUIDv4: \"4d0e2614-6385-4fa7-bf0b-c2e2f75c6877\""
},
"name": {
"description": "Optional name for the scan"
}
}
},
"GET /api/v1/scans/{id}": {
"name": "get_scan",
"description": "Get detailed information about a specific scan",
"parameters": {
"fields[scans]": {
"name": "fields",
"description": "The tool will return only the specified fields, if not set all are returned (comma-separated, e.g. \"name,state,progress,duration\")"
},
"include": {
"description": "Include related resources in the response, supported value is \"provider\""
}
}
},
"PATCH /api/v1/scans/{id}": {
"name": "update_scan",
"description": "Update the details of a specific scan",
"parameters": {
"name": {
"description": "Name for the scan to be updated"
}
}
},
"GET /api/v1/scans/{id}/compliance/{name}": {
"name": "get_scan_compliance_report",
"description": "Download a specific compliance report (e.g., 'cis_1.4_aws') as a CSV file",
"parameters": {
"name": {
"name": "compliance_name"
},
"fields[scan-reports]": {
"name": "fields",
"description": "The tool will return only the specified fields, if not set all are returned (comma-separated, e.g. \"id,name\")"
}
}
},
"GET /api/v1/scans/{id}/report": {
"name": "get_scan_report",
"description": "Download a ZIP file containing the scan report",
"parameters": {
"fields[scan-reports]": {
"name": "fields",
"description": "Not use this parameter for now"
}
}
},
"POST /api/v1/schedules/daily": {
"name": "schedules_daily_scan",
"parameters": {
"provider_id": {
"name": "provider_id",
"description": "Prowler generated UUID of the provider to scan. The format is UUIDv4: \"4d0e2614-6385-4fa7-bf0b-c2e2f75c6877\""
}
}
}
}
}
@@ -0,0 +1,974 @@
#!/usr/bin/env python3
"""
Generate FastMCP server code from OpenAPI specification.
This script parses an OpenAPI specification file and generates FastMCP tool functions
with proper type hints, parameters, and docstrings.
"""
import json
import os
import re
from datetime import datetime
from pathlib import Path
from typing import Optional
import requests
import yaml
class OpenAPIToMCPGenerator:
def __init__(
self,
spec_file: str,
custom_auth_module: Optional[str] = None,
exclude_patterns: Optional[list[str]] = None,
exclude_operations: Optional[list[str]] = None,
exclude_tags: Optional[list[str]] = None,
include_only_tags: Optional[list[str]] = None,
config_file: Optional[str] = None,
):
"""
Initialize the generator with an OpenAPI spec file.
Args:
spec_file: Path to OpenAPI specification file
custom_auth_module: Module path for custom authentication
exclude_patterns: list of regex patterns to exclude endpoints (matches against path)
exclude_operations: list of operation IDs to exclude
exclude_tags: list of tags to exclude
include_only_tags: If specified, only include endpoints with these tags
config_file: Path to JSON configuration file for custom mappings
"""
self.spec_file = spec_file
self.custom_auth_module = custom_auth_module
self.exclude_patterns = exclude_patterns or []
self.exclude_operations = exclude_operations or []
self.exclude_tags = exclude_tags or []
self.include_only_tags = include_only_tags
self.config_file = config_file
self.config = self._load_config() if config_file else {}
self.spec = self._load_spec()
self.generated_tools = []
self.imports = set()
self.needs_query_array_normalizer = False
self.type_mapping = {
"string": "str",
"integer": "str",
"number": "str",
"boolean": "bool | str",
"array": "list[Any] | str",
"object": "dict[str, Any] | str",
}
def _load_config(self) -> dict:
"""Load configuration from JSON file."""
try:
with open(self.config_file, "r") as f:
return json.load(f)
except FileNotFoundError:
return {}
except json.JSONDecodeError:
return {}
def _load_spec(self) -> dict:
"""Load OpenAPI specification from file."""
with open(self.spec_file, "r") as f:
if self.spec_file.endswith(".yaml") or self.spec_file.endswith(".yml"):
return yaml.safe_load(f)
else:
return json.load(f)
def _get_endpoint_config(self, path: str, method: str) -> dict:
"""Get endpoint configuration from config file with pattern matching and inheritance.
Configuration resolution order (most to least specific):
1. Exact endpoint match (e.g., "GET /api/v1/findings/metadata")
2. Pattern matches, sorted by specificity:
- Patterns without wildcards are more specific
- Longer patterns are more specific
- Example: "GET /api/v1/findings/*" matches all findings endpoints
When multiple configurations match, they are merged with more specific
configurations overriding less specific ones.
"""
if not self.config:
return {}
endpoint_key = f"{method.upper()} {path}"
merged_config = {}
# Get endpoints configuration (now supports both exact and pattern matches)
endpoints = self.config.get("endpoints", {})
# Separate exact matches from patterns
exact_match = None
pattern_matches = []
for config_key, config_value in endpoints.items():
if "*" in config_key or "?" in config_key:
# This is a pattern - convert wildcards to regex
regex_pattern = config_key.replace("*", ".*").replace("?", ".")
if re.match(f"^{regex_pattern}$", endpoint_key):
pattern_matches.append((config_key, config_value))
elif config_key == endpoint_key:
# Exact match
exact_match = (config_key, config_value)
# Also check for patterns in endpoint_patterns for backward compatibility
endpoint_patterns = self.config.get("endpoint_patterns", {})
for pattern, pattern_config in endpoint_patterns.items():
regex_pattern = pattern.replace("*", ".*").replace("?", ".")
if re.match(f"^{regex_pattern}$", endpoint_key):
pattern_matches.append((pattern, pattern_config))
# Sort pattern matches by specificity
# More specific patterns should be applied last to override less specific ones
pattern_matches.sort(
key=lambda x: (
x[0].count("*") + x[0].count("?"), # Fewer wildcards = more specific
-len(
x[0]
), # Longer patterns = more specific (negative for reverse sort)
),
reverse=True,
) # Reverse so least specific comes first
# Apply configurations from least to most specific
# First apply pattern matches (from least to most specific)
for pattern, pattern_config in pattern_matches:
merged_config = self._merge_configs(merged_config, pattern_config)
# Finally apply exact match (most specific)
if exact_match:
merged_config = self._merge_configs(merged_config, exact_match[1])
# Fallback to old endpoint_mappings for backward compatibility
if not merged_config:
endpoint_mappings = self.config.get("endpoint_mappings", {})
if endpoint_key in endpoint_mappings:
merged_config = {"name": endpoint_mappings[endpoint_key]}
return merged_config
def _merge_configs(self, base_config: dict, override_config: dict) -> dict:
"""Merge two configurations, with override_config taking precedence.
Special handling for parameters: merges parameter configurations deeply.
"""
import copy
result = copy.deepcopy(base_config)
for key, value in override_config.items():
if key == "parameters" and key in result:
# Deep merge parameters
if not isinstance(result[key], dict):
result[key] = {}
if isinstance(value, dict):
for param_name, param_config in value.items():
if param_name in result[key] and isinstance(
result[key][param_name], dict
):
# Merge parameter configurations
result[key][param_name] = {
**result[key][param_name],
**param_config,
}
else:
result[key][param_name] = param_config
else:
# For other keys, override completely
result[key] = value
return result
def _sanitize_function_name(self, operation_id: str) -> str:
"""Convert operation ID to valid Python function name."""
# Replace non-alphanumeric characters with underscores
name = re.sub(r"[^a-zA-Z0-9_]", "_", operation_id)
# Ensure it doesn't start with a number
if name and name[0].isdigit():
name = f"op_{name}"
return name.lower()
def _get_python_type(self, schema: dict) -> tuple[str, str]:
"""Convert OpenAPI schema to Python type hint.
Returns:
Tuple of (type_hint, original_type) where original_type is used for casting
"""
if not schema:
return "Any", "any"
# Handle oneOf/anyOf/allOf schemas - these are typically objects
if "oneOf" in schema or "anyOf" in schema or "allOf" in schema:
# These are complex schemas, typically representing different object variants
return "dict[str, Any] | str", "object"
schema_type = schema.get("type", "string")
# Handle enums
if "enum" in schema:
enum_values = schema["enum"]
if all(isinstance(v, str) for v in enum_values):
# Create Literal type for string enums - already strings, no casting needed
self.imports.add("from typing import Literal")
enum_str = ", ".join(f'"{v}"' for v in enum_values)
return f"Literal[{enum_str}]", "string"
else:
return self.type_mapping.get(schema_type, "Any"), schema_type
# Handle arrays
if schema_type == "array":
return "list[Any] | str", "array"
# Handle format specifications
if schema_type == "string":
format_type = schema.get("format", "")
if format_type in ["date", "date-time", "uuid", "email"]:
return "str", "string"
return self.type_mapping.get(schema_type, "Any"), schema_type
def _resolve_ref(self, ref: str) -> dict:
"""Resolve a $ref reference in the OpenAPI spec."""
if not ref.startswith("#/"):
return {}
# Split the reference path
ref_parts = ref[2:].split("/") # Remove '#/' and split
# Navigate through the spec to find the referenced schema
resolved = self.spec
for part in ref_parts:
resolved = resolved.get(part, {})
return resolved
def _extract_parameters(
self, operation: dict, endpoint_config: Optional[dict] = None
) -> list[dict]:
"""Extract and process parameters from an operation."""
parameters = []
for param in operation.get("parameters", []):
# Sanitize parameter name for Python
python_name = (
param.get("name", "")
.replace("[", "_")
.replace("]", "")
.replace(".", "_")
.replace("-", "_")
) # Also replace hyphens
type_hint, original_type = self._get_python_type(param.get("schema", {}))
param_info = {
"name": param.get("name", ""),
"python_name": python_name,
"in": param.get("in", "query"),
"required": param.get("required", False),
"description": param.get("description", ""),
"type": type_hint,
"original_type": original_type,
"original_schema": param.get("schema", {}),
}
# Apply custom parameter configuration from endpoint config
if endpoint_config and "parameters" in endpoint_config:
param_config = endpoint_config["parameters"]
if param_info["name"] in param_config:
custom_param = param_config[param_info["name"]]
if "name" in custom_param:
param_info["python_name"] = custom_param["name"]
if "description" in custom_param:
param_info["description"] = custom_param["description"]
parameters.append(param_info)
# Handle request body if present - extract as individual parameters
if "requestBody" in operation:
body = operation["requestBody"]
content = body.get("content", {})
# Check for different content types
schema = None
if "application/vnd.api+json" in content:
schema = content["application/vnd.api+json"].get("schema", {})
elif "application/json" in content:
schema = content["application/json"].get("schema", {})
if schema:
# Resolve $ref if present
if "$ref" in schema:
schema = self._resolve_ref(schema["$ref"])
# Try to extract individual fields from the schema
body_params = self._extract_body_parameters(
schema, body.get("required", False)
)
# Apply custom parameter config to body parameters
if endpoint_config and "parameters" in endpoint_config:
param_config = endpoint_config["parameters"]
for param in body_params:
if param["name"] in param_config:
custom_param = param_config[param["name"]]
if "name" in custom_param:
param["python_name"] = custom_param["name"]
if "description" in custom_param:
param["description"] = custom_param["description"]
parameters.extend(body_params)
return parameters
def _extract_body_parameters(self, schema: dict, is_required: bool) -> list[dict]:
"""Extract individual parameters from request body schema."""
parameters = []
# Handle JSON:API format with data.attributes structure
if "properties" in schema:
data = schema["properties"].get("data", {})
if "properties" in data:
# Extract attributes
attributes = data["properties"].get("attributes", {})
if "properties" in attributes:
# Get required fields from attributes
required_attrs = attributes.get("required", [])
for prop_name, prop_schema in attributes["properties"].items():
# Skip read-only fields for POST/PUT/PATCH operations
if prop_schema.get("readOnly", False):
continue
python_name = prop_name.replace("-", "_")
# Check if this field is required
is_field_required = prop_name in required_attrs
type_hint, original_type = self._get_python_type(prop_schema)
param_info = {
"name": prop_name, # Keep original name for API
"python_name": python_name,
"in": "body",
"required": is_field_required,
"description": prop_schema.get(
"description",
prop_schema.get("title", f"{prop_name} parameter"),
),
"type": type_hint,
"original_type": original_type,
"original_schema": prop_schema,
"resource_type": (
data["properties"]
.get("type", {})
.get("enum", ["resource"])[0]
if "type" in data["properties"]
else "resource"
),
}
parameters.append(param_info)
# Also check for relationships (like provider_id)
relationships = data["properties"].get("relationships", {})
if "properties" in relationships:
required_rels = relationships.get("required", [])
for rel_name, rel_schema in relationships["properties"].items():
# Extract ID from relationship
python_name = f"{rel_name}_id"
is_rel_required = rel_name in required_rels
param_info = {
"name": f"{rel_name}_id",
"python_name": python_name,
"in": "body",
"required": is_rel_required,
"description": f"ID of the related {rel_name}",
"type": "str",
"original_type": "string",
"original_schema": rel_schema,
}
parameters.append(param_info)
# If no structured params found, fall back to generic body parameter
if not parameters and schema:
parameters.append(
{
"name": "body",
"python_name": "body",
"in": "body",
"required": is_required,
"description": "Request body data",
"type": "dict[str, Any] | str",
"original_type": "object",
"original_schema": schema,
}
)
return parameters
def _generate_docstring(
self,
operation: dict,
parameters: list[dict],
path: str,
method: str,
endpoint_config: Optional[dict] = None,
) -> str:
"""Generate a comprehensive docstring for the tool function."""
lines = []
# Main description - use custom or default
endpoint_config = endpoint_config or {}
# Use custom description if provided, otherwise fall back to OpenAPI
if "description" in endpoint_config:
lines.append(f' """{endpoint_config["description"]}')
else:
summary = operation.get("summary", "")
description = operation.get("description", "")
if summary:
lines.append(f' """{summary}')
else:
lines.append(f' """Execute {method.upper()} {path}')
if "description" not in endpoint_config:
# Only add OpenAPI description if no custom description was provided
description = operation.get("description", "")
if description and description != summary:
lines.append("")
# Clean up description - remove extra whitespace
clean_desc = " ".join(description.split())
lines.append(f" {clean_desc}")
# Add endpoint info
lines.append("")
lines.append(f" Endpoint: {method.upper()} {path}")
# Parameters section
if parameters:
lines.append("")
lines.append(" Args:")
for param in parameters:
# Use custom description if available
param_desc = param["description"] or "Self-explanatory parameter"
# Handle multi-line descriptions properly
required_text = "(required)" if param["required"] else "(optional)"
if "\n" in param_desc:
# Split on actual newlines (not escaped)
desc_lines = param_desc.split("\n")
first_line = desc_lines[0].strip()
lines.append(
f" {param['python_name']} {required_text}: {first_line}"
)
# Add subsequent lines with proper indentation (12 spaces for continuation)
for desc_line in desc_lines[1:]:
desc_line = desc_line.strip()
if desc_line:
lines.append(f" {desc_line}")
else:
# Clean up parameter description for single line
param_desc = " ".join(param_desc.split())
lines.append(
f" {param['python_name']} {required_text}: {param_desc}"
)
# Add enum values if present
if "enum" in param.get("original_schema", {}):
enum_values = param["original_schema"]["enum"]
lines.append(
f" Allowed values: {', '.join(str(v) for v in enum_values)}"
)
# Returns section
lines.append("")
lines.append(" Returns:")
lines.append(" dict containing the API response")
lines.append(' """')
return "\n".join(lines)
def _generate_function_signature(
self, func_name: str, parameters: list[dict]
) -> str:
"""Generate the function signature with proper type hints."""
# Sort parameters: required first, then optional
sorted_params = sorted(
parameters, key=lambda x: (not x["required"], x["python_name"])
)
param_strings = []
for param in sorted_params:
if param["required"]:
param_strings.append(f" {param['python_name']}: {param['type']}")
else:
param_strings.append(
f" {param['python_name']}: Optional[{param['type']}] = None"
)
if param_strings:
params_str = ",\n".join(param_strings)
return f"async def {func_name}(\n{params_str}\n) -> dict[str, Any]:"
else:
return f"async def {func_name}() -> dict[str, Any]:"
def _get_cast_expression(self, param: dict) -> str:
"""Generate type casting expression for a parameter.
Args:
param: Parameter dict with 'python_name' and 'original_type'
Returns:
Expression string that casts the parameter value to the correct type
"""
python_name = param["python_name"]
original_type = param.get("original_type", "string")
if original_type == "boolean":
# Convert string to boolean using simple comparison
return f"({python_name}.lower() in ('true', '1', 'yes', 'on') if isinstance({python_name}, str) else {python_name})"
elif original_type == "array":
if param.get("in") == "query":
self.needs_query_array_normalizer = True
return f"_normalize_query_array({python_name})"
return f"json.loads({python_name}) if isinstance({python_name}, str) else {python_name}"
elif original_type == "object":
return f"json.loads({python_name}) if isinstance({python_name}, str) else {python_name}"
else:
# string or any other type - no casting needed
return python_name
def _generate_function_body(
self, path: str, method: str, parameters: list[dict], operation_id: str
) -> str:
"""Generate the function body for making API calls."""
lines = []
# Add try block
lines.append(" try:")
# Get authentication token if custom auth module is provided
if self.custom_auth_module:
lines.append(" token = await auth_manager.get_valid_token()")
lines.append("")
# Build parameters
query_params = [p for p in parameters if p["in"] == "query"]
path_params = [p for p in parameters if p["in"] == "path"]
body_params = [p for p in parameters if p["in"] == "body"]
# Add json import if needed for object or array type casting
if any(p.get("original_type") in ["object", "array"] for p in parameters):
self.imports.add("import json")
# Build query parameters
if query_params:
lines.append(" params = {}")
for param in query_params:
cast_expr = self._get_cast_expression(param)
if param["required"]:
lines.append(f" params['{param['name']}'] = {cast_expr}")
else:
lines.append(f" if {param['python_name']} is not None:")
lines.append(f" params['{param['name']}'] = {cast_expr}")
lines.append("")
# Build path with path parameters
final_path = path
for param in path_params:
cast_expr = self._get_cast_expression(param)
lines.append(
f" path = '{path}'.replace('{{{param['name']}}}', str({cast_expr}))"
)
final_path = "path"
# Build request body if there are body parameters
if body_params:
# Check if we have individual params or a single body param
if len(body_params) == 1 and body_params[0]["python_name"] == "body":
# Single body parameter - use it directly with casting
cast_expr = self._get_cast_expression(body_params[0])
lines.append(f" request_body = {cast_expr}")
else:
# Get resource type from first body param (they should all have the same)
resource_type = (
body_params[0].get("resource_type", "resource")
if body_params
else "resource"
)
# Build JSON:API structure from individual parameters
lines.append(" # Build request body")
lines.append(" request_body = {")
lines.append(' "data": {')
lines.append(f' "type": "{resource_type}"')
# Separate attributes from relationships
# Note: Check if param was originally from attributes section, not just by name
attribute_params = []
relationship_params = []
for p in body_params:
# If this param came from the attributes section (has resource_type), it's an attribute
# even if its name ends with _id
if "resource_type" in p:
attribute_params.append(p)
elif p["python_name"].endswith("_id") and "resource_type" not in p:
relationship_params.append(p)
else:
attribute_params.append(p)
if attribute_params:
lines.append(",")
lines.append(' "attributes": {}')
lines.append(" }")
lines.append(" }")
if attribute_params:
lines.append("")
lines.append(" # Add attributes")
for param in attribute_params:
cast_expr = self._get_cast_expression(param)
if param["required"]:
lines.append(
f' request_body["data"]["attributes"]["{param["name"]}"] = {cast_expr}'
)
else:
lines.append(
f" if {param['python_name']} is not None:"
)
lines.append(
f' request_body["data"]["attributes"]["{param["name"]}"] = {cast_expr}'
)
if relationship_params:
lines.append("")
lines.append(" # Add relationships")
lines.append(' request_body["data"]["relationships"] = {}')
for param in relationship_params:
rel_name = param["python_name"].replace("_id", "")
cast_expr = self._get_cast_expression(param)
if param["required"]:
lines.append(
f' request_body["data"]["relationships"]["{rel_name}"] = {{'
)
lines.append(' "data": {')
lines.append(f' "type": "{rel_name}s",')
lines.append(f' "id": {cast_expr}')
lines.append(" }")
lines.append(" }")
else:
lines.append(
f" if {param['python_name']} is not None:"
)
lines.append(
f' request_body["data"]["relationships"]["{rel_name}"] = {{'
)
lines.append(' "data": {')
lines.append(f' "type": "{rel_name}s",')
lines.append(f' "id": {cast_expr}')
lines.append(" }")
lines.append(" }")
lines.append("")
# Build the request URL
url_line = (
f'f"{{auth_manager.base_url}}{{{final_path}}}"'
if final_path == "path"
else f'f"{{auth_manager.base_url}}{path}"'
)
lines.append(f" url = {url_line}")
lines.append("")
# Build request parameters
request_params = ["url"]
if self.custom_auth_module:
request_params.append("headers=auth_manager.get_headers(token)")
if query_params:
request_params.append("params=params")
if body_params:
request_params.append("json=request_body")
params_str = ",\n ".join(request_params)
lines.append(f" response = await prowler_app_client.{method}(")
lines.append(f" {params_str}")
lines.append(" )")
lines.append(" response.raise_for_status()")
lines.append("")
# Parse response
lines.append(" data = response.json()")
lines.append("")
lines.append(" return {")
lines.append(' "success": True,')
lines.append(' "data": data.get("data", data),')
lines.append(' "meta": data.get("meta", {}),')
lines.append(" }")
lines.append("")
# Exception handling
lines.append(" except Exception as e:")
lines.append(" return {")
lines.append(' "success": False,')
lines.append(
f' "error": f"Failed to execute {operation_id}: {{str(e)}}"'
)
lines.append(" }")
return "\n".join(lines)
def _should_exclude_endpoint(self, path: str, operation: dict) -> bool:
"""
Determine if an endpoint should be excluded from generation.
Args:
path: The API endpoint path
operation: The operation dictionary from OpenAPI spec
Returns:
True if endpoint should be excluded, False otherwise
"""
# Check if operation is marked as deprecated
if operation.get("deprecated", False):
return True
# Check operation ID exclusion
operation_id = operation.get("operationId", "")
if operation_id in self.exclude_operations:
return True
# Check path pattern exclusion
for pattern in self.exclude_patterns:
if re.search(pattern, path):
return True
# Check tags
tags = operation.get("tags", [])
# If include_only_tags is specified, exclude if no matching tag
if self.include_only_tags:
if not any(tag in self.include_only_tags for tag in tags):
return True
# Check excluded tags
if any(tag in self.exclude_tags for tag in tags):
return True
return False
def generate_tools(self) -> str:
"""Generate all FastMCP tools from the OpenAPI spec."""
output_lines = []
# Generate header
output_lines.append('"""')
output_lines.append("Auto-generated FastMCP server from OpenAPI specification")
output_lines.append(f"Generated on: {datetime.now().isoformat()}")
output_lines.append(
f"Source: {self.spec_file} (version: {self.spec.get('info', {}).get('version', 'unknown')})"
)
output_lines.append('"""')
output_lines.append("")
# Add imports
self.imports.add("from typing import Any, Optional")
self.imports.add("import httpx")
self.imports.add("from fastmcp import FastMCP")
if self.custom_auth_module:
self.imports.add(f"from {self.custom_auth_module} import ProwlerAppAuth")
# Process all paths and operations
paths = self.spec.get("paths", {})
tools_by_tag = {} # Group tools by tag for better organization
excluded_count = 0
for path, path_item in paths.items():
for method in ["get", "post", "put", "patch", "delete"]:
if method in path_item:
operation = path_item[method]
# Check if this endpoint should be excluded
if self._should_exclude_endpoint(path, operation):
excluded_count += 1
continue
operation_id = operation.get("operationId", f"{method}_{path}")
tags = operation.get("tags", ["default"])
# Get endpoint configuration
endpoint_config = self._get_endpoint_config(path, method)
# Use custom function name if provided
if "name" in endpoint_config:
func_name = endpoint_config["name"]
else:
func_name = self._sanitize_function_name(operation_id)
parameters = self._extract_parameters(operation, endpoint_config)
tool_code = []
# Add @app_mcp_server.tool() decorator
tool_code.append("@app_mcp_server.tool()")
# Generate function signature
tool_code.append(
self._generate_function_signature(func_name, parameters)
)
# Generate docstring with custom description if provided
tool_code.append(
self._generate_docstring(
operation, parameters, path, method, endpoint_config
)
)
# Generate function body
tool_code.append(
self._generate_function_body(
path, method, parameters, operation_id
)
)
# Group by tag
for tag in tags:
if tag not in tools_by_tag:
tools_by_tag[tag] = []
tools_by_tag[tag].append("\n".join(tool_code))
# Write imports (consolidate typing imports)
typing_imports = set()
other_imports = []
for imp in sorted(self.imports):
if imp.startswith("from typing import"):
# Extract the imported items
items = imp.replace("from typing import", "").strip()
typing_imports.update([item.strip() for item in items.split(",")])
else:
other_imports.append(imp)
# Add consolidated typing import if needed
if typing_imports:
output_lines.append(
f"from typing import {', '.join(sorted(typing_imports))}"
)
# Add other imports
for imp in other_imports:
output_lines.append(imp)
if self.needs_query_array_normalizer:
output_lines.append("")
output_lines.append("")
output_lines.append("def _normalize_query_array(value):")
output_lines.append(
' """Normalize query array inputs to comma-separated strings."""'
)
output_lines.append(" if isinstance(value, str):")
output_lines.append(" stripped = value.strip()")
output_lines.append(" if not stripped:")
output_lines.append(" return stripped")
output_lines.append(
" if stripped.startswith('[') and stripped.endswith(']'):"
)
output_lines.append(" try:")
output_lines.append(" parsed = json.loads(stripped)")
output_lines.append(" except json.JSONDecodeError:")
output_lines.append(" return stripped")
output_lines.append(" if isinstance(parsed, list):")
output_lines.append(
" return ','.join(str(item) for item in parsed)"
)
output_lines.append(" return stripped")
output_lines.append(" if isinstance(value, (list, tuple, set)):")
output_lines.append(" return ','.join(str(item) for item in value)")
output_lines.append(
" if hasattr(value, '__iter__') and not isinstance(value, dict):"
)
output_lines.append(" try:")
output_lines.append(
" return ','.join(str(item) for item in value)"
)
output_lines.append(" except TypeError:")
output_lines.append(" return str(value)")
output_lines.append(" if isinstance(value, dict):")
output_lines.append(" return ','.join(str(key) for key in value)")
output_lines.append(" return str(value)")
output_lines.append("")
output_lines.append("# Initialize MCP server")
output_lines.append('app_mcp_server = FastMCP("prowler-app")')
output_lines.append("")
if self.custom_auth_module:
output_lines.append("# Initialize authentication manager")
output_lines.append("auth_manager = ProwlerAppAuth()")
output_lines.append("")
output_lines.append("# Initialize HTTP client")
output_lines.append("prowler_app_client = httpx.AsyncClient(")
output_lines.append(" timeout=30.0,")
output_lines.append(")")
output_lines.append("")
# Write tools grouped by tag
for tag, tools in tools_by_tag.items():
output_lines.append("")
output_lines.append("# " + "=" * 76)
output_lines.append(f"# {tag.upper()} ENDPOINTS")
output_lines.append("# " + "=" * 76)
output_lines.append("")
for tool in tools:
output_lines.append("")
output_lines.append(tool)
return "\n".join(output_lines)
def save_to_file(self, output_file: str):
"""Save the generated code to a file."""
generated_code = self.generate_tools()
Path(output_file).write_text(generated_code)
def generate_server_file():
# Get the spec file from the API directly (https://api.prowler.com/api/v1/schema)
api_base_url = os.getenv("PROWLER_API_BASE_URL", "https://api.prowler.com")
spec_file = f"{api_base_url}/api/v1/schema"
# Download the spec yaml file
response = requests.get(spec_file)
response.raise_for_status()
spec_data = response.text
# Save the spec data to a file
with open(str(Path(__file__).parent / "schema.yaml"), "w") as f:
f.write(spec_data)
# Example usage
generator = OpenAPIToMCPGenerator(
spec_file=str(Path(__file__).parent / "schema.yaml"),
custom_auth_module="prowler_mcp_server.prowler_app.utils.auth",
include_only_tags=[
"Provider",
"Scan",
"Schedule",
"Finding",
"Processor",
],
config_file=str(
Path(__file__).parent / "mcp_config.json"
), # Use custom naming config
)
# Generate and save the MCP server
generator.save_to_file(str(Path(__file__).parent.parent / "server.py"))
@@ -1,79 +0,0 @@
"""Utility for auto-discovering and loading MCP tools.
This module provides functionality to automatically discover and register
all BaseTool subclasses from the tools package.
"""
import importlib
import pkgutil
from fastmcp import FastMCP
from prowler_mcp_server.lib.logger import logger
from prowler_mcp_server.prowler_app.tools.base import BaseTool
def load_all_tools(mcp: FastMCP) -> None:
"""Auto-discover and load all BaseTool subclasses from the tools package.
This function:
1. Dynamically imports all Python modules in the tools package
2. Discovers all concrete BaseTool subclasses
3. Instantiates each tool class
4. Registers all tools with the provided FastMCP instance
Args:
mcp: The FastMCP instance to register tools with
TOOLS_PACKAGE: The package path containing tool modules (default: prowler_mcp_server.prowler_app.tools)
Example:
from fastmcp import FastMCP
from prowler_mcp_server.prowler_app.utils.tool_loader import load_all_tools
app = FastMCP("prowler-app")
load_all_tools(app)
"""
TOOLS_PACKAGE = "prowler_mcp_server.prowler_app.tools"
logger.info(f"Auto-discovering tools from package: {TOOLS_PACKAGE}")
# Import the tools package
try:
tools_module = importlib.import_module(TOOLS_PACKAGE)
except ImportError as e:
logger.error(f"Failed to import tools package {TOOLS_PACKAGE}: {e}")
return
# Get the package path
if hasattr(tools_module, "__path__"):
package_path = tools_module.__path__
else:
logger.error(f"Package {TOOLS_PACKAGE} has no __path__ attribute")
return
# Import all modules in the package
for _, module_name, _ in pkgutil.iter_modules(package_path):
try:
full_module_name = f"{TOOLS_PACKAGE}.{module_name}"
importlib.import_module(full_module_name)
logger.debug(f"Imported module: {full_module_name}")
except Exception as e:
logger.error(f"Failed to import module {module_name}: {e}")
# Discover all concrete BaseTool subclasses
concrete_tools = [
tool_class
for tool_class in BaseTool.__subclasses__()
if not getattr(tool_class, "__abstractmethods__", None)
]
logger.info(f"Discovered {len(concrete_tools)} tool classes")
# Instantiate and register each tool
for tool_class in concrete_tools:
try:
tool_instance = tool_class()
tool_instance.register_tools(mcp)
logger.info(f"Loaded and registered: {tool_class.__name__}")
except Exception as e:
logger.error(f"Failed to load tool {tool_class.__name__}: {e}")
logger.info("Tool loading complete")
+12
View File
@@ -1,4 +1,5 @@
import asyncio
import os
from fastmcp import FastMCP
from prowler_mcp_server import __version__
@@ -23,6 +24,17 @@ async def setup_main_server():
# Import Prowler App tools with prowler_app_ prefix
try:
logger.info("Importing Prowler App server...")
if not os.path.exists(
os.path.join(os.path.dirname(__file__), "prowler_app", "server.py")
):
from prowler_mcp_server.prowler_app.utils.server_generator import (
generate_server_file,
)
logger.info("Prowler App server not found, generating...")
generate_server_file()
from prowler_mcp_server.prowler_app.server import app_mcp_server
await prowler_mcp_server.import_server(app_mcp_server, prefix="prowler_app")
+2 -2
View File
@@ -4,8 +4,8 @@ requires = ["setuptools>=61.0", "wheel"]
[project]
dependencies = [
"fastmcp==2.13.1",
"httpx>=0.28.0"
"fastmcp>=2.11.3",
"httpx>=0.27.0",
]
description = "MCP server for Prowler ecosystem"
name = "prowler-mcp"

Some files were not shown because too many files have changed in this diff Show More