Compare commits

...

62 Commits

Author SHA1 Message Date
Andoni A. b050c917c6 perf(attack-paths): optimize getPathEdges with O(1) adjacency maps
Pre-build parent and children maps at the start of traversal for O(1)
lookups instead of O(n) array searches per traversal step. This improves
performance for large attack path graphs.
2026-01-14 17:12:13 +01:00
Andoni A. 67933d7d2d Merge branch 'attack-paths-demo' into attack-paths-demo-extras 2026-01-14 17:06:33 +01:00
Andoni Alonso 39280c8b9b feat(attack-paths): add Bedrock and AttachRolePolicy privilege escalation queries (#9793) 2026-01-14 17:01:21 +01:00
Andoni Alonso 4bcaf29b32 feat(attack-paths): improve graph path highlighting (#9769) 2026-01-14 16:59:27 +01:00
Josema Camacho e95be697ef Prowler 511 leaving one database per scan (#9795) 2026-01-14 16:19:02 +01:00
Andoni A. 6fa4565ebd fix(attack-paths): connect virtual nodes from principal instead of effective_principal
When a principal can assume a role (effective_principal), the virtual
relationship was being created from effective_principal but the graph
showed the original principal, causing disconnected nodes.

Now the virtual relationship is created from the original principal,
keeping the graph fully connected while still detecting escalation
paths that require role assumption.
2026-01-14 09:18:18 +01:00
Andoni A. e426c29207 fix(attack-paths): remove duplicate path_target causing disconnected nodes
Removed the extra path_target re-match that was causing role nodes to appear
disconnected in the visualization. The target_role is now only connected via
virtual relationships (PASSES_ROLE → target_role → GRANTS_ACCESS), which
provides a cleaner and more accurate attack path visualization.
2026-01-14 09:11:27 +01:00
Andoni A. 1d8d4f9325 refactor(attack-paths): show target roles in PassRole escalation paths
Updated PassRole queries to display which specific role(s) can be passed
in the visualization, instead of just showing a count. The path now shows:

Principal → New Resource → Target Role → Privilege Escalation

This allows users to see exactly which admin roles a principal can pass
to escalate privileges, which is crucial for security analysis.

Queries updated: Lambda, ECS, Glue, Bedrock, CloudFormation
2026-01-14 09:06:11 +01:00
Andoni A. cad44a3510 fix(attack-paths): fix duplicate virtual nodes in priv escalation queries
Virtual nodes were being created for each result row, causing duplicates
in the graph visualization. Fixed by using aggregation pattern:
1. Deduplicate principals FIRST (before matching target roles)
2. Collect target roles per principal
3. Create ONE virtual node per principal with role count

Queries fixed:
- aws-iam-privesc-passrole-lambda
- aws-glue-privesc-passrole-dev-endpoint
- aws-bedrock-privesc-passrole-code-interpreter
- aws-cloudformation-privesc-passrole-create-stack

The virtual node description now shows "N admin role(s) can be passed"
instead of creating N separate nodes.
2026-01-14 08:58:26 +01:00
Andoni A. ee73e043f9 refactor(attack-paths): apply query improvements to remaining priv escalation queries
Apply the same patterns from PR #9770 to the other privilege escalation
queries that were missing the improvements:

- aws-iam-privesc-create-policy-version
- aws-iam-privesc-attach-role-policy-assume-role
- aws-iam-privesc-passrole-lambda
- aws-iam-privesc-role-chain

Changes applied:
- Add DISTINCT deduplication before creating virtual relationships
- Add re-match paths at the end for proper visualization
- Remove redundant path variables from RETURN statements
- Create unique virtual node IDs per principal->target pair
2026-01-13 16:39:32 +01:00
Andoni A. 815797bc2b fix(attack-paths): hide findings completely in full view
- Change opacity-based hiding to display:none for finding nodes
- Use visibility:hidden for finding edges in full view
- Add isFilteredView to useEffect dependency array
- In filtered view, all nodes/edges remain visible as expected
2026-01-12 17:13:52 +01:00
Andoni A. 9cd249c561 fix(attack-paths): show findings at full opacity in filtered view
When in filtered view, findings are part of the selected path and
should be fully visible, not hidden with reduced opacity.

- Add isFilteredView prop to AttackPathGraph component
- Skip hiding findings when isFilteredView is true
2026-01-12 16:59:47 +01:00
Andoni A. 00fe96a9f7 refactor(attack-paths): redraw graph with filtered data for optimal layout
Reverts the visibility-based approach to use data-changing approach
which redraws the graph with only the selected path nodes, optimizing
the layout for the filtered view.

- Store fullData separately when entering filtered view
- Compute filtered subgraph with only visible nodes and edges
- Graph redraws with new data, auto-fitting to show selected path
- Restore fullData when exiting filtered view
2026-01-12 16:56:35 +01:00
Andoni A. 7c45ee1dbb refactor(attack-paths): use visibility-based filtering with D3 transitions
Replace data-replacement approach with visibleNodeIds Set to fix
animation issues caused by Next.js server component re-renders.

- Changed useGraphState to compute visibleNodeIds without modifying data
- Graph component now animates opacity changes via D3 transitions
- Keeps DOM structure stable while providing smooth visual transitions
- Findings now properly appear when filtering by a resource node
2026-01-12 16:50:25 +01:00
Andoni A. d19a23f829 feat(attack-paths): highlight selected node and show findings in filtered view
- Add orange glow filter and pulsing animation for selected/filtered nodes
- Pass isFilteredView prop to graph component to show findings when filtering
- Update node styling to show thicker border (4px) and orange glow on selection
- Ensure findings are visible in filtered view instead of being hidden by default
2026-01-12 16:41:09 +01:00
Andoni A. b071fffe57 feat(attack-paths): add filtered view when clicking on graph nodes
When clicking a node in the attack path graph:
- Filters the graph to show only upstream (ancestors) and downstream (descendants) paths
- Includes findings directly connected to the selected node
- Shows a "Back to Full View" button to restore the complete graph
- Displays an indicator showing which node is being filtered

Uses atomic Zustand state updates to ensure proper re-rendering of the D3 graph.
2026-01-12 16:33:30 +01:00
Andoni A. 422c55404b refactor(attack-paths): simplify legend by consolidating resource types
Merge all individual resource type items (AWS Account, EC2 Instance,
S3 Bucket, etc.) into a single "Resource" entry to reduce visual
clutter in the graph legend.
2026-01-12 16:26:20 +01:00
Andoni A. 6c307385b0 fix(attack-paths): preserve aws variable in ECS query WITH clause
Add aws to intermediate WITH clause to fix 'Variable aws not defined'
error when deduplicating principals before matching target roles.
2026-01-12 14:47:12 +01:00
Andoni A. 13964ccb1c fix(attack-paths): merge ECS target roles into single virtual node per principal
Instead of creating one virtual ECS task node per target role, merge all
target roles into a single node per principal. The node description shows
how many admin roles can be passed (e.g., '3 admin role(s) can be passed').

This reduces visual clutter when a principal can pass multiple admin roles.
2026-01-12 13:30:53 +01:00
Andoni A. 64ed526e31 refactor(attack-paths): rename virtual nodes from Malicious to New
The virtual nodes represent potential resources that could be created
for privilege escalation, not actual malicious resources. Renamed for
clarity:
- Malicious Task Definition -> New Task Definition
- Malicious Dev Endpoint -> New Dev Endpoint
- Malicious Code Interpreter -> New Code Interpreter
- Malicious Stack -> New Stack
2026-01-09 15:05:12 +01:00
Andoni A. 2388a053ee fix(attack-paths): highlight single path upstream instead of all paths
Changed upstream traversal to follow only one parent at each level
instead of all parents. This prevents the entire graph from lighting
up when selecting a node that has multiple ancestors with many children.

- Upstream: now uses find() to get first parent only
- Downstream: unchanged, still highlights all descendants
2026-01-08 18:18:44 +01:00
Andoni A. 7bb5354275 feat(attack-paths): improve graph visualization and interactions
- Change edge color from orange to white by default
- Highlight entire path in orange on node hover/selection
- Add Ctrl + scroll to zoom functionality with increased speed
- Update node borders to orange on hover/selection
- Add zoom hint to legend
- Remove hover effect from info button
2026-01-08 17:57:24 +01:00
Andoni A. 03cae9895b wip 2026-01-08 15:43:59 +01:00
Andoni A. e398b654d4 merge all privesc nodes into one, change it to look like a finding 2026-01-07 16:45:40 +01:00
Andoni A. d9e978af29 initial version 2026-01-07 10:50:29 +01:00
Josema Camacho 95d9e9a59f feat(attack-paths): Update Cartography dependency and its usage (#9593) 2025-12-18 15:52:15 +01:00
Josema Camacho 48f19d0f11 fix(attack-paths): neo4j.exceptions import (#9356) 2025-12-01 10:31:18 +01:00
Josema Camacho 345033e58a Fix attack paths demo neo4j conneciton (#9352)
Add retryable Neo4j session.
2025-11-29 12:55:49 +01:00
Alan Buscaglia 15cb87534c feat(attack-paths): apply Scope Rule pattern for feature-local organization (#9270)
Co-authored-by: Claude <noreply@anthropic.com>
2025-11-28 17:05:35 +01:00
Josema Camacho 5a85db103d feat(attack-paths): Task and endpoints (#9344)
- Added support to Neo4j
- Added Cartography as Attack Paths Scan
- Added Attack Path Scans endpoints for their management and run queries on those scan
2025-11-28 15:44:15 +01:00
César Arroba 2b86078d06 chore(api): build attack paths demo image (#9349) 2025-11-28 15:33:04 +01:00
lydiavilchez b2abdbeb60 feat(gcp-compute): add check to ensure VMs are not preemptible or spot (#9342)
Co-authored-by: HugoPBrito <hugopbrit@gmail.com>
2025-11-28 12:49:19 +01:00
lydiavilchez dc852b4595 feat(gcp-compute): add automatic restart check for VM instances (#9271) 2025-11-28 12:21:58 +01:00
Hugo Pereira Brito 1250f582a5 fix(check): custom check folder validation (#9335) 2025-11-28 12:19:47 +01:00
Pedro Martín bb43e924ee fix(report): use pagina for ENS in footer (#9345) 2025-11-28 12:04:30 +01:00
Andoni Alonso 0225627a98 fix(docs): fix image paths (#9341) 2025-11-28 11:20:54 +01:00
Alan Buscaglia 3097513525 fix(ui): filter Risk Pipeline chart by selected providers and show zero-data legends (#9340) 2025-11-27 17:39:01 +01:00
Alan Buscaglia 6af9ff4b4b feat(ui): add interactive charts with filter navigation (#9333) 2025-11-27 16:04:55 +01:00
Hugo Pereira Brito 06fa57a949 fix(docs): info warning format (#9339) 2025-11-27 09:57:05 -05:00
mattkeeler dc9e91ac4e fix(m365): Support multiple Exchange mailbox policies (#9241)
Co-authored-by: HugoPBrito <hugopbrit@gmail.com>
2025-11-27 14:10:15 +01:00
Shafkat Rahman 59f8dfe5ae feat(github): add immutable releases check (#9162)
Co-authored-by: Andoni Alonso <14891798+andoniaf@users.noreply.github.com>
2025-11-27 13:40:15 +01:00
Adrián Jesús Peña Rodríguez 7e0c5540bb feat(api): restore compliance overview endpoint (#9330) 2025-11-27 13:31:15 +01:00
Daniel Barranquero 79ec53bfc5 fix(ui): update changelog (#9334) 2025-11-27 13:16:50 +01:00
Daniel Barranquero ed5f6b3af6 feat(ui): add MongoDB Atlas provider support (#9253)
Co-authored-by: Alan Buscaglia <gentlemanprogramming@gmail.com>
2025-11-27 12:37:20 +01:00
Andoni Alonso 6e135abaa0 fix(iac): ignore mutelist in IaC scans (#9331) 2025-11-27 11:08:58 +01:00
Hugo Pereira Brito 65b054f798 feat: enhance m365 documentation (#9287) 2025-11-26 16:17:43 +01:00
Alan Buscaglia 28d5b2bb6c feat(ui): integrate threat map with regions API endpoint (#9324)
Co-authored-by: alejandrobailo <alejandrobailo94@gmail.com>
2025-11-26 16:12:31 +01:00
Prowler Bot c8d9f37e70 feat(aws): Update regions for AWS services (#9294)
Co-authored-by: prowler-bot <179230569+prowler-bot@users.noreply.github.com>
2025-11-26 09:42:40 -05:00
lydiavilchez 9d7b9c3327 feat(gcp): Add VPC Service Controls check for Cloud Storage (#9256)
Co-authored-by: Daniel Barranquero <danielbo2001@gmail.com>
2025-11-26 14:45:27 +01:00
Hugo Pereira Brito 127b8d8e56 fix: typo in pdf report generation (#9322)
Co-authored-by: Pepe Fagoaga <pepe@prowler.com>
2025-11-26 13:58:40 +01:00
Alan Buscaglia 4e9dd46a5e feat(ui): add Risk Pipeline View with Sankey chart to Overview page (#9320)
Co-authored-by: alejandrobailo <alejandrobailo94@gmail.com>
2025-11-26 13:33:58 +01:00
Hugo Pereira Brito 880345bebe fix(sharepoint): false positives on disabled external sharing (#9298) 2025-11-26 12:23:04 +01:00
Andoni Alonso 1259713fd6 docs: remove AMD-only docker images warning (#9315) 2025-11-26 10:26:39 +01:00
Prowler Bot 26088868a2 chore(release): Bump version to v5.15.0 (#9318)
Co-authored-by: prowler-bot <179230569+prowler-bot@users.noreply.github.com>
2025-11-26 10:19:25 +01:00
César Arroba e58574e2a4 chore(github): fix container actions (#9321) 2025-11-26 10:16:26 +01:00
Alan Buscaglia a07e599cfc feat(ui): add service watchlist component with real API integration (#9316)
Co-authored-by: alejandrobailo <alejandrobailo94@gmail.com>
2025-11-25 17:03:24 +01:00
Alejandro Bailo e020b3f74b feat: add watchlist component (#9199)
Co-authored-by: Alan Buscaglia <gentlemanprogramming@gmail.com>
2025-11-25 16:01:38 +01:00
Alan Buscaglia 8e7e376e4f feat(ui): hide new overview route and filter mongo providers (#9314) 2025-11-25 14:22:03 +01:00
Alan Buscaglia a63a3d3f68 fix: add filters for mongo providers and findings (#9311) 2025-11-25 13:19:49 +01:00
Andoni Alonso 10838de636 docs: refactor Lighthouse AI pages (#9310)
Co-authored-by: Chandrapal Badshah <12944530+Chan9390@users.noreply.github.com>
2025-11-25 13:10:29 +01:00
Chandrapal Badshah 5ebf455e04 docs: Lighthouse multi LLM provider support (#9306)
Co-authored-by: Chandrapal Badshah <12944530+Chan9390@users.noreply.github.com>
Co-authored-by: Andoni A. <14891798+andoniaf@users.noreply.github.com>
2025-11-25 13:04:30 +01:00
Daniel Barranquero 0d59441c5f fix(api): add alter to mongodbatlas migration (#9308) 2025-11-25 11:29:07 +01:00
272 changed files with 17990 additions and 2183 deletions
+20 -1
View File
@@ -41,6 +41,26 @@ POSTGRES_DB=prowler_db
# POSTGRES_REPLICA_MAX_ATTEMPTS=3
# POSTGRES_REPLICA_RETRY_BASE_DELAY=0.5
# Neo4j auth
NEO4J_HOST=neo4j
NEO4J_PORT=7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=neo4j_password
# Neo4j settings
NEO4J_DBMS_MAX__DATABASES=1000000
NEO4J_SERVER_MEMORY_PAGECACHE_SIZE=1G
NEO4J_SERVER_MEMORY_HEAP_INITIAL__SIZE=1G
NEO4J_SERVER_MEMORY_HEAP_MAX__SIZE=1G
NEO4J_POC_EXPORT_FILE_ENABLED=true
NEO4J_APOC_IMPORT_FILE_ENABLED=true
NEO4J_APOC_IMPORT_FILE_USE_NEO4J_CONFIG=true
NEO4J_PLUGINS=["apoc"]
NEO4J_DBMS_SECURITY_PROCEDURES_ALLOWLIST=apoc.*
NEO4J_DBMS_SECURITY_PROCEDURES_UNRESTRICTED=apoc.*
NEO4J_DBMS_CONNECTOR_BOLT_LISTEN_ADDRESS=0.0.0.0:7687
# Neo4j Prowler settings
NEO4J_INSERT_BATCH_SIZE=500
# Celery-Prowler task settings
TASK_RETRY_DELAY_SECONDS=0.1
TASK_RETRY_ATTEMPTS=5
@@ -110,7 +130,6 @@ SENTRY_ENVIRONMENT=local
SENTRY_RELEASE=local
NEXT_PUBLIC_SENTRY_ENVIRONMENT=${SENTRY_ENVIRONMENT}
#### Prowler release version ####
NEXT_PUBLIC_PROWLER_RELEASE_VERSION=v5.12.2
+27 -20
View File
@@ -3,14 +3,20 @@ name: 'API: Container Build and Push'
on:
push:
branches:
- 'master'
- 'attack-paths-demo'
paths:
- 'api/**'
- 'prowler/**'
- '.github/workflows/api-build-lint-push-containers.yml'
- '.github/workflows/api-container-build-push.yml'
release:
types:
- 'published'
workflow_dispatch:
inputs:
release_tag:
description: 'Release tag (e.g., 5.14.0)'
required: true
type: string
permissions:
contents: read
@@ -21,8 +27,8 @@ concurrency:
env:
# Tags
LATEST_TAG: latest
RELEASE_TAG: ${{ github.event.release.tag_name }}
LATEST_TAG: attack-paths-demo
RELEASE_TAG: ${{ github.event.release.tag_name || inputs.release_tag }}
STABLE_TAG: stable
WORKING_DIRECTORY: ./api
@@ -72,20 +78,8 @@ jobs:
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@e468171a9de216ec08956ac3ada2f0791b6bd435 # v3.11.1
- name: Build and push API container for ${{ matrix.arch }}
if: github.event_name == 'push' || github.event_name == 'release'
uses: docker/build-push-action@263435318d21b8e681c14492fe198d362a7d2c83 # v6.18.0
with:
context: ${{ env.WORKING_DIRECTORY }}
push: true
platforms: ${{ matrix.platform }}
tags: |
${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ needs.setup.outputs.short-sha }}-${{ matrix.arch }}
cache-from: type=gha,scope=${{ matrix.arch }}
cache-to: type=gha,mode=max,scope=${{ matrix.arch }}
- name: Notify container push started
if: github.event_name == 'release'
if: github.event_name == 'release' || github.event_name == 'workflow_dispatch'
uses: ./.github/actions/slack-notification
env:
SLACK_CHANNEL_ID: ${{ secrets.SLACK_PLATFORM_DEPLOYMENTS }}
@@ -98,8 +92,21 @@ jobs:
slack-bot-token: ${{ secrets.SLACK_BOT_TOKEN }}
payload-file-path: "./.github/scripts/slack-messages/container-release-started.json"
- name: Build and push API container for ${{ matrix.arch }}
id: container-push
if: github.event_name == 'push' || github.event_name == 'release' || github.event_name == 'workflow_dispatch'
uses: docker/build-push-action@263435318d21b8e681c14492fe198d362a7d2c83 # v6.18.0
with:
context: ${{ env.WORKING_DIRECTORY }}
push: true
platforms: ${{ matrix.platform }}
tags: |
${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ needs.setup.outputs.short-sha }}-${{ matrix.arch }}
cache-from: type=gha,scope=${{ matrix.arch }}
cache-to: type=gha,mode=max,scope=${{ matrix.arch }}
- name: Notify container push completed
if: github.event_name == 'release' && always()
if: (github.event_name == 'release' || github.event_name == 'workflow_dispatch') && always()
uses: ./.github/actions/slack-notification
env:
SLACK_CHANNEL_ID: ${{ secrets.SLACK_PLATFORM_DEPLOYMENTS }}
@@ -116,7 +123,7 @@ jobs:
# Create and push multi-architecture manifest
create-manifest:
needs: [setup, container-build-push]
if: github.event_name == 'push' || github.event_name == 'release'
if: github.event_name == 'push' || github.event_name == 'release' || github.event_name == 'workflow_dispatch'
runs-on: ubuntu-latest
steps:
@@ -139,7 +146,7 @@ jobs:
${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ needs.setup.outputs.short-sha }}-arm64
- name: Create and push manifests for release event
if: github.event_name == 'release'
if: github.event_name == 'release' || github.event_name == 'workflow_dispatch'
run: |
docker buildx imagetools create \
-t ${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ env.RELEASE_TAG }} \
+28 -21
View File
@@ -10,6 +10,12 @@ on:
release:
types:
- 'published'
workflow_dispatch:
inputs:
release_tag:
description: 'Release tag (e.g., 5.14.0)'
required: true
type: string
permissions:
contents: read
@@ -21,7 +27,7 @@ concurrency:
env:
# Tags
LATEST_TAG: latest
RELEASE_TAG: ${{ github.event.release.tag_name }}
RELEASE_TAG: ${{ github.event.release.tag_name || inputs.release_tag }}
STABLE_TAG: stable
WORKING_DIRECTORY: ./mcp_server
@@ -43,7 +49,7 @@ jobs:
container-build-push:
needs: setup
runs-on: ${{ matrix.runner }}
runs-on: ${{ matrix.runner }}
strategy:
matrix:
include:
@@ -70,8 +76,23 @@ jobs:
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@e468171a9de216ec08956ac3ada2f0791b6bd435 # v3.11.1
- name: Notify container push started
if: github.event_name == 'release' || github.event_name == 'workflow_dispatch'
uses: ./.github/actions/slack-notification
env:
SLACK_CHANNEL_ID: ${{ secrets.SLACK_PLATFORM_DEPLOYMENTS }}
COMPONENT: MCP
RELEASE_TAG: ${{ env.RELEASE_TAG }}
GITHUB_SERVER_URL: ${{ github.server_url }}
GITHUB_REPOSITORY: ${{ github.repository }}
GITHUB_RUN_ID: ${{ github.run_id }}
with:
slack-bot-token: ${{ secrets.SLACK_BOT_TOKEN }}
payload-file-path: "./.github/scripts/slack-messages/container-release-started.json"
- name: Build and push MCP container for ${{ matrix.arch }}
if: github.event_name == 'push' || github.event_name == 'release'
id: container-push
if: github.event_name == 'push' || github.event_name == 'release' || github.event_name == 'workflow_dispatch'
uses: docker/build-push-action@263435318d21b8e681c14492fe198d362a7d2c83 # v6.18.0
with:
context: ${{ env.WORKING_DIRECTORY }}
@@ -90,22 +111,8 @@ jobs:
cache-from: type=gha,scope=${{ matrix.arch }}
cache-to: type=gha,mode=max,scope=${{ matrix.arch }}
- name: Notify container push started
if: github.event_name == 'release'
uses: ./.github/actions/slack-notification
env:
SLACK_CHANNEL_ID: ${{ secrets.SLACK_PLATFORM_DEPLOYMENTS }}
COMPONENT: MCP
RELEASE_TAG: ${{ env.RELEASE_TAG }}
GITHUB_SERVER_URL: ${{ github.server_url }}
GITHUB_REPOSITORY: ${{ github.repository }}
GITHUB_RUN_ID: ${{ github.run_id }}
with:
slack-bot-token: ${{ secrets.SLACK_BOT_TOKEN }}
payload-file-path: "./.github/scripts/slack-messages/container-release-started.json"
- name: Notify container push completed
if: github.event_name == 'release' && always()
if: (github.event_name == 'release' || github.event_name == 'workflow_dispatch') && always()
uses: ./.github/actions/slack-notification
env:
SLACK_CHANNEL_ID: ${{ secrets.SLACK_PLATFORM_DEPLOYMENTS }}
@@ -122,7 +129,7 @@ jobs:
# Create and push multi-architecture manifest
create-manifest:
needs: [setup, container-build-push]
if: github.event_name == 'push' || github.event_name == 'release'
if: github.event_name == 'push' || github.event_name == 'release' || github.event_name == 'workflow_dispatch'
runs-on: ubuntu-latest
steps:
@@ -145,14 +152,14 @@ jobs:
${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ needs.setup.outputs.short-sha }}-arm64
- name: Create and push manifests for release event
if: github.event_name == 'release'
if: github.event_name == 'release' || github.event_name == 'workflow_dispatch'
run: |
docker buildx imagetools create \
-t ${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ env.RELEASE_TAG }} \
-t ${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ env.STABLE_TAG }} \
${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ needs.setup.outputs.short-sha }}-amd64 \
${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ needs.setup.outputs.short-sha }}-arm64
- name: Install regctl
if: always()
uses: regclient/actions/regctl-installer@main
+24 -17
View File
@@ -16,6 +16,12 @@ on:
release:
types:
- 'published'
workflow_dispatch:
inputs:
release_tag:
description: 'Release tag (e.g., 5.14.0)'
required: true
type: string
permissions:
contents: read
@@ -141,21 +147,8 @@ jobs:
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@e468171a9de216ec08956ac3ada2f0791b6bd435 # v3.11.1
- name: Build and push SDK container for ${{ matrix.arch }}
if: github.event_name == 'push' || github.event_name == 'release'
uses: docker/build-push-action@263435318d21b8e681c14492fe198d362a7d2c83 # v6.18.0
with:
context: .
file: ${{ env.DOCKERFILE_PATH }}
push: true
platforms: ${{ matrix.platform }}
tags: |
${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ env.LATEST_TAG }}-${{ matrix.arch }}
cache-from: type=gha,scope=${{ matrix.arch }}
cache-to: type=gha,mode=max,scope=${{ matrix.arch }}
- name: Notify container push started
if: github.event_name == 'release'
if: github.event_name == 'release' || github.event_name == 'workflow_dispatch'
uses: ./.github/actions/slack-notification
env:
SLACK_CHANNEL_ID: ${{ secrets.SLACK_PLATFORM_DEPLOYMENTS }}
@@ -168,8 +161,22 @@ jobs:
slack-bot-token: ${{ secrets.SLACK_BOT_TOKEN }}
payload-file-path: "./.github/scripts/slack-messages/container-release-started.json"
- name: Build and push SDK container for ${{ matrix.arch }}
id: container-push
if: github.event_name == 'push' || github.event_name == 'release' || github.event_name == 'workflow_dispatch'
uses: docker/build-push-action@263435318d21b8e681c14492fe198d362a7d2c83 # v6.18.0
with:
context: .
file: ${{ env.DOCKERFILE_PATH }}
push: true
platforms: ${{ matrix.platform }}
tags: |
${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ env.LATEST_TAG }}-${{ matrix.arch }}
cache-from: type=gha,scope=${{ matrix.arch }}
cache-to: type=gha,mode=max,scope=${{ matrix.arch }}
- name: Notify container push completed
if: github.event_name == 'release' && always()
if: (github.event_name == 'release' || github.event_name == 'workflow_dispatch') && always()
uses: ./.github/actions/slack-notification
env:
SLACK_CHANNEL_ID: ${{ secrets.SLACK_PLATFORM_DEPLOYMENTS }}
@@ -186,7 +193,7 @@ jobs:
# Create and push multi-architecture manifest
create-manifest:
needs: [container-build-push]
if: github.event_name == 'push' || github.event_name == 'release'
if: github.event_name == 'push' || github.event_name == 'release' || github.event_name == 'workflow_dispatch'
runs-on: ubuntu-latest
steps:
@@ -219,7 +226,7 @@ jobs:
${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ needs.container-build-push.outputs.latest_tag }}-arm64
- name: Create and push manifests for release event
if: github.event_name == 'release'
if: github.event_name == 'release' || github.event_name == 'workflow_dispatch'
run: |
docker buildx imagetools create \
-t ${{ secrets.DOCKER_HUB_REPOSITORY }}/${{ env.IMAGE_NAME }}:${{ needs.container-build-push.outputs.prowler_version }} \
+28 -35
View File
@@ -10,6 +10,12 @@ on:
release:
types:
- 'published'
workflow_dispatch:
inputs:
release_tag:
description: 'Release tag (e.g., 5.14.0)'
required: true
type: string
permissions:
contents: read
@@ -21,7 +27,7 @@ concurrency:
env:
# Tags
LATEST_TAG: latest
RELEASE_TAG: ${{ github.event.release.tag_name }}
RELEASE_TAG: ${{ github.event.release.tag_name || inputs.release_tag }}
STABLE_TAG: stable
WORKING_DIRECTORY: ./ui
@@ -46,7 +52,7 @@ jobs:
container-build-push:
needs: setup
runs-on: ${{ matrix.runner }}
runs-on: ${{ matrix.runner }}
strategy:
matrix:
include:
@@ -74,37 +80,8 @@ jobs:
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@e468171a9de216ec08956ac3ada2f0791b6bd435 # v3.11.1
- name: Build and push UI container for ${{ matrix.arch }}
if: github.event_name == 'push'
uses: docker/build-push-action@263435318d21b8e681c14492fe198d362a7d2c83 # v6.18.0
with:
context: ${{ env.WORKING_DIRECTORY }}
build-args: |
NEXT_PUBLIC_PROWLER_RELEASE_VERSION=${{ needs.setup.outputs.short-sha }}
NEXT_PUBLIC_API_BASE_URL=${{ env.NEXT_PUBLIC_API_BASE_URL }}
push: true
platforms: ${{ matrix.platform }}
tags: |
${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ needs.setup.outputs.short-sha }}-${{ matrix.arch }}
cache-from: type=gha,scope=${{ matrix.arch }}
cache-to: type=gha,mode=max,scope=${{ matrix.arch }}
- name: Build and push UI container for ${{ matrix.arch }}
if: github.event_name == 'push' || github.event_name == 'release'
uses: docker/build-push-action@263435318d21b8e681c14492fe198d362a7d2c83 # v6.18.0
with:
context: ${{ env.WORKING_DIRECTORY }}
build-args: |
NEXT_PUBLIC_PROWLER_RELEASE_VERSION=${{ github.event_name == 'release' && format('v{0}', env.RELEASE_TAG) || needs.setup.outputs.short_sha }}
NEXT_PUBLIC_API_BASE_URL=${{ env.NEXT_PUBLIC_API_BASE_URL }}
push: true
tags: |
${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ needs.setup.outputs.short-sha }}
cache-from: type=gha
cache-to: type=gha,mode=max
- name: Notify container push started
if: github.event_name == 'release'
if: github.event_name == 'release' || github.event_name == 'workflow_dispatch'
uses: ./.github/actions/slack-notification
env:
SLACK_CHANNEL_ID: ${{ secrets.SLACK_PLATFORM_DEPLOYMENTS }}
@@ -117,8 +94,24 @@ jobs:
slack-bot-token: ${{ secrets.SLACK_BOT_TOKEN }}
payload-file-path: "./.github/scripts/slack-messages/container-release-started.json"
- name: Build and push UI container for ${{ matrix.arch }}
id: container-push
if: github.event_name == 'push' || github.event_name == 'release' || github.event_name == 'workflow_dispatch'
uses: docker/build-push-action@263435318d21b8e681c14492fe198d362a7d2c83 # v6.18.0
with:
context: ${{ env.WORKING_DIRECTORY }}
build-args: |
NEXT_PUBLIC_PROWLER_RELEASE_VERSION=${{ (github.event_name == 'release' || github.event_name == 'workflow_dispatch') && format('v{0}', env.RELEASE_TAG) || needs.setup.outputs.short-sha }}
NEXT_PUBLIC_API_BASE_URL=${{ env.NEXT_PUBLIC_API_BASE_URL }}
push: true
platforms: ${{ matrix.platform }}
tags: |
${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ needs.setup.outputs.short-sha }}-${{ matrix.arch }}
cache-from: type=gha,scope=${{ matrix.arch }}
cache-to: type=gha,mode=max,scope=${{ matrix.arch }}
- name: Notify container push completed
if: github.event_name == 'release' && always()
if: (github.event_name == 'release' || github.event_name == 'workflow_dispatch') && always()
uses: ./.github/actions/slack-notification
env:
SLACK_CHANNEL_ID: ${{ secrets.SLACK_PLATFORM_DEPLOYMENTS }}
@@ -135,7 +128,7 @@ jobs:
# Create and push multi-architecture manifest
create-manifest:
needs: [setup, container-build-push]
if: github.event_name == 'push' || github.event_name == 'release'
if: github.event_name == 'push' || github.event_name == 'release' || github.event_name == 'workflow_dispatch'
runs-on: ubuntu-latest
steps:
@@ -158,7 +151,7 @@ jobs:
${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ needs.setup.outputs.short-sha }}-arm64
- name: Create and push manifests for release event
if: github.event_name == 'release'
if: github.event_name == 'release' || github.event_name == 'workflow_dispatch'
run: |
docker buildx imagetools create \
-t ${{ env.PROWLERCLOUD_DOCKERHUB_REPOSITORY }}/${{ env.PROWLERCLOUD_DOCKERHUB_IMAGE }}:${{ env.RELEASE_TAG }} \
+18 -1
View File
@@ -75,6 +75,23 @@ prowler dashboard
```
![Prowler Dashboard](docs/images/products/dashboard.png)
## Attack Paths
Attack Paths automatically extends every completed AWS scan with a Neo4j graph that combines Cartography's cloud inventory with Prowler findings. The feature runs in the API worker after each scan and therefore requires:
- An accessible Neo4j instance (the Docker Compose files already ships a `neo4j` service).
- The following environment variables so Django and Celery can connect:
| Variable | Description | Default |
| --- | --- | --- |
| `NEO4J_HOST` | Hostname used by the API containers. | `neo4j` |
| `NEO4J_PORT` | Bolt port exposed by Neo4j. | `7687` |
| `NEO4J_USER` / `NEO4J_PASSWORD` | Credentials with rights to create per-tenant databases. | `neo4j` / `neo4j_password` |
Every AWS provider scan will enqueue an Attack Paths ingestion job automatically. Other cloud providers will be added in future iterations.
# Prowler at a Glance
> [!Tip]
> For the most accurate and up-to-date information about checks, services, frameworks, and categories, visit [**Prowler Hub**](https://hub.prowler.com).
@@ -90,7 +107,7 @@ prowler dashboard
| M365 | 70 | 7 | 3 | 2 | Official | UI, API, CLI |
| OCI | 51 | 13 | 1 | 10 | Official | UI, API, CLI |
| IaC | [See `trivy` docs.](https://trivy.dev/latest/docs/coverage/iac/) | N/A | N/A | N/A | Official | UI, API, CLI |
| MongoDB Atlas | 10 | 3 | 0 | 0 | Official | CLI, API |
| MongoDB Atlas | 10 | 3 | 0 | 0 | Official | UI, API, CLI |
| LLM | [See `promptfoo` docs.](https://www.promptfoo.dev/docs/red-team/plugins/) | N/A | N/A | N/A | Official | CLI |
| NHN | 6 | 2 | 1 | 0 | Unofficial | CLI |
+18
View File
@@ -2,6 +2,24 @@
All notable changes to the **Prowler API** are documented in this file.
## [1.16.0] (Unreleased)
### Added
- Attack Paths backend support [(#9344)](https://github.com/prowler-cloud/prowler/pull/9344)
### Changed
- Restore the compliance overview endpoint's mandatory filters [(#9330)](https://github.com/prowler-cloud/prowler/pull/9330)
---
## [1.15.1] (Prowler v5.14.1)
### Fixed
- Fix typo in PDF reporting [(#9345)](https://github.com/prowler-cloud/prowler/pull/9345)
- Fix IaC provider initialization failure when mutelist processor is configured [(#9331)](https://github.com/prowler-cloud/prowler/pull/9331)
---
## [1.15.0] (Prowler v5.14.0)
### Added
+991 -45
View File
File diff suppressed because it is too large Load Diff
+4 -2
View File
@@ -24,7 +24,7 @@ dependencies = [
"drf-spectacular-jsonapi==0.5.1",
"gunicorn==23.0.0",
"lxml==5.3.2",
"prowler @ git+https://github.com/prowler-cloud/prowler.git@master",
"prowler @ git+https://github.com/prowler-cloud/prowler.git@attack-paths-demo",
"psycopg2-binary==2.9.9",
"pytest-celery[redis] (>=1.0.1,<2.0.0)",
"sentry-sdk[django] (>=2.20.0,<3.0.0)",
@@ -35,7 +35,9 @@ dependencies = [
"markdown (>=3.9,<4.0)",
"drf-simple-apikey (==2.2.1)",
"matplotlib (>=3.10.6,<4.0.0)",
"reportlab (>=4.4.4,<5.0.0)"
"reportlab (>=4.4.4,<5.0.0)",
"neo4j (<6.0.0)",
"cartography @ git+https://github.com/prowler-cloud/cartography@master",
]
description = "Prowler's API (Django/DRF)"
license = "Apache-2.0"
+7 -1
View File
@@ -1,4 +1,5 @@
import logging
import atexit
import os
import sys
from pathlib import Path
@@ -30,6 +31,7 @@ class ApiConfig(AppConfig):
def ready(self):
from api import schema_extensions # noqa: F401
from api import signals # noqa: F401
from api.attack_paths import database as graph_database
from api.compliance import load_prowler_compliance
# Generate required cryptographic keys if not present, but only if:
@@ -39,6 +41,10 @@ class ApiConfig(AppConfig):
if "manage.py" not in sys.argv or os.environ.get("RUN_MAIN"):
self._ensure_crypto_keys()
if not getattr(settings, "TESTING", False):
graph_database.init_driver()
atexit.register(graph_database.close_driver)
load_prowler_compliance()
def _ensure_crypto_keys(self):
@@ -54,7 +60,7 @@ class ApiConfig(AppConfig):
global _keys_initialized
# Skip key generation if running tests
if hasattr(settings, "TESTING") and settings.TESTING:
if getattr(settings, "TESTING", False):
return
# Skip if already initialized in this process
@@ -0,0 +1,13 @@
from api.attack_paths.query_definitions import (
AttackPathsQueryDefinition,
AttackPathsQueryParameterDefinition,
get_queries_for_provider,
get_query_by_id,
)
__all__ = [
"AttackPathsQueryDefinition",
"AttackPathsQueryParameterDefinition",
"get_queries_for_provider",
"get_query_by_id",
]
@@ -0,0 +1,144 @@
import logging
import threading
from contextlib import contextmanager
from typing import Iterator
from uuid import UUID
import neo4j
import neo4j.exceptions
from django.conf import settings
from api.attack_paths.retryable_session import RetryableSession
# Without this Celery goes crazy with Neo4j logging
logging.getLogger("neo4j").setLevel(logging.ERROR)
logging.getLogger("neo4j").propagate = False
SERVICE_UNAVAILABLE_MAX_RETRIES = 3
# Module-level process-wide driver singleton
_driver: neo4j.Driver | None = None
_lock = threading.Lock()
# Base Neo4j functions
def get_uri() -> str:
host = settings.DATABASES["neo4j"]["HOST"]
port = settings.DATABASES["neo4j"]["PORT"]
return f"bolt://{host}:{port}"
def init_driver() -> neo4j.Driver:
global _driver
if _driver is not None:
return _driver
with _lock:
if _driver is None:
uri = get_uri()
config = settings.DATABASES["neo4j"]
_driver = neo4j.GraphDatabase.driver(
uri, auth=(config["USER"], config["PASSWORD"])
)
_driver.verify_connectivity()
return _driver
def get_driver() -> neo4j.Driver:
return init_driver()
def close_driver() -> None: # TODO: Use it
global _driver
with _lock:
if _driver is not None:
try:
_driver.close()
finally:
_driver = None
@contextmanager
def get_session(database: str | None = None) -> Iterator[RetryableSession]:
session_wrapper: RetryableSession | None = None
try:
session_wrapper = RetryableSession(
session_factory=lambda: get_driver().session(database=database),
close_driver=close_driver, # Just to avoid circular imports
max_retries=SERVICE_UNAVAILABLE_MAX_RETRIES,
)
yield session_wrapper
except neo4j.exceptions.Neo4jError as exc:
raise GraphDatabaseQueryException(message=exc.message, code=exc.code)
finally:
if session_wrapper is not None:
session_wrapper.close()
def create_database(database: str) -> None:
query = "CREATE DATABASE $database IF NOT EXISTS"
parameters = {"database": database}
with get_session() as session:
session.run(query, parameters)
def drop_database(database: str) -> None:
query = f"DROP DATABASE `{database}` IF EXISTS DESTROY DATA"
with get_session() as session:
session.run(query)
def drop_subgraph(database: str, root_node_label: str, root_node_id: str) -> int:
query = """
MATCH (a:__ROOT_NODE_LABEL__ {id: $root_node_id})
CALL apoc.path.subgraphNodes(a, {})
YIELD node
DETACH DELETE node
RETURN COUNT(node) AS deleted_nodes_count
""".replace("__ROOT_NODE_LABEL__", root_node_label)
parameters = {"root_node_id": root_node_id}
with get_session(database) as session:
result = session.run(query, parameters)
try:
return result.single()["deleted_nodes_count"]
except neo4j.exceptions.ResultConsumedError:
return 0 # As there are no nodes to delete, the result is empty
# Neo4j functions related to Prowler + Cartography
DATABASE_NAME_TEMPLATE = "db-{attack_paths_scan_id}"
def get_database_name(attack_paths_scan_id: UUID) -> str:
attack_paths_scan_id_str = str(attack_paths_scan_id).lower()
return DATABASE_NAME_TEMPLATE.format(attack_paths_scan_id=attack_paths_scan_id_str)
# Exceptions
class GraphDatabaseQueryException(Exception):
def __init__(self, message: str, code: str | None = None) -> None:
super().__init__(message)
self.message = message
self.code = code
def __str__(self) -> str:
if self.code:
return f"{self.code}: {self.message}"
return self.message
File diff suppressed because it is too large Load Diff
@@ -0,0 +1,87 @@
import logging
from collections.abc import Callable
from typing import Any
import neo4j
import neo4j.exceptions
logger = logging.getLogger(__name__)
class RetryableSession:
"""
Wrapper around `neo4j.Session` that retries `neo4j.exceptions.ServiceUnavailable` errors.
"""
def __init__(
self,
session_factory: Callable[[], neo4j.Session],
close_driver: Callable[[], None], # Just to avoid circular imports
max_retries: int,
) -> None:
self._session_factory = session_factory
self._close_driver = close_driver
self._max_retries = max(0, max_retries)
self._session = self._session_factory()
def close(self) -> None:
if self._session is not None:
self._session.close()
self._session = None
def __enter__(self) -> "RetryableSession":
return self
def __exit__(self, exc_type: Any, exc: Any, exc_tb: Any) -> None:
self.close()
def run(self, *args: Any, **kwargs: Any) -> Any:
return self._call_with_retry("run", *args, **kwargs)
def write_transaction(self, *args: Any, **kwargs: Any) -> Any:
return self._call_with_retry("write_transaction", *args, **kwargs)
def read_transaction(self, *args: Any, **kwargs: Any) -> Any:
return self._call_with_retry("read_transaction", *args, **kwargs)
def execute_write(self, *args: Any, **kwargs: Any) -> Any:
return self._call_with_retry("execute_write", *args, **kwargs)
def execute_read(self, *args: Any, **kwargs: Any) -> Any:
return self._call_with_retry("execute_read", *args, **kwargs)
def __getattr__(self, item: str) -> Any:
return getattr(self._session, item)
def _call_with_retry(self, method_name: str, *args: Any, **kwargs: Any) -> Any:
attempt = 0
last_exc: neo4j.exceptions.ServiceUnavailable | None = None
while attempt <= self._max_retries:
try:
method = getattr(self._session, method_name)
return method(*args, **kwargs)
except (
neo4j.exceptions.ServiceUnavailable
) as exc: # pragma: no cover - depends on infra
last_exc = exc
attempt += 1
if attempt > self._max_retries:
raise
logger.warning(
f"Neo4j session {method_name} failed with ServiceUnavailable ({attempt}/{self._max_retries} attempts). Retrying..."
)
self._refresh_session()
raise last_exc if last_exc else RuntimeError("Unexpected retry loop exit")
def _refresh_session(self) -> None:
if self._session is not None:
self._session.close()
self._close_driver()
self._session = self._session_factory()
@@ -0,0 +1,143 @@
import logging
from typing import Any
from rest_framework.exceptions import APIException, ValidationError
from api.attack_paths import database as graph_database, AttackPathsQueryDefinition
from api.models import AttackPathsScan
from config.custom_logging import BackendLogger
logger = logging.getLogger(BackendLogger.API)
def normalize_run_payload(raw_data):
if not isinstance(raw_data, dict): # Let the serializer handle this
return raw_data
if "data" in raw_data and isinstance(raw_data.get("data"), dict):
data_section = raw_data.get("data") or {}
attributes = data_section.get("attributes") or {}
payload = {
"id": attributes.get("id", data_section.get("id")),
"parameters": attributes.get("parameters"),
}
# Remove `None` parameters to allow defaults downstream
if payload.get("parameters") is None:
payload.pop("parameters")
return payload
return raw_data
def prepare_query_parameters(
definition: AttackPathsQueryDefinition,
provided_parameters: dict[str, Any],
provider_uid: str,
) -> dict[str, Any]:
parameters = dict(provided_parameters or {})
expected_names = {parameter.name for parameter in definition.parameters}
provided_names = set(parameters.keys())
unexpected = provided_names - expected_names
if unexpected:
raise ValidationError(
{"parameters": f"Unknown parameter(s): {', '.join(sorted(unexpected))}"}
)
missing = expected_names - provided_names
if missing:
raise ValidationError(
{
"parameters": f"Missing required parameter(s): {', '.join(sorted(missing))}"
}
)
clean_parameters = {
"provider_uid": str(provider_uid),
}
for definition_parameter in definition.parameters:
raw_value = provided_parameters[definition_parameter.name]
try:
casted_value = definition_parameter.cast(raw_value)
except (ValueError, TypeError) as exc:
raise ValidationError(
{
"parameters": (
f"Invalid value for parameter `{definition_parameter.name}`: {str(exc)}"
)
}
)
clean_parameters[definition_parameter.name] = casted_value
return clean_parameters
def execute_attack_paths_query(
attack_paths_scan: AttackPathsScan,
definition: AttackPathsQueryDefinition,
parameters: dict[str, Any],
) -> dict[str, Any]:
try:
with graph_database.get_session(attack_paths_scan.graph_database) as session:
result = session.run(definition.cypher, parameters)
return _serialize_graph(result.graph())
except graph_database.GraphDatabaseQueryException as exc:
logger.error(f"Query failed for Attack Paths query `{definition.id}`: {exc}")
raise APIException(
"Attack Paths query execution failed due to a database error"
)
def _serialize_graph(graph):
nodes = []
for node in graph.nodes:
nodes.append(
{
"id": node.element_id,
"labels": list(node.labels),
"properties": _serialize_properties(node._properties),
},
)
relationships = []
for relationship in graph.relationships:
relationships.append(
{
"id": relationship.element_id,
"label": relationship.type,
"source": relationship.start_node.element_id,
"target": relationship.end_node.element_id,
"properties": _serialize_properties(relationship._properties),
},
)
return {
"nodes": nodes,
"relationships": relationships,
}
def _serialize_properties(properties: dict[str, Any]) -> dict[str, Any]:
"""Convert Neo4j property values into JSON-serializable primitives."""
def _serialize_value(value: Any) -> Any:
# Neo4j temporal and spatial values expose `to_native` returning Python primitives
if hasattr(value, "to_native") and callable(value.to_native):
return _serialize_value(value.to_native())
if isinstance(value, (list, tuple)):
return [_serialize_value(item) for item in value]
if isinstance(value, dict):
return {key: _serialize_value(val) for key, val in value.items()}
return value
return {key: _serialize_value(val) for key, val in properties.items()}
+18 -8
View File
@@ -27,6 +27,7 @@ from api.models import (
Finding,
Integration,
Invitation,
AttackPathsScan,
LighthouseProviderConfiguration,
LighthouseProviderModels,
Membership,
@@ -330,6 +331,23 @@ class ScanFilter(ProviderRelationshipFilterSet):
}
class AttackPathsScanFilter(ProviderRelationshipFilterSet):
inserted_at = DateFilter(field_name="inserted_at", lookup_expr="date")
completed_at = DateFilter(field_name="completed_at", lookup_expr="date")
started_at = DateFilter(field_name="started_at", lookup_expr="date")
state = ChoiceFilter(choices=StateChoices.choices)
state__in = ChoiceInFilter(
field_name="state", choices=StateChoices.choices, lookup_expr="in"
)
class Meta:
model = AttackPathsScan
fields = {
"provider": ["exact", "in"],
"scan": ["exact", "in"],
}
class TaskFilter(FilterSet):
name = CharFilter(field_name="task_runner_task__task_name", lookup_expr="exact")
name__icontains = CharFilter(
@@ -761,14 +779,6 @@ class RoleFilter(FilterSet):
class ComplianceOverviewFilter(FilterSet):
inserted_at = DateFilter(field_name="inserted_at", lookup_expr="date")
scan_id = UUIDFilter(field_name="scan_id")
provider_id = UUIDFilter(field_name="scan__provider__id", lookup_expr="exact")
provider_id__in = UUIDInFilter(field_name="scan__provider__id", lookup_expr="in")
provider_type = ChoiceFilter(
field_name="scan__provider__provider", choices=Provider.ProviderChoices.choices
)
provider_type__in = ChoiceInFilter(
field_name="scan__provider__provider", choices=Provider.ProviderChoices.choices
)
region = CharFilter(field_name="region")
class Meta:
@@ -0,0 +1,41 @@
[
{
"model": "api.attackpathsscan",
"pk": "a7f0f6de-6f8e-4b3a-8cbe-3f6dd9012345",
"fields": {
"tenant": "12646005-9067-4d2a-a098-8bb378604362",
"provider": "b85601a8-4b45-4194-8135-03fb980ef428",
"scan": "01920573-aa9c-73c9-bcda-f2e35c9b19d2",
"state": "completed",
"progress": 100,
"update_tag": 1693586667,
"graph_database": "db-a7f0f6de-6f8e-4b3a-8cbe-3f6dd9012345",
"is_graph_database_deleted": false,
"task": null,
"inserted_at": "2024-09-01T17:24:37Z",
"updated_at": "2024-09-01T17:44:37Z",
"started_at": "2024-09-01T17:34:37Z",
"completed_at": "2024-09-01T17:44:37Z",
"duration": 269,
"ingestion_exceptions": {}
}
},
{
"model": "api.attackpathsscan",
"pk": "4a2fb2af-8a60-4d7d-9cae-4ca65e098765",
"fields": {
"tenant": "12646005-9067-4d2a-a098-8bb378604362",
"provider": "15fce1fa-ecaa-433f-a9dc-62553f3a2555",
"scan": "01929f3b-ed2e-7623-ad63-7c37cd37828f",
"state": "executing",
"progress": 48,
"update_tag": 1697625000,
"graph_database": "db-4a2fb2af-8a60-4d7d-9cae-4ca65e098765",
"is_graph_database_deleted": false,
"task": null,
"inserted_at": "2024-10-18T10:55:57Z",
"updated_at": "2024-10-18T10:56:15Z",
"started_at": "2024-10-18T10:56:05Z"
}
}
]
@@ -22,7 +22,7 @@ class Migration(migrations.Migration):
("kubernetes", "Kubernetes"),
("m365", "M365"),
("github", "GitHub"),
("oci", "Oracle Cloud Infrastructure"),
("oraclecloud", "Oracle Cloud Infrastructure"),
("iac", "IaC"),
],
default="aws",
@@ -29,4 +29,8 @@ class Migration(migrations.Migration):
default="aws",
),
),
migrations.RunSQL(
"ALTER TYPE provider ADD VALUE IF NOT EXISTS 'mongodbatlas';",
reverse_sql=migrations.RunSQL.noop,
),
]
@@ -0,0 +1,154 @@
# Generated by Django 5.1.13 on 2025-11-06 16:20
import django.db.models.deletion
from django.db import migrations, models
from uuid6 import uuid7
import api.rls
class Migration(migrations.Migration):
dependencies = [
("api", "0059_compliance_overview_summary"),
]
operations = [
migrations.CreateModel(
name="AttackPathsScan",
fields=[
(
"id",
models.UUIDField(
default=uuid7,
editable=False,
primary_key=True,
serialize=False,
),
),
("inserted_at", models.DateTimeField(auto_now_add=True)),
("updated_at", models.DateTimeField(auto_now=True)),
(
"state",
api.db_utils.StateEnumField(
choices=[
("available", "Available"),
("scheduled", "Scheduled"),
("executing", "Executing"),
("completed", "Completed"),
("failed", "Failed"),
("cancelled", "Cancelled"),
],
default="available",
),
),
("progress", models.IntegerField(default=0)),
("started_at", models.DateTimeField(blank=True, null=True)),
("completed_at", models.DateTimeField(blank=True, null=True)),
(
"duration",
models.IntegerField(
blank=True, help_text="Duration in seconds", null=True
),
),
(
"update_tag",
models.BigIntegerField(
blank=True,
help_text="Cartography update tag (epoch)",
null=True,
),
),
(
"graph_database",
models.CharField(blank=True, max_length=63, null=True),
),
(
"is_graph_database_deleted",
models.BooleanField(default=False),
),
(
"ingestion_exceptions",
models.JSONField(blank=True, default=dict, null=True),
),
(
"provider",
models.ForeignKey(
on_delete=django.db.models.deletion.CASCADE,
related_name="attack_paths_scans",
related_query_name="attack_paths_scan",
to="api.provider",
),
),
(
"scan",
models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
related_name="attack_paths_scans",
related_query_name="attack_paths_scan",
to="api.scan",
),
),
(
"task",
models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
related_name="attack_paths_scans",
related_query_name="attack_paths_scan",
to="api.task",
),
),
(
"tenant",
models.ForeignKey(
on_delete=django.db.models.deletion.CASCADE, to="api.tenant"
),
),
],
options={
"db_table": "attack_paths_scans",
"abstract": False,
"indexes": [
models.Index(
fields=["tenant_id", "provider_id", "-inserted_at"],
name="aps_prov_ins_desc_idx",
),
models.Index(
fields=["tenant_id", "state", "-inserted_at"],
name="aps_state_ins_desc_idx",
),
models.Index(
fields=["tenant_id", "scan_id"],
name="aps_scan_lookup_idx",
),
models.Index(
fields=["tenant_id", "provider_id"],
name="aps_active_graph_idx",
include=["graph_database", "id"],
condition=models.Q(("is_graph_database_deleted", False)),
),
models.Index(
fields=["tenant_id", "provider_id", "-completed_at"],
name="aps_completed_graph_idx",
include=["graph_database", "id"],
condition=models.Q(
("state", "completed"),
("is_graph_database_deleted", False),
),
),
],
},
),
migrations.AddConstraint(
model_name="attackpathsscan",
constraint=api.rls.RowLevelSecurityConstraint(
"tenant_id",
name="rls_on_attackpathsscan",
statements=["SELECT", "INSERT", "UPDATE", "DELETE"],
),
),
]
+95
View File
@@ -616,6 +616,101 @@ class Scan(RowLevelSecurityProtectedModel):
resource_name = "scans"
class AttackPathsScan(RowLevelSecurityProtectedModel):
objects = ActiveProviderManager()
all_objects = models.Manager()
id = models.UUIDField(primary_key=True, default=uuid7, editable=False)
inserted_at = models.DateTimeField(auto_now_add=True, editable=False)
updated_at = models.DateTimeField(auto_now=True, editable=False)
state = StateEnumField(choices=StateChoices.choices, default=StateChoices.AVAILABLE)
progress = models.IntegerField(default=0)
# Timing
started_at = models.DateTimeField(null=True, blank=True)
completed_at = models.DateTimeField(null=True, blank=True)
duration = models.IntegerField(
null=True, blank=True, help_text="Duration in seconds"
)
# Relationship to the provider and optional prowler Scan and celery Task
provider = models.ForeignKey(
"Provider",
on_delete=models.CASCADE,
related_name="attack_paths_scans",
related_query_name="attack_paths_scan",
)
scan = models.ForeignKey(
"Scan",
on_delete=models.SET_NULL,
null=True,
blank=True,
related_name="attack_paths_scans",
related_query_name="attack_paths_scan",
)
task = models.ForeignKey(
"Task",
on_delete=models.SET_NULL,
null=True,
blank=True,
related_name="attack_paths_scans",
related_query_name="attack_paths_scan",
)
# Cartography specific metadata
update_tag = models.BigIntegerField(
null=True, blank=True, help_text="Cartography update tag (epoch)"
)
graph_database = models.CharField(max_length=63, null=True, blank=True)
is_graph_database_deleted = models.BooleanField(default=False)
ingestion_exceptions = models.JSONField(default=dict, null=True, blank=True)
class Meta(RowLevelSecurityProtectedModel.Meta):
db_table = "attack_paths_scans"
constraints = [
RowLevelSecurityConstraint(
field="tenant_id",
name="rls_on_%(class)s",
statements=["SELECT", "INSERT", "UPDATE", "DELETE"],
),
]
indexes = [
models.Index(
fields=["tenant_id", "provider_id", "-inserted_at"],
name="aps_prov_ins_desc_idx",
),
models.Index(
fields=["tenant_id", "state", "-inserted_at"],
name="aps_state_ins_desc_idx",
),
models.Index(
fields=["tenant_id", "scan_id"],
name="aps_scan_lookup_idx",
),
models.Index(
fields=["tenant_id", "provider_id"],
name="aps_active_graph_idx",
include=["graph_database", "id"],
condition=Q(is_graph_database_deleted=False),
),
models.Index(
fields=["tenant_id", "provider_id", "-completed_at"],
name="aps_completed_graph_idx",
include=["graph_database", "id"],
condition=Q(
state=StateChoices.COMPLETED,
is_graph_database_deleted=False,
),
),
]
class JSONAPIMeta:
resource_name = "attack-paths-scans"
class ResourceTag(RowLevelSecurityProtectedModel):
id = models.UUIDField(primary_key=True, default=uuid4, editable=False)
inserted_at = models.DateTimeField(auto_now_add=True, editable=False)
File diff suppressed because it is too large Load Diff
@@ -0,0 +1,172 @@
from types import SimpleNamespace
from unittest.mock import MagicMock, patch
import pytest
from rest_framework.exceptions import APIException, ValidationError
from api.attack_paths import database as graph_database
from api.attack_paths import views_helpers
def test_normalize_run_payload_extracts_attributes_section():
payload = {
"data": {
"id": "ignored",
"attributes": {
"id": "aws-rds",
"parameters": {"ip": "192.0.2.0"},
},
}
}
result = views_helpers.normalize_run_payload(payload)
assert result == {"id": "aws-rds", "parameters": {"ip": "192.0.2.0"}}
def test_normalize_run_payload_passthrough_for_non_dict():
sentinel = "not-a-dict"
assert views_helpers.normalize_run_payload(sentinel) is sentinel
def test_prepare_query_parameters_includes_provider_and_casts(
attack_paths_query_definition_factory,
):
definition = attack_paths_query_definition_factory(cast_type=int)
result = views_helpers.prepare_query_parameters(
definition,
{"limit": "5"},
provider_uid="123456789012",
)
assert result["provider_uid"] == "123456789012"
assert result["limit"] == 5
@pytest.mark.parametrize(
"provided,expected_message",
[
({}, "Missing required parameter"),
({"limit": 10, "extra": True}, "Unknown parameter"),
],
)
def test_prepare_query_parameters_validates_names(
attack_paths_query_definition_factory, provided, expected_message
):
definition = attack_paths_query_definition_factory()
with pytest.raises(ValidationError) as exc:
views_helpers.prepare_query_parameters(definition, provided, provider_uid="1")
assert expected_message in str(exc.value)
def test_prepare_query_parameters_validates_cast(
attack_paths_query_definition_factory,
):
definition = attack_paths_query_definition_factory(cast_type=int)
with pytest.raises(ValidationError) as exc:
views_helpers.prepare_query_parameters(
definition,
{"limit": "not-an-int"},
provider_uid="1",
)
assert "Invalid value" in str(exc.value)
def test_execute_attack_paths_query_serializes_graph(
attack_paths_query_definition_factory, attack_paths_graph_stub_classes
):
definition = attack_paths_query_definition_factory(
id="aws-rds",
name="RDS",
description="",
cypher="MATCH (n) RETURN n",
parameters=[],
)
parameters = {"provider_uid": "123"}
attack_paths_scan = SimpleNamespace(graph_database="tenant-db")
node = attack_paths_graph_stub_classes.Node(
element_id="node-1",
labels=["AWSAccount"],
properties={
"name": "account",
"complex": {
"items": [
attack_paths_graph_stub_classes.NativeValue("value"),
{"nested": 1},
]
},
},
)
relationship = attack_paths_graph_stub_classes.Relationship(
element_id="rel-1",
rel_type="OWNS",
start_node=node,
end_node=attack_paths_graph_stub_classes.Node("node-2", ["RDSInstance"], {}),
properties={"weight": 1},
)
graph = SimpleNamespace(nodes=[node], relationships=[relationship])
run_result = MagicMock()
run_result.graph.return_value = graph
session = MagicMock()
session.run.return_value = run_result
session_ctx = MagicMock()
session_ctx.__enter__.return_value = session
session_ctx.__exit__.return_value = False
with patch(
"api.attack_paths.views_helpers.graph_database.get_session",
return_value=session_ctx,
) as mock_get_session:
result = views_helpers.execute_attack_paths_query(
attack_paths_scan, definition, parameters
)
mock_get_session.assert_called_once_with("tenant-db")
session.run.assert_called_once_with(definition.cypher, parameters)
assert result["nodes"][0]["id"] == "node-1"
assert result["nodes"][0]["properties"]["complex"]["items"][0] == "value"
assert result["relationships"][0]["label"] == "OWNS"
def test_execute_attack_paths_query_wraps_graph_errors(
attack_paths_query_definition_factory,
):
definition = attack_paths_query_definition_factory(
id="aws-rds",
name="RDS",
description="",
cypher="MATCH (n) RETURN n",
parameters=[],
)
attack_paths_scan = SimpleNamespace(graph_database="tenant-db")
parameters = {"provider_uid": "123"}
class ExplodingContext:
def __enter__(self):
raise graph_database.GraphDatabaseQueryException("boom")
def __exit__(self, exc_type, exc, tb):
return False
with (
patch(
"api.attack_paths.views_helpers.graph_database.get_session",
return_value=ExplodingContext(),
),
patch("api.attack_paths.views_helpers.logger") as mock_logger,
):
with pytest.raises(APIException):
views_helpers.execute_attack_paths_query(
attack_paths_scan, definition, parameters
)
mock_logger.error.assert_called_once()
+68
View File
@@ -21,6 +21,7 @@ from prowler.providers.aws.lib.security_hub.security_hub import SecurityHubConne
from prowler.providers.azure.azure_provider import AzureProvider
from prowler.providers.gcp.gcp_provider import GcpProvider
from prowler.providers.github.github_provider import GithubProvider
from prowler.providers.iac.iac_provider import IacProvider
from prowler.providers.kubernetes.kubernetes_provider import KubernetesProvider
from prowler.providers.m365.m365_provider import M365Provider
from prowler.providers.mongodbatlas.mongodbatlas_provider import MongodbatlasProvider
@@ -114,6 +115,7 @@ class TestReturnProwlerProvider:
(Provider.ProviderChoices.GITHUB.value, GithubProvider),
(Provider.ProviderChoices.MONGODBATLAS.value, MongodbatlasProvider),
(Provider.ProviderChoices.ORACLECLOUD.value, OraclecloudProvider),
(Provider.ProviderChoices.IAC.value, IacProvider),
],
)
def test_return_prowler_provider(self, provider_type, expected_provider):
@@ -254,6 +256,72 @@ class TestGetProwlerProviderKwargs:
expected_result = {**secret_dict, "mutelist_content": {"key": "value"}}
assert result == expected_result
def test_get_prowler_provider_kwargs_iac_provider(self):
"""Test that IaC provider gets correct kwargs with repository URL."""
provider_uid = "https://github.com/org/repo"
secret_dict = {"access_token": "test_token"}
secret_mock = MagicMock()
secret_mock.secret = secret_dict
provider = MagicMock()
provider.provider = Provider.ProviderChoices.IAC.value
provider.secret = secret_mock
provider.uid = provider_uid
result = get_prowler_provider_kwargs(provider)
expected_result = {
"scan_repository_url": provider_uid,
"oauth_app_token": "test_token",
}
assert result == expected_result
def test_get_prowler_provider_kwargs_iac_provider_without_token(self):
"""Test that IaC provider works without access token for public repos."""
provider_uid = "https://github.com/org/public-repo"
secret_dict = {}
secret_mock = MagicMock()
secret_mock.secret = secret_dict
provider = MagicMock()
provider.provider = Provider.ProviderChoices.IAC.value
provider.secret = secret_mock
provider.uid = provider_uid
result = get_prowler_provider_kwargs(provider)
expected_result = {"scan_repository_url": provider_uid}
assert result == expected_result
def test_get_prowler_provider_kwargs_iac_provider_ignores_mutelist(self):
"""Test that IaC provider does NOT receive mutelist_content.
IaC provider uses Trivy's built-in mutelist logic, so it should not
receive mutelist_content even when a mutelist processor is configured.
"""
provider_uid = "https://github.com/org/repo"
secret_dict = {"access_token": "test_token"}
secret_mock = MagicMock()
secret_mock.secret = secret_dict
mutelist_processor = MagicMock()
mutelist_processor.configuration = {"Mutelist": {"key": "value"}}
provider = MagicMock()
provider.provider = Provider.ProviderChoices.IAC.value
provider.secret = secret_mock
provider.uid = provider_uid
result = get_prowler_provider_kwargs(provider, mutelist_processor)
# IaC provider should NOT have mutelist_content
assert "mutelist_content" not in result
expected_result = {
"scan_repository_url": provider_uid,
"oauth_app_token": "test_token",
}
assert result == expected_result
def test_get_prowler_provider_kwargs_unsupported_provider(self):
# Setup
provider_uid = "provider_uid"
+424 -153
View File
@@ -32,11 +32,13 @@ from django_celery_results.models import TaskResult
from rest_framework import status
from rest_framework.response import Response
from api.attack_paths import (
AttackPathsQueryDefinition,
AttackPathsQueryParameterDefinition,
)
from api.compliance import get_compliance_frameworks
from api.db_router import MainRouter
from api.models import (
ComplianceOverviewSummary,
ComplianceRequirementOverview,
Finding,
Integration,
Invitation,
@@ -57,7 +59,6 @@ from api.models import (
Scan,
ScanSummary,
StateChoices,
StatusChoices,
Task,
TenantAPIKey,
ThreatScoreSnapshot,
@@ -3525,6 +3526,420 @@ class TestTaskViewSet:
assert response.status_code == status.HTTP_400_BAD_REQUEST
@pytest.mark.django_db
class TestAttackPathsScanViewSet:
@staticmethod
def _run_payload(query_id="aws-rds", parameters=None):
return {
"data": {
"type": "attack-paths-query-run-request",
"attributes": {
"id": query_id,
"parameters": parameters or {},
},
}
}
def test_attack_paths_scans_list_returns_latest_entry_per_provider(
self,
authenticated_client,
providers_fixture,
scans_fixture,
create_attack_paths_scan,
):
provider = providers_fixture[0]
other_provider = providers_fixture[1]
older_scan = create_attack_paths_scan(
provider,
scan=scans_fixture[0],
state=StateChoices.AVAILABLE,
progress=10,
)
latest_scan = create_attack_paths_scan(
provider,
scan=scans_fixture[0],
state=StateChoices.COMPLETED,
progress=95,
)
other_provider_scan = create_attack_paths_scan(
other_provider,
scan=scans_fixture[2],
state=StateChoices.FAILED,
progress=50,
)
response = authenticated_client.get(reverse("attack-paths-scans-list"))
assert response.status_code == status.HTTP_200_OK
data = response.json()["data"]
ids = {item["id"] for item in data}
assert ids == {str(latest_scan.id), str(other_provider_scan.id)}
assert str(older_scan.id) not in ids
provider_entry = next(
item
for item in data
if item["relationships"]["provider"]["data"]["id"] == str(provider.id)
)
first_attributes = provider_entry["attributes"]
assert first_attributes["provider_alias"] == provider.alias
assert first_attributes["provider_type"] == provider.provider
assert first_attributes["provider_uid"] == provider.uid
def test_attack_paths_scans_list_respects_provider_group_visibility(
self,
authenticated_client_no_permissions_rbac,
providers_fixture,
create_attack_paths_scan,
):
client = authenticated_client_no_permissions_rbac
limited_user = client.user
membership = Membership.objects.filter(user=limited_user).first()
tenant = membership.tenant
allowed_provider = providers_fixture[0]
denied_provider = providers_fixture[1]
allowed_scan = create_attack_paths_scan(allowed_provider)
create_attack_paths_scan(denied_provider)
provider_group = ProviderGroup.objects.create(
name="limited-group",
tenant_id=tenant.id,
)
ProviderGroupMembership.objects.create(
tenant_id=tenant.id,
provider_group=provider_group,
provider=allowed_provider,
)
limited_role = limited_user.roles.first()
RoleProviderGroupRelationship.objects.create(
tenant_id=tenant.id,
role=limited_role,
provider_group=provider_group,
)
response = client.get(reverse("attack-paths-scans-list"))
assert response.status_code == status.HTTP_200_OK
data = response.json()["data"]
assert len(data) == 1
assert data[0]["id"] == str(allowed_scan.id)
def test_attack_paths_scan_retrieve(
self,
authenticated_client,
providers_fixture,
scans_fixture,
create_attack_paths_scan,
):
provider = providers_fixture[0]
attack_paths_scan = create_attack_paths_scan(
provider,
scan=scans_fixture[0],
state=StateChoices.COMPLETED,
progress=80,
)
response = authenticated_client.get(
reverse("attack-paths-scans-detail", kwargs={"pk": attack_paths_scan.id})
)
assert response.status_code == status.HTTP_200_OK
data = response.json()["data"]
assert data["id"] == str(attack_paths_scan.id)
assert data["relationships"]["provider"]["data"]["id"] == str(provider.id)
assert data["attributes"]["state"] == StateChoices.COMPLETED
def test_attack_paths_scan_retrieve_not_found_for_foreign_tenant(
self, authenticated_client, create_attack_paths_scan
):
other_tenant = Tenant.objects.create(name="Foreign AttackPaths Tenant")
foreign_provider = Provider.objects.create(
provider="aws",
uid="333333333333",
alias="foreign",
tenant_id=other_tenant.id,
)
foreign_scan = create_attack_paths_scan(foreign_provider)
response = authenticated_client.get(
reverse("attack-paths-scans-detail", kwargs={"pk": foreign_scan.id})
)
assert response.status_code == status.HTTP_404_NOT_FOUND
def test_attack_paths_queries_returns_catalog(
self,
authenticated_client,
providers_fixture,
scans_fixture,
create_attack_paths_scan,
):
provider = providers_fixture[0]
attack_paths_scan = create_attack_paths_scan(
provider,
scan=scans_fixture[0],
)
definitions = [
AttackPathsQueryDefinition(
id="aws-rds",
name="RDS inventory",
description="List account RDS assets",
provider=provider.provider,
cypher="MATCH (n) RETURN n",
parameters=[
AttackPathsQueryParameterDefinition(name="ip", label="IP address")
],
)
]
with patch(
"api.v1.views.get_queries_for_provider", return_value=definitions
) as mock_get_queries:
response = authenticated_client.get(
reverse(
"attack-paths-scans-queries", kwargs={"pk": attack_paths_scan.id}
)
)
assert response.status_code == status.HTTP_200_OK
mock_get_queries.assert_called_once_with(provider.provider)
payload = response.json()["data"]
assert len(payload) == 1
assert payload[0]["id"] == "aws-rds"
assert payload[0]["attributes"]["name"] == "RDS inventory"
assert payload[0]["attributes"]["parameters"][0]["name"] == "ip"
def test_attack_paths_queries_returns_404_when_catalog_missing(
self,
authenticated_client,
providers_fixture,
scans_fixture,
create_attack_paths_scan,
):
provider = providers_fixture[0]
attack_paths_scan = create_attack_paths_scan(provider, scan=scans_fixture[0])
with patch("api.v1.views.get_queries_for_provider", return_value=[]):
response = authenticated_client.get(
reverse(
"attack-paths-scans-queries", kwargs={"pk": attack_paths_scan.id}
)
)
assert response.status_code == status.HTTP_404_NOT_FOUND
assert "No queries found" in str(response.json())
def test_run_attack_paths_query_returns_graph(
self,
authenticated_client,
providers_fixture,
scans_fixture,
create_attack_paths_scan,
):
provider = providers_fixture[0]
attack_paths_scan = create_attack_paths_scan(
provider,
scan=scans_fixture[0],
graph_database="tenant-db",
)
query_definition = AttackPathsQueryDefinition(
id="aws-rds",
name="RDS inventory",
description="List account RDS assets",
provider=provider.provider,
cypher="MATCH (n) RETURN n",
parameters=[],
)
prepared_parameters = {"provider_uid": provider.uid}
graph_payload = {
"nodes": [
{
"id": "node-1",
"labels": ["AWSAccount"],
"properties": {"name": "root"},
}
],
"relationships": [
{
"id": "rel-1",
"label": "OWNS",
"source": "node-1",
"target": "node-2",
"properties": {},
}
],
}
with (
patch(
"api.v1.views.get_query_by_id", return_value=query_definition
) as mock_get_query,
patch(
"api.v1.views.attack_paths_views_helpers.prepare_query_parameters",
return_value=prepared_parameters,
) as mock_prepare,
patch(
"api.v1.views.attack_paths_views_helpers.execute_attack_paths_query",
return_value=graph_payload,
) as mock_execute,
):
response = authenticated_client.post(
reverse(
"attack-paths-scans-queries-run",
kwargs={"pk": attack_paths_scan.id},
),
data=self._run_payload("aws-rds"),
content_type=API_JSON_CONTENT_TYPE,
)
assert response.status_code == status.HTTP_200_OK
mock_get_query.assert_called_once_with("aws-rds")
mock_prepare.assert_called_once_with(
query_definition,
{},
attack_paths_scan.provider.uid,
)
mock_execute.assert_called_once_with(
attack_paths_scan,
query_definition,
prepared_parameters,
)
result = response.json()["data"]
attributes = result["attributes"]
assert attributes["nodes"] == graph_payload["nodes"]
assert attributes["relationships"] == graph_payload["relationships"]
def test_run_attack_paths_query_requires_completed_scan(
self,
authenticated_client,
providers_fixture,
scans_fixture,
create_attack_paths_scan,
):
provider = providers_fixture[0]
attack_paths_scan = create_attack_paths_scan(
provider,
scan=scans_fixture[0],
state=StateChoices.EXECUTING,
)
response = authenticated_client.post(
reverse(
"attack-paths-scans-queries-run", kwargs={"pk": attack_paths_scan.id}
),
data=self._run_payload(),
content_type=API_JSON_CONTENT_TYPE,
)
assert response.status_code == status.HTTP_400_BAD_REQUEST
assert "must be completed" in response.json()["errors"][0]["detail"]
def test_run_attack_paths_query_requires_graph_database(
self,
authenticated_client,
providers_fixture,
scans_fixture,
create_attack_paths_scan,
):
provider = providers_fixture[0]
attack_paths_scan = create_attack_paths_scan(
provider,
scan=scans_fixture[0],
graph_database=None,
)
response = authenticated_client.post(
reverse(
"attack-paths-scans-queries-run", kwargs={"pk": attack_paths_scan.id}
),
data=self._run_payload(),
content_type=API_JSON_CONTENT_TYPE,
)
assert response.status_code == status.HTTP_500_INTERNAL_SERVER_ERROR
assert "does not reference a graph database" in str(response.json())
def test_run_attack_paths_query_unknown_query(
self,
authenticated_client,
providers_fixture,
scans_fixture,
create_attack_paths_scan,
):
provider = providers_fixture[0]
attack_paths_scan = create_attack_paths_scan(
provider,
scan=scans_fixture[0],
)
with patch("api.v1.views.get_query_by_id", return_value=None):
response = authenticated_client.post(
reverse(
"attack-paths-scans-queries-run",
kwargs={"pk": attack_paths_scan.id},
),
data=self._run_payload("unknown-query"),
content_type=API_JSON_CONTENT_TYPE,
)
assert response.status_code == status.HTTP_400_BAD_REQUEST
assert "Unknown Attack Paths query" in response.json()["errors"][0]["detail"]
def test_run_attack_paths_query_returns_404_when_no_nodes_found(
self,
authenticated_client,
providers_fixture,
scans_fixture,
create_attack_paths_scan,
):
provider = providers_fixture[0]
attack_paths_scan = create_attack_paths_scan(
provider,
scan=scans_fixture[0],
)
query_definition = AttackPathsQueryDefinition(
id="aws-empty",
name="empty",
description="",
provider=provider.provider,
cypher="MATCH (n) RETURN n",
)
with (
patch("api.v1.views.get_query_by_id", return_value=query_definition),
patch(
"api.v1.views.attack_paths_views_helpers.prepare_query_parameters",
return_value={"provider_uid": provider.uid},
),
patch(
"api.v1.views.attack_paths_views_helpers.execute_attack_paths_query",
return_value={"nodes": [], "relationships": []},
),
):
response = authenticated_client.post(
reverse(
"attack-paths-scans-queries-run",
kwargs={"pk": attack_paths_scan.id},
),
data=self._run_payload("aws-empty"),
content_type=API_JSON_CONTENT_TYPE,
)
assert response.status_code == status.HTTP_404_NOT_FOUND
payload = response.json()
if "data" in payload:
attributes = payload["data"].get("attributes", {})
assert attributes.get("nodes") == []
assert attributes.get("relationships") == []
else:
assert "errors" in payload
@pytest.mark.django_db
class TestResourceViewSet:
def test_resources_list_none(self, authenticated_client):
@@ -5315,9 +5730,11 @@ class TestUserRoleRelationshipViewSet:
def test_create_relationship_already_exists(
self, authenticated_client, roles_fixture, create_test_user
):
# Only add Role One (which has manage_account=True) to ensure
# the second request has permission to add roles
data = {
"data": [
{"type": "roles", "id": str(role.id)} for role in roles_fixture[:2]
{"type": "roles", "id": str(roles_fixture[0].id)},
]
}
authenticated_client.post(
@@ -5820,44 +6237,16 @@ class TestProviderGroupMembershipViewSet:
@pytest.mark.django_db
class TestComplianceOverviewViewSet:
@pytest.fixture(autouse=True)
def mock_backfill_task(self):
with patch("api.v1.views.backfill_compliance_summaries_task.delay") as mock:
yield mock
def test_compliance_overview_list_none(
self,
authenticated_client,
tenants_fixture,
providers_fixture,
mock_backfill_task,
):
tenant = tenants_fixture[0]
provider = providers_fixture[0]
scan = Scan.objects.create(
name="empty-compliance-scan",
provider=provider,
trigger=Scan.TriggerChoices.MANUAL,
state=StateChoices.COMPLETED,
tenant=tenant,
)
def test_compliance_overview_list_none(self, authenticated_client):
response = authenticated_client.get(
reverse("complianceoverview-list"),
{"filter[scan_id]": str(scan.id)},
{"filter[scan_id]": "8d20ac7d-4cbc-435e-85f4-359be37af821"},
)
assert response.status_code == status.HTTP_200_OK
assert len(response.json()["data"]) == 0
mock_backfill_task.assert_called_once()
_, kwargs = mock_backfill_task.call_args
assert kwargs["scan_id"] == str(scan.id)
assert str(kwargs["tenant_id"]) == str(tenant.id)
def test_compliance_overview_list(
self,
authenticated_client,
compliance_requirements_overviews_fixture,
mock_backfill_task,
self, authenticated_client, compliance_requirements_overviews_fixture
):
# List compliance overviews with existing data
requirement_overview1 = compliance_requirements_overviews_fixture[0]
@@ -5887,112 +6276,6 @@ class TestComplianceOverviewViewSet:
assert "requirements_failed" in attributes
assert "requirements_manual" in attributes
assert "total_requirements" in attributes
mock_backfill_task.assert_called_once()
_, kwargs = mock_backfill_task.call_args
assert kwargs["scan_id"] == scan_id
def test_compliance_overview_list_uses_preaggregated_summaries(
self,
authenticated_client,
tenants_fixture,
providers_fixture,
mock_backfill_task,
):
tenant = tenants_fixture[0]
provider = providers_fixture[0]
scan = Scan.objects.create(
name="preaggregated-scan",
provider=provider,
trigger=Scan.TriggerChoices.MANUAL,
state=StateChoices.COMPLETED,
tenant=tenant,
)
ComplianceRequirementOverview.objects.create(
tenant=tenant,
scan=scan,
compliance_id="cis_1.4_aws",
framework="CIS-1.4-AWS",
version="1.4",
description="CIS AWS Foundations Benchmark v1.4.0",
region="eu-west-1",
requirement_id="framework-metadata",
requirement_status=StatusChoices.PASS,
passed_checks=1,
failed_checks=0,
total_checks=1,
)
ComplianceOverviewSummary.objects.create(
tenant=tenant,
scan=scan,
compliance_id="cis_1.4_aws",
requirements_passed=5,
requirements_failed=1,
requirements_manual=2,
total_requirements=8,
)
response = authenticated_client.get(
reverse("complianceoverview-list"),
{"filter[scan_id]": str(scan.id)},
)
assert response.status_code == status.HTTP_200_OK
data = response.json()["data"]
assert len(data) == 1
overview = data[0]
assert overview["id"] == "cis_1.4_aws"
assert overview["attributes"]["requirements_passed"] == 5
assert overview["attributes"]["requirements_failed"] == 1
assert overview["attributes"]["requirements_manual"] == 2
assert overview["attributes"]["total_requirements"] == 8
assert "framework" in overview["attributes"]
assert "version" in overview["attributes"]
mock_backfill_task.assert_not_called()
def test_compliance_overview_region_filter_skips_backfill(
self,
authenticated_client,
compliance_requirements_overviews_fixture,
mock_backfill_task,
):
requirement_overview = compliance_requirements_overviews_fixture[0]
scan_id = str(requirement_overview.scan.id)
response = authenticated_client.get(
reverse("complianceoverview-list"),
{
"filter[scan_id]": scan_id,
"filter[region]": requirement_overview.region,
},
)
assert response.status_code == status.HTTP_200_OK
assert len(response.json()["data"]) >= 1
mock_backfill_task.assert_not_called()
def test_compliance_overview_list_without_scan_id(
self, authenticated_client, compliance_requirements_overviews_fixture
):
# Ensure the endpoint works without passing a scan filter
response = authenticated_client.get(reverse("complianceoverview-list"))
assert response.status_code == status.HTTP_200_OK
data = response.json()["data"]
assert len(data) == 3
# Validate payload structure
first_item = data[0]
assert "id" in first_item
assert "attributes" in first_item
attributes = first_item["attributes"]
assert "framework" in attributes
assert "version" in attributes
assert "requirements_passed" in attributes
assert "requirements_failed" in attributes
assert "requirements_manual" in attributes
assert "total_requirements" in attributes
def test_compliance_overview_metadata(
self, authenticated_client, compliance_requirements_overviews_fixture
@@ -6146,11 +6429,6 @@ class TestComplianceOverviewViewSet:
requirement_overview1 = compliance_requirements_overviews_fixture[0]
scan_id = str(requirement_overview1.scan.id)
# Remove existing compliance data so the view falls back to task checks
scan = requirement_overview1.scan
ComplianceOverviewSummary.objects.filter(scan=scan).delete()
ComplianceRequirementOverview.objects.filter(scan=scan).delete()
# Mock a running task
with patch.object(
ComplianceOverviewViewSet, "get_task_response_if_running"
@@ -6178,11 +6456,6 @@ class TestComplianceOverviewViewSet:
requirement_overview1 = compliance_requirements_overviews_fixture[0]
scan_id = str(requirement_overview1.scan.id)
# Remove existing compliance data so the view falls back to task checks
scan = requirement_overview1.scan
ComplianceOverviewSummary.objects.filter(scan=scan).delete()
ComplianceRequirementOverview.objects.filter(scan=scan).delete()
# Mock a failed task
with patch.object(
ComplianceOverviewViewSet, "get_task_response_if_running"
@@ -6206,8 +6479,6 @@ class TestComplianceOverviewViewSet:
("framework", "framework", 1),
("version", "version", 1),
("region", "region", 1),
("region__in", "region", 1),
("region.in", "region", 1),
],
)
def test_compliance_overview_filters(
+2 -1
View File
@@ -158,7 +158,8 @@ def get_prowler_provider_kwargs(
if mutelist_processor:
mutelist_content = mutelist_processor.configuration.get("Mutelist", {})
if mutelist_content:
# IaC provider doesn't support mutelist (uses Trivy's built-in logic)
if mutelist_content and provider.provider != Provider.ProviderChoices.IAC.value:
prowler_provider_kwargs["mutelist_content"] = mutelist_content
return prowler_provider_kwargs
+104
View File
@@ -21,6 +21,7 @@ from rest_framework_simplejwt.tokens import RefreshToken
from api.db_router import MainRouter
from api.exceptions import ConflictException
from api.models import (
AttackPathsScan,
Finding,
Integration,
IntegrationProviderRelationship,
@@ -1127,6 +1128,109 @@ class ScanComplianceReportSerializer(serializers.Serializer):
fields = ["id", "name"]
class AttackPathsScanSerializer(RLSSerializer):
state = StateEnumSerializerField(read_only=True)
provider_alias = serializers.SerializerMethodField(read_only=True)
provider_type = serializers.SerializerMethodField(read_only=True)
provider_uid = serializers.SerializerMethodField(read_only=True)
class Meta:
model = AttackPathsScan
fields = [
"id",
"state",
"progress",
"provider",
"provider_alias",
"provider_type",
"provider_uid",
"scan",
"task",
"inserted_at",
"started_at",
"completed_at",
"duration",
]
included_serializers = {
"provider": "api.v1.serializers.ProviderIncludeSerializer",
"scan": "api.v1.serializers.ScanIncludeSerializer",
"task": "api.v1.serializers.TaskSerializer",
}
def get_provider_alias(self, obj):
provider = getattr(obj, "provider", None)
return provider.alias if provider else None
def get_provider_type(self, obj):
provider = getattr(obj, "provider", None)
return provider.provider if provider else None
def get_provider_uid(self, obj):
provider = getattr(obj, "provider", None)
return provider.uid if provider else None
class AttackPathsQueryParameterSerializer(serializers.Serializer):
name = serializers.CharField()
label = serializers.CharField()
data_type = serializers.CharField(default="string")
description = serializers.CharField(allow_null=True, required=False)
placeholder = serializers.CharField(allow_null=True, required=False)
class JSONAPIMeta:
resource_name = "attack-paths-query-parameter"
class AttackPathsQuerySerializer(serializers.Serializer):
id = serializers.CharField()
name = serializers.CharField()
description = serializers.CharField()
provider = serializers.CharField()
parameters = AttackPathsQueryParameterSerializer(many=True)
class JSONAPIMeta:
resource_name = "attack-paths-query"
class AttackPathsQueryRunRequestSerializer(serializers.Serializer):
id = serializers.CharField()
parameters = serializers.DictField(
child=serializers.JSONField(), allow_empty=True, required=False
)
class JSONAPIMeta:
resource_name = "attack-paths-query-run-request"
class AttackPathsNodeSerializer(serializers.Serializer):
id = serializers.CharField()
labels = serializers.ListField(child=serializers.CharField())
properties = serializers.DictField(child=serializers.JSONField())
class JSONAPIMeta:
resource_name = "attack-paths-query-result-node"
class AttackPathsRelationshipSerializer(serializers.Serializer):
id = serializers.CharField()
label = serializers.CharField()
source = serializers.CharField()
target = serializers.CharField()
properties = serializers.DictField(child=serializers.JSONField())
class JSONAPIMeta:
resource_name = "attack-paths-query-result-relationship"
class AttackPathsQueryResultSerializer(serializers.Serializer):
nodes = AttackPathsNodeSerializer(many=True)
relationships = AttackPathsRelationshipSerializer(many=True)
class JSONAPIMeta:
resource_name = "attack-paths-query-result"
class ResourceTagSerializer(RLSSerializer):
"""
Serializer for the ResourceTag model
+4
View File
@@ -4,6 +4,7 @@ from drf_spectacular.views import SpectacularRedocView
from rest_framework_nested import routers
from api.v1.views import (
AttackPathsScanViewSet,
ComplianceOverviewViewSet,
CustomSAMLLoginView,
CustomTokenObtainView,
@@ -53,6 +54,9 @@ router.register(r"tenants", TenantViewSet, basename="tenant")
router.register(r"providers", ProviderViewSet, basename="provider")
router.register(r"provider-groups", ProviderGroupViewSet, basename="providergroup")
router.register(r"scans", ScanViewSet, basename="scan")
router.register(
r"attack-paths-scans", AttackPathsScanViewSet, basename="attack-paths-scans"
)
router.register(r"tasks", TaskViewSet, basename="task")
router.register(r"resources", ResourceViewSet, basename="resource")
router.register(r"findings", FindingViewSet, basename="finding")
+305 -284
View File
@@ -3,6 +3,7 @@ import glob
import json
import logging
import os
from collections import defaultdict
from copy import deepcopy
from datetime import datetime, timedelta, timezone
@@ -10,6 +11,7 @@ from decimal import ROUND_HALF_UP, Decimal, InvalidOperation
from urllib.parse import urljoin
import sentry_sdk
from allauth.socialaccount.models import SocialAccount, SocialApp
from allauth.socialaccount.providers.github.views import GitHubOAuth2Adapter
from allauth.socialaccount.providers.google.views import GoogleOAuth2Adapter
@@ -41,8 +43,9 @@ from django.db.models import (
Sum,
Value,
When,
Window,
)
from django.db.models.functions import Coalesce
from django.db.models.functions import Coalesce, RowNumber
from django.http import HttpResponse, QueryDict
from django.shortcuts import redirect
from django.urls import reverse
@@ -72,23 +75,12 @@ from rest_framework.generics import GenericAPIView, get_object_or_404
from rest_framework.permissions import SAFE_METHODS
from rest_framework_json_api.views import RelationshipView, Response
from rest_framework_simplejwt.exceptions import InvalidToken, TokenError
from tasks.beat import schedule_provider_scan
from tasks.jobs.export import get_s3_client
from tasks.tasks import (
backfill_compliance_summaries_task,
backfill_scan_resource_summaries_task,
check_integration_connection_task,
check_lighthouse_connection_task,
check_lighthouse_provider_connection_task,
check_provider_connection_task,
delete_provider_task,
delete_tenant_task,
jira_integration_task,
mute_historical_findings_task,
perform_scan_task,
refresh_lighthouse_provider_models_task,
)
from api.attack_paths import (
get_queries_for_provider,
get_query_by_id,
views_helpers as attack_paths_views_helpers,
)
from api.base_views import BaseRLSViewSet, BaseTenantViewset, BaseUserViewset
from api.compliance import (
PROWLER_COMPLIANCE_OVERVIEW_TEMPLATE,
@@ -106,6 +98,7 @@ from api.filters import (
InvitationFilter,
LatestFindingFilter,
LatestResourceFilter,
AttackPathsScanFilter,
LighthouseProviderConfigFilter,
LighthouseProviderModelsFilter,
MembershipFilter,
@@ -126,11 +119,11 @@ from api.filters import (
UserFilter,
)
from api.models import (
ComplianceOverviewSummary,
ComplianceRequirementOverview,
Finding,
Integration,
Invitation,
AttackPathsScan,
LighthouseConfiguration,
LighthouseProviderConfiguration,
LighthouseProviderModels,
@@ -172,6 +165,10 @@ from api.utils import (
from api.uuid_utils import datetime_to_uuid7, uuid7_start
from api.v1.mixins import DisablePaginationMixin, PaginateByPkMixin, TaskManagementMixin
from api.v1.serializers import (
AttackPathsQueryRunRequestSerializer,
AttackPathsQuerySerializer,
AttackPathsQueryResultSerializer,
AttackPathsScanSerializer,
ComplianceOverviewAttributesSerializer,
ComplianceOverviewDetailSerializer,
ComplianceOverviewDetailThreatscoreSerializer,
@@ -249,6 +246,22 @@ from api.v1.serializers import (
UserSerializer,
UserUpdateSerializer,
)
from tasks.beat import schedule_provider_scan
from tasks.jobs.attack_paths import db_utils as attack_paths_db_utils
from tasks.jobs.export import get_s3_client
from tasks.tasks import (
backfill_scan_resource_summaries_task,
check_integration_connection_task,
check_lighthouse_connection_task,
check_lighthouse_provider_connection_task,
check_provider_connection_task,
delete_provider_task,
delete_tenant_task,
jira_integration_task,
mute_historical_findings_task,
perform_scan_task,
refresh_lighthouse_provider_models_task,
)
logger = logging.getLogger(BackendLogger.API)
@@ -392,6 +405,10 @@ class SchemaView(SpectacularAPIView):
"name": "Scan",
"description": "Endpoints for triggering manual scans and viewing scan results.",
},
{
"name": "Attack Paths",
"description": "Endpoints for Attack Paths scan status and executing Attack Paths queries.",
},
{
"name": "Schedule",
"description": "Endpoints for managing scan schedules, allowing configuration of automated "
@@ -2142,6 +2159,12 @@ class ScanViewSet(BaseRLSViewSet):
},
)
attack_paths_db_utils.create_attack_paths_scan(
tenant_id=self.request.tenant_id,
scan_id=str(scan.id),
provider_id=str(scan.provider_id),
)
prowler_task = Task.objects.get(id=task.id)
scan.task_id = task.id
scan.save(update_fields=["task_id"])
@@ -2222,6 +2245,187 @@ class TaskViewSet(BaseRLSViewSet):
)
@extend_schema_view(
list=extend_schema(
tags=["Attack Paths"],
summary="List Attack Paths scans",
description="Retrieve Attack Paths scans for the tenant with support for filtering, ordering, and pagination.",
),
retrieve=extend_schema(
tags=["Attack Paths"],
summary="Retrieve Attack Paths scan details",
description="Fetch full details for a specific Attack Paths scan.",
),
attack_paths_queries=extend_schema(
tags=["Attack Paths"],
summary="List attack paths queries",
description="Retrieve the catalog of Attack Paths queries available for this Attack Paths scan.",
responses={
200: OpenApiResponse(AttackPathsQuerySerializer(many=True)),
404: OpenApiResponse(
description="No queries found for the selected provider"
),
},
),
run_attack_paths_query=extend_schema(
tags=["Attack Paths"],
summary="Execute an Attack Paths query",
description="Execute the selected Attack Paths query against the Attack Paths graph and return the resulting subgraph.",
request=AttackPathsQueryRunRequestSerializer,
responses={
200: OpenApiResponse(AttackPathsQueryResultSerializer),
400: OpenApiResponse(
description="Bad request (e.g., Unknown Attack Paths query for the selected provider)"
),
404: OpenApiResponse(
description="No attack paths found for the given query and parameters"
),
500: OpenApiResponse(
description="Attack Paths query execution failed due to a database error"
),
},
),
)
class AttackPathsScanViewSet(BaseRLSViewSet):
queryset = AttackPathsScan.objects.all()
serializer_class = AttackPathsScanSerializer
http_method_names = ["get", "post"]
filterset_class = AttackPathsScanFilter
ordering = ["-inserted_at"]
ordering_fields = [
"inserted_at",
"started_at",
]
# RBAC required permissions
required_permissions = [Permissions.MANAGE_SCANS]
def set_required_permissions(self):
if self.request.method in SAFE_METHODS:
self.required_permissions = []
else:
self.required_permissions = [Permissions.MANAGE_SCANS]
def get_serializer_class(self):
if self.action == "run_attack_paths_query":
return AttackPathsQueryRunRequestSerializer
return super().get_serializer_class()
def get_queryset(self):
user_roles = get_role(self.request.user)
base_queryset = AttackPathsScan.objects.filter(tenant_id=self.request.tenant_id)
if user_roles.unlimited_visibility:
queryset = base_queryset
else:
queryset = base_queryset.filter(provider__in=get_providers(user_roles))
return queryset.select_related("provider", "scan", "task")
def list(self, request, *args, **kwargs):
queryset = self.filter_queryset(self.get_queryset())
latest_per_provider = queryset.annotate(
latest_scan_rank=Window(
expression=RowNumber(),
partition_by=[F("provider_id")],
order_by=[F("inserted_at").desc()],
)
).filter(latest_scan_rank=1)
page = self.paginate_queryset(latest_per_provider)
if page is not None:
serializer = self.get_serializer(page, many=True)
return self.get_paginated_response(serializer.data)
serializer = self.get_serializer(latest_per_provider, many=True)
return Response(serializer.data)
@extend_schema(exclude=True)
def create(self, request, *args, **kwargs):
raise MethodNotAllowed(method="POST")
@extend_schema(exclude=True)
def destroy(self, request, *args, **kwargs):
raise MethodNotAllowed(method="DELETE")
@action(
detail=True,
methods=["get"],
url_path="queries",
url_name="queries",
)
def attack_paths_queries(self, request, pk=None):
attack_paths_scan = self.get_object()
queries = get_queries_for_provider(attack_paths_scan.provider.provider)
if not queries:
return Response(
{"detail": "No queries found for the selected provider"},
status=status.HTTP_404_NOT_FOUND,
)
serializer = AttackPathsQuerySerializer(queries, many=True)
return Response(serializer.data, status=status.HTTP_200_OK)
@action(
detail=True,
methods=["post"],
url_path="queries/run",
url_name="queries-run",
)
def run_attack_paths_query(self, request, pk=None):
attack_paths_scan = self.get_object()
if attack_paths_scan.state != StateChoices.COMPLETED:
raise ValidationError(
{
"detail": "The Attack Paths scan must be completed before running Attack Paths queries"
}
)
if not attack_paths_scan.graph_database:
logger.error(
f"The Attack Paths Scan {attack_paths_scan.id} does not reference a graph database"
)
return Response(
{"detail": "The Attack Paths scan does not reference a graph database"},
status=status.HTTP_500_INTERNAL_SERVER_ERROR,
)
payload = attack_paths_views_helpers.normalize_run_payload(request.data)
serializer = AttackPathsQueryRunRequestSerializer(data=payload)
serializer.is_valid(raise_exception=True)
query_definition = get_query_by_id(serializer.validated_data["id"])
if (
query_definition is None
or query_definition.provider != attack_paths_scan.provider.provider
):
raise ValidationError(
{"id": "Unknown Attack Paths query for the selected provider"}
)
parameters = attack_paths_views_helpers.prepare_query_parameters(
query_definition,
serializer.validated_data.get("parameters", {}),
attack_paths_scan.provider.uid,
)
graph = attack_paths_views_helpers.execute_attack_paths_query(
attack_paths_scan, query_definition, parameters
)
status_code = status.HTTP_200_OK
if not graph.get("nodes"):
status_code = status.HTTP_404_NOT_FOUND
response_serializer = AttackPathsQueryResultSerializer(graph)
return Response(response_serializer.data, status=status_code)
@extend_schema_view(
list=extend_schema(
tags=["Resource"],
@@ -3359,50 +3563,15 @@ class RoleProviderGroupRelationshipView(RelationshipView, BaseRLSViewSet):
@extend_schema_view(
list=extend_schema(
tags=["Compliance Overview"],
summary="List compliance overviews",
description=(
"Retrieve an overview of all compliance frameworks. "
"If scan_id is provided, returns compliance data for that specific scan. "
"If scan_id is omitted, returns compliance data aggregated from the latest completed scan of each provider."
),
summary="List compliance overviews for a scan",
description="Retrieve an overview of all the compliance in a given scan.",
parameters=[
OpenApiParameter(
name="filter[scan_id]",
required=False,
required=True,
type=OpenApiTypes.UUID,
location=OpenApiParameter.QUERY,
description=(
"Optional scan ID. If provided, returns compliance for that scan. "
"If omitted, returns compliance for the latest completed scan per provider."
),
),
OpenApiParameter(
name="filter[provider_id]",
required=False,
type=OpenApiTypes.UUID,
location=OpenApiParameter.QUERY,
description="Filter by specific provider ID.",
),
OpenApiParameter(
name="filter[provider_id__in]",
required=False,
type={"type": "array", "items": {"type": "string", "format": "uuid"}},
location=OpenApiParameter.QUERY,
description="Filter by multiple provider IDs (comma-separated).",
),
OpenApiParameter(
name="filter[provider_type]",
required=False,
type=OpenApiTypes.STR,
location=OpenApiParameter.QUERY,
description="Filter by provider type (e.g., aws, azure, gcp).",
),
OpenApiParameter(
name="filter[provider_type__in]",
required=False,
type={"type": "array", "items": {"type": "string"}},
location=OpenApiParameter.QUERY,
description="Filter by multiple provider types (comma-separated).",
description="Related scan ID.",
),
],
responses={
@@ -3559,114 +3728,6 @@ class ComplianceOverviewViewSet(BaseRLSViewSet, TaskManagementMixin):
def retrieve(self, request, *args, **kwargs):
raise MethodNotAllowed(method="GET")
def _compliance_summaries_queryset(self, scan_id):
"""Return pre-aggregated summaries constrained by RBAC visibility."""
role = get_role(self.request.user)
unlimited_visibility = getattr(
role, Permissions.UNLIMITED_VISIBILITY.value, False
)
summaries = ComplianceOverviewSummary.objects.filter(
tenant_id=self.request.tenant_id,
scan_id=scan_id,
)
if not unlimited_visibility:
providers = Provider.all_objects.filter(
provider_groups__in=role.provider_groups.all()
).distinct()
summaries = summaries.filter(scan__provider__in=providers)
return summaries
def _get_compliance_template(self, *, provider=None, scan_id=None):
"""Return the compliance template for the given provider or scan."""
if provider is None and scan_id is not None:
scan = Scan.all_objects.select_related("provider").get(pk=scan_id)
provider = scan.provider
if not provider:
return {}
return PROWLER_COMPLIANCE_OVERVIEW_TEMPLATE.get(provider.provider, {})
def _aggregate_compliance_overview(self, queryset, template_metadata=None):
"""
Aggregate requirement rows into compliance overview dictionaries.
Args:
queryset: ComplianceRequirementOverview queryset already filtered.
template_metadata: Optional dict mapping compliance_id -> metadata.
"""
template_metadata = template_metadata or {}
requirement_status_subquery = queryset.values(
"compliance_id", "requirement_id"
).annotate(
fail_count=Count("id", filter=Q(requirement_status="FAIL")),
pass_count=Count("id", filter=Q(requirement_status="PASS")),
total_count=Count("id"),
)
compliance_data = {}
fallback_metadata = {
item["compliance_id"]: {
"framework": item["framework"],
"version": item["version"],
}
for item in queryset.values(
"compliance_id", "framework", "version"
).distinct()
}
for item in requirement_status_subquery:
compliance_id = item["compliance_id"]
if item["fail_count"] > 0:
req_status = "FAIL"
elif item["pass_count"] == item["total_count"]:
req_status = "PASS"
else:
req_status = "MANUAL"
compliance_status = compliance_data.setdefault(
compliance_id,
{
"total_requirements": 0,
"requirements_passed": 0,
"requirements_failed": 0,
"requirements_manual": 0,
},
)
compliance_status["total_requirements"] += 1
if req_status == "PASS":
compliance_status["requirements_passed"] += 1
elif req_status == "FAIL":
compliance_status["requirements_failed"] += 1
else:
compliance_status["requirements_manual"] += 1
response_data = []
for compliance_id, data in compliance_data.items():
template = template_metadata.get(compliance_id, {})
fallback = fallback_metadata.get(compliance_id, {})
response_data.append(
{
"id": compliance_id,
"compliance_id": compliance_id,
"framework": template.get("framework")
or fallback.get("framework", ""),
"version": template.get("version") or fallback.get("version", ""),
"requirements_passed": data["requirements_passed"],
"requirements_failed": data["requirements_failed"],
"requirements_manual": data["requirements_manual"],
"total_requirements": data["total_requirements"],
}
)
serializer = self.get_serializer(response_data, many=True)
return serializer.data
def _task_response_if_running(self, scan_id):
"""Check for an in-progress task only when no compliance data exists."""
try:
@@ -3681,135 +3742,95 @@ class ComplianceOverviewViewSet(BaseRLSViewSet, TaskManagementMixin):
status=status.HTTP_500_INTERNAL_SERVER_ERROR,
)
def _list_with_region_filter(self, scan_id, region_filter):
"""
Fall back to detailed ComplianceRequirementOverview query when region filter is applied.
This uses the original aggregation logic across filtered regions.
"""
regions = region_filter.split(",") if "," in region_filter else [region_filter]
queryset = self.filter_queryset(self.get_queryset()).filter(
scan_id=scan_id,
region__in=regions,
)
data = self._aggregate_compliance_overview(queryset)
if data:
return Response(data)
task_response = self._task_response_if_running(scan_id)
if task_response:
return task_response
return Response(data)
def _list_without_region_aggregation(self, scan_id):
"""
Fall back aggregation when compliance summaries don't exist yet.
Aggregates ComplianceRequirementOverview data across ALL regions.
"""
queryset = self.filter_queryset(self.get_queryset()).filter(scan_id=scan_id)
compliance_template = self._get_compliance_template(scan_id=scan_id)
data = self._aggregate_compliance_overview(
queryset, template_metadata=compliance_template
)
if data:
return Response(data)
task_response = self._task_response_if_running(scan_id)
if task_response:
return task_response
return Response(data)
def list(self, request, *args, **kwargs):
scan_id = request.query_params.get("filter[scan_id]")
tenant_id = self.request.tenant_id
if scan_id:
# Specific scan requested - use optimized summaries with region support
region_filter = request.query_params.get(
"filter[region]"
) or request.query_params.get("filter[region__in]")
if region_filter:
# Fall back to detailed query with region filtering
return self._list_with_region_filter(scan_id, region_filter)
summaries = list(self._compliance_summaries_queryset(scan_id))
if not summaries:
# Trigger async backfill for next time
backfill_compliance_summaries_task.delay(
tenant_id=self.request.tenant_id, scan_id=scan_id
)
# Use fallback aggregation for this request
return self._list_without_region_aggregation(scan_id)
# Get compliance template for provider to enrich with framework/version
compliance_template = self._get_compliance_template(scan_id=scan_id)
# Convert to response format with framework/version enrichment
response_data = []
for summary in summaries:
compliance_metadata = compliance_template.get(summary.compliance_id, {})
response_data.append(
if not scan_id:
raise ValidationError(
[
{
"id": summary.compliance_id,
"compliance_id": summary.compliance_id,
"framework": compliance_metadata.get("framework", ""),
"version": compliance_metadata.get("version", ""),
"requirements_passed": summary.requirements_passed,
"requirements_failed": summary.requirements_failed,
"requirements_manual": summary.requirements_manual,
"total_requirements": summary.total_requirements,
"detail": "This query parameter is required.",
"status": 400,
"source": {"pointer": "filter[scan_id]"},
"code": "required",
}
)
]
)
try:
if task := self.get_task_response_if_running(
task_name="scan-compliance-overviews",
task_kwargs={"tenant_id": self.request.tenant_id, "scan_id": scan_id},
raise_on_not_found=False,
):
return task
except TaskFailedException:
return Response(
{"detail": "Task failed to generate compliance overview data."},
status=status.HTTP_500_INTERNAL_SERVER_ERROR,
)
queryset = self.filter_queryset(self.filter_queryset(self.get_queryset()))
serializer = self.get_serializer(response_data, many=True)
return Response(serializer.data)
else:
# No scan_id provided - use latest scans per provider
# First, check if provider filters are present
provider_id = request.query_params.get("filter[provider_id]")
provider_id__in = request.query_params.get("filter[provider_id__in]")
provider_type = request.query_params.get("filter[provider_type]")
provider_type__in = request.query_params.get("filter[provider_type__in]")
requirement_status_subquery = queryset.values(
"compliance_id", "requirement_id"
).annotate(
fail_count=Count("id", filter=Q(requirement_status="FAIL")),
pass_count=Count("id", filter=Q(requirement_status="PASS")),
total_count=Count("id"),
)
scan_filters = {"tenant_id": tenant_id, "state": StateChoices.COMPLETED}
compliance_data = {}
framework_info = {}
# Apply provider ID filters
if provider_id:
scan_filters["provider_id"] = provider_id
elif provider_id__in:
# Convert comma-separated string to list
provider_ids = [pid.strip() for pid in provider_id__in.split(",")]
scan_filters["provider_id__in"] = provider_ids
for item in queryset.values("compliance_id", "framework", "version").distinct():
framework_info[item["compliance_id"]] = {
"framework": item["framework"],
"version": item["version"],
}
# Apply provider type filters
if provider_type:
scan_filters["provider__provider"] = provider_type
elif provider_type__in:
# Convert comma-separated string to list
provider_types = [pt.strip() for pt in provider_type__in.split(",")]
scan_filters["provider__provider__in"] = provider_types
for item in requirement_status_subquery:
compliance_id = item["compliance_id"]
latest_scan_ids = (
Scan.all_objects.filter(**scan_filters)
.order_by("provider_id", "-inserted_at")
.distinct("provider_id")
.values_list("id", flat=True)
if item["fail_count"] > 0:
req_status = "FAIL"
elif item["pass_count"] == item["total_count"]:
req_status = "PASS"
else:
req_status = "MANUAL"
if compliance_id not in compliance_data:
compliance_data[compliance_id] = {
"total_requirements": 0,
"requirements_passed": 0,
"requirements_failed": 0,
"requirements_manual": 0,
}
compliance_data[compliance_id]["total_requirements"] += 1
if req_status == "PASS":
compliance_data[compliance_id]["requirements_passed"] += 1
elif req_status == "FAIL":
compliance_data[compliance_id]["requirements_failed"] += 1
else:
compliance_data[compliance_id]["requirements_manual"] += 1
response_data = []
for compliance_id, data in compliance_data.items():
framework = framework_info.get(compliance_id, {})
response_data.append(
{
"id": compliance_id,
"compliance_id": compliance_id,
"framework": framework.get("framework", ""),
"version": framework.get("version", ""),
"requirements_passed": data["requirements_passed"],
"requirements_failed": data["requirements_failed"],
"requirements_manual": data["requirements_manual"],
"total_requirements": data["total_requirements"],
}
)
base_queryset = self.get_queryset()
queryset = self.filter_queryset(
base_queryset.filter(scan_id__in=latest_scan_ids)
)
# Aggregate compliance data across latest scans
compliance_template = self._get_compliance_template()
data = self._aggregate_compliance_overview(
queryset, template_metadata=compliance_template
)
return Response(data)
serializer = self.get_serializer(response_data, many=True)
return Response(serializer.data)
@action(detail=False, methods=["get"], url_name="metadata")
def metadata(self, request):
@@ -5328,7 +5349,7 @@ class TenantApiKeyViewSet(BaseRLSViewSet):
@extend_schema(exclude=True)
def destroy(self, request, *args, **kwargs):
raise MethodNotAllowed(method="DESTROY")
raise MethodNotAllowed(method="DELETE")
@action(detail=True, methods=["delete"])
def revoke(self, request, *args, **kwargs):
+1
View File
@@ -1,6 +1,7 @@
import warnings
from celery import Celery, Task
from config.env import env
# Suppress specific warnings from django-rest-auth: https://github.com/iMerica/dj-rest-auth/issues/684
+6
View File
@@ -36,6 +36,12 @@ DATABASES = {
"HOST": env("POSTGRES_REPLICA_HOST", default=default_db_host),
"PORT": env("POSTGRES_REPLICA_PORT", default=default_db_port),
},
"neo4j": {
"HOST": env.str("NEO4J_HOST", "neo4j"),
"PORT": env.str("NEO4J_PORT", "7687"),
"USER": env.str("NEO4J_USER", "neo4j"),
"PASSWORD": env.str("NEO4J_PASSWORD", "neo4j_password"),
},
}
DATABASES["default"] = DATABASES["prowler_user"]
@@ -37,6 +37,12 @@ DATABASES = {
"HOST": env("POSTGRES_REPLICA_HOST", default=default_db_host),
"PORT": env("POSTGRES_REPLICA_PORT", default=default_db_port),
},
"neo4j": {
"HOST": env.str("NEO4J_HOST"),
"PORT": env.str("NEO4J_PORT"),
"USER": env.str("NEO4J_USER"),
"PASSWORD": env.str("NEO4J_PASSWORD"),
},
}
DATABASES["default"] = DATABASES["prowler_user"]
+111 -7
View File
@@ -1,8 +1,11 @@
import logging
from types import SimpleNamespace
from datetime import datetime, timedelta, timezone
from unittest.mock import MagicMock, patch
import pytest
from allauth.socialaccount.models import SocialLogin
from django.conf import settings
from django.db import connection as django_connection
@@ -11,10 +14,14 @@ from django.urls import reverse
from django_celery_results.models import TaskResult
from rest_framework import status
from rest_framework.test import APIClient
from tasks.jobs.backfill import backfill_resource_scan_summaries
from api.attack_paths import (
AttackPathsQueryDefinition,
AttackPathsQueryParameterDefinition,
)
from api.db_utils import rls_transaction
from api.models import (
AttackPathsScan,
ComplianceOverview,
ComplianceRequirementOverview,
Finding,
@@ -47,6 +54,7 @@ from api.rls import Tenant
from api.v1.serializers import TokenSerializer
from prowler.lib.check.models import Severity
from prowler.lib.outputs.finding import Status
from tasks.jobs.backfill import backfill_resource_scan_summaries
TODAY = str(datetime.today().date())
API_JSON_CONTENT_TYPE = "application/vnd.api+json"
@@ -159,22 +167,20 @@ def create_test_user_rbac_no_roles(django_db_setup, django_db_blocker, tenants_f
@pytest.fixture(scope="function")
def create_test_user_rbac_limited(django_db_setup, django_db_blocker):
def create_test_user_rbac_limited(django_db_setup, django_db_blocker, tenants_fixture):
with django_db_blocker.unblock():
user = User.objects.create_user(
name="testing_limited",
email="rbac_limited@rbac.com",
password=TEST_PASSWORD,
)
tenant = Tenant.objects.create(
name="Tenant Test",
)
tenant = tenants_fixture[0]
Membership.objects.create(
user=user,
tenant=tenant,
role=Membership.RoleChoices.OWNER,
)
Role.objects.create(
role = Role.objects.create(
name="limited",
tenant_id=tenant.id,
manage_users=False,
@@ -187,7 +193,7 @@ def create_test_user_rbac_limited(django_db_setup, django_db_blocker):
)
UserRoleRelationship.objects.create(
user=user,
role=Role.objects.get(name="limited"),
role=role,
tenant_id=tenant.id,
)
return user
@@ -1469,6 +1475,104 @@ def mute_rules_fixture(tenants_fixture, create_test_user, findings_fixture):
return mute_rule1, mute_rule2
@pytest.fixture
def create_attack_paths_scan():
"""Factory fixture to create Attack Paths scans for tests."""
def _create(
provider,
*,
scan=None,
state=StateChoices.COMPLETED,
progress=0,
graph_database="tenant-db",
**extra_fields,
):
scan_instance = scan or Scan.objects.create(
name=extra_fields.pop("scan_name", "Attack Paths Supporting Scan"),
provider=provider,
trigger=Scan.TriggerChoices.MANUAL,
state=extra_fields.pop("scan_state", StateChoices.COMPLETED),
tenant_id=provider.tenant_id,
)
payload = {
"tenant_id": provider.tenant_id,
"provider": provider,
"scan": scan_instance,
"state": state,
"progress": progress,
"graph_database": graph_database,
}
payload.update(extra_fields)
return AttackPathsScan.objects.create(**payload)
return _create
@pytest.fixture
def attack_paths_query_definition_factory():
"""Factory fixture for building Attack Paths query definitions."""
def _create(**overrides):
cast_type = overrides.pop("cast_type", str)
parameters = overrides.pop(
"parameters",
[
AttackPathsQueryParameterDefinition(
name="limit",
label="Limit",
cast=cast_type,
)
],
)
definition_payload = {
"id": "aws-test",
"name": "Attack Paths Test Query",
"description": "Synthetic Attack Paths definition for tests.",
"provider": "aws",
"cypher": "RETURN 1",
"parameters": parameters,
}
definition_payload.update(overrides)
return AttackPathsQueryDefinition(**definition_payload)
return _create
@pytest.fixture
def attack_paths_graph_stub_classes():
"""Provide lightweight graph element stubs for Attack Paths serialization tests."""
class AttackPathsNativeValue:
def __init__(self, value):
self._value = value
def to_native(self):
return self._value
class AttackPathsNode:
def __init__(self, element_id, labels, properties):
self.element_id = element_id
self.labels = labels
self._properties = properties
class AttackPathsRelationship:
def __init__(self, element_id, rel_type, start_node, end_node, properties):
self.element_id = element_id
self.type = rel_type
self.start_node = start_node
self.end_node = end_node
self._properties = properties
return SimpleNamespace(
NativeValue=AttackPathsNativeValue,
Node=AttackPathsNode,
Relationship=AttackPathsRelationship,
)
def get_authorization_header(access_token: str) -> dict:
return {"Authorization": f"Bearer {access_token}"}
+7
View File
@@ -7,6 +7,7 @@ from tasks.tasks import perform_scheduled_scan_task
from api.db_utils import rls_transaction
from api.exceptions import ConflictException
from api.models import Provider, Scan, StateChoices
from tasks.jobs.attack_paths import db_utils as attack_paths_db_utils
def schedule_provider_scan(provider_instance: Provider):
@@ -39,6 +40,12 @@ def schedule_provider_scan(provider_instance: Provider):
scheduled_at=datetime.now(timezone.utc),
)
attack_paths_db_utils.create_attack_paths_scan(
tenant_id=tenant_id,
scan_id=str(scheduled_scan.id),
provider_id=provider_id,
)
# Schedule the task
periodic_task_instance = PeriodicTask.objects.create(
interval=schedule,
@@ -0,0 +1,5 @@
from tasks.jobs.attack_paths.scan import run as attack_paths_scan
__all__ = [
"attack_paths_scan",
]
@@ -0,0 +1,237 @@
# Portions of this file are based on code from the Cartography project
# (https://github.com/cartography-cncf/cartography), which is licensed under the Apache 2.0 License.
from typing import Any
import aioboto3
import boto3
import neo4j
from cartography.config import Config as CartographyConfig
from cartography.intel import aws as cartography_aws
from celery.utils.log import get_task_logger
from api.models import (
AttackPathsScan as ProwlerAPIAttackPathsScan,
Provider as ProwlerAPIProvider,
)
from prowler.providers.common.provider import Provider as ProwlerSDKProvider
from tasks.jobs.attack_paths import db_utils, utils
logger = get_task_logger(__name__)
def start_aws_ingestion(
neo4j_session: neo4j.Session,
cartography_config: CartographyConfig,
prowler_api_provider: ProwlerAPIProvider,
prowler_sdk_provider: ProwlerSDKProvider,
attack_paths_scan: ProwlerAPIAttackPathsScan,
) -> dict[str, dict[str, str]]:
"""
Code based on Cartography version 0.122.0, specifically on `cartography.intel.aws.__init__.py`.
For the scan progress updates:
- The caller of this function (`tasks.jobs.attack_paths.scan.run`) has set it to 2.
- When the control returns to the caller, it will be set to 95.
"""
# Initialize variables common to all jobs
common_job_parameters = {
"UPDATE_TAG": cartography_config.update_tag,
"permission_relationships_file": cartography_config.permission_relationships_file,
"aws_guardduty_severity_threshold": cartography_config.aws_guardduty_severity_threshold,
"aws_cloudtrail_management_events_lookback_hours": cartography_config.aws_cloudtrail_management_events_lookback_hours,
"experimental_aws_inspector_batch": cartography_config.experimental_aws_inspector_batch,
}
boto3_session = get_boto3_session(prowler_api_provider, prowler_sdk_provider)
regions: list[str] = list(prowler_sdk_provider._enabled_regions)
requested_syncs = list(cartography_aws.RESOURCE_FUNCTIONS.keys())
sync_args = cartography_aws._build_aws_sync_kwargs(
neo4j_session,
boto3_session,
regions,
prowler_api_provider.uid,
cartography_config.update_tag,
common_job_parameters,
)
# Starting with sync functions
cartography_aws.organizations.sync(
neo4j_session,
{prowler_api_provider.alias: prowler_api_provider.uid},
cartography_config.update_tag,
common_job_parameters,
)
db_utils.update_attack_paths_scan_progress(attack_paths_scan, 3)
# Adding an extra field
common_job_parameters["AWS_ID"] = prowler_api_provider.uid
cartography_aws._autodiscover_accounts(
neo4j_session,
boto3_session,
prowler_api_provider.uid,
cartography_config.update_tag,
common_job_parameters,
)
db_utils.update_attack_paths_scan_progress(attack_paths_scan, 4)
failed_syncs = sync_aws_account(
prowler_api_provider, requested_syncs, sync_args, attack_paths_scan
)
if "permission_relationships" in requested_syncs:
cartography_aws.RESOURCE_FUNCTIONS["permission_relationships"](**sync_args)
db_utils.update_attack_paths_scan_progress(attack_paths_scan, 88)
if "resourcegroupstaggingapi" in requested_syncs:
cartography_aws.RESOURCE_FUNCTIONS["resourcegroupstaggingapi"](**sync_args)
db_utils.update_attack_paths_scan_progress(attack_paths_scan, 89)
cartography_aws.run_scoped_analysis_job(
"aws_ec2_iaminstanceprofile.json",
neo4j_session,
common_job_parameters,
)
db_utils.update_attack_paths_scan_progress(attack_paths_scan, 90)
cartography_aws.run_analysis_job(
"aws_lambda_ecr.json",
neo4j_session,
common_job_parameters,
)
db_utils.update_attack_paths_scan_progress(attack_paths_scan, 91)
cartography_aws.merge_module_sync_metadata(
neo4j_session,
group_type="AWSAccount",
group_id=prowler_api_provider.uid,
synced_type="AWSAccount",
update_tag=cartography_config.update_tag,
stat_handler=cartography_aws.stat_handler,
)
db_utils.update_attack_paths_scan_progress(attack_paths_scan, 92)
# Removing the added extra field
del common_job_parameters["AWS_ID"]
cartography_aws.run_cleanup_job(
"aws_post_ingestion_principals_cleanup.json",
neo4j_session,
common_job_parameters,
)
db_utils.update_attack_paths_scan_progress(attack_paths_scan, 93)
cartography_aws._perform_aws_analysis(
requested_syncs, neo4j_session, common_job_parameters
)
db_utils.update_attack_paths_scan_progress(attack_paths_scan, 94)
return failed_syncs
def get_boto3_session(
prowler_api_provider: ProwlerAPIProvider, prowler_sdk_provider: ProwlerSDKProvider
) -> boto3.Session:
boto3_session = prowler_sdk_provider.session.current_session
aws_accounts_from_session = cartography_aws.organizations.get_aws_account_default(
boto3_session
)
if not aws_accounts_from_session:
raise Exception(
"No valid AWS credentials could be found. No AWS accounts can be synced."
)
aws_account_id_from_session = list(aws_accounts_from_session.values())[0]
if prowler_api_provider.uid != aws_account_id_from_session:
raise Exception(
f"Provider {prowler_api_provider.uid} doesn't match AWS account {aws_account_id_from_session}."
)
if boto3_session.region_name is None:
global_region = prowler_sdk_provider.get_global_region()
boto3_session._session.set_config_variable("region", global_region)
return boto3_session
def get_aioboto3_session(boto3_session: boto3.Session) -> aioboto3.Session:
return aioboto3.Session(botocore_session=boto3_session._session)
def sync_aws_account(
prowler_api_provider: ProwlerAPIProvider,
requested_syncs: list[str],
sync_args: dict[str, Any],
attack_paths_scan: ProwlerAPIAttackPathsScan,
) -> dict[str, str]:
current_progress = 4 # `cartography_aws._autodiscover_accounts`
max_progress = (
87 # `cartography_aws.RESOURCE_FUNCTIONS["permission_relationships"]` - 1
)
n_steps = (
len(requested_syncs) - 2
) # Excluding `permission_relationships` and `resourcegroupstaggingapi`
progress_step = (max_progress - current_progress) / n_steps
failed_syncs = {}
for func_name in requested_syncs:
if func_name in cartography_aws.RESOURCE_FUNCTIONS:
logger.info(
f"Syncing function {func_name} for AWS account {prowler_api_provider.uid}"
)
# Updating progress, not really the right place but good enough
current_progress += progress_step
db_utils.update_attack_paths_scan_progress(
attack_paths_scan, int(current_progress)
)
try:
# `ecr:image_layers` uses `aioboto3_session` instead of `boto3_session`
if func_name == "ecr:image_layers":
cartography_aws.RESOURCE_FUNCTIONS[func_name](
neo4j_session=sync_args.get("neo4j_session"),
aioboto3_session=get_aioboto3_session(
sync_args.get("boto3_session")
),
regions=sync_args.get("regions"),
current_aws_account_id=sync_args.get("current_aws_account_id"),
update_tag=sync_args.get("update_tag"),
common_job_parameters=sync_args.get("common_job_parameters"),
)
# Skip permission relationships and tags for now because they rely on data already being in the graph
elif func_name in [
"permission_relationships",
"resourcegroupstaggingapi",
]:
continue
else:
cartography_aws.RESOURCE_FUNCTIONS[func_name](**sync_args)
except Exception as e:
exception_message = utils.stringify_exception(
e, f"Exception for AWS sync function: {func_name}"
)
failed_syncs[func_name] = exception_message
logger.warning(
f"Caught exception syncing function {func_name} from AWS account {prowler_api_provider.uid}. We "
"are continuing on to the next AWS sync function.",
)
continue
else:
raise ValueError(
f'AWS sync function "{func_name}" was specified but does not exist. Did you misspell it?'
)
return failed_syncs
@@ -0,0 +1,158 @@
from datetime import datetime, timezone
from typing import Any
from uuid import UUID
from cartography.config import Config as CartographyConfig
from api.db_utils import rls_transaction
from api.models import (
AttackPathsScan as ProwlerAPIAttackPathsScan,
Provider as ProwlerAPIProvider,
StateChoices,
)
from tasks.jobs.attack_paths.providers import is_provider_available
def create_attack_paths_scan(
tenant_id: str,
scan_id: str,
provider_id: int,
) -> ProwlerAPIAttackPathsScan | None:
with rls_transaction(tenant_id):
prowler_api_provider = ProwlerAPIProvider.objects.get(id=provider_id)
if not is_provider_available(prowler_api_provider.provider):
return None
with rls_transaction(tenant_id):
attack_paths_scan = ProwlerAPIAttackPathsScan.objects.create(
tenant_id=tenant_id,
provider_id=provider_id,
scan_id=scan_id,
state=StateChoices.SCHEDULED,
started_at=datetime.now(tz=timezone.utc),
)
attack_paths_scan.save()
return attack_paths_scan
def retrieve_attack_paths_scan(
tenant_id: str,
scan_id: str,
) -> ProwlerAPIAttackPathsScan | None:
try:
with rls_transaction(tenant_id):
attack_paths_scan = ProwlerAPIAttackPathsScan.objects.get(
scan_id=scan_id,
)
return attack_paths_scan
except ProwlerAPIAttackPathsScan.DoesNotExist:
return None
def starting_attack_paths_scan(
attack_paths_scan: ProwlerAPIAttackPathsScan,
task_id: str,
cartography_config: CartographyConfig,
) -> None:
with rls_transaction(attack_paths_scan.tenant_id):
attack_paths_scan.task_id = task_id
attack_paths_scan.state = StateChoices.EXECUTING
attack_paths_scan.started_at = datetime.now(tz=timezone.utc)
attack_paths_scan.update_tag = cartography_config.update_tag
attack_paths_scan.graph_database = cartography_config.neo4j_database
attack_paths_scan.save(
update_fields=[
"task_id",
"state",
"started_at",
"update_tag",
"graph_database",
]
)
def finish_attack_paths_scan(
attack_paths_scan: ProwlerAPIAttackPathsScan,
state: StateChoices,
ingestion_exceptions: dict[str, Any],
) -> None:
with rls_transaction(attack_paths_scan.tenant_id):
now = datetime.now(tz=timezone.utc)
duration = int((now - attack_paths_scan.started_at).total_seconds())
attack_paths_scan.state = state
attack_paths_scan.progress = 100
attack_paths_scan.completed_at = now
attack_paths_scan.duration = duration
attack_paths_scan.ingestion_exceptions = ingestion_exceptions
attack_paths_scan.save(
update_fields=[
"state",
"progress",
"completed_at",
"duration",
"ingestion_exceptions",
]
)
def update_attack_paths_scan_progress(
attack_paths_scan: ProwlerAPIAttackPathsScan,
progress: int,
) -> None:
with rls_transaction(attack_paths_scan.tenant_id):
attack_paths_scan.progress = progress
attack_paths_scan.save(update_fields=["progress"])
def get_old_attack_paths_scans(
tenant_id: str,
provider_id: str,
attack_paths_scan_id: str,
) -> list[ProwlerAPIAttackPathsScan]:
"""
An `old_attack_paths_scan` is any `completed` Attack Paths scan for the same provider,
with its graph database not deleted, excluding the current Attack Paths scan.
"""
with rls_transaction(tenant_id):
completed_scans_qs = (
ProwlerAPIAttackPathsScan.objects.filter(
provider_id=provider_id,
state=StateChoices.COMPLETED,
is_graph_database_deleted=False,
)
.exclude(id=attack_paths_scan_id)
.all()
)
return list(completed_scans_qs)
def update_old_attack_paths_scan(
old_attack_paths_scan: ProwlerAPIAttackPathsScan,
) -> None:
with rls_transaction(old_attack_paths_scan.tenant_id):
old_attack_paths_scan.is_graph_database_deleted = True
old_attack_paths_scan.save(update_fields=["is_graph_database_deleted"])
def get_provider_graph_database_names(tenant_id: str, provider_id: str) -> list[str]:
"""
Return existing graph database names for a tenant/provider.
Note: For accesing the `AttackPathsScan` we need to use `all_objects` manager because the provider is soft-deleted.
"""
with rls_transaction(tenant_id):
graph_databases_names_qs = ProwlerAPIAttackPathsScan.all_objects.filter(
provider_id=provider_id,
is_graph_database_deleted=False,
).values_list("graph_database", flat=True)
return list(graph_databases_names_qs)
@@ -0,0 +1,23 @@
AVAILABLE_PROVIDERS: list[str] = [
"aws",
]
ROOT_NODE_LABELS: dict[str, str] = {
"aws": "AWSAccount",
}
NODE_UID_FIELDS: dict[str, str] = {
"aws": "arn",
}
def is_provider_available(provider_type: str) -> bool:
return provider_type in AVAILABLE_PROVIDERS
def get_root_node_label(provider_type: str) -> str:
return ROOT_NODE_LABELS.get(provider_type, "UnknownProviderAccount")
def get_node_uid_field(provider_type: str) -> str:
return NODE_UID_FIELDS.get(provider_type, "UnknownProviderUID")
@@ -0,0 +1,205 @@
import neo4j
from cartography.client.core.tx import run_write_query
from cartography.config import Config as CartographyConfig
from celery.utils.log import get_task_logger
from api.db_utils import rls_transaction
from api.models import Provider, ResourceFindingMapping
from config.env import env
from prowler.config import config as ProwlerConfig
from tasks.jobs.attack_paths.providers import get_node_uid_field, get_root_node_label
logger = get_task_logger(__name__)
BATCH_SIZE = env.int("NEO4J_INSERT_BATCH_SIZE", 500)
INDEX_STATEMENTS = [
"CREATE INDEX prowler_finding_id IF NOT EXISTS FOR (n:ProwlerFinding) ON (n.id);",
"CREATE INDEX prowler_finding_provider_uid IF NOT EXISTS FOR (n:ProwlerFinding) ON (n.provider_uid);",
"CREATE INDEX prowler_finding_lastupdated IF NOT EXISTS FOR (n:ProwlerFinding) ON (n.lastupdated);",
"CREATE INDEX prowler_finding_check_id IF NOT EXISTS FOR (n:ProwlerFinding) ON (n.status);",
]
INSERT_STATEMENT_TEMPLATE = """
UNWIND $findings_data AS finding_data
MATCH (account:__ROOT_NODE_LABEL__ {id: $provider_uid})
MATCH (account)-->(resource)
WHERE resource.__NODE_UID_FIELD__ = finding_data.resource_uid
OR resource.id = finding_data.resource_uid
MERGE (finding:ProwlerFinding {id: finding_data.id})
ON CREATE SET
finding.id = finding_data.id,
finding.uid = finding_data.uid,
finding.inserted_at = finding_data.inserted_at,
finding.updated_at = finding_data.updated_at,
finding.first_seen_at = finding_data.first_seen_at,
finding.scan_id = finding_data.scan_id,
finding.delta = finding_data.delta,
finding.status = finding_data.status,
finding.status_extended = finding_data.status_extended,
finding.severity = finding_data.severity,
finding.check_id = finding_data.check_id,
finding.check_title = finding_data.check_title,
finding.muted = finding_data.muted,
finding.muted_reason = finding_data.muted_reason,
finding.provider_uid = $provider_uid,
finding.firstseen = timestamp(),
finding.lastupdated = $last_updated,
finding._module_name = 'cartography:prowler',
finding._module_version = $prowler_version
ON MATCH SET
finding.status = finding_data.status,
finding.status_extended = finding_data.status_extended,
finding.lastupdated = $last_updated
MERGE (resource)-[rel:HAS_FINDING]->(finding)
ON CREATE SET
rel.provider_uid = $provider_uid,
rel.firstseen = timestamp(),
rel.lastupdated = $last_updated,
rel._module_name = 'cartography:prowler',
rel._module_version = $prowler_version
ON MATCH SET
rel.lastupdated = $last_updated
"""
CLEANUP_STATEMENT = """
MATCH (finding:ProwlerFinding {provider_uid: $provider_uid})
WHERE finding.lastupdated < $last_updated
WITH finding LIMIT $batch_size
DETACH DELETE finding
RETURN COUNT(finding) AS deleted_findings_count
"""
def create_indexes(neo4j_session: neo4j.Session) -> None:
"""
Code based on Cartography version 0.122.0, specifically on `cartography.intel.create_indexes.run`.
"""
logger.info("Creating indexes for Prowler node types.")
for statement in INDEX_STATEMENTS:
logger.debug("Executing statement: %s", statement)
run_write_query(neo4j_session, statement)
def analysis(
neo4j_session: neo4j.Session,
prowler_api_provider: Provider,
scan_id: str,
config: CartographyConfig,
) -> None:
findings_data = get_provider_last_scan_findings(prowler_api_provider, scan_id)
load_findings(neo4j_session, findings_data, prowler_api_provider, config)
cleanup_findings(neo4j_session, prowler_api_provider, config)
def get_provider_last_scan_findings(
prowler_api_provider: Provider,
scan_id: str,
) -> list[dict[str, str]]:
with rls_transaction(prowler_api_provider.tenant_id):
resource_finding_qs = ResourceFindingMapping.objects.filter(
finding__scan_id=scan_id,
).values(
"resource__uid",
"finding__id",
"finding__uid",
"finding__inserted_at",
"finding__updated_at",
"finding__first_seen_at",
"finding__scan_id",
"finding__delta",
"finding__status",
"finding__status_extended",
"finding__severity",
"finding__check_id",
"finding__check_metadata__checktitle",
"finding__muted",
"finding__muted_reason",
)
findings = []
for resource_finding in resource_finding_qs:
findings.append(
{
"resource_uid": str(resource_finding["resource__uid"]),
"id": str(resource_finding["finding__id"]),
"uid": resource_finding["finding__uid"],
"inserted_at": resource_finding["finding__inserted_at"],
"updated_at": resource_finding["finding__updated_at"],
"first_seen_at": resource_finding["finding__first_seen_at"],
"scan_id": str(resource_finding["finding__scan_id"]),
"delta": resource_finding["finding__delta"],
"status": resource_finding["finding__status"],
"status_extended": resource_finding["finding__status_extended"],
"severity": resource_finding["finding__severity"],
"check_id": str(resource_finding["finding__check_id"]),
"check_title": resource_finding[
"finding__check_metadata__checktitle"
],
"muted": resource_finding["finding__muted"],
"muted_reason": resource_finding["finding__muted_reason"],
}
)
return findings
def load_findings(
neo4j_session: neo4j.Session,
findings_data: list[dict[str, str]],
prowler_api_provider: Provider,
config: CartographyConfig,
) -> None:
replacements = {
"__ROOT_NODE_LABEL__": get_root_node_label(prowler_api_provider.provider),
"__NODE_UID_FIELD__": get_node_uid_field(prowler_api_provider.provider),
}
query = INSERT_STATEMENT_TEMPLATE
for replace_key, replace_value in replacements.items():
query = query.replace(replace_key, replace_value)
parameters = {
"provider_uid": str(prowler_api_provider.uid),
"last_updated": config.update_tag,
"prowler_version": ProwlerConfig.prowler_version,
}
total_length = len(findings_data)
for i in range(0, total_length, BATCH_SIZE):
parameters["findings_data"] = findings_data[i : i + BATCH_SIZE]
logger.info(
f"Loading findings batch {i // BATCH_SIZE + 1} / {(total_length + BATCH_SIZE - 1) // BATCH_SIZE}"
)
neo4j_session.run(query, parameters)
def cleanup_findings(
neo4j_session: neo4j.Session,
prowler_api_provider: Provider,
config: CartographyConfig,
) -> None:
parameters = {
"provider_uid": str(prowler_api_provider.uid),
"last_updated": config.update_tag,
"batch_size": BATCH_SIZE,
}
batch = 1
deleted_count = 1
while deleted_count > 0:
logger.info(f"Cleaning findings batch {batch}")
result = neo4j_session.run(CLEANUP_STATEMENT, parameters)
deleted_count = result.single().get("deleted_findings_count", 0)
batch += 1
@@ -0,0 +1,183 @@
import logging
import time
import asyncio
from typing import Any, Callable
from cartography.config import Config as CartographyConfig
from cartography.intel import analysis as cartography_analysis
from cartography.intel import create_indexes as cartography_create_indexes
from cartography.intel import ontology as cartography_ontology
from celery.utils.log import get_task_logger
from api.attack_paths import database as graph_database
from api.db_utils import rls_transaction
from api.models import (
Provider as ProwlerAPIProvider,
StateChoices,
)
from api.utils import initialize_prowler_provider
from tasks.jobs.attack_paths import aws, db_utils, prowler, utils
# Without this Celery goes crazy with Cartography logging
logging.getLogger("cartography").setLevel(logging.ERROR)
logging.getLogger("neo4j").propagate = False
logger = get_task_logger(__name__)
CARTOGRAPHY_INGESTION_FUNCTIONS: dict[str, Callable] = {
"aws": aws.start_aws_ingestion,
}
def get_cartography_ingestion_function(provider_type: str) -> Callable | None:
return CARTOGRAPHY_INGESTION_FUNCTIONS.get(provider_type)
def run(tenant_id: str, scan_id: str, task_id: str) -> dict[str, Any]:
"""
Code based on Cartography version 0.122.0, specifically on `cartography.cli.main`, `cartography.cli.CLI.main`,
`cartography.sync.run_with_config` and `cartography.sync.Sync.run`.
"""
ingestion_exceptions = {} # This will hold any exceptions raised during ingestion
# Prowler necessary objects
with rls_transaction(tenant_id):
prowler_api_provider = ProwlerAPIProvider.objects.get(scan__pk=scan_id)
prowler_sdk_provider = initialize_prowler_provider(prowler_api_provider)
# Attack Paths Scan necessary objects
cartography_ingestion_function = get_cartography_ingestion_function(
prowler_api_provider.provider
)
attack_paths_scan = db_utils.retrieve_attack_paths_scan(tenant_id, scan_id)
# Checks before starting the scan
if not cartography_ingestion_function:
ingestion_exceptions = {
"global_error": f"Provider {prowler_api_provider.provider} is not supported for Attack Paths scans"
}
if attack_paths_scan:
db_utils.finish_attack_paths_scan(
attack_paths_scan, StateChoices.COMPLETED, ingestion_exceptions
)
logger.warning(
f"Provider {prowler_api_provider.provider} is not supported for Attack Paths scans"
)
return ingestion_exceptions
else:
if not attack_paths_scan:
logger.warning(
f"No Attack Paths Scan found for scan {scan_id} and tenant {tenant_id}, let's create it then"
)
attack_paths_scan = db_utils.create_attack_paths_scan(
tenant_id, scan_id, prowler_api_provider.id
)
# While creating the Cartography configuration, attributes `neo4j_user` and `neo4j_password` are not really needed in this config object
cartography_config = CartographyConfig(
neo4j_uri=graph_database.get_uri(),
neo4j_database=graph_database.get_database_name(attack_paths_scan.id),
update_tag=int(time.time()),
)
# Starting the Attack Paths scan
db_utils.starting_attack_paths_scan(attack_paths_scan, task_id, cartography_config)
try:
logger.info(
f"Creating Neo4j database {cartography_config.neo4j_database} for tenant {prowler_api_provider.tenant_id}"
)
graph_database.create_database(cartography_config.neo4j_database)
db_utils.update_attack_paths_scan_progress(attack_paths_scan, 1)
logger.info(
f"Starting Cartography ({attack_paths_scan.id}) for "
f"{prowler_api_provider.provider.upper()} provider {prowler_api_provider.id}"
)
with graph_database.get_session(
cartography_config.neo4j_database
) as neo4j_session:
# Indexes creation
cartography_create_indexes.run(neo4j_session, cartography_config)
prowler.create_indexes(neo4j_session)
db_utils.update_attack_paths_scan_progress(attack_paths_scan, 2)
# The real scan, where iterates over cloud services
ingestion_exceptions = _call_within_event_loop(
cartography_ingestion_function,
neo4j_session,
cartography_config,
prowler_api_provider,
prowler_sdk_provider,
attack_paths_scan,
)
# Post-processing: Just keeping it to be more Cartography compliant
cartography_ontology.run(neo4j_session, cartography_config)
db_utils.update_attack_paths_scan_progress(attack_paths_scan, 95)
cartography_analysis.run(neo4j_session, cartography_config)
db_utils.update_attack_paths_scan_progress(attack_paths_scan, 96)
# Adding Prowler nodes and relationships
prowler.analysis(
neo4j_session, prowler_api_provider, scan_id, cartography_config
)
logger.info(
f"Completed Cartography ({attack_paths_scan.id}) for "
f"{prowler_api_provider.provider.upper()} provider {prowler_api_provider.id}"
)
# Handling databases changes
old_attack_paths_scans = db_utils.get_old_attack_paths_scans(
prowler_api_provider.tenant_id,
prowler_api_provider.id,
attack_paths_scan.id,
)
for old_attack_paths_scan in old_attack_paths_scans:
graph_database.drop_database(old_attack_paths_scan.graph_database)
db_utils.update_old_attack_paths_scan(old_attack_paths_scan)
db_utils.finish_attack_paths_scan(
attack_paths_scan, StateChoices.COMPLETED, ingestion_exceptions
)
return ingestion_exceptions
except Exception as e:
exception_message = utils.stringify_exception(e, "Cartography failed")
logger.error(exception_message)
ingestion_exceptions["global_cartography_error"] = exception_message
# Handling databases changes
graph_database.drop_database(cartography_config.neo4j_database)
db_utils.finish_attack_paths_scan(
attack_paths_scan, StateChoices.FAILED, ingestion_exceptions
)
raise
def _call_within_event_loop(fn, *args, **kwargs):
"""
Cartography needs a running event loop, so assuming there is none (Celery task or even regular DRF endpoint),
let's create a new one and set it as the current event loop for this thread.
"""
loop = asyncio.new_event_loop()
try:
asyncio.set_event_loop(loop)
return fn(*args, **kwargs)
finally:
try:
loop.run_until_complete(loop.shutdown_asyncgens())
except Exception:
pass
loop.close()
asyncio.set_event_loop(None)
@@ -0,0 +1,10 @@
import traceback
from datetime import datetime, timezone
def stringify_exception(exception: Exception, context: str) -> str:
timestamp = datetime.now(tz=timezone.utc)
exception_traceback = traceback.TracebackException.from_exception(exception)
traceback_string = "".join(exception_traceback.format())
return f"{timestamp} - {context}\n{traceback_string}"
+24 -2
View File
@@ -1,9 +1,19 @@
from celery.utils.log import get_task_logger
from django.db import DatabaseError
from api.attack_paths import database as graph_database
from api.db_router import MainRouter
from api.db_utils import batch_delete, rls_transaction
from api.models import Finding, Provider, Resource, Scan, ScanSummary, Tenant
from api.models import (
AttackPathsScan,
Finding,
Provider,
Resource,
Scan,
ScanSummary,
Tenant,
)
from tasks.jobs.attack_paths.db_utils import get_provider_graph_database_names
logger = get_task_logger(__name__)
@@ -23,16 +33,27 @@ def delete_provider(tenant_id: str, pk: str):
Raises:
Provider.DoesNotExist: If no instance with the provided primary key exists.
"""
# Delete the Attack Paths' graph databases related to the provider
graph_database_names = get_provider_graph_database_names(tenant_id, pk)
try:
for graph_database_name in graph_database_names:
graph_database.drop_database(graph_database_name)
except graph_database.GraphDatabaseQueryException as gdb_error:
logger.error(f"Error deleting Provider databases: {gdb_error}")
raise
# Get all provider related data and delete them in batches
with rls_transaction(tenant_id):
instance = Provider.all_objects.get(pk=pk)
deletion_summary = {}
deletion_steps = [
("Scan Summaries", ScanSummary.all_objects.filter(scan__provider=instance)),
("Findings", Finding.all_objects.filter(scan__provider=instance)),
("Resources", Resource.all_objects.filter(provider=instance)),
("Scans", Scan.all_objects.filter(provider=instance)),
("AttackPathsScans", AttackPathsScan.all_objects.filter(provider=instance)),
]
deletion_summary = {}
for step_name, queryset in deletion_steps:
try:
_, step_summary = batch_delete(tenant_id, queryset)
@@ -48,6 +69,7 @@ def delete_provider(tenant_id: str, pk: str):
except DatabaseError as db_error:
logger.error(f"Error deleting Provider: {db_error}")
raise
return deletion_summary
+22 -5
View File
@@ -1,6 +1,7 @@
import io
import os
from collections import defaultdict
from functools import partial
from pathlib import Path
from shutil import rmtree
@@ -772,7 +773,9 @@ def _create_section_score_chart(
return buffer
def _add_pdf_footer(canvas_obj: canvas.Canvas, doc: SimpleDocTemplate) -> None:
def _add_pdf_footer(
canvas_obj: canvas.Canvas, doc: SimpleDocTemplate, compliance_name: str
) -> None:
"""
Add footer with page number and branding to each page of the PDF.
@@ -782,7 +785,9 @@ def _add_pdf_footer(canvas_obj: canvas.Canvas, doc: SimpleDocTemplate) -> None:
"""
canvas_obj.saveState()
width, height = doc.pagesize
page_num_text = f"Página {doc.page}"
page_num_text = (
f"{'Página' if 'ens' in compliance_name.lower() else 'Page'} {doc.page}"
)
canvas_obj.setFont("PlusJakartaSans", 9)
canvas_obj.setFillColorRGB(0.4, 0.4, 0.4)
canvas_obj.drawString(30, 20, page_num_text)
@@ -1595,7 +1600,11 @@ def generate_threatscore_report(
elements.append(PageBreak())
# Build the PDF
doc.build(elements, onFirstPage=_add_pdf_footer, onLaterPages=_add_pdf_footer)
doc.build(
elements,
onFirstPage=partial(_add_pdf_footer, compliance_name=compliance_name),
onLaterPages=partial(_add_pdf_footer, compliance_name=compliance_name),
)
except Exception as e:
tb_lineno = e.__traceback__.tb_lineno if e.__traceback__ else "unknown"
logger.info(f"Error building the document, line {tb_lineno} -- {e}")
@@ -2818,7 +2827,11 @@ def generate_ens_report(
# Build the PDF
logger.info("Building PDF...")
doc.build(elements, onFirstPage=_add_pdf_footer, onLaterPages=_add_pdf_footer)
doc.build(
elements,
onFirstPage=partial(_add_pdf_footer, compliance_name=compliance_name),
onLaterPages=partial(_add_pdf_footer, compliance_name=compliance_name),
)
except Exception as e:
tb_lineno = e.__traceback__.tb_lineno if e.__traceback__ else "unknown"
logger.error(f"Error building ENS report, line {tb_lineno} -- {e}")
@@ -3365,7 +3378,11 @@ def generate_nis2_report(
# Build the PDF
logger.info("Building NIS2 PDF...")
doc.build(elements, onFirstPage=_add_pdf_footer, onLaterPages=_add_pdf_footer)
doc.build(
elements,
onFirstPage=partial(_add_pdf_footer, compliance_name=compliance_name),
onLaterPages=partial(_add_pdf_footer, compliance_name=compliance_name),
)
logger.info(f"NIS2 report successfully generated at {output_path}")
except Exception as e:
+36 -12
View File
@@ -1,13 +1,26 @@
import os
from datetime import datetime, timedelta, timezone
from pathlib import Path
from shutil import rmtree
from celery import chain, group, shared_task
from celery.utils.log import get_task_logger
from django_celery_beat.models import PeriodicTask
from api.compliance import get_compliance_frameworks
from api.db_router import READ_REPLICA_ALIAS
from api.db_utils import rls_transaction
from api.decorators import set_tenant
from api.models import Finding, Integration, Provider, Scan, ScanSummary, StateChoices
from api.utils import initialize_prowler_provider
from api.v1.serializers import ScanTaskSerializer
from config.celery import RLSTask
from config.django.base import DJANGO_FINDINGS_BATCH_SIZE, DJANGO_TMP_OUTPUT_DIRECTORY
from django_celery_beat.models import PeriodicTask
from prowler.lib.check.compliance_models import Compliance
from prowler.lib.outputs.compliance.generic.generic import GenericCompliance
from prowler.lib.outputs.finding import Finding as FindingOutput
from tasks.jobs.attack_paths import attack_paths_scan
from tasks.jobs.backfill import (
backfill_compliance_summaries,
backfill_resource_scan_summaries,
@@ -43,17 +56,6 @@ from tasks.jobs.scan import (
)
from tasks.utils import batched, get_next_execution_datetime
from api.compliance import get_compliance_frameworks
from api.db_router import READ_REPLICA_ALIAS
from api.db_utils import rls_transaction
from api.decorators import set_tenant
from api.models import Finding, Integration, Provider, Scan, ScanSummary, StateChoices
from api.utils import initialize_prowler_provider
from api.v1.serializers import ScanTaskSerializer
from prowler.lib.check.compliance_models import Compliance
from prowler.lib.outputs.compliance.generic.generic import GenericCompliance
from prowler.lib.outputs.finding import Finding as FindingOutput
logger = get_task_logger(__name__)
@@ -86,6 +88,9 @@ def _perform_scan_complete_tasks(tenant_id: str, scan_id: str, provider_id: str)
),
),
).apply_async()
perform_attack_paths_scan_task.apply_async(
kwargs={"tenant_id": tenant_id, "scan_id": scan_id}
)
@shared_task(base=RLSTask, name="provider-connection-check")
@@ -281,6 +286,25 @@ def perform_scan_summary_task(tenant_id: str, scan_id: str):
return aggregate_findings(tenant_id=tenant_id, scan_id=scan_id)
# TODO: This task must be queued at the `attack-paths` queue, don't forget to add it to the `docker-entrypoint.sh` file
@shared_task(base=RLSTask, bind=True, name="attack-paths-scan-perform", queue="scans")
def perform_attack_paths_scan_task(self, tenant_id: str, scan_id: str):
"""
Execute an Attack Paths scan for the given provider within the current tenant RLS context.
Args:
self: The task instance (automatically passed when bind=True).
tenant_id (str): The tenant identifier for RLS context.
scan_id (str): The Prowler scan identifier for obtaining the tenant and provider context.
Returns:
Any: The result from `attack_paths_scan`, including any per-scan failure details.
"""
return attack_paths_scan(
tenant_id=tenant_id, scan_id=scan_id, task_id=self.request.id
)
@shared_task(name="tenant-deletion", queue="deletion", autoretry_for=(Exception,))
def delete_tenant_task(tenant_id: str):
return delete_tenant(pk=tenant_id)
@@ -0,0 +1,416 @@
from contextlib import nullcontext
from types import SimpleNamespace
from unittest.mock import MagicMock, call, patch
import pytest
from api.models import (
AttackPathsScan,
Finding,
Provider,
Resource,
ResourceFindingMapping,
Scan,
StateChoices,
StatusChoices,
)
from prowler.lib.check.models import Severity
from tasks.jobs.attack_paths import prowler as prowler_module
from tasks.jobs.attack_paths.scan import run as attack_paths_run
@pytest.mark.django_db
class TestAttackPathsRun:
def test_run_success_flow(self, tenants_fixture, providers_fixture, scans_fixture):
tenant = tenants_fixture[0]
provider = providers_fixture[0]
provider.provider = Provider.ProviderChoices.AWS
provider.save()
scan = scans_fixture[0]
scan.provider = provider
scan.save()
attack_paths_scan = AttackPathsScan.objects.create(
tenant_id=tenant.id,
provider=provider,
scan=scan,
state=StateChoices.SCHEDULED,
)
mock_session = MagicMock()
session_ctx = MagicMock()
session_ctx.__enter__.return_value = mock_session
session_ctx.__exit__.return_value = False
ingestion_result = {"organizations": "warning"}
ingestion_fn = MagicMock(return_value=ingestion_result)
with (
patch(
"tasks.jobs.attack_paths.scan.rls_transaction",
new=lambda *args, **kwargs: nullcontext(),
),
patch(
"tasks.jobs.attack_paths.scan.initialize_prowler_provider",
return_value=MagicMock(_enabled_regions=["us-east-1"]),
),
patch(
"tasks.jobs.attack_paths.scan.graph_database.get_uri",
return_value="bolt://neo4j",
),
patch(
"tasks.jobs.attack_paths.scan.graph_database.get_database_name",
return_value="db-scan-id",
) as mock_get_db_name,
patch(
"tasks.jobs.attack_paths.scan.graph_database.create_database"
) as mock_create_db,
patch(
"tasks.jobs.attack_paths.scan.graph_database.get_session",
return_value=session_ctx,
) as mock_get_session,
patch(
"tasks.jobs.attack_paths.scan.cartography_create_indexes.run"
) as mock_cartography_indexes,
patch(
"tasks.jobs.attack_paths.scan.cartography_analysis.run"
) as mock_cartography_analysis,
patch(
"tasks.jobs.attack_paths.scan.cartography_ontology.run"
) as mock_cartography_ontology,
patch(
"tasks.jobs.attack_paths.scan.prowler.create_indexes"
) as mock_prowler_indexes,
patch(
"tasks.jobs.attack_paths.scan.prowler.analysis"
) as mock_prowler_analysis,
patch(
"tasks.jobs.attack_paths.scan.db_utils.retrieve_attack_paths_scan",
return_value=attack_paths_scan,
) as mock_retrieve_scan,
patch(
"tasks.jobs.attack_paths.scan.db_utils.starting_attack_paths_scan"
) as mock_starting,
patch(
"tasks.jobs.attack_paths.scan.db_utils.update_attack_paths_scan_progress"
) as mock_update_progress,
patch(
"tasks.jobs.attack_paths.scan.db_utils.finish_attack_paths_scan"
) as mock_finish,
patch(
"tasks.jobs.attack_paths.scan.get_cartography_ingestion_function",
return_value=ingestion_fn,
) as mock_get_ingestion,
patch(
"tasks.jobs.attack_paths.scan._call_within_event_loop",
side_effect=lambda fn, *a, **kw: fn(*a, **kw),
) as mock_event_loop,
):
result = attack_paths_run(str(tenant.id), str(scan.id), "task-123")
assert result == ingestion_result
mock_retrieve_scan.assert_called_once_with(str(tenant.id), str(scan.id))
mock_starting.assert_called_once()
config = mock_starting.call_args[0][2]
assert config.neo4j_database == "db-scan-id"
mock_create_db.assert_called_once_with("db-scan-id")
mock_get_session.assert_called_once_with("db-scan-id")
mock_cartography_indexes.assert_called_once_with(mock_session, config)
mock_prowler_indexes.assert_called_once_with(mock_session)
mock_cartography_analysis.assert_called_once_with(mock_session, config)
mock_cartography_ontology.assert_called_once_with(mock_session, config)
mock_prowler_analysis.assert_called_once_with(
mock_session,
provider,
str(scan.id),
config,
)
assert mock_get_ingestion.call_args_list == [
call(provider.provider),
call(provider.provider),
]
mock_event_loop.assert_called_once()
mock_update_progress.assert_any_call(attack_paths_scan, 1)
mock_update_progress.assert_any_call(attack_paths_scan, 2)
mock_update_progress.assert_any_call(attack_paths_scan, 95)
mock_finish.assert_called_once_with(
attack_paths_scan, StateChoices.COMPLETED, ingestion_result
)
mock_get_db_name.assert_called_once_with(attack_paths_scan.id)
def test_run_failure_marks_scan_failed(
self, tenants_fixture, providers_fixture, scans_fixture
):
tenant = tenants_fixture[0]
provider = providers_fixture[0]
provider.provider = Provider.ProviderChoices.AWS
provider.save()
scan = scans_fixture[0]
scan.provider = provider
scan.save()
attack_paths_scan = AttackPathsScan.objects.create(
tenant_id=tenant.id,
provider=provider,
scan=scan,
state=StateChoices.SCHEDULED,
)
mock_session = MagicMock()
session_ctx = MagicMock()
session_ctx.__enter__.return_value = mock_session
session_ctx.__exit__.return_value = False
ingestion_fn = MagicMock(side_effect=RuntimeError("ingestion boom"))
with (
patch(
"tasks.jobs.attack_paths.scan.rls_transaction",
new=lambda *args, **kwargs: nullcontext(),
),
patch(
"tasks.jobs.attack_paths.scan.initialize_prowler_provider",
return_value=MagicMock(_enabled_regions=["us-east-1"]),
),
patch("tasks.jobs.attack_paths.scan.graph_database.get_uri"),
patch(
"tasks.jobs.attack_paths.scan.graph_database.get_database_name",
return_value="db-scan-id",
),
patch("tasks.jobs.attack_paths.scan.graph_database.create_database"),
patch(
"tasks.jobs.attack_paths.scan.graph_database.get_session",
return_value=session_ctx,
),
patch("tasks.jobs.attack_paths.scan.cartography_create_indexes.run"),
patch("tasks.jobs.attack_paths.scan.cartography_analysis.run"),
patch("tasks.jobs.attack_paths.scan.prowler.create_indexes"),
patch("tasks.jobs.attack_paths.scan.prowler.analysis"),
patch(
"tasks.jobs.attack_paths.scan.db_utils.retrieve_attack_paths_scan",
return_value=attack_paths_scan,
),
patch("tasks.jobs.attack_paths.scan.db_utils.starting_attack_paths_scan"),
patch(
"tasks.jobs.attack_paths.scan.db_utils.update_attack_paths_scan_progress"
),
patch(
"tasks.jobs.attack_paths.scan.db_utils.finish_attack_paths_scan"
) as mock_finish,
patch(
"tasks.jobs.attack_paths.scan.get_cartography_ingestion_function",
return_value=ingestion_fn,
),
patch(
"tasks.jobs.attack_paths.scan._call_within_event_loop",
side_effect=lambda fn, *a, **kw: fn(*a, **kw),
),
patch(
"tasks.jobs.attack_paths.scan.utils.stringify_exception",
return_value="Cartography failed: ingestion boom",
),
):
with pytest.raises(RuntimeError, match="ingestion boom"):
attack_paths_run(str(tenant.id), str(scan.id), "task-456")
failure_args = mock_finish.call_args[0]
assert failure_args[0] is attack_paths_scan
assert failure_args[1] == StateChoices.FAILED
assert failure_args[2] == {
"global_cartography_error": "Cartography failed: ingestion boom"
}
def test_run_returns_early_for_unsupported_provider(self, tenants_fixture):
tenant = tenants_fixture[0]
provider = Provider.objects.create(
provider=Provider.ProviderChoices.GCP,
uid="gcp-account",
alias="gcp",
tenant_id=tenant.id,
)
scan = Scan.objects.create(
name="GCP Scan",
provider=provider,
trigger=Scan.TriggerChoices.MANUAL,
state=StateChoices.AVAILABLE,
tenant_id=tenant.id,
)
with (
patch(
"tasks.jobs.attack_paths.scan.rls_transaction",
new=lambda *args, **kwargs: nullcontext(),
),
patch(
"tasks.jobs.attack_paths.scan.initialize_prowler_provider",
return_value=MagicMock(),
),
patch(
"tasks.jobs.attack_paths.scan.get_cartography_ingestion_function",
return_value=None,
) as mock_get_ingestion,
patch(
"tasks.jobs.attack_paths.scan.db_utils.retrieve_attack_paths_scan"
) as mock_retrieve,
):
result = attack_paths_run(str(tenant.id), str(scan.id), "task-789")
assert result == {}
mock_get_ingestion.assert_called_once_with(provider.provider)
mock_retrieve.assert_not_called()
@pytest.mark.django_db
class TestAttackPathsProwlerHelpers:
def test_create_indexes_executes_all_statements(self):
mock_session = MagicMock()
with patch("tasks.jobs.attack_paths.prowler.run_write_query") as mock_run_write:
prowler_module.create_indexes(mock_session)
assert mock_run_write.call_count == len(prowler_module.INDEX_STATEMENTS)
mock_run_write.assert_has_calls(
[call(mock_session, stmt) for stmt in prowler_module.INDEX_STATEMENTS]
)
def test_load_findings_batches_requests(self, providers_fixture):
provider = providers_fixture[0]
provider.provider = Provider.ProviderChoices.AWS
provider.save()
findings = [
{"id": "1", "resource_uid": "r-1"},
{"id": "2", "resource_uid": "r-2"},
]
config = SimpleNamespace(update_tag=12345)
mock_session = MagicMock()
with (
patch.object(prowler_module, "BATCH_SIZE", 1),
patch(
"tasks.jobs.attack_paths.prowler.get_root_node_label",
return_value="AWSAccount",
),
patch(
"tasks.jobs.attack_paths.prowler.get_node_uid_field",
return_value="arn",
),
):
prowler_module.load_findings(mock_session, findings, provider, config)
assert mock_session.run.call_count == 2
for call_args in mock_session.run.call_args_list:
params = call_args.args[1]
assert params["provider_uid"] == str(provider.uid)
assert params["last_updated"] == config.update_tag
assert "findings_data" in params
def test_cleanup_findings_runs_batches(self, providers_fixture):
provider = providers_fixture[0]
config = SimpleNamespace(update_tag=1024)
mock_session = MagicMock()
first_batch = MagicMock()
first_batch.single.return_value = {"deleted_findings_count": 3}
second_batch = MagicMock()
second_batch.single.return_value = {"deleted_findings_count": 0}
mock_session.run.side_effect = [first_batch, second_batch]
prowler_module.cleanup_findings(mock_session, provider, config)
assert mock_session.run.call_count == 2
params = mock_session.run.call_args.args[1]
assert params["provider_uid"] == str(provider.uid)
assert params["last_updated"] == config.update_tag
def test_get_provider_last_scan_findings_returns_latest_scan_data(
self,
tenants_fixture,
providers_fixture,
):
tenant = tenants_fixture[0]
provider = providers_fixture[0]
provider.provider = Provider.ProviderChoices.AWS
provider.save()
resource = Resource.objects.create(
tenant_id=tenant.id,
provider=provider,
uid="resource-uid",
name="Resource",
region="us-east-1",
service="ec2",
type="instance",
)
older_scan = Scan.objects.create(
name="Older",
provider=provider,
trigger=Scan.TriggerChoices.MANUAL,
state=StateChoices.COMPLETED,
tenant_id=tenant.id,
)
old_finding = Finding.objects.create(
tenant_id=tenant.id,
uid="older-finding",
scan=older_scan,
delta=Finding.DeltaChoices.NEW,
status=StatusChoices.PASS,
status_extended="ok",
severity=Severity.low,
impact=Severity.low,
impact_extended="",
raw_result={},
check_id="check-old",
check_metadata={"checktitle": "Old"},
first_seen_at=older_scan.inserted_at,
)
ResourceFindingMapping.objects.create(
tenant_id=tenant.id,
resource=resource,
finding=old_finding,
)
latest_scan = Scan.objects.create(
name="Latest",
provider=provider,
trigger=Scan.TriggerChoices.MANUAL,
state=StateChoices.COMPLETED,
tenant_id=tenant.id,
)
finding = Finding.objects.create(
tenant_id=tenant.id,
uid="finding-uid",
scan=latest_scan,
delta=Finding.DeltaChoices.NEW,
status=StatusChoices.FAIL,
status_extended="failed",
severity=Severity.high,
impact=Severity.high,
impact_extended="",
raw_result={},
check_id="check-1",
check_metadata={"checktitle": "Check title"},
first_seen_at=latest_scan.inserted_at,
)
ResourceFindingMapping.objects.create(
tenant_id=tenant.id,
resource=resource,
finding=finding,
)
latest_scan.refresh_from_db()
with patch(
"tasks.jobs.attack_paths.prowler.rls_transaction",
new=lambda *args, **kwargs: nullcontext(),
):
findings_data = prowler_module.get_provider_last_scan_findings(
provider,
str(latest_scan.id),
)
assert len(findings_data) == 1
finding_dict = findings_data[0]
assert finding_dict["id"] == str(finding.id)
assert finding_dict["resource_uid"] == resource.uid
assert finding_dict["check_title"] == "Check title"
assert finding_dict["scan_id"] == str(latest_scan.id)
+98 -30
View File
@@ -1,27 +1,60 @@
from unittest.mock import call, patch
import pytest
from django.core.exceptions import ObjectDoesNotExist
from tasks.jobs.deletion import delete_provider, delete_tenant
from api.models import Provider, Tenant
from tasks.jobs.deletion import delete_provider, delete_tenant
@pytest.mark.django_db
class TestDeleteProvider:
def test_delete_provider_success(self, providers_fixture):
instance = providers_fixture[0]
tenant_id = str(instance.tenant_id)
result = delete_provider(tenant_id, instance.id)
with patch(
"tasks.jobs.deletion.get_provider_graph_database_names"
) as mock_get_provider_graph_database_names, patch(
"tasks.jobs.deletion.graph_database.drop_database"
) as mock_drop_database:
graph_db_names = ["graph-db-1", "graph-db-2"]
mock_get_provider_graph_database_names.return_value = graph_db_names
assert result
with pytest.raises(ObjectDoesNotExist):
Provider.objects.get(pk=instance.id)
instance = providers_fixture[0]
tenant_id = str(instance.tenant_id)
result = delete_provider(tenant_id, instance.id)
assert result
with pytest.raises(ObjectDoesNotExist):
Provider.objects.get(pk=instance.id)
mock_get_provider_graph_database_names.assert_called_once_with(
tenant_id, instance.id
)
mock_drop_database.assert_has_calls(
[call(graph_db_name) for graph_db_name in graph_db_names]
)
def test_delete_provider_does_not_exist(self, tenants_fixture):
tenant_id = str(tenants_fixture[0].id)
non_existent_pk = "babf6796-cfcc-4fd3-9dcf-88d012247645"
with patch(
"tasks.jobs.deletion.get_provider_graph_database_names"
) as mock_get_provider_graph_database_names, patch(
"tasks.jobs.deletion.graph_database.drop_database"
) as mock_drop_database:
graph_db_names = ["graph-db-1"]
mock_get_provider_graph_database_names.return_value = graph_db_names
with pytest.raises(ObjectDoesNotExist):
delete_provider(tenant_id, non_existent_pk)
tenant_id = str(tenants_fixture[0].id)
non_existent_pk = "babf6796-cfcc-4fd3-9dcf-88d012247645"
with pytest.raises(ObjectDoesNotExist):
delete_provider(tenant_id, non_existent_pk)
mock_get_provider_graph_database_names.assert_called_once_with(
tenant_id, non_existent_pk
)
mock_drop_database.assert_has_calls(
[call(graph_db_name) for graph_db_name in graph_db_names]
)
@pytest.mark.django_db
@@ -30,33 +63,68 @@ class TestDeleteTenant:
"""
Test successful deletion of a tenant and its related data.
"""
tenant = tenants_fixture[0]
providers = Provider.objects.filter(tenant_id=tenant.id)
with patch(
"tasks.jobs.deletion.get_provider_graph_database_names"
) as mock_get_provider_graph_database_names, patch(
"tasks.jobs.deletion.graph_database.drop_database"
) as mock_drop_database:
tenant = tenants_fixture[0]
providers = list(Provider.objects.filter(tenant_id=tenant.id))
# Ensure the tenant and related providers exist before deletion
assert Tenant.objects.filter(id=tenant.id).exists()
assert providers.exists()
graph_db_names_per_provider = [
[f"graph-db-{provider.id}"] for provider in providers
]
mock_get_provider_graph_database_names.side_effect = (
graph_db_names_per_provider
)
# Call the function and validate the result
deletion_summary = delete_tenant(tenant.id)
# Ensure the tenant and related providers exist before deletion
assert Tenant.objects.filter(id=tenant.id).exists()
assert providers
assert deletion_summary is not None
assert not Tenant.objects.filter(id=tenant.id).exists()
assert not Provider.objects.filter(tenant_id=tenant.id).exists()
# Call the function and validate the result
deletion_summary = delete_tenant(tenant.id)
assert deletion_summary is not None
assert not Tenant.objects.filter(id=tenant.id).exists()
assert not Provider.objects.filter(tenant_id=tenant.id).exists()
expected_calls = [
call(provider.tenant_id, provider.id) for provider in providers
]
mock_get_provider_graph_database_names.assert_has_calls(
expected_calls, any_order=True
)
assert mock_get_provider_graph_database_names.call_count == len(
expected_calls
)
expected_drop_calls = [
call(graph_db_name[0]) for graph_db_name in graph_db_names_per_provider
]
mock_drop_database.assert_has_calls(expected_drop_calls, any_order=True)
assert mock_drop_database.call_count == len(expected_drop_calls)
def test_delete_tenant_with_no_providers(self, tenants_fixture):
"""
Test deletion of a tenant with no related providers.
"""
tenant = tenants_fixture[1] # Assume this tenant has no providers
providers = Provider.objects.filter(tenant_id=tenant.id)
with patch(
"tasks.jobs.deletion.get_provider_graph_database_names"
) as mock_get_provider_graph_database_names, patch(
"tasks.jobs.deletion.graph_database.drop_database"
) as mock_drop_database:
tenant = tenants_fixture[1] # Assume this tenant has no providers
providers = Provider.objects.filter(tenant_id=tenant.id)
# Ensure the tenant exists but has no related providers
assert Tenant.objects.filter(id=tenant.id).exists()
assert not providers.exists()
# Ensure the tenant exists but has no related providers
assert Tenant.objects.filter(id=tenant.id).exists()
assert not providers.exists()
# Call the function and validate the result
deletion_summary = delete_tenant(tenant.id)
# Call the function and validate the result
deletion_summary = delete_tenant(tenant.id)
assert deletion_summary == {} # No providers, so empty summary
assert not Tenant.objects.filter(id=tenant.id).exists()
assert deletion_summary == {} # No providers, so empty summary
assert not Tenant.objects.filter(id=tenant.id).exists()
mock_get_provider_graph_database_names.assert_not_called()
mock_drop_database.assert_not_called()
+77 -9
View File
@@ -1,24 +1,28 @@
import uuid
from contextlib import contextmanager
from unittest.mock import MagicMock, patch
import openai
import pytest
from botocore.exceptions import ClientError
from tasks.tasks import (
_perform_scan_complete_tasks,
check_integrations_task,
check_lighthouse_provider_connection_task,
generate_outputs_task,
refresh_lighthouse_provider_models_task,
s3_integration_task,
security_hub_integration_task,
)
from api.models import (
Integration,
LighthouseProviderConfiguration,
LighthouseProviderModels,
)
from tasks.tasks import (
_perform_scan_complete_tasks,
check_integrations_task,
check_lighthouse_provider_connection_task,
generate_outputs_task,
perform_attack_paths_scan_task,
refresh_lighthouse_provider_models_task,
s3_integration_task,
security_hub_integration_task,
)
# TODO Move this to outputs/reports jobs
@@ -529,6 +533,7 @@ class TestGenerateOutputs:
class TestScanCompleteTasks:
@patch("tasks.tasks.perform_attack_paths_scan_task.apply_async")
@patch("tasks.tasks.create_compliance_requirements_task.apply_async")
@patch("tasks.tasks.perform_scan_summary_task.si")
@patch("tasks.tasks.generate_outputs_task.si")
@@ -541,6 +546,7 @@ class TestScanCompleteTasks:
mock_outputs_task,
mock_scan_summary_task,
mock_compliance_requirements_task,
mock_attack_paths_task,
):
"""Test that scan complete tasks are properly orchestrated with optimized reports."""
_perform_scan_complete_tasks("tenant-id", "scan-id", "provider-id")
@@ -577,6 +583,68 @@ class TestScanCompleteTasks:
scan_id="scan-id",
)
mock_attack_paths_task.assert_called_once_with(
kwargs={"tenant_id": "tenant-id", "scan_id": "scan-id"}
)
class TestAttackPathsTasks:
@staticmethod
@contextmanager
def _override_task_request(task, **attrs):
request = task.request
sentinel = object()
previous = {key: getattr(request, key, sentinel) for key in attrs}
for key, value in attrs.items():
setattr(request, key, value)
try:
yield
finally:
for key, prev in previous.items():
if prev is sentinel:
if hasattr(request, key):
delattr(request, key)
else:
setattr(request, key, prev)
def test_perform_attack_paths_scan_task_calls_runner(self):
with (
patch("tasks.tasks.attack_paths_scan") as mock_attack_paths_scan,
self._override_task_request(
perform_attack_paths_scan_task, id="celery-task-id"
),
):
mock_attack_paths_scan.return_value = {"status": "ok"}
result = perform_attack_paths_scan_task.run(
tenant_id="tenant-id", scan_id="scan-id"
)
mock_attack_paths_scan.assert_called_once_with(
tenant_id="tenant-id", scan_id="scan-id", task_id="celery-task-id"
)
assert result == {"status": "ok"}
def test_perform_attack_paths_scan_task_propagates_exception(self):
with (
patch(
"tasks.tasks.attack_paths_scan",
side_effect=RuntimeError("Exception to propagate"),
) as mock_attack_paths_scan,
self._override_task_request(
perform_attack_paths_scan_task, id="celery-task-error"
),
):
with pytest.raises(RuntimeError, match="Exception to propagate"):
perform_attack_paths_scan_task.run(
tenant_id="tenant-id", scan_id="scan-id"
)
mock_attack_paths_scan.assert_called_once_with(
tenant_id="tenant-id", scan_id="scan-id", task_id="celery-task-error"
)
@pytest.mark.django_db
class TestCheckIntegrationsTask:
+46 -1
View File
@@ -1,6 +1,7 @@
services:
api-dev:
hostname: "prowler-api"
# image: prowler-api-dev
build:
context: ./api
dockerfile: Dockerfile
@@ -24,6 +25,8 @@ services:
condition: service_healthy
valkey:
condition: service_healthy
neo4j:
condition: service_healthy
entrypoint:
- "/home/prowler/docker-entrypoint.sh"
- "dev"
@@ -78,7 +81,41 @@ services:
timeout: 5s
retries: 3
neo4j:
image: graphstack/dozerdb:5.26.3.0
hostname: "neo4j"
volumes:
- ./_data/neo4j:/data
environment:
# We can't add our .env file because some of our current variables are not compatible with Neo4j env vars
# Auth
- NEO4J_AUTH=${NEO4J_USER}/${NEO4J_PASSWORD}
# Memory limits
- NEO4J_dbms_max__databases=${NEO4J_DBMS_MAX__DATABASES:-1000000}
- NEO4J_server_memory_pagecache_size=${NEO4J_SERVER_MEMORY_PAGECACHE_SIZE:-1G}
- NEO4J_server_memory_heap_initial__size=${NEO4J_SERVER_MEMORY_HEAP_INITIAL__SIZE:-1G}
- NEO4J_server_memory_heap_max__size=${NEO4J_SERVER_MEMORY_HEAP_MAX__SIZE:-1G}
# APOC
- apoc.export.file.enabled=${NEO4J_POC_EXPORT_FILE_ENABLED:-true}
- apoc.import.file.enabled=${NEO4J_APOC_IMPORT_FILE_ENABLED:-true}
- apoc.import.file.use_neo4j_config=${NEO4J_APOC_IMPORT_FILE_USE_NEO4J_CONFIG:-true}
- "NEO4J_PLUGINS=${NEO4J_PLUGINS:-[\"apoc\"]}"
- "NEO4J_dbms_security_procedures_allowlist=${NEO4J_DBMS_SECURITY_PROCEDURES_ALLOWLIST:-apoc.*}"
- "NEO4J_dbms_security_procedures_unrestricted=${NEO4J_DBMS_SECURITY_PROCEDURES_UNRESTRICTED:-apoc.*}"
# Networking
- "dbms.connector.bolt.listen_address=${NEO4J_DBMS_CONNECTOR_BOLT_LISTEN_ADDRESS:-0.0.0.0:7687}"
# 7474 is the UI port
ports:
- 7474:7474
- ${NEO4J_PORT:-7687}:7687
healthcheck:
test: ["CMD", "wget", "--no-verbose", "http://localhost:7474"]
interval: 10s
timeout: 10s
retries: 10
worker-dev:
# image: prowler-api-dev
build:
context: ./api
dockerfile: Dockerfile
@@ -89,17 +126,23 @@ services:
- path: .env
required: false
volumes:
- "outputs:/tmp/prowler_api_output"
- ./api/src/backend:/home/prowler/backend
- ./api/pyproject.toml:/home/prowler/pyproject.toml
- ./api/docker-entrypoint.sh:/home/prowler/docker-entrypoint.sh
- outputs:/tmp/prowler_api_output
depends_on:
valkey:
condition: service_healthy
postgres:
condition: service_healthy
neo4j:
condition: service_healthy
entrypoint:
- "/home/prowler/docker-entrypoint.sh"
- "worker"
worker-beat:
# image: prowler-api-dev
build:
context: ./api
dockerfile: Dockerfile
@@ -114,6 +157,8 @@ services:
condition: service_healthy
postgres:
condition: service_healthy
neo4j:
condition: service_healthy
entrypoint:
- "../docker-entrypoint.sh"
- "beat"
+31
View File
@@ -63,6 +63,37 @@ services:
timeout: 5s
retries: 3
neo4j:
image: graphstack/dozerdb:5.26.3.0
hostname: "neo4j"
volumes:
- ./_data/neo4j:/data
environment:
# We can't add our .env file because some of our current variables are not compatible with Neo4j env vars
# Auth
- NEO4J_AUTH=${NEO4J_USER}/${NEO4J_PASSWORD}
# Memory limits
- NEO4J_dbms_max__databases=${NEO4J_DBMS_MAX__DATABASES:-1000000}
- NEO4J_server_memory_pagecache_size=${NEO4J_SERVER_MEMORY_PAGECACHE_SIZE:-1G}
- NEO4J_server_memory_heap_initial__size=${NEO4J_SERVER_MEMORY_HEAP_INITIAL__SIZE:-1G}
- NEO4J_server_memory_heap_max__size=${NEO4J_SERVER_MEMORY_HEAP_MAX__SIZE:-1G}
# APOC
- apoc.export.file.enabled=${NEO4J_POC_EXPORT_FILE_ENABLED:-true}
- apoc.import.file.enabled=${NEO4J_APOC_IMPORT_FILE_ENABLED:-true}
- apoc.import.file.use_neo4j_config=${NEO4J_APOC_IMPORT_FILE_USE_NEO4J_CONFIG:-true}
- "NEO4J_PLUGINS=${NEO4J_PLUGINS:-[\"apoc\"]}"
- "NEO4J_dbms_security_procedures_allowlist=${NEO4J_DBMS_SECURITY_PROCEDURES_ALLOWLIST:-apoc.*}"
- "NEO4J_dbms_security_procedures_unrestricted=${NEO4J_DBMS_SECURITY_PROCEDURES_UNRESTRICTED:-apoc.*}"
# Networking
- "dbms.connector.bolt.listen_address=${NEO4J_DBMS_CONNECTOR_BOLT_LISTEN_ADDRESS:-0.0.0.0:7687}"
ports:
- ${NEO4J_PORT:-7687}:7687
healthcheck:
test: ["CMD", "wget", "--no-verbose", "http://localhost:7474"]
interval: 10s
timeout: 10s
retries: 10
worker:
image: prowlercloud/prowler-api:${PROWLER_API_VERSION:-stable}
env_file:
+8 -2
View File
@@ -52,7 +52,7 @@
{
"group": "Prowler Lighthouse AI",
"pages": [
"user-guide/tutorials/prowler-app-lighthouse"
"getting-started/products/prowler-lighthouse-ai"
]
},
{
@@ -109,7 +109,13 @@
"user-guide/tutorials/prowler-app-jira-integration"
]
},
"user-guide/tutorials/prowler-app-lighthouse",
{
"group": "Lighthouse AI",
"pages": [
"user-guide/tutorials/prowler-app-lighthouse",
"user-guide/tutorials/prowler-app-lighthouse-multi-llm"
]
},
"user-guide/tutorials/prowler-cloud-public-ips",
{
"group": "Tutorials",
@@ -4,12 +4,12 @@ title: "Installation"
### Installation
Prowler App supports multiple installation methods based on your environment.
Prowler App offers flexible installation methods tailored to various environments.
Refer to the [Prowler App Tutorial](/user-guide/tutorials/prowler-app) for detailed usage instructions.
<Warning>
Prowler configuration is based in `.env` files. Every version of Prowler can have differences on that file, so, please, use the file that corresponds with that version or repository branch or tag.
Prowler configuration is based on `.env` files. Every version of Prowler can have differences on that file, so, please, use the file that corresponds with that version or repository branch or tag.
</Warning>
<Tabs>
@@ -26,8 +26,6 @@ Refer to the [Prowler App Tutorial](/user-guide/tutorials/prowler-app) for detai
curl -sLO "https://raw.githubusercontent.com/prowler-cloud/prowler/refs/tags/${VERSION}/.env"
docker compose up -d
```
> Containers are built for `linux/amd64`. If your workstation's architecture is different, please set `DOCKER_DEFAULT_PLATFORM=linux/amd64` in your environment or use the `--platform linux/amd64` flag in the docker command.
</Tab>
<Tab title="GitHub">
_Requirements_:
@@ -106,11 +104,13 @@ Refer to the [Prowler App Tutorial](/user-guide/tutorials/prowler-app) for detai
</Tab>
</Tabs>
### Update Prowler App
### Updating Prowler App
Upgrade Prowler App installation using one of two options:
#### Option 1: Update Environment File
#### Option 1: Updating the Environment File
To update the environment file:
Edit the `.env` file and change version values:
@@ -119,7 +119,7 @@ PROWLER_UI_VERSION="5.9.0"
PROWLER_API_VERSION="5.9.0"
```
#### Option 2: Use Docker Compose Pull
#### Option 2: Using Docker Compose Pull
```bash
docker compose pull --policy always
@@ -133,7 +133,7 @@ The `--policy always` flag ensures that Docker pulls the latest images even if t
Everything is preserved, nothing will be deleted after the update.
</Note>
### Troubleshooting
### Troubleshooting Installation Issues
If containers don't start, check logs for errors:
@@ -145,16 +145,16 @@ docker compose logs
docker images | grep prowler
```
If you encounter issues, you can rollback to the previous version by changing the `.env` file back to your previous version and running:
If issues are encountered, rollback to the previous version by changing the `.env` file back to the previous version and running:
```bash
docker compose pull
docker compose up -d
```
### Container versions
### Container Versions
The available versions of Prowler CLI are the following:
The available versions of Prowler App are the following:
- `latest`: in sync with `master` branch (please note that it is not a stable version)
- `v4-latest`: in sync with `v4` branch (please note that it is not a stable version)
@@ -4,7 +4,7 @@ title: 'Installation'
## Installation
Prowler is available as a project in [PyPI](https://pypi.org/project/prowler/). Install it as a Python package with `Python >= 3.9, <= 3.12`:
To install Prowler as a Python package, use `Python >= 3.9, <= 3.12`. Prowler is available as a project in [PyPI](https://pypi.org/project/prowler/):
<Tabs>
<Tab title="pipx">
@@ -41,7 +41,7 @@ Prowler is available as a project in [PyPI](https://pypi.org/project/prowler/).
prowler -v
```
Upgrade Prowler to the latest version:
To upgrade Prowler to the latest version:
``` bash
pip install --upgrade prowler
@@ -54,8 +54,6 @@ Prowler is available as a project in [PyPI](https://pypi.org/project/prowler/).
* In the command below, change `-v` to your local directory path in order to access the reports.
* AWS, GCP, Azure and/or Kubernetes credentials
> Containers are built for `linux/amd64`. If your workstation's architecture is different, please set `DOCKER_DEFAULT_PLATFORM=linux/amd64` in your environment or use the `--platform linux/amd64` flag in the docker command.
_Commands_:
``` bash
@@ -75,7 +73,7 @@ Prowler is available as a project in [PyPI](https://pypi.org/project/prowler/).
_Commands_:
```
```bash
git clone https://github.com/prowler-cloud/prowler
cd prowler
poetry install
@@ -94,7 +92,7 @@ Prowler is available as a project in [PyPI](https://pypi.org/project/prowler/).
_Commands_:
```
```bash
python3 -m pip install --user pipx
python3 -m pipx ensurepath
pipx install prowler
@@ -104,7 +102,7 @@ Prowler is available as a project in [PyPI](https://pypi.org/project/prowler/).
<Tab title="Ubuntu">
_Requirements_:
* `Ubuntu 23.04` or above, if you are using an older version of Ubuntu check [pipx installation](https://docs.prowler.com/projects/prowler-open-source/en/latest/#__tabbed_1_1) and ensure you have `Python >= 3.9, <= 3.12`.
* `Ubuntu 23.04` or above. For older Ubuntu versions, check [pipx installation](https://docs.prowler.com/projects/prowler-open-source/en/latest/#__tabbed_1_1) and ensure `Python >= 3.9, <= 3.12` is installed.
* `Python >= 3.9, <= 3.12`
* AWS, GCP, Azure and/or Kubernetes credentials
@@ -121,7 +119,7 @@ Prowler is available as a project in [PyPI](https://pypi.org/project/prowler/).
<Tab title="Brew">
_Requirements_:
* `Brew` installed in your Mac or Linux
* `Brew` installed on Mac or Linux
* AWS, GCP, Azure and/or Kubernetes credentials
_Commands_:
@@ -171,7 +169,8 @@ Prowler is available as a project in [PyPI](https://pypi.org/project/prowler/).
```
</Tab>
</Tabs>
## Container versions
## Container Versions
The available versions of Prowler CLI are the following:
@@ -0,0 +1,180 @@
---
title: 'Overview'
---
import { VersionBadge } from "/snippets/version-badge.mdx"
<VersionBadge version="5.8.0" />
Prowler Lighthouse AI is a Cloud Security Analyst chatbot that helps you understand, prioritize, and remediate security findings in your cloud environments. It's designed to provide security expertise for teams without dedicated resources, acting as your 24/7 virtual cloud security analyst.
<img src="/images/prowler-app/lighthouse-intro.png" alt="Prowler Lighthouse" />
<Card title="Set Up Lighthouse AI" icon="rocket" href="/user-guide/tutorials/prowler-app-lighthouse#set-up">
Learn how to configure Lighthouse AI with your preferred LLM provider
</Card>
## Capabilities
Prowler Lighthouse AI is designed to be your AI security team member, with capabilities including:
### Natural Language Querying
Ask questions in plain English about your security findings. Examples:
- "What are my highest risk findings?"
- "Show me all S3 buckets with public access."
- "What security issues were found in my production accounts?"
<img src="/images/prowler-app/lighthouse-feature1.png" alt="Natural language querying" />
### Detailed Remediation Guidance
Get tailored step-by-step instructions for fixing security issues:
- Clear explanations of the problem and its impact
- Commands or console steps to implement fixes
- Alternative approaches with different solutions
<img src="/images/prowler-app/lighthouse-feature2.png" alt="Detailed Remediation" />
### Enhanced Context and Analysis
Lighthouse AI can provide additional context to help you understand the findings:
- Explain security concepts related to findings in simple terms
- Provide risk assessments based on your environment and context
- Connect related findings to show broader security patterns
<img src="/images/prowler-app/lighthouse-config.png" alt="Business Context" />
<img src="/images/prowler-app/lighthouse-feature3.png" alt="Contextual Responses" />
## Important Notes
Prowler Lighthouse AI is powerful, but there are limitations:
- **Continuous improvement**: Please report any issues, as the feature may make mistakes or encounter errors, despite extensive testing.
- **Access limitations**: Lighthouse AI can only access data the logged-in user can view. If you can't see certain information, Lighthouse AI can't see it either.
- **NextJS session dependence**: If your Prowler application session expires or logs out, Lighthouse AI will error out. Refresh and log back in to continue.
- **Response quality**: The response quality depends on the selected LLM provider and model. Choose models with strong tool-calling capabilities for best results. We recommend `gpt-5` model from OpenAI.
### Getting Help
If you encounter issues with Prowler Lighthouse AI or have suggestions for improvements, please [reach out through our Slack channel](https://goto.prowler.com/slack).
### What Data Is Shared to LLM Providers?
The following API endpoints are accessible to Prowler Lighthouse AI. Data from the following API endpoints could be shared with LLM provider depending on the scope of user's query:
#### Accessible API Endpoints
**User Management:**
- List all users - `/api/v1/users`
- Retrieve the current user's information - `/api/v1/users/me`
**Provider Management:**
- List all providers - `/api/v1/providers`
- Retrieve data from a provider - `/api/v1/providers/{id}`
**Scan Management:**
- List all scans - `/api/v1/scans`
- Retrieve data from a specific scan - `/api/v1/scans/{id}`
**Resource Management:**
- List all resources - `/api/v1/resources`
- Retrieve data for a resource - `/api/v1/resources/{id}`
**Findings Management:**
- List all findings - `/api/v1/findings`
- Retrieve data from a specific finding - `/api/v1/findings/{id}`
- Retrieve metadata values from findings - `/api/v1/findings/metadata`
**Overview Data:**
- Get aggregated findings data - `/api/v1/overviews/findings`
- Get findings data by severity - `/api/v1/overviews/findings_severity`
- Get aggregated provider data - `/api/v1/overviews/providers`
- Get findings data by service - `/api/v1/overviews/services`
**Compliance Management:**
- List compliance overviews (optionally filter by scan) - `/api/v1/compliance-overviews`
- Retrieve data from a specific compliance overview - `/api/v1/compliance-overviews/{id}`
#### Excluded API Endpoints
Not all Prowler API endpoints are integrated with Lighthouse AI. They are intentionally excluded for the following reasons:
- OpenAI/other LLM providers shouldn't have access to sensitive data (like fetching provider secrets and other sensitive config)
- Users queries don't need responses from those API endpoints (ex: tasks, tenant details, downloading zip file, etc.)
**Excluded Endpoints:**
**User Management:**
- List specific users information - `/api/v1/users/{id}`
- List user memberships - `/api/v1/users/{user_pk}/memberships`
- Retrieve membership data from the user - `/api/v1/users/{user_pk}/memberships/{id}`
**Tenant Management:**
- List all tenants - `/api/v1/tenants`
- Retrieve data from a tenant - `/api/v1/tenants/{id}`
- List tenant memberships - `/api/v1/tenants/{tenant_pk}/memberships`
- List all invitations - `/api/v1/tenants/invitations`
- Retrieve data from tenant invitation - `/api/v1/tenants/invitations/{id}`
**Security and Configuration:**
- List all secrets - `/api/v1/providers/secrets`
- Retrieve data from a secret - `/api/v1/providers/secrets/{id}`
- List all provider groups - `/api/v1/provider-groups`
- Retrieve data from a provider group - `/api/v1/provider-groups/{id}`
**Reports and Tasks:**
- Download zip report - `/api/v1/scans/{v1}/report`
- List all tasks - `/api/v1/tasks`
- Retrieve data from a specific task - `/api/v1/tasks/{id}`
**Lighthouse AI Configuration:**
- List LLM providers - `/api/v1/lighthouse/providers`
- Retrieve LLM provider - `/api/v1/lighthouse/providers/{id}`
- List available models - `/api/v1/lighthouse/models`
- Retrieve tenant configuration - `/api/v1/lighthouse/configuration`
<Note>
Agents only have access to hit GET endpoints. They don't have access to other HTTP methods.
</Note>
## FAQs
**1. Which LLM providers are supported?**
Lighthouse AI supports three providers:
- **OpenAI** - GPT models (GPT-5, GPT-4o, etc.)
- **Amazon Bedrock** - Claude, Llama, Titan, and other models via AWS
- **OpenAI Compatible** - Custom endpoints like OpenRouter, Ollama, or any OpenAI-compatible service
For detailed configuration instructions, see [Using Multiple LLM Providers with Lighthouse](/user-guide/tutorials/prowler-app-lighthouse-multi-llm).
**2. Why a multi-agent supervisor model?**
Context windows are limited. While demo data fits inside the context window, querying real-world data often exceeds it. A multi-agent architecture is used so different agents fetch different sizes of data and respond with the minimum required data to the supervisor. This spreads the context window usage across agents.
**3. Is my security data shared with LLM providers?**
Minimal data is shared to generate useful responses. Agents can access security findings and remediation details when needed. Provider secrets are protected by design and cannot be read. The LLM provider credentials configured with Lighthouse AI are only accessible to our NextJS server and are never sent to the LLM providers. Resource metadata (names, tags, account/project IDs, etc) may be shared with the configured LLM provider based on query requirements.
**4. Can the Lighthouse AI change my cloud environment?**
No. The agent doesn't have the tools to make the changes, even if the configured cloud provider API keys contain permissions to modify resources.
Binary file not shown.

Before

Width:  |  Height:  |  Size: 197 KiB

After

Width:  |  Height:  |  Size: 96 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 540 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 204 KiB

After

Width:  |  Height:  |  Size: 136 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 241 KiB

After

Width:  |  Height:  |  Size: 147 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 268 KiB

After

Width:  |  Height:  |  Size: 180 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 404 KiB

After

Width:  |  Height:  |  Size: 165 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 347 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 173 KiB

+1 -1
View File
@@ -33,7 +33,7 @@ The supported providers right now are:
| [Github](/user-guide/providers/github/getting-started-github) | Official | UI, API, CLI |
| [Oracle Cloud](/user-guide/providers/oci/getting-started-oci) | Official | UI, API, CLI |
| [Infra as Code](/user-guide/providers/iac/getting-started-iac) | Official | UI, API, CLI |
| [MongoDB Atlas](/user-guide/providers/mongodbatlas/getting-started-mongodbatlas) | Official | CLI, API |
| [MongoDB Atlas](/user-guide/providers/mongodbatlas/getting-started-mongodbatlas) | Official | UI, API, CLI |
| [LLM](/user-guide/providers/llm/getting-started-llm) | Official | CLI |
| **NHN** | Unofficial | CLI |
@@ -49,8 +49,9 @@ This method grants permanent access and is the recommended setup for production
![External ID](/images/providers/prowler-cloud-external-id.png)
![Stack Data](/images/providers/fill-stack-data.png)
!!! info
An **External ID** is required when assuming the *ProwlerScan* role to comply with AWS [confused deputy prevention](https://docs.aws.amazon.com/IAM/latest/UserGuide/confused-deputy.html).
<Info>
An **External ID** is required when assuming the *ProwlerScan* role to prevent the [confused deputy problem](https://docs.aws.amazon.com/IAM/latest/UserGuide/confused-deputy.html).
</Info>
6. Acknowledge the IAM resource creation warning and proceed
@@ -37,7 +37,7 @@ title: 'Getting Started With AWS on Prowler'
6. Choose the preferred authentication method (next step)
![Select auth method](/images/providers/select-auth-method.png)
![Select auth method](./img/select-auth-method.png)
### Step 3: Set Up AWS Authentication
@@ -76,7 +76,7 @@ For Google Cloud, first enter your `GCP Project ID` and then select the authenti
7. Click "Next", then "Launch Scan"
![Launch Scan GCP](/images/providers/launch-scan.png)
![Launch Scan GCP](./img/launch-scan.png)
---
@@ -2,25 +2,38 @@
title: "Microsoft 365 Authentication in Prowler"
---
Prowler for Microsoft 365 supports multiple authentication types. Authentication methods vary between Prowler App and Prowler CLI:
Prowler for Microsoft 365 supports multiple authentication types across Prowler Cloud and Prowler CLI.
**Prowler App:**
## Navigation
- [Common Setup](#common-setup)
- [Prowler Cloud Authentication](#prowler-cloud-authentication)
- [Prowler CLI Authentication](#prowler-cli-authentication)
- [Supported PowerShell Versions](#supported-powershell-versions)
- [Required PowerShell Modules](#required-powershell-modules)
- [**Application Certificate Authentication**](#certificate-based-authentication) (**Recommended**)
- [**Application Client Secret Authentication**](#client-secret-authentication)
## Common Setup
### Authentication Methods Overview
Prowler Cloud uses app-only authentication. Prowler CLI supports the same app-only options and two delegated flows.
**Prowler Cloud:**
- [**Application Certificate Authentication**](#application-certificate-authentication-recommended) (**Recommended**)
- [**Application Client Secret Authentication**](#application-client-secret-authentication)
**Prowler CLI:**
- [**Application Certificate Authentication**](#certificate-based-authentication) (**Recommended**)
- [**Application Client Secret Authentication**](#client-secret-authentication)
- [**Application Certificate Authentication**](#application-certificate-authentication-recommended) (**Recommended**)
- [**Application Client Secret Authentication**](#application-client-secret-authentication)
- [**Azure CLI Authentication**](#azure-cli-authentication)
- [**Interactive Browser Authentication**](#interactive-browser-authentication)
## Required Permissions
### Required Permissions
To run the full Prowler provider, including PowerShell checks, two types of permission scopes must be set in **Microsoft Entra ID**.
### Application Permissions for App-Only Authentication
#### Application Permissions for App-Only Authentication
When using service principal authentication, add these **Application Permissions**:
@@ -44,6 +57,7 @@ When using service principal authentication, add these **Application Permissions
These permissions enable application-based authentication methods (client secret and certificate). Using certificate-based authentication is the recommended way to run the full M365 provider, including PowerShell checks.
</Note>
### Browser Authentication Permissions
When using browser authentication, permissions are delegated to the user, so the user must have the appropriate permissions rather than the application.
@@ -52,37 +66,38 @@ When using browser authentication, permissions are delegated to the user, so the
Browser and Azure CLI authentication methods limit scanning capabilities to checks that operate through Microsoft Graph API. Checks requiring PowerShell modules will not execute, as they need application-level permissions that cannot be delegated through browser authentication.
</Warning>
### Step-by-Step Permission Assignment
#### Create Application Registration
1. Access **Microsoft Entra ID**
1. Access **Microsoft Entra ID**.
![Overview of Microsoft Entra ID](/images/providers/microsoft-entra-id.png)
2. Navigate to "Applications" > "App registrations"
2. Navigate to "Applications" > "App registrations".
![App Registration nav](/images/providers/app-registration-menu.png)
3. Click "+ New registration", complete the form, and click "Register"
3. Click "+ New registration", complete the form, and click "Register".
![New Registration](/images/providers/new-registration.png)
4. Go to "Certificates & secrets" > "Client secrets" > "+ New client secret"
4. Go to "Certificates & secrets" > "Client secrets" > "+ New client secret".
![Certificate & Secrets nav](/images/providers/certificates-and-secrets.png)
5. Fill in the required fields and click "Add", then copy the generated value (this will be `AZURE_CLIENT_SECRET`)
5. Fill in the required fields and click "Add", then copy the generated value (this will be `AZURE_CLIENT_SECRET`).
![New Client Secret](/images/providers/new-client-secret.png)
#### Grant Microsoft Graph API Permissions
1. Go to App Registration > Select your Prowler App > click on "API permissions"
1. Open **API permissions** for the Prowler application registration.
![API Permission Page](/images/providers/api-permissions-page.png)
2. Click "+ Add a permission" > "Microsoft Graph" > "Application permissions"
2. Click "+ Add a permission" > "Microsoft Graph" > "Application permissions".
![Add API Permission](/images/providers/add-app-api-permission.png)
@@ -97,38 +112,39 @@ Browser and Azure CLI authentication methods limit scanning capabilities to chec
![Application Permissions](/images/providers/app-permissions.png)
4. Click "Add permissions", then click "Grant admin consent for `<your-tenant-name>`"
4. Click "Add permissions", then click "Grant admin consent for `<your-tenant-name>`".
<a id="grant-powershell-module-permissions-for-app-only-authentication"></a>
#### Grant PowerShell Module Permissions
1. **Add Exchange API:**
- Search and select "Office 365 Exchange Online" API in **APIs my organization uses**
- Search and select "Office 365 Exchange Online" API in **APIs my organization uses**.
![Office 365 Exchange Online API](/images/providers/search-exchange-api.png)
- Select "Exchange.ManageAsApp" permission and click "Add permissions"
- Select "Exchange.ManageAsApp" permission and click "Add permissions".
![Exchange.ManageAsApp Permission](/images/providers/exchange-permission.png)
- Assign `Global Reader` role to the app: Go to `Roles and administrators` > click `here` for directory level assignment
- Assign `Global Reader` role to the app: Go to `Roles and administrators` > click `here` for directory level assignment.
![Roles and administrators](/images/providers/here.png)
- Search for `Global Reader` and assign it to your application
- Search for `Global Reader` and assign it to the application.
![Global Reader Role](/images/providers/global-reader-role.png)
2. **Add Teams API:**
- Search and select "Skype and Teams Tenant Admin API" in **APIs my organization uses**
- Search and select "Skype and Teams Tenant Admin API" in **APIs my organization uses**.
![Skype and Teams Tenant Admin API](/images/providers/search-skype-teams-tenant-admin-api.png)
- Select "application_access" permission and click "Add permissions"
- Select "application_access" permission and click "Add permissions".
![application_access Permission](/images/providers/teams-permission.png)
3. Click "Grant admin consent for `<your-tenant-name>`" to grant admin consent
3. Click "Grant admin consent for `<your-tenant-name>`" to grant admin consent.
![Grant Admin Consent](/images/providers/grant-external-api-permissions.png)
@@ -136,11 +152,13 @@ Final permissions should look like this:
![Final Permissions](/images/providers/final-permissions.png)
Use the same application registration for both Prowler Cloud and Prowler CLI while switching authentication methods as needed.
<a id="client-secret-authentication"></a>
<a id="certificate-based-authentication"></a>
## Application Certificate Authentication (Recommended)
_Available for both Prowler App and Prowler CLI_
_Available for both Prowler Cloud and Prowler CLI_
**Authentication flag for CLI:** `--certificate-auth`
@@ -173,11 +191,11 @@ Guard `prowlerm365.key` and `prowlerm365.pfx`. Only upload the `.cer` file to th
</Warning>
If your organization uses a certificate authority, you can replace step 2 with a CSR workflow and import the signed certificate instead.
If an internal certificate authority is preferred, replace step 2 with a CSR workflow and import the signed certificate instead.
### Upload the Certificate to Microsoft Entra ID
1. Open **Microsoft Entra ID** > **App registrations** > your application.
1. Open **Microsoft Entra ID** > **App registrations** > the Prowler application.
2. Go to **Certificates & secrets** > **Certificates**.
3. Select **Upload certificate** and choose `prowlerm365.cer`.
4. Confirm the certificate appears with the expected expiration date.
@@ -189,45 +207,37 @@ base64 -i prowlerm365.pfx -o prowlerm365.pfx.b64
cat prowlerm365.pfx.b64 | tr -d '\n'
```
Copy the resulting single-line Base64 string (or the contents of `prowlerm365.pfx.b64`)—you will use it in the next step.
Copy the resulting single-line Base64 string (or the contents of `prowlerm365.pfx.b64`) for the next step.
### Provide the Certificate to Prowler
You can supply the private certificate to Prowler in two ways:
- **Prowler Cloud:** Paste the Base64-encoded PFX in the `certificate_content` field when configuring the Microsoft 365 provider in Prowler Cloud.
- **Prowler CLI:** Export credential variables or pass the local file path when running Prowler.
- **Environment variables (recommended for headless execution)**
```console
export AZURE_CLIENT_ID="00000000-0000-0000-0000-000000000000"
export AZURE_TENANT_ID="11111111-1111-1111-1111-111111111111"
export M365_CERTIFICATE_CONTENT="$(base64 < prowlerm365.pfx | tr -d '\n')"
```
```console
export AZURE_CLIENT_ID="00000000-0000-0000-0000-000000000000"
export AZURE_TENANT_ID="11111111-1111-1111-1111-111111111111"
export M365_CERTIFICATE_CONTENT="$(base64 < prowlerm365.pfx | tr -d '\n')"
```
Store the PFX securely and reference it when running the CLI:
The `M365_CERTIFICATE_CONTENT` variable must contain a single-line Base64 string. Remove any line breaks or spaces before exporting.
```console
python3 prowler-cli.py m365 --certificate-auth --certificate-path /secure/path/prowlerm365.pfx
```
- **Local file path**
Store the PFX securely and reference it when you run the CLI:
```console
python3 prowler-cli.py m365 --certificate-auth --certificate-path /secure/path/prowlerm365.pfx
```
The CLI still needs `AZURE_CLIENT_ID` and `AZURE_TENANT_ID` in the environment when you use `--certificate-path`.
For the **Prowler App**, paste the Base64-encoded PFX in the `certificate_content` field when you configure the provider secrets. The platform persists the encrypted certificate and supplies it during scans.
The CLI still needs `AZURE_CLIENT_ID` and `AZURE_TENANT_ID` in the environment when `--certificate-path` is used.
<Note>
Do not mix certificate authentication with a client secret. Provide either a certificate **or** a secret to the application registration and Prowler configuration.
</Note>
<a id="client-secret-authentication"></a>
<a id="service-principal-authentication"></a>
<a id="service-principal-authentication-recommended"></a>
## Application Client Secret Authentication
_Available for both Prowler App and Prowler CLI_
_Available for both Prowler Cloud and Prowler CLI_
**Authentication flag for CLI:** `--sp-env-auth`
@@ -239,35 +249,59 @@ export AZURE_CLIENT_SECRET="XXXXXXXXX"
export AZURE_TENANT_ID="XXXXXXXXX"
```
If these variables are not set or exported, execution using `--sp-env-auth` will fail.
Refer to the [Step-by-Step Permission Assignment](#step-by-step-permission-assignment) section below for setup instructions.
If the external API permissions described in the mentioned section above are not added only checks that work through MS Graph will be executed. This means that the full provider will not be executed.
This workflow is helpful for initial validation or temporary access. Plan to transition to certificate-based authentication to remove long-lived secrets and keep full provider coverage in unattended environments.
If these variables are not set or exported, execution using `--sp-env-auth` will fail. This workflow is helpful for initial validation or temporary access. Plan to transition to certificate-based authentication to remove long-lived secrets and keep full provider coverage in unattended environments.
<Note>
To scan every M365 check, ensure the required permissions are added to the application registration. Refer to the [PowerShell Module Permissions](#grant-powershell-module-permissions-for-app-only-authentication) section for more information.
</Note>
### Run Prowler with Certificate Authentication
If the external API permissions described above are not added, only checks that work through Microsoft Graph will be executed. This means that the full provider will not be executed.
After the variables or path are in place, run the Microsoft 365 provider as usual:
## Prowler Cloud Authentication
Use the shared permissions and credentials above, then complete the Microsoft 365 provider form in Prowler Cloud. The platform persists the encrypted credentials and supplies them during scans.
### Application Certificate Authentication (Recommended)
1. Select **Application Certificate Authentication**.
2. Enter the **tenant ID** and **application (client) ID**.
3. Paste the Base64-encoded certificate content.
This method keeps all Microsoft 365 checks available, including PowerShell-based checks.
### Application Client Secret Authentication
1. Select **Application Client Secret Authentication**.
2. Enter the **tenant ID** and **application (client) ID**.
3. Enter the **client secret**.
## Prowler CLI Authentication
### Certificate Authentication
**Authentication flag for CLI:** `--certificate-auth`
After credentials are exported, launch the Microsoft 365 provider with certificate authentication:
```console
python3 prowler-cli.py m365 --certificate-auth --init-modules --log-level ERROR
```
The command above initializes PowerShell modules if needed. You can combine other standard flags (for example, `--region M365USGovernment` or custom outputs) with `--certificate-auth`.
Prowler prints the certificate thumbprint during execution so the correct credential can be verified.
Prowler prints the certificate thumbprint during execution so you can confirm the correct credential is in use.
### Client Secret Authentication
**Authentication flag for CLI:** `--sp-env-auth`
After exporting the secret-based variables, run:
```console
python3 prowler-cli.py m365 --sp-env-auth --init-modules --log-level ERROR
```
<a id="azure-cli-authentication"></a>
## Azure CLI Authentication
_Available only for Prowler CLI_
### Azure CLI Authentication
**Authentication flag for CLI:** `--az-cli-auth`
@@ -279,7 +313,7 @@ az login --tenant <TENANT_ID>
az account set --tenant <TENANT_ID>
```
If you prefer to reuse the same service principal that powers certificate-based authentication, authenticate it through Azure CLI instead of exporting environment variables. Azure CLI expects the certificate in PEM format; convert the PFX produced earlier and sign in:
If reusing the same service principal that powers certificate-based authentication, authenticate it through Azure CLI instead of exporting environment variables. Azure CLI expects the certificate in PEM format; convert the PFX produced earlier and sign in:
```console
openssl pkcs12 -in prowlerm365.pfx -out prowlerm365.pem -nodes
@@ -297,11 +331,9 @@ python3 prowler-cli.py m365 --az-cli-auth
The Azure CLI identity must hold the same Microsoft Graph and external API permissions required for the full provider. Signing in with a user account limits the scan to delegated Microsoft Graph endpoints and skips PowerShell-based checks. Use a service principal with the necessary application permissions to keep complete coverage.
## Interactive Browser Authentication
### Interactive Browser Authentication
_Available only for Prowler CLI_
**Authentication flag:** `--browser-auth`
**Authentication flag for CLI:** `--browser-auth`
Authenticate against Azure using the default browser to start the scan. The `--tenant-id` flag is also required.
@@ -8,73 +8,81 @@ title: 'Getting Started With Microsoft 365 on Prowler'
Government cloud accounts or tenants (Microsoft 365 Government) are currently unsupported, but we expect to add support for them in the near future.
</Note>
## Prerequisites
Configure authentication for Microsoft 365 by following the [Microsoft 365 Authentication](/user-guide/providers/microsoft365/authentication) guide. This includes:
Set up authentication for Microsoft 365 with the [Microsoft 365 Authentication](/user-guide/providers/microsoft365/authentication) guide before starting either path:
- Registering an application in Microsoft Entra ID
- Granting all required Microsoft Graph and external API permissions
- Generating the application certificate (recommended) or client secret
- Setting up PowerShell module permissions (for full security coverage)
- Register an application in Microsoft Entra ID
- Grant the Microsoft Graph and external API permissions listed for the provider
- Generate an application certificate (recommended) or client secret
- Prepare PowerShell module permissions to enable every check
## Prowler App
<CardGroup cols={2}>
<Card title="Prowler Cloud" icon="cloud" href="#prowler-cloud">
Onboard Microsoft 365 using Prowler Cloud
</Card>
<Card title="Prowler CLI" icon="terminal" href="#prowler-cli">
Onboard Microsoft 365 using Prowler CLI
</Card>
</CardGroup>
### Step 1: Obtain Domain ID
## Prowler Cloud
1. Go to the Entra ID portal, then search for "Domain" or go to Identity > Settings > Domain Names
### Step 1: Locate the Domain ID
1. Open the Entra ID portal, then search for "Domain" or go to Identity > Settings > Domain Names.
![Search Domain Names](/images/providers/search-domain-names.png)
![Custom Domain Names](/images/providers/custom-domain-names.png)
2. Select the domain to use as unique identifier for the Microsoft 365 account in Prowler App
2. Select the domain that acts as the unique identifier for the Microsoft 365 account in Prowler Cloud.
### Step 2: Access Prowler App
### Step 2: Open Prowler Cloud
1. Go to [Prowler Cloud](https://cloud.prowler.com/) or launch [Prowler App](/user-guide/tutorials/prowler-app)
2. Navigate to "Configuration" > "Cloud Providers"
1. Go to [Prowler Cloud](https://cloud.prowler.com/) or launch [Prowler App](/user-guide/tutorials/prowler-app).
2. Navigate to "Configuration" > "Cloud Providers".
![Cloud Providers Page](/images/prowler-app/cloud-providers-page.png)
3. Click on "Add Cloud Provider"
3. Click "Add Cloud Provider".
![Add a Cloud Provider](/images/prowler-app/add-cloud-provider.png)
4. Select "Microsoft 365"
4. Select "Microsoft 365".
![Select Microsoft 365](/images/providers/select-m365-prowler-cloud.png)
5. Add the Domain ID and an optional alias, then click "Next"
5. Add the Domain ID and an optional alias, then click "Next".
![Add Domain ID](/images/providers/add-domain-id.png)
### Step 3: Select Authentication Method and Provide Credentials
### Step 3: Choose and Provide Authentication
Prowler App now separates Microsoft 365 authentication into two app-only options. After adding the Domain ID (primary tenant domain), choose the method that matches your setup:
After the Domain ID is in place, select the app-only authentication option that matches the Microsoft Entra ID setup:
<img src="/images/providers/m365-auth-selection-form.png" alt="M365 authentication method selection" width="700" />
#### Application Certificate Authentication (Recommended)
1. Enter your **tenant ID**: This is the unique identifier for your Microsoft Entra ID directory.
2. Enter your **application (client) ID**: This is the unique identifier assigned to your app registration in Microsoft Entra ID.
3. Upload your **certificate file content**: This is the Base64 encoded certificate content used to authenticate your application.
1. Enter the **tenant ID**, the unique identifier for the Microsoft Entra ID directory.
2. Enter the **application (client) ID**, the identifier for the Entra application registration.
3. Upload the **certificate file content** (Base64-encoded PFX).
<img src="/images/providers/certificate-form.png" alt="M365 certificate authentication form" width="700" />
Use this method whenever possible to avoid managing client secrets and to unlock every Microsoft 365 check, including those that require PowerShell modules.
For detailed instructions on how to setup Application Certificate Authentication, see the [Authentication](/user-guide/providers/microsoft365/authentication#application-certificate-authentication-recommended) page.
Use this method to avoid managing secrets and to unlock all Microsoft 365 checks, including the PowerShell-based ones. Full setup steps are in the [Authentication guide](/user-guide/providers/microsoft365/authentication#application-certificate-authentication-recommended).
#### Application Client Secret Authentication
1. Enter your **tenant ID**: This is the unique identifier for your Microsoft Entra ID directory.
2. Enter your **application (client) ID**: This is the unique identifier assigned to your app registration in Microsoft Entra ID.
3. Enter your **client secret**: This is the secret key used to authenticate your application.
1. Enter the **tenant ID**.
2. Enter the **application (client) ID**.
3. Enter the **client secret**.
<img src="/images/providers/secret-form.png" alt="M365 client secret authentication form" width="700" />
For detailed instructions on how to setup Application Client Secret Authentication, see the [Authentication](/user-guide/providers/microsoft365/authentication#application-client-secret-authentication) page.
For the complete setup workflow, follow the [Authentication guide](/user-guide/providers/microsoft365/authentication#application-client-secret-authentication).
### Step 4: Launch the Scan
@@ -90,30 +98,30 @@ For detailed instructions on how to setup Application Client Secret Authenticati
## Prowler CLI
Use Prowler CLI to scan Microsoft 365 environments.
### Step 1: Confirm PowerShell Coverage
### PowerShell Requirements
PowerShell 7.4+ keeps the full Microsoft 365 coverage. Installation options are listed in the [Authentication guide](/user-guide/providers/microsoft365/authentication#supported-powershell-versions).
PowerShell 7.4+ is required for comprehensive Microsoft 365 security coverage. Installation instructions are available in the [Authentication guide](/user-guide/providers/microsoft365/authentication#supported-powershell-versions).
### Step 2: Select an Authentication Method
### Authentication Options
Select an authentication method from the [Microsoft 365 Authentication](/user-guide/providers/microsoft365/authentication) guide:
Choose the matching flag from the [Microsoft 365 Authentication](/user-guide/providers/microsoft365/authentication) guide:
- **Application Certificate Authentication** (recommended): `--certificate-auth`
- **Application Client Secret Authentication**: `--sp-env-auth`
- **Azure CLI Authentication**: `--az-cli-auth`
- **Interactive Browser Authentication**: `--browser-auth`
### Basic Usage
### Step 3: Run the First Scan
After configuring authentication, run a basic scan:
Run a baseline scan after credentials are configured:
```console
prowler m365 --sp-env-auth
```
For comprehensive scans including PowerShell checks:
### Step 4: Enable Full Coverage
Include PowerShell module initialization to run every check:
```console
prowler m365 --sp-env-auth --init-modules
@@ -0,0 +1,128 @@
---
title: 'Using Multiple LLM Providers with Lighthouse'
---
import { VersionBadge } from "/snippets/version-badge.mdx"
<VersionBadge version="5.14.0" />
Prowler Lighthouse AI supports multiple Large Language Model (LLM) providers, offering flexibility to choose the provider that best fits infrastructure, compliance requirements, and cost considerations. This guide explains how to configure and use different LLM providers with Lighthouse AI.
## Supported Providers
Lighthouse AI supports the following LLM providers:
- **OpenAI**: Provides access to GPT models (GPT-4o, GPT-4, etc.)
- **Amazon Bedrock**: Offers AWS-hosted access to Claude, Llama, Titan, and other models
- **OpenAI Compatible**: Supports custom endpoints like OpenRouter, Ollama, or any OpenAI-compatible service
## How Default Providers Work
All three providers can be configured for a tenant, but only one can be set as the default provider. The first configured provider automatically becomes the default.
When visiting Lighthouse AI chat, the default provider's default model loads automatically. Users can switch to any available LLM model (including those from non-default providers) using the dropdown in chat.
<img src="/images/prowler-app/lighthouse-switch-models.png" alt="Switch models in Lighthouse AI chat interface" />
## Configuring Providers
Navigate to **Configuration** → **Lighthouse AI** to see all three provider options with a **Connect** button under each.
<img src="/images/prowler-app/lighthouse-configuration.png" alt="Prowler Lighthouse Configuration" />
### Connecting a Provider
To connect a provider:
1. Click **Connect** under the desired provider
2. Enter the required credentials
3. Select a default model for that provider
4. Click **Connect** to save
### OpenAI
#### Required Information
- **API Key**: OpenAI API key (starts with `sk-` or `sk-proj-`)
<Note>
To generate an OpenAI API key, visit https://platform.openai.com/api-keys
</Note>
### Amazon Bedrock
#### Required Information
- **AWS Access Key ID**: AWS access key ID
- **AWS Secret Access Key**: AWS secret access key
- **AWS Region**: Region where Bedrock is available (e.g., `us-east-1`, `us-west-2`)
#### Required Permissions
The AWS user must have the `AmazonBedrockLimitedAccess` managed policy attached:
```text
arn:aws:iam::aws:policy/AmazonBedrockLimitedAccess
```
<Note>
Currently, only AWS access key and secret key authentication is supported. Amazon Bedrock API key support will be available soon.
</Note>
<Note>
Available models depend on AWS region and account entitlements. Lighthouse AI displays only accessible models.
</Note>
### OpenAI Compatible
Use this option to connect to any LLM provider exposing OpenAI compatible API endpoint (OpenRouter, Ollama, etc.).
#### Required Information
- **API Key**: API key from the compatible service
- **Base URL**: API endpoint URL including the API version (e.g., `https://openrouter.ai/api/v1`)
#### Example: OpenRouter
1. Create an account at [OpenRouter](https://openrouter.ai/)
2. [Generate an API key](https://openrouter.ai/docs/guides/overview/auth/provisioning-api-keys) from the OpenRouter dashboard
3. Configure in Lighthouse AI:
- **API Key**: OpenRouter API key
- **Base URL**: `https://openrouter.ai/api/v1`
## Changing the Default Provider
To set a different provider as default:
1. Navigate to **Configuration** → **Lighthouse AI**
2. Click **Configure** under the provider you want as default
3. Click **Set as Default**
<img src="/images/prowler-app/lighthouse-set-default-provider.png" alt="Set default LLM provider" />
## Updating Provider Credentials
To update credentials for a connected provider:
1. Navigate to **Configuration** → **Lighthouse AI**
2. Click **Configure** under the provider
3. Enter the new credentials
4. Click **Update**
## Deleting a Provider
To remove a configured provider:
1. Navigate to **Configuration** → **Lighthouse AI**
2. Click **Configure** under the provider
3. Click **Delete**
## Model Recommendations
For best results with Lighthouse AI, the recommended model is `gpt-5` from OpenAI.
Models from other providers such as Amazon Bedrock and OpenAI Compatible endpoints can be connected and used, but performance is not guaranteed.
## Getting Help
For issues or suggestions, [reach out through our Slack channel](https://goto.prowler.com/slack).
@@ -1,26 +1,20 @@
---
title: 'Prowler Lighthouse AI'
title: 'How It Works'
---
import { VersionBadge } from "/snippets/version-badge.mdx"
<VersionBadge version="5.8.0" />
Prowler Lighthouse AI is a Cloud Security Analyst chatbot that helps you understand, prioritize, and remediate security findings in your cloud environments. It's designed to provide security expertise for teams without dedicated resources, acting as your 24/7 virtual cloud security analyst.
<img src="/images/prowler-app/lighthouse-intro.png" alt="Prowler Lighthouse" />
## How It Works
Prowler Lighthouse AI uses OpenAI's language models and integrates with your Prowler security findings data.
Prowler Lighthouse AI integrates Large Language Models (LLMs) with Prowler security findings data.
Here's what's happening behind the scenes:
- The system uses a multi-agent architecture built with [LanggraphJS](https://github.com/langchain-ai/langgraphjs) for LLM logic and [Vercel AI SDK UI](https://sdk.vercel.ai/docs/ai-sdk-ui/overview) for frontend chatbot.
- It uses a ["supervisor" architecture](https://langchain-ai.lang.chat/langgraphjs/tutorials/multi_agent/agent_supervisor/) that interacts with different agents for specialized tasks. For example, `findings_agent` can analyze detected security findings, while `overview_agent` provides a summary of connected cloud accounts.
- The system connects to OpenAI models to understand, fetch the right data, and respond to the user's query.
- The system connects to the configured LLM provider to understand user's query, fetches the right data, and responds to the query.
<Note>
Lighthouse AI is tested against `gpt-4o` and `gpt-4o-mini` OpenAI models.
Lighthouse AI supports multiple LLM providers including OpenAI, Amazon Bedrock, and OpenAI-compatible services. For configuration details, see [Using Multiple LLM Providers with Lighthouse](/user-guide/tutorials/prowler-app-lighthouse-multi-llm).
</Note>
- The supervisor agent is the main contact point. It is what users interact with directly from the chat interface. It coordinates with other agents to answer users' questions comprehensively.
@@ -30,16 +24,22 @@ Lighthouse AI is tested against `gpt-4o` and `gpt-4o-mini` OpenAI models.
All agents can only read relevant security data. They cannot modify your data or access sensitive information like configured secrets or tenant details.
</Note>
## Set up
Getting started with Prowler Lighthouse AI is easy:
1. Go to the configuration page in your Prowler dashboard.
2. Enter your OpenAI API key.
3. Select your preferred model. The recommended one for best results is `gpt-4o`.
4. (Optional) Add business context to improve response quality and prioritization.
1. Navigate to **Configuration** → **Lighthouse AI**
2. Click **Connect** under the desired provider (OpenAI, Amazon Bedrock, or OpenAI Compatible)
3. Enter the required credentials
4. Select a default model
5. Click **Connect** to save
<img src="/images/prowler-app/lighthouse-config.png" alt="Lighthouse AI Configuration" />
<Note>
For detailed configuration instructions for each provider, see [Using Multiple LLM Providers with Lighthouse](/user-guide/tutorials/prowler-app-lighthouse-multi-llm).
</Note>
<img src="/images/prowler-app/lighthouse-configuration.png" alt="Lighthouse AI Configuration" />
### Adding Business Context
@@ -51,163 +51,3 @@ The optional business context field lets you provide additional information to h
- Current security initiatives or focus areas
Better context leads to more relevant responses and prioritization that aligns with your needs.
## Capabilities
Prowler Lighthouse AI is designed to be your AI security team member, with capabilities including:
### Natural Language Querying
Ask questions in plain English about your security findings. Examples:
- "What are my highest risk findings?"
- "Show me all S3 buckets with public access."
- "What security issues were found in my production accounts?"
<img src="/images/prowler-app/lighthouse-feature1.png" alt="Natural language querying" />
### Detailed Remediation Guidance
Get tailored step-by-step instructions for fixing security issues:
- Clear explanations of the problem and its impact
- Commands or console steps to implement fixes
- Alternative approaches with different solutions
<img src="/images/prowler-app/lighthouse-feature2.png" alt="Detailed Remediation" />
### Enhanced Context and Analysis
Lighthouse AI can provide additional context to help you understand the findings:
- Explain security concepts related to findings in simple terms
- Provide risk assessments based on your environment and context
- Connect related findings to show broader security patterns
<img src="/images/prowler-app/lighthouse-config.png" alt="Business Context" />
<img src="/images/prowler-app/lighthouse-feature3.png" alt="Contextual Responses" />
## Important Notes
Prowler Lighthouse AI is powerful, but there are limitations:
- **Continuous improvement**: Please report any issues, as the feature may make mistakes or encounter errors, despite extensive testing.
- **Access limitations**: Lighthouse AI can only access data the logged-in user can view. If you can't see certain information, Lighthouse AI can't see it either.
- **NextJS session dependence**: If your Prowler application session expires or logs out, Lighthouse AI will error out. Refresh and log back in to continue.
- **Response quality**: The response quality depends on the selected OpenAI model. For best results, use gpt-4o.
### Getting Help
If you encounter issues with Prowler Lighthouse AI or have suggestions for improvements, please [reach out through our Slack channel](https://goto.prowler.com/slack).
### What Data Is Shared to OpenAI?
The following API endpoints are accessible to Prowler Lighthouse AI. Data from the following API endpoints could be shared with OpenAI depending on the scope of user's query:
#### Accessible API Endpoints
**User Management:**
- List all users - `/api/v1/users`
- Retrieve the current user's information - `/api/v1/users/me`
**Provider Management:**
- List all providers - `/api/v1/providers`
- Retrieve data from a provider - `/api/v1/providers/{id}`
**Scan Management:**
- List all scans - `/api/v1/scans`
- Retrieve data from a specific scan - `/api/v1/scans/{id}`
**Resource Management:**
- List all resources - `/api/v1/resources`
- Retrieve data for a resource - `/api/v1/resources/{id}`
**Findings Management:**
- List all findings - `/api/v1/findings`
- Retrieve data from a specific finding - `/api/v1/findings/{id}`
- Retrieve metadata values from findings - `/api/v1/findings/metadata`
**Overview Data:**
- Get aggregated findings data - `/api/v1/overviews/findings`
- Get findings data by severity - `/api/v1/overviews/findings_severity`
- Get aggregated provider data - `/api/v1/overviews/providers`
- Get findings data by service - `/api/v1/overviews/services`
**Compliance Management:**
- List compliance overviews (optionally filter by scan) - `/api/v1/compliance-overviews`
- Retrieve data from a specific compliance overview - `/api/v1/compliance-overviews/{id}`
#### Excluded API Endpoints
Not all Prowler API endpoints are integrated with Lighthouse AI. They are intentionally excluded for the following reasons:
- OpenAI/other LLM providers shouldn't have access to sensitive data (like fetching provider secrets and other sensitive config)
- Users queries don't need responses from those API endpoints (ex: tasks, tenant details, downloading zip file, etc.)
**Excluded Endpoints:**
**User Management:**
- List specific users information - `/api/v1/users/{id}`
- List user memberships - `/api/v1/users/{user_pk}/memberships`
- Retrieve membership data from the user - `/api/v1/users/{user_pk}/memberships/{id}`
**Tenant Management:**
- List all tenants - `/api/v1/tenants`
- Retrieve data from a tenant - `/api/v1/tenants/{id}`
- List tenant memberships - `/api/v1/tenants/{tenant_pk}/memberships`
- List all invitations - `/api/v1/tenants/invitations`
- Retrieve data from tenant invitation - `/api/v1/tenants/invitations/{id}`
**Security and Configuration:**
- List all secrets - `/api/v1/providers/secrets`
- Retrieve data from a secret - `/api/v1/providers/secrets/{id}`
- List all provider groups - `/api/v1/provider-groups`
- Retrieve data from a provider group - `/api/v1/provider-groups/{id}`
**Reports and Tasks:**
- Download zip report - `/api/v1/scans/{v1}/report`
- List all tasks - `/api/v1/tasks`
- Retrieve data from a specific task - `/api/v1/tasks/{id}`
**Lighthouse AI Configuration:**
- List OpenAI configuration - `/api/v1/lighthouse-config`
- Retrieve OpenAI key and configuration - `/api/v1/lighthouse-config/{id}`
<Note>
Agents only have access to hit GET endpoints. They don't have access to other HTTP methods.
</Note>
## FAQs
**1. Why only OpenAI models?**
During feature development, we evaluated other LLM models.
- **Claude AI** - Claude models have [tier-based ratelimits](https://docs.anthropic.com/en/api/rate-limits#requirements-to-advance-tier). For Lighthouse AI to answer slightly complex questions, there are a handful of API calls to the LLM provider within few seconds. With Claude's tiering system, users must purchase $400 credits or convert their subscription to monthly invoicing after talking to their sales team. This pricing may not suit all Prowler users.
- **Gemini Models** - Gemini lacks a solid tool calling feature like OpenAI. It calls functions recursively until exceeding limits. Gemini-2.5-Pro-Experimental is better than previous models regarding tool calling and responding, but it's still experimental.
- **Deepseek V3** - Doesn't support system prompt messages.
**2. Why a multi-agent supervisor model?**
Context windows are limited. While demo data fits inside the context window, querying real-world data often exceeds it. A multi-agent architecture is used so different agents fetch different sizes of data and respond with the minimum required data to the supervisor. This spreads the context window usage across agents.
**3. Is my security data shared with OpenAI?**
Minimal data is shared to generate useful responses. Agents can access security findings and remediation details when needed. Provider secrets are protected by design and cannot be read. The OpenAI key configured with Lighthouse AI is only accessible to our NextJS server and is never sent to LLMs. Resource metadata (names, tags, account/project IDs, etc) may be shared with OpenAI based on your query requirements.
**4. Can the Lighthouse AI change my cloud environment?**
No. The agent doesn't have the tools to make the changes, even if the configured cloud provider API keys contain permissions to modify resources.
Generated
+26 -19
View File
@@ -1,4 +1,4 @@
# This file is automatically @generated by Poetry 2.2.1 and should not be changed by hand.
# This file is automatically @generated by Poetry 2.1.4 and should not be changed by hand.
[[package]]
name = "about-time"
@@ -938,34 +938,34 @@ files = [
[[package]]
name = "boto3"
version = "1.39.15"
version = "1.40.61"
description = "The AWS SDK for Python"
optional = false
python-versions = ">=3.9"
groups = ["main", "dev"]
files = [
{file = "boto3-1.39.15-py3-none-any.whl", hash = "sha256:38fc54576b925af0075636752de9974e172c8a2cf7133400e3e09b150d20fb6a"},
{file = "boto3-1.39.15.tar.gz", hash = "sha256:b4483625f0d8c35045254dee46cd3c851bbc0450814f20b9b25bee1b5c0d8409"},
{file = "boto3-1.40.61-py3-none-any.whl", hash = "sha256:6b9c57b2a922b5d8c17766e29ed792586a818098efe84def27c8f582b33f898c"},
{file = "boto3-1.40.61.tar.gz", hash = "sha256:d6c56277251adf6c2bdd25249feae625abe4966831676689ff23b4694dea5b12"},
]
[package.dependencies]
botocore = ">=1.39.15,<1.40.0"
botocore = ">=1.40.61,<1.41.0"
jmespath = ">=0.7.1,<2.0.0"
s3transfer = ">=0.13.0,<0.14.0"
s3transfer = ">=0.14.0,<0.15.0"
[package.extras]
crt = ["botocore[crt] (>=1.21.0,<2.0a0)"]
[[package]]
name = "botocore"
version = "1.39.15"
version = "1.40.61"
description = "Low-level, data-driven core of boto 3."
optional = false
python-versions = ">=3.9"
groups = ["main", "dev"]
files = [
{file = "botocore-1.39.15-py3-none-any.whl", hash = "sha256:eb9cfe918ebfbfb8654e1b153b29f0c129d586d2c0d7fb4032731d49baf04cff"},
{file = "botocore-1.39.15.tar.gz", hash = "sha256:2aa29a717f14f8c7ca058c2e297aaed0aa10ecea24b91514eee802814d1b7600"},
{file = "botocore-1.40.61-py3-none-any.whl", hash = "sha256:17ebae412692fd4824f99cde0f08d50126dc97954008e5ba2b522eb049238aa7"},
{file = "botocore-1.40.61.tar.gz", hash = "sha256:a2487ad69b090f9cccd64cf07c7021cd80ee9c0655ad974f87045b02f3ef52cd"},
]
[package.dependencies]
@@ -977,7 +977,7 @@ urllib3 = [
]
[package.extras]
crt = ["awscrt (==0.23.8)"]
crt = ["awscrt (==0.27.6)"]
[[package]]
name = "cachetools"
@@ -2366,6 +2366,8 @@ python-versions = "*"
groups = ["dev"]
files = [
{file = "jsonpath-ng-1.7.0.tar.gz", hash = "sha256:f6f5f7fd4e5ff79c785f1573b394043b39849fb2bb47bcead935d12b00beab3c"},
{file = "jsonpath_ng-1.7.0-py2-none-any.whl", hash = "sha256:898c93fc173f0c336784a3fa63d7434297544b7198124a68f9a3ef9597b0ae6e"},
{file = "jsonpath_ng-1.7.0-py3-none-any.whl", hash = "sha256:f3d7f9e848cba1b6da28c55b1c26ff915dc9e0b1ba7e752a53d6da8d5cbd00b6"},
]
[package.dependencies]
@@ -4884,6 +4886,7 @@ files = [
{file = "ruamel.yaml.clib-0.2.12-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:f66efbc1caa63c088dead1c4170d148eabc9b80d95fb75b6c92ac0aad2437d76"},
{file = "ruamel.yaml.clib-0.2.12-cp310-cp310-musllinux_1_1_i686.whl", hash = "sha256:22353049ba4181685023b25b5b51a574bce33e7f51c759371a7422dcae5402a6"},
{file = "ruamel.yaml.clib-0.2.12-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:932205970b9f9991b34f55136be327501903f7c66830e9760a8ffb15b07f05cd"},
{file = "ruamel.yaml.clib-0.2.12-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:a52d48f4e7bf9005e8f0a89209bf9a73f7190ddf0489eee5eb51377385f59f2a"},
{file = "ruamel.yaml.clib-0.2.12-cp310-cp310-win32.whl", hash = "sha256:3eac5a91891ceb88138c113f9db04f3cebdae277f5d44eaa3651a4f573e6a5da"},
{file = "ruamel.yaml.clib-0.2.12-cp310-cp310-win_amd64.whl", hash = "sha256:ab007f2f5a87bd08ab1499bdf96f3d5c6ad4dcfa364884cb4549aa0154b13a28"},
{file = "ruamel.yaml.clib-0.2.12-cp311-cp311-macosx_13_0_arm64.whl", hash = "sha256:4a6679521a58256a90b0d89e03992c15144c5f3858f40d7c18886023d7943db6"},
@@ -4892,6 +4895,7 @@ files = [
{file = "ruamel.yaml.clib-0.2.12-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:811ea1594b8a0fb466172c384267a4e5e367298af6b228931f273b111f17ef52"},
{file = "ruamel.yaml.clib-0.2.12-cp311-cp311-musllinux_1_1_i686.whl", hash = "sha256:cf12567a7b565cbf65d438dec6cfbe2917d3c1bdddfce84a9930b7d35ea59642"},
{file = "ruamel.yaml.clib-0.2.12-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:7dd5adc8b930b12c8fc5b99e2d535a09889941aa0d0bd06f4749e9a9397c71d2"},
{file = "ruamel.yaml.clib-0.2.12-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:1492a6051dab8d912fc2adeef0e8c72216b24d57bd896ea607cb90bb0c4981d3"},
{file = "ruamel.yaml.clib-0.2.12-cp311-cp311-win32.whl", hash = "sha256:bd0a08f0bab19093c54e18a14a10b4322e1eacc5217056f3c063bd2f59853ce4"},
{file = "ruamel.yaml.clib-0.2.12-cp311-cp311-win_amd64.whl", hash = "sha256:a274fb2cb086c7a3dea4322ec27f4cb5cc4b6298adb583ab0e211a4682f241eb"},
{file = "ruamel.yaml.clib-0.2.12-cp312-cp312-macosx_14_0_arm64.whl", hash = "sha256:20b0f8dc160ba83b6dcc0e256846e1a02d044e13f7ea74a3d1d56ede4e48c632"},
@@ -4900,6 +4904,7 @@ files = [
{file = "ruamel.yaml.clib-0.2.12-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:749c16fcc4a2b09f28843cda5a193e0283e47454b63ec4b81eaa2242f50e4ccd"},
{file = "ruamel.yaml.clib-0.2.12-cp312-cp312-musllinux_1_1_i686.whl", hash = "sha256:bf165fef1f223beae7333275156ab2022cffe255dcc51c27f066b4370da81e31"},
{file = "ruamel.yaml.clib-0.2.12-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:32621c177bbf782ca5a18ba4d7af0f1082a3f6e517ac2a18b3974d4edf349680"},
{file = "ruamel.yaml.clib-0.2.12-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:b82a7c94a498853aa0b272fd5bc67f29008da798d4f93a2f9f289feb8426a58d"},
{file = "ruamel.yaml.clib-0.2.12-cp312-cp312-win32.whl", hash = "sha256:e8c4ebfcfd57177b572e2040777b8abc537cdef58a2120e830124946aa9b42c5"},
{file = "ruamel.yaml.clib-0.2.12-cp312-cp312-win_amd64.whl", hash = "sha256:0467c5965282c62203273b838ae77c0d29d7638c8a4e3a1c8bdd3602c10904e4"},
{file = "ruamel.yaml.clib-0.2.12-cp313-cp313-macosx_14_0_arm64.whl", hash = "sha256:4c8c5d82f50bb53986a5e02d1b3092b03622c02c2eb78e29bec33fd9593bae1a"},
@@ -4908,6 +4913,7 @@ files = [
{file = "ruamel.yaml.clib-0.2.12-cp313-cp313-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:96777d473c05ee3e5e3c3e999f5d23c6f4ec5b0c38c098b3a5229085f74236c6"},
{file = "ruamel.yaml.clib-0.2.12-cp313-cp313-musllinux_1_1_i686.whl", hash = "sha256:3bc2a80e6420ca8b7d3590791e2dfc709c88ab9152c00eeb511c9875ce5778bf"},
{file = "ruamel.yaml.clib-0.2.12-cp313-cp313-musllinux_1_1_x86_64.whl", hash = "sha256:e188d2699864c11c36cdfdada94d781fd5d6b0071cd9c427bceb08ad3d7c70e1"},
{file = "ruamel.yaml.clib-0.2.12-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:4f6f3eac23941b32afccc23081e1f50612bdbe4e982012ef4f5797986828cd01"},
{file = "ruamel.yaml.clib-0.2.12-cp313-cp313-win32.whl", hash = "sha256:6442cb36270b3afb1b4951f060eccca1ce49f3d087ca1ca4563a6eb479cb3de6"},
{file = "ruamel.yaml.clib-0.2.12-cp313-cp313-win_amd64.whl", hash = "sha256:e5b8daf27af0b90da7bb903a876477a9e6d7270be6146906b276605997c7e9a3"},
{file = "ruamel.yaml.clib-0.2.12-cp39-cp39-macosx_12_0_arm64.whl", hash = "sha256:fc4b630cd3fa2cf7fce38afa91d7cfe844a9f75d7f0f36393fa98815e911d987"},
@@ -4916,6 +4922,7 @@ files = [
{file = "ruamel.yaml.clib-0.2.12-cp39-cp39-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:e2f1c3765db32be59d18ab3953f43ab62a761327aafc1594a2a1fbe038b8b8a7"},
{file = "ruamel.yaml.clib-0.2.12-cp39-cp39-musllinux_1_1_i686.whl", hash = "sha256:d85252669dc32f98ebcd5d36768f5d4faeaeaa2d655ac0473be490ecdae3c285"},
{file = "ruamel.yaml.clib-0.2.12-cp39-cp39-musllinux_1_1_x86_64.whl", hash = "sha256:e143ada795c341b56de9418c58d028989093ee611aa27ffb9b7f609c00d813ed"},
{file = "ruamel.yaml.clib-0.2.12-cp39-cp39-musllinux_1_2_aarch64.whl", hash = "sha256:2c59aa6170b990d8d2719323e628aaf36f3bfbc1c26279c0eeeb24d05d2d11c7"},
{file = "ruamel.yaml.clib-0.2.12-cp39-cp39-win32.whl", hash = "sha256:beffaed67936fbbeffd10966a4eb53c402fafd3d6833770516bf7314bc6ffa12"},
{file = "ruamel.yaml.clib-0.2.12-cp39-cp39-win_amd64.whl", hash = "sha256:040ae85536960525ea62868b642bdb0c2cc6021c9f9d507810c0c604e66f5a7b"},
{file = "ruamel.yaml.clib-0.2.12.tar.gz", hash = "sha256:6c8fbb13ec503f99a91901ab46e0b07ae7941cd527393187039aec586fdfd36f"},
@@ -4923,14 +4930,14 @@ files = [
[[package]]
name = "s3transfer"
version = "0.13.1"
version = "0.14.0"
description = "An Amazon S3 Transfer Manager"
optional = false
python-versions = ">=3.9"
groups = ["main", "dev"]
files = [
{file = "s3transfer-0.13.1-py3-none-any.whl", hash = "sha256:a981aa7429be23fe6dfc13e80e4020057cbab622b08c0315288758d67cabc724"},
{file = "s3transfer-0.13.1.tar.gz", hash = "sha256:c3fdba22ba1bd367922f27ec8032d6a1cf5f10c934fb5d68cf60fd5a23d936cf"},
{file = "s3transfer-0.14.0-py3-none-any.whl", hash = "sha256:ea3b790c7077558ed1f02a3072fb3cb992bbbd253392f4b6e9e8976941c7d456"},
{file = "s3transfer-0.14.0.tar.gz", hash = "sha256:eff12264e7c8b4985074ccce27a3b38a485bb7f7422cc8046fee9be4983e4125"},
]
[package.dependencies]
@@ -5075,18 +5082,18 @@ files = [
[[package]]
name = "slack-sdk"
version = "3.34.0"
version = "3.39.0"
description = "The Slack API Platform SDK for Python"
optional = false
python-versions = ">=3.6"
python-versions = ">=3.7"
groups = ["main"]
files = [
{file = "slack_sdk-3.34.0-py2.py3-none-any.whl", hash = "sha256:c61f57f310d85be83466db5a98ab6ae3bb2e5587437b54fa0daa8fae6a0feffa"},
{file = "slack_sdk-3.34.0.tar.gz", hash = "sha256:ff61db7012160eed742285ea91f11c72b7a38a6500a7f6c5335662b4bc6b853d"},
{file = "slack_sdk-3.39.0-py2.py3-none-any.whl", hash = "sha256:b1556b2f5b8b12b94e5ea3f56c4f2c7f04462e4e1013d325c5764ff118044fa8"},
{file = "slack_sdk-3.39.0.tar.gz", hash = "sha256:6a56be10dc155c436ff658c6b776e1c082e29eae6a771fccf8b0a235822bbcb1"},
]
[package.extras]
optional = ["SQLAlchemy (>=1.4,<3)", "aiodns (>1.0)", "aiohttp (>=3.7.3,<4)", "boto3 (<=2)", "websocket-client (>=1,<2)", "websockets (>=9.1,<15)"]
optional = ["SQLAlchemy (>=1.4,<3)", "aiodns (>1.0)", "aiohttp (>=3.7.3,<4)", "boto3 (<=2)", "websocket-client (>=1,<2)", "websockets (>=9.1,<16)"]
[[package]]
name = "sniffio"
@@ -5688,4 +5695,4 @@ type = ["pytest-mypy"]
[metadata]
lock-version = "2.1"
python-versions = ">3.9.1,<3.13"
content-hash = "a367e65bc43c0a16495a3d0f6eab8b356cc49b509e329b61c6641cd87f374ff4"
content-hash = "82015f7b4b08e419ac5d28eab1a2d4b563b1980c84679e020ed3d42d3b4e9b85"
+19
View File
@@ -2,6 +2,25 @@
All notable changes to the **Prowler SDK** are documented in this file.
## [v5.15.0] (Prowler UNRELEASED)
### Added
- `cloudstorage_uses_vpc_service_controls` check for GCP provider [(#9256)](https://github.com/prowler-cloud/prowler/pull/9256)
- `repository_immutable_releases_enabled` check for GitHub provider [(#9162)](https://github.com/prowler-cloud/prowler/pull/9162)
- `compute_instance_preemptible_vm_disabled` check for GCP provider [(#9342)](https://github.com/prowler-cloud/prowler/pull/9342)
- `compute_instance_automatic_restart_enabled` check for GCP provider [(#9271)](https://github.com/prowler-cloud/prowler/pull/9271)
---
## [v5.14.1] (Prowler UNRELEASED)
### Fixed
- `sharepoint_external_sharing_managed` check to handle external sharing disabled at organization level [(#9298)](https://github.com/prowler-cloud/prowler/pull/9298)
- Custom check folder metadata validation [(#9335)](https://github.com/prowler-cloud/prowler/pull/9335)
- Support multiple Exchange mailbox policies in M365 `exchange_mailbox_policy_additional_storage_restricted` check [(#9241)](https://github.com/prowler-cloud/prowler/pull/9241)
---
## [v5.14.0] (Prowler v5.14.0)
### Added
+6
View File
@@ -24,6 +24,7 @@ from prowler.lib.check.check import (
list_checks_json,
list_fixers,
list_services,
load_custom_checks_metadata,
parse_checks_from_file,
parse_checks_from_folder,
print_categories,
@@ -185,6 +186,11 @@ def prowler():
logger.debug("Loading checks metadata from .metadata.json files")
bulk_checks_metadata = CheckMetadata.get_bulk(provider)
# Load custom checks metadata before validation
if checks_folder:
custom_folder_metadata = load_custom_checks_metadata(checks_folder)
bulk_checks_metadata.update(custom_folder_metadata)
if args.list_categories:
print_categories(list_categories(bulk_checks_metadata))
sys.exit()
+1 -1
View File
@@ -38,7 +38,7 @@ class _MutableTimestamp:
timestamp = _MutableTimestamp(datetime.today())
timestamp_utc = _MutableTimestamp(datetime.now(timezone.utc))
prowler_version = "5.14.0"
prowler_version = "5.15.0"
html_logo_url = "https://github.com/prowler-cloud/prowler/"
square_logo_img = "https://raw.githubusercontent.com/prowler-cloud/prowler/dc7d2d5aeb92fdf12e8604f42ef6472cd3e8e889/docs/img/prowler-logo-black.png"
aws_logo = "https://user-images.githubusercontent.com/38561120/235953920-3e3fba08-0795-41dc-b480-9bea57db9f2e.png"
+43 -1
View File
@@ -14,7 +14,7 @@ from colorama import Fore, Style
import prowler
from prowler.config.config import orange_color
from prowler.lib.check.custom_checks_metadata import update_check_metadata
from prowler.lib.check.models import Check
from prowler.lib.check.models import Check, load_check_metadata
from prowler.lib.check.utils import recover_checks_from_provider
from prowler.lib.logger import logger
from prowler.lib.outputs.outputs import report
@@ -110,6 +110,48 @@ def parse_checks_from_folder(provider, input_folder: str) -> set:
sys.exit(1)
def load_custom_checks_metadata(input_folder: str) -> dict:
"""
Load check metadata from a custom checks folder without copying the checks.
This is used to validate check names before the provider is initialized.
Args:
input_folder (str): Path to the folder containing custom checks.
Returns:
dict: A dictionary with CheckID as key and CheckMetadata as value.
"""
custom_checks_metadata = {}
try:
if not os.path.isdir(input_folder):
return custom_checks_metadata
with os.scandir(input_folder) as checks:
for check in checks:
if check.is_dir():
check_name = check.name
metadata_file = os.path.join(
input_folder, check_name, f"{check_name}.metadata.json"
)
if os.path.isfile(metadata_file):
try:
check_metadata = load_check_metadata(metadata_file)
custom_checks_metadata[check_metadata.CheckID] = (
check_metadata
)
except Exception as error:
logger.warning(
f"Could not load metadata from {metadata_file}: {error}"
)
return custom_checks_metadata
except Exception as error:
logger.error(
f"{error.__class__.__name__}[{error.__traceback__.tb_lineno}] -- {error}"
)
return custom_checks_metadata
# Load checks from custom folder
def remove_custom_checks_module(input_folder: str, provider: str):
# Check if input folder is a S3 URI
@@ -761,11 +761,15 @@
"ap-south-1",
"ap-southeast-1",
"ap-southeast-2",
"ap-southeast-5",
"ca-central-1",
"eu-central-1",
"eu-south-1",
"eu-south-2",
"eu-west-1",
"eu-west-2",
"eu-west-3",
"il-central-1",
"sa-east-1",
"us-east-1",
"us-east-2",
@@ -1379,6 +1383,7 @@
"bedrock": {
"regions": {
"aws": [
"af-south-1",
"ap-east-2",
"ap-northeast-1",
"ap-northeast-2",
@@ -1392,6 +1397,7 @@
"ap-southeast-5",
"ap-southeast-7",
"ca-central-1",
"ca-west-1",
"eu-central-1",
"eu-central-2",
"eu-north-1",
@@ -1402,6 +1408,8 @@
"eu-west-3",
"il-central-1",
"me-central-1",
"me-south-1",
"mx-central-1",
"sa-east-1",
"us-east-1",
"us-east-2",
@@ -1418,6 +1426,7 @@
"bedrock-agent": {
"regions": {
"aws": [
"af-south-1",
"ap-east-2",
"ap-northeast-1",
"ap-northeast-2",
@@ -1431,6 +1440,7 @@
"ap-southeast-5",
"ap-southeast-7",
"ca-central-1",
"ca-west-1",
"eu-central-1",
"eu-central-2",
"eu-north-1",
@@ -1441,6 +1451,8 @@
"eu-west-3",
"il-central-1",
"me-central-1",
"me-south-1",
"mx-central-1",
"sa-east-1",
"us-east-1",
"us-east-2",
@@ -3595,6 +3607,7 @@
"ap-southeast-3",
"ap-southeast-4",
"ap-southeast-5",
"ap-southeast-6",
"ap-southeast-7",
"ca-central-1",
"ca-west-1",
@@ -5551,6 +5564,7 @@
"ap-southeast-3",
"ap-southeast-4",
"ap-southeast-5",
"ap-southeast-6",
"ap-southeast-7",
"ca-central-1",
"ca-west-1",
@@ -8459,6 +8473,7 @@
"ap-southeast-3",
"ap-southeast-4",
"ap-southeast-5",
"ap-southeast-6",
"ap-southeast-7",
"ca-central-1",
"ca-west-1",
@@ -10696,7 +10711,9 @@
"ap-southeast-1",
"ap-southeast-2",
"ap-southeast-3",
"ap-southeast-5",
"ca-central-1",
"ca-west-1",
"eu-central-1",
"eu-central-2",
"eu-north-1",
@@ -10732,7 +10749,9 @@
"ap-southeast-1",
"ap-southeast-2",
"ap-southeast-3",
"ap-southeast-5",
"ca-central-1",
"ca-west-1",
"eu-central-1",
"eu-central-2",
"eu-north-1",
@@ -11239,6 +11258,7 @@
"aws": [
"af-south-1",
"ap-east-1",
"ap-east-2",
"ap-northeast-1",
"ap-northeast-2",
"ap-northeast-3",
@@ -0,0 +1,6 @@
from prowler.providers.common.provider import Provider
from prowler.providers.gcp.services.accesscontextmanager.accesscontextmanager_service import (
AccessContextManager,
)
accesscontextmanager_client = AccessContextManager(Provider.get_global_provider())
@@ -0,0 +1,101 @@
from pydantic.v1 import BaseModel
import prowler.providers.gcp.config as config
from prowler.lib.logger import logger
from prowler.providers.gcp.gcp_provider import GcpProvider
from prowler.providers.gcp.lib.service.service import GCPService
from prowler.providers.gcp.services.cloudresourcemanager.cloudresourcemanager_client import (
cloudresourcemanager_client,
)
class AccessContextManager(GCPService):
def __init__(self, provider: GcpProvider):
super().__init__("accesscontextmanager", provider, api_version="v1")
self.service_perimeters = []
self._get_service_perimeters()
def _get_service_perimeters(self):
for org in cloudresourcemanager_client.organizations:
try:
access_policies = []
try:
request = self.client.accessPolicies().list(
parent=f"organizations/{org.id}"
)
while request is not None:
response = request.execute(
num_retries=config.DEFAULT_RETRY_ATTEMPTS
)
access_policies.extend(response.get("accessPolicies", []))
request = self.client.accessPolicies().list_next(
previous_request=request, previous_response=response
)
except Exception as error:
logger.error(
f"{self.region} -- {error.__class__.__name__}[{error.__traceback__.tb_lineno}]: {error}"
)
continue
for policy in access_policies:
try:
request = (
self.client.accessPolicies()
.servicePerimeters()
.list(parent=policy["name"])
)
while request is not None:
response = request.execute(
num_retries=config.DEFAULT_RETRY_ATTEMPTS
)
for perimeter in response.get("servicePerimeters", []):
status = perimeter.get("status", {})
spec = perimeter.get("spec", {})
perimeter_config = status if status else spec
resources = perimeter_config.get("resources", [])
restricted_services = perimeter_config.get(
"restrictedServices", []
)
self.service_perimeters.append(
ServicePerimeter(
name=perimeter["name"],
title=perimeter.get("title", ""),
perimeter_type=perimeter.get(
"perimeterType", ""
),
resources=resources,
restricted_services=restricted_services,
policy_name=policy["name"],
)
)
request = (
self.client.accessPolicies()
.servicePerimeters()
.list_next(
previous_request=request, previous_response=response
)
)
except Exception as error:
logger.error(
f"{self.region} -- {error.__class__.__name__}[{error.__traceback__.tb_lineno}]: {error}"
)
except Exception as error:
logger.error(
f"{self.region} -- {error.__class__.__name__}[{error.__traceback__.tb_lineno}]: {error}"
)
class ServicePerimeter(BaseModel):
name: str
title: str
perimeter_type: str
resources: list[str]
restricted_services: list[str]
policy_name: str
@@ -19,6 +19,14 @@ class CloudResourceManager(GCPService):
def _get_iam_policy(self):
for project_id in self.project_ids:
try:
# Get project details to obtain project number
project_details = (
self.client.projects()
.get(projectId=project_id)
.execute(num_retries=DEFAULT_RETRY_ATTEMPTS)
)
project_number = project_details.get("projectNumber", "")
policy = (
self.client.projects()
.getIamPolicy(resource=project_id)
@@ -41,6 +49,7 @@ class CloudResourceManager(GCPService):
self.cloud_resource_manager_projects.append(
Project(
id=project_id,
number=project_number,
audit_logging=audit_logging,
audit_configs=audit_configs,
)
@@ -96,6 +105,7 @@ class Binding(BaseModel):
class Project(BaseModel):
id: str
number: str = ""
audit_logging: bool
audit_configs: list[AuditConfig] = []
@@ -0,0 +1,36 @@
{
"Provider": "gcp",
"CheckID": "cloudstorage_uses_vpc_service_controls",
"CheckTitle": "Cloud Storage services are protected by VPC Service Controls",
"CheckType": [],
"ServiceName": "cloudstorage",
"SubServiceName": "",
"ResourceIdTemplate": "",
"Severity": "medium",
"ResourceType": "cloudresourcemanager.googleapis.com/Project",
"Description": "**GCP Projects** are evaluated to ensure they have **VPC Service Controls** enabled for Cloud Storage. VPC Service Controls establish security boundaries by restricting access to Cloud Storage resources to specific networks and trusted clients, preventing unauthorized data access and exfiltration.",
"Risk": "Projects without VPC Service Controls protection for Cloud Storage may be vulnerable to unauthorized data access and exfiltration, even with proper IAM policies in place. VPC Service Controls provide an additional layer of network-level security that restricts API access based on the context of the request.",
"RelatedUrl": "",
"AdditionalURLs": [
"https://www.trendmicro.com/cloudoneconformity/knowledge-base/gcp/CloudStorage/use-vpc-service-controls.html",
"https://cloud.google.com/vpc-service-controls/docs/create-service-perimeters"
],
"Remediation": {
"Code": {
"CLI": "",
"NativeIaC": "",
"Other": "1) Open Google Cloud Console → Security → VPC Service Controls\n2) Create a new service perimeter or select an existing one\n3) Add the relevant GCP projects to the perimeter's protected resources\n4) Add 'storage.googleapis.com' to the list of restricted services\n5) Configure appropriate ingress and egress rules\n6) Save the perimeter configuration",
"Terraform": ""
},
"Recommendation": {
"Text": "Enable VPC Service Controls for all Cloud Storage buckets by adding their projects to a service perimeter with storage.googleapis.com as a restricted service. This prevents data exfiltration and ensures API calls are only allowed from authorized networks.",
"Url": "https://hub.prowler.com/check/cloudstorage_uses_vpc_service_controls"
}
},
"Categories": [
"internet-exposed"
],
"DependsOn": [],
"RelatedTo": [],
"Notes": ""
}
@@ -0,0 +1,53 @@
from prowler.lib.check.models import Check, Check_Report_GCP
from prowler.providers.gcp.services.accesscontextmanager.accesscontextmanager_client import (
accesscontextmanager_client,
)
from prowler.providers.gcp.services.cloudresourcemanager.cloudresourcemanager_client import (
cloudresourcemanager_client,
)
class cloudstorage_uses_vpc_service_controls(Check):
"""
Ensure Cloud Storage is protected by VPC Service Controls at project level.
Reports PASS if a project is in a VPC Service Controls perimeter
with storage.googleapis.com as a restricted service, otherwise FAIL.
"""
def execute(self) -> list[Check_Report_GCP]:
findings = []
protected_projects = {}
for perimeter in accesscontextmanager_client.service_perimeters:
if any(
service == "storage.googleapis.com"
for service in perimeter.restricted_services
):
for resource in perimeter.resources:
protected_projects[resource] = perimeter.title
for project in cloudresourcemanager_client.cloud_resource_manager_projects:
report = Check_Report_GCP(
metadata=self.metadata(),
resource=cloudresourcemanager_client.projects[project.id],
project_id=project.id,
location=cloudresourcemanager_client.region,
resource_name=(
cloudresourcemanager_client.projects[project.id].name
if cloudresourcemanager_client.projects[project.id].name
else "GCP Project"
),
)
report.status = "FAIL"
report.status_extended = f"Project {project.id} does not have VPC Service Controls enabled for Cloud Storage."
# GCP stores resources by project number, not project ID
project_resource_id = f"projects/{project.number}"
if project_resource_id in protected_projects:
report.status = "PASS"
report.status_extended = f"Project {project.id} has VPC Service Controls enabled for Cloud Storage in perimeter {protected_projects[project_resource_id]}."
findings.append(report)
return findings
@@ -0,0 +1,36 @@
{
"Provider": "gcp",
"CheckID": "compute_instance_automatic_restart_enabled",
"CheckTitle": "Compute Engine VM instances have Automatic Restart enabled",
"CheckType": [],
"ServiceName": "compute",
"SubServiceName": "",
"ResourceIdTemplate": "",
"Severity": "medium",
"ResourceType": "compute.googleapis.com/Instance",
"Description": "**Google Compute Engine virtual machine instances** are evaluated to ensure that **Automatic Restart** is enabled. This feature allows the Google Cloud Compute Engine service to automatically restart VM instances when they are terminated due to non-user-initiated reasons such as maintenance events, hardware failures, or software failures.",
"Risk": "VM instances without Automatic Restart enabled will not recover automatically from host maintenance events or unexpected failures, potentially leading to prolonged service downtime and requiring manual intervention to restore services.",
"RelatedUrl": "",
"AdditionalURLs": [
"https://www.trendmicro.com/cloudoneconformity/knowledge-base/gcp/ComputeEngine/enable-automatic-restart.html",
"https://cloud.google.com/compute/docs/instances/setting-instance-scheduling-options"
],
"Remediation": {
"Code": {
"CLI": "gcloud compute instances update <INSTANCE_NAME> --restart-on-failure --zone=<ZONE>",
"NativeIaC": "",
"Other": "1) Open Google Cloud Console → Compute Engine → VM instances\n2) Click on the instance name to view details\n3) Click 'Edit' at the top of the page\n4) Under 'Availability policies', set 'Automatic restart' to 'On (recommended)'\n5) Click 'Save' at the bottom of the page",
"Terraform": "```hcl\n# Example: enable Automatic Restart for a Compute Engine VM instance\nresource \"google_compute_instance\" \"example\" {\n name = var.instance_name\n machine_type = var.machine_type\n zone = var.zone\n\n scheduling {\n automatic_restart = true\n on_host_maintenance = \"MIGRATE\"\n }\n}\n```"
},
"Recommendation": {
"Text": "Enable the Automatic Restart feature for Compute Engine VM instances to enhance system reliability by automatically recovering from crashes or system-initiated terminations. This setting does not interfere with user-initiated shutdowns or stops.",
"Url": "https://hub.prowler.com/check/compute_instance_automatic_restart_enabled"
}
},
"Categories": [
"resilience"
],
"DependsOn": [],
"RelatedTo": [],
"Notes": "VM instances missing the 'scheduling.automaticRestart' field are treated as having Automatic Restart enabled (defaults to true). Preemptible instances and instances with provisioning model set to SPOT are automatically marked as PASS, as they cannot have Automatic Restart enabled by design."
}
@@ -0,0 +1,35 @@
from prowler.lib.check.models import Check, Check_Report_GCP
from prowler.providers.gcp.services.compute.compute_client import compute_client
class compute_instance_automatic_restart_enabled(Check):
"""
Ensure Compute Engine VM instances have Automatic Restart enabled.
Reports PASS if a VM instance has automatic restart enabled, otherwise FAIL.
"""
def execute(self) -> list[Check_Report_GCP]:
findings = []
for instance in compute_client.instances:
report = Check_Report_GCP(metadata=self.metadata(), resource=instance)
# Preemptible and Spot VMs cannot have automatic restart enabled
if instance.preemptible or instance.provisioning_model == "SPOT":
report.status = "FAIL"
report.status_extended = (
f"VM Instance {instance.name} is a Preemptible or Spot instance, "
"which cannot have Automatic Restart enabled by design."
)
elif instance.automatic_restart:
report.status = "PASS"
report.status_extended = (
f"VM Instance {instance.name} has Automatic Restart enabled."
)
else:
report.status = "FAIL"
report.status_extended = f"VM Instance {instance.name} does not have Automatic Restart enabled."
findings.append(report)
return findings
@@ -0,0 +1,37 @@
{
"Provider": "gcp",
"CheckID": "compute_instance_preemptible_vm_disabled",
"CheckTitle": "VM instance is not configured as preemptible or Spot VM",
"CheckType": [],
"ServiceName": "compute",
"SubServiceName": "",
"ResourceIdTemplate": "",
"Severity": "medium",
"ResourceType": "compute.googleapis.com/Instance",
"Description": "This check verifies that VM instances are not configured as **preemptible** or **Spot VMs**.\n\nBoth preemptible and Spot VMs can be terminated by Google at any time when resources are needed elsewhere, making them unsuitable for production and business-critical workloads. Spot VMs are the newer version of preemptible VMs and are Google's recommended approach for interruptible workloads.",
"Risk": "Preemptible and Spot VMs may be **terminated at any time** by Google Cloud, causing:\n\n- **Service disruptions** for production workloads\n- **Data loss** if workloads are not fault-tolerant\n- **Availability issues** for business-critical applications\n\nThey are designed for batch jobs and fault-tolerant workloads only.",
"RelatedUrl": "",
"AdditionalURLs": [
"https://cloud.google.com/compute/docs/instances/preemptible",
"https://cloud.google.com/compute/docs/instances/spot",
"https://www.trendmicro.com/cloudoneconformity/knowledge-base/gcp/ComputeEngine/disable-preemptibility.html"
],
"Remediation": {
"Code": {
"CLI": "",
"NativeIaC": "",
"Other": "1. Go to Compute Engine console\n2. Select the preemptible or Spot VM instance\n3. Create a machine image from the instance\n4. Create a new instance from the machine image\n5. During creation, set **VM provisioning model** to **Standard** (not Spot)\n6. Delete the original preemptible or Spot VM instance",
"Terraform": "```hcl\nresource \"google_compute_instance\" \"example_resource\" {\n name = \"example-instance\"\n machine_type = \"e2-medium\"\n zone = \"us-central1-a\"\n\n scheduling {\n # Use standard provisioning model for production workloads (not Spot)\n provisioning_model = \"STANDARD\"\n # Also ensure preemptible is false (legacy field)\n preemptible = false\n }\n}\n```"
},
"Recommendation": {
"Text": "Use standard provisioning model for production and business-critical VM instances. Preemptible and Spot VMs should only be used for fault-tolerant, batch processing, or non-critical workloads that can handle interruptions.",
"Url": "https://hub.prowler.com/checks/compute_instance_preemptible_vm_disabled"
}
},
"Categories": [
"resilience"
],
"DependsOn": [],
"RelatedTo": [],
"Notes": ""
}
@@ -0,0 +1,32 @@
from prowler.lib.check.models import Check, Check_Report_GCP
from prowler.providers.gcp.services.compute.compute_client import compute_client
class compute_instance_preemptible_vm_disabled(Check):
"""
Ensure GCP Compute Engine VM instances are not preemptible or Spot VMs.
- PASS: VM instance is not preemptible (preemptible=False) and not Spot
(provisioningModel != "SPOT").
- FAIL: VM instance is preemptible (preemptible=True) or Spot
(provisioningModel="SPOT").
"""
def execute(self) -> list[Check_Report_GCP]:
findings = []
for instance in compute_client.instances:
report = Check_Report_GCP(metadata=self.metadata(), resource=instance)
report.status = "PASS"
report.status_extended = (
f"VM Instance {instance.name} is not preemptible or Spot VM."
)
if instance.preemptible or instance.provisioning_model == "SPOT":
report.status = "FAIL"
vm_type = "preemptible" if instance.preemptible else "Spot VM"
report.status_extended = (
f"VM Instance {instance.name} is configured as {vm_type}."
)
findings.append(report)
return findings
@@ -133,7 +133,16 @@ class Compute(GCPService):
)
for disk in instance.get("disks", [])
],
automatic_restart=instance.get("scheduling", {}).get(
"automaticRestart", False
),
provisioning_model=instance.get("scheduling", {}).get(
"provisioningModel", "STANDARD"
),
project_id=project_id,
preemptible=instance.get("scheduling", {}).get(
"preemptible", False
),
)
)
@@ -365,6 +374,9 @@ class Instance(BaseModel):
service_accounts: list
ip_forward: bool
disks_encryption: list
automatic_restart: bool = False
preemptible: bool = False
provisioning_model: str = "STANDARD"
class Network(BaseModel):
@@ -0,0 +1,33 @@
{
"Provider": "github",
"CheckID": "repository_immutable_releases_enabled",
"CheckTitle": "Repository has immutable releases enabled",
"CheckType": [],
"ServiceName": "repository",
"SubServiceName": "",
"ResourceIdTemplate": "github:user-id:repository/repository-name",
"Severity": "high",
"ResourceType": "GitHubRepository",
"Description": "Immutable releases prevent modification or replacement of published artifacts after publication. When enabled, release assets and binaries become tamper-proof, ensuring artifact integrity throughout the software supply chain.",
"Risk": "If immutable releases are disabled, release assets can be tampered with after publication, allowing attackers to substitute malicious binaries and undermining supply chain integrity.",
"RelatedUrl": "https://docs.github.com/en/repositories/releasing-projects-on-github/managing-releases-in-a-repository#preventing-changes-to-releases",
"Remediation": {
"Code": {
"CLI": "",
"NativeIaC": "",
"Other": "",
"Terraform": ""
},
"Recommendation": {
"Text": "Enable immutable releases in the repository settings so release artifacts cannot be altered once published.",
"Url": "https://docs.github.com/en/code-security/supply-chain-security/understanding-your-software-supply-chain/immutable-releases"
}
},
"Categories": [
"software-supply-chain"
],
"DependsOn": [],
"RelatedTo": [],
"Notes": ""
}
@@ -0,0 +1,41 @@
from typing import List
from prowler.lib.check.models import Check, CheckReportGithub
from prowler.providers.github.services.repository.repository_client import (
repository_client,
)
class repository_immutable_releases_enabled(Check):
"""Ensure immutable releases are enabled for GitHub repositories.
Immutable releases prevent post-publication tampering of binaries and release assets.
"""
def execute(self) -> List[CheckReportGithub]:
"""Run the immutable releases verification for each discovered repository.
Returns:
List[CheckReportGithub]: Collection of check reports describing the immutable releases status.
"""
findings: List[CheckReportGithub] = []
for repo in repository_client.repositories.values():
if repo.immutable_releases_enabled is None:
continue
report = CheckReportGithub(metadata=self.metadata(), resource=repo)
if repo.immutable_releases_enabled:
report.status = "PASS"
report.status_extended = (
f"Repository {repo.name} has immutable releases enabled."
)
else:
report.status = "FAIL"
report.status_extended = (
f"Repository {repo.name} does not have immutable releases enabled."
)
findings.append(report)
return findings
@@ -341,6 +341,9 @@ class Repository(GithubService):
name=repo.name,
owner=repo.owner.login,
full_name=repo.full_name,
immutable_releases_enabled=self._get_repository_immutable_releases_status(
repo
),
default_branch=Branch(
name=default_branch,
protected=branch_protection,
@@ -370,6 +373,54 @@ class Repository(GithubService):
f"{repo.full_name}: {error.__class__.__name__}[{error.__traceback__.tb_lineno}]: {error}"
)
def _get_repository_immutable_releases_status(self, repo) -> Optional[bool]:
"""Retrieve the immutable releases status for the provided repository.
The API returns a response in the format:
{
"enabled": true,
"enforced_by_owner": false
}
Args:
repo: The PyGithub repository instance to query.
Returns:
Optional[bool]: True when immutable releases are enabled, False when they are disabled, and None when the status cannot be determined.
"""
try:
_, response = repo._requester.requestJsonAndCheck( # type: ignore[attr-defined]
"GET",
f"/repos/{repo.full_name}/immutable-releases",
headers={
"Accept": "application/vnd.github+json",
"X-GitHub-Api-Version": "2022-11-28",
},
)
if isinstance(response, dict) and "enabled" in response:
return response.get("enabled")
return None
except github.GithubException as error:
status_code = getattr(error, "status", None)
if status_code == 404:
logger.info(
f"{repo.full_name}: immutable releases endpoint not available for this repository."
)
return None
if status_code == 403:
logger.warning(
f"{repo.full_name}: insufficient permissions to query immutable releases endpoint."
)
return None
self._handle_github_api_error(
error, "fetching immutable releases status", repo.full_name
)
except Exception as error:
logger.error(
f"{repo.full_name}: {error.__class__.__name__}[{error.__traceback__.tb_lineno}]: {error}"
)
return None
class Branch(BaseModel):
"""Model for Github Branch"""
@@ -396,6 +447,7 @@ class Repo(BaseModel):
name: str
owner: str
full_name: str
immutable_releases_enabled: Optional[bool] = None
default_branch: Branch
private: bool
archived: bool
@@ -13,32 +13,28 @@ class exchange_mailbox_policy_additional_storage_restricted(Check):
def execute(self) -> List[CheckReportM365]:
"""Run the check to validate Exchange mailbox policy restrictions.
Iterates through the mailbox policy configuration to determine if additional storage
providers are restricted and generates a report based on the policy status.
Iterates through all mailbox policies to determine if additional storage
providers are restricted and generates reports for each policy.
Returns:
List[CheckReportM365]: A list of reports with the restriction status for the mailbox policy.
List[CheckReportM365]: A list of reports with the restriction status for each mailbox policy.
"""
findings = []
mailbox_policy = exchange_client.mailbox_policy
if mailbox_policy:
report = CheckReportM365(
metadata=self.metadata(),
resource=mailbox_policy,
resource_name="Exchange Mailbox Policy",
resource_id=mailbox_policy.id,
)
report.status = "FAIL"
report.status_extended = (
"Exchange mailbox policy allows additional storage providers."
)
if not mailbox_policy.additional_storage_enabled:
report.status = "PASS"
report.status_extended = (
"Exchange mailbox policy restricts additional storage providers."
for mailbox_policy in exchange_client.mailbox_policies:
if mailbox_policy:
report = CheckReportM365(
metadata=self.metadata(),
resource=mailbox_policy,
resource_name=f"Exchange Mailbox Policy - {mailbox_policy.id}",
resource_id=mailbox_policy.id,
)
report.status = "FAIL"
report.status_extended = f"Exchange mailbox policy '{mailbox_policy.id}' allows additional storage providers."
findings.append(report)
if not mailbox_policy.additional_storage_enabled:
report.status = "PASS"
report.status_extended = f"Exchange mailbox policy '{mailbox_policy.id}' restricts additional storage providers."
findings.append(report)
return findings
@@ -16,7 +16,7 @@ class Exchange(M365Service):
self.external_mail_config = []
self.transport_rules = []
self.transport_config = None
self.mailbox_policy = None
self.mailbox_policies = []
self.role_assignment_policies = []
self.mailbox_audit_properties = []
@@ -27,7 +27,7 @@ class Exchange(M365Service):
self.external_mail_config = self._get_external_mail_config()
self.transport_rules = self._get_transport_rules()
self.transport_config = self._get_transport_config()
self.mailbox_policy = self._get_mailbox_policy()
self.mailbox_policies = self._get_mailbox_policy()
self.role_assignment_policies = self._get_role_assignment_policies()
self.mailbox_audit_properties = self._get_mailbox_audit_properties()
self.powershell.close()
@@ -164,21 +164,27 @@ class Exchange(M365Service):
def _get_mailbox_policy(self):
logger.info("Microsoft365 - Getting mailbox policy configuration...")
mailboxes_policy = None
mailbox_policies = []
try:
mailbox_policy = self.powershell.get_mailbox_policy()
if mailbox_policy:
mailboxes_policy = MailboxPolicy(
id=mailbox_policy.get("Id", ""),
additional_storage_enabled=mailbox_policy.get(
"AdditionalStorageProvidersAvailable", True
),
)
policies_data = self.powershell.get_mailbox_policy()
if policies_data:
if isinstance(policies_data, dict):
policies_data = [policies_data]
for policy in policies_data:
if policy:
mailbox_policies.append(
MailboxPolicy(
id=policy.get("Id", ""),
additional_storage_enabled=policy.get(
"AdditionalStorageProvidersAvailable", True
),
)
)
except Exception as error:
logger.error(
f"{error.__class__.__name__}[{error.__traceback__.tb_lineno}]: {error}"
)
return mailboxes_policy
return mailbox_policies
def _get_role_assignment_policies(self):
logger.info("Microsoft365 - Getting role assignment policies...")
@@ -11,12 +11,9 @@ class sharepoint_external_sharing_managed(Check):
Check if Microsoft 365 SharePoint external sharing is managed through domain whitelists/blacklists.
This check verifies that SharePoint external sharing settings are configured to restrict document sharing
to external domains by enforcing domain-based restrictions. This means that the setting
'sharingDomainRestrictionMode' must be set to either "AllowList" or "BlockList". If it is not, then
external sharing is not managed via domain restrictions, increasing the risk of unauthorized access.
Note: This check only evaluates the domain restriction mode and does not enforce the optional check
of verifying that the allowed/blocked domain list is not empty.
to external domains by enforcing domain-based restrictions. When external sharing is enabled, the setting
'sharingDomainRestrictionMode' must be set to either "AllowList" or "BlockList" with a corresponding
domain list. If external sharing is disabled at the organization level, the check passes.
"""
def execute(self) -> List[CheckReportM365]:
@@ -40,7 +37,12 @@ class sharepoint_external_sharing_managed(Check):
)
report.status = "FAIL"
report.status_extended = "SharePoint external sharing is not managed through domain restrictions."
if settings.sharingDomainRestrictionMode in ["allowList", "blockList"]:
if settings.sharingCapability == "Disabled":
report.status = "PASS"
report.status_extended = (
"External sharing is disabled at organization level."
)
elif settings.sharingDomainRestrictionMode in ["allowList", "blockList"]:
report.status_extended = f"SharePoint external sharing is managed through domain restrictions with mode '{settings.sharingDomainRestrictionMode}' but the list is empty."
if (
settings.sharingDomainRestrictionMode == "allowList"
+4 -4
View File
@@ -40,8 +40,8 @@ dependencies = [
"azure-mgmt-loganalytics==12.0.0",
"azure-monitor-query==2.0.0",
"azure-storage-blob==12.24.1",
"boto3==1.39.15",
"botocore==1.39.15",
"boto3==1.40.61",
"botocore==1.40.61",
"colorama==0.4.6",
"cryptography==44.0.1",
"dash==3.1.1",
@@ -64,7 +64,7 @@ dependencies = [
"pytz==2025.1",
"schema==0.7.5",
"shodan==1.31.0",
"slack-sdk==3.34.0",
"slack-sdk==3.39.0",
"tabulate==0.9.0",
"tzlocal==5.3.1",
"py-iam-expand==0.1.0",
@@ -77,7 +77,7 @@ maintainers = [{name = "Prowler Engineering", email = "engineering@prowler.com"}
name = "prowler"
readme = "README.md"
requires-python = ">3.9.1,<3.13"
version = "5.14.0"
version = "5.15.0"
[project.scripts]
prowler = "prowler.__main__:prowler"
+44
View File
@@ -18,6 +18,7 @@ from prowler.lib.check.check import (
list_categories,
list_checks_json,
list_services,
load_custom_checks_metadata,
parse_checks_from_file,
parse_checks_from_folder,
remove_custom_checks_module,
@@ -483,6 +484,49 @@ class TestCheck:
)
remove_custom_checks_module(check_folder, provider)
def test_load_custom_checks_metadata(self, tmp_path):
"""Test loading check metadata from a custom checks folder."""
check_name = "custom_test_check"
check_folder = tmp_path / check_name
check_folder.mkdir()
metadata = {
"Provider": "aws",
"CheckID": check_name,
"CheckTitle": "Test Custom Check",
"CheckType": [],
"ServiceName": "custom",
"SubServiceName": "",
"ResourceIdTemplate": "arn:aws:custom:::resource",
"Severity": "low",
"ResourceType": "AwsCustomResource",
"Description": "A test custom check",
"Risk": "Test risk",
"RelatedUrl": "https://example.com",
"Remediation": {
"Code": {"CLI": "", "NativeIaC": "", "Other": "", "Terraform": ""},
"Recommendation": {"Text": "", "Url": ""},
},
"Categories": [],
"DependsOn": [],
"RelatedTo": [],
"Notes": "",
}
metadata_file = check_folder / f"{check_name}.metadata.json"
metadata_file.write_text(json.dumps(metadata))
result = load_custom_checks_metadata(str(tmp_path))
assert check_name in result
assert result[check_name].CheckID == check_name
assert result[check_name].Provider == "aws"
assert result[check_name].Severity == "low"
def test_load_custom_checks_metadata_nonexistent_path(self):
"""Test that nonexistent paths return empty dict."""
result = load_custom_checks_metadata("/nonexistent/path/to/checks")
assert result == {}
def test_exclude_checks_to_run(self):
test_cases = [
{
+86 -2
View File
@@ -56,6 +56,7 @@ def mock_api_client(GCPService, service, api_version, _):
mock_api_policies_calls(client)
mock_api_sink_calls(client)
mock_api_services_calls(client)
mock_api_access_policies_calls(client)
return client
@@ -117,8 +118,9 @@ def mock_api_projects_calls(client: MagicMock):
"etag": "BwWWja0YfJA=",
"version": 3,
}
# Used by compute client
# Used by compute client and cloudresourcemanager
client.projects().get().execute.return_value = {
"projectNumber": "123456789012",
"commonInstanceMetadata": {
"items": [
{
@@ -134,7 +136,7 @@ def mock_api_projects_calls(client: MagicMock):
"value": "TRUE",
},
]
}
},
}
client.projects().list_next.return_value = None
# Used by dataproc client
@@ -757,6 +759,11 @@ def mock_api_instances_calls(client: MagicMock, service: str):
"diskType": "disk_type",
}
],
"scheduling": {
"automaticRestart": False,
"preemptible": False,
"provisioningModel": "STANDARD",
},
},
{
"name": "instance2",
@@ -783,6 +790,11 @@ def mock_api_instances_calls(client: MagicMock, service: str):
"diskType": "disk_type",
}
],
"scheduling": {
"automaticRestart": False,
"preemptible": False,
"provisioningModel": "STANDARD",
},
},
]
}
@@ -1100,3 +1112,75 @@ def mock_api_services_calls(client: MagicMock):
]
}
client.services().list_next.return_value = None
def mock_api_access_policies_calls(client: MagicMock):
# Mock access policies list based on parent organization
def mock_list_access_policies(parent):
return_value = MagicMock()
# Only return policies for the first organization (123456789)
if parent == "organizations/123456789":
return_value.execute.return_value = {
"accessPolicies": [
{
"name": "accessPolicies/123456",
"title": "Test Access Policy 1",
},
{
"name": "accessPolicies/789012",
"title": "Test Access Policy 2",
},
]
}
elif parent == "organizations/987654321":
# No policies for the second organization
return_value.execute.return_value = {"accessPolicies": []}
else:
return_value.execute.return_value = {"accessPolicies": []}
return return_value
client.accessPolicies().list = mock_list_access_policies
client.accessPolicies().list_next.return_value = None
# Mock service perimeters list based on parent access policy
def mock_list_service_perimeters(parent):
return_value = MagicMock()
if parent == "accessPolicies/123456":
return_value.execute.return_value = {
"servicePerimeters": [
{
"name": "accessPolicies/123456/servicePerimeters/perimeter1",
"title": "Test Perimeter 1",
"perimeterType": "PERIMETER_TYPE_REGULAR",
"status": {
"resources": [
f"projects/{GCP_PROJECT_ID}",
],
"restrictedServices": [
"storage.googleapis.com",
"bigquery.googleapis.com",
],
},
},
{
"name": "accessPolicies/123456/servicePerimeters/perimeter2",
"title": "Test Perimeter 2",
"perimeterType": "PERIMETER_TYPE_BRIDGE",
"spec": {
"resources": [],
"restrictedServices": [
"compute.googleapis.com",
],
},
},
]
}
elif parent == "accessPolicies/789012":
# No perimeters for the second policy
return_value.execute.return_value = {"servicePerimeters": []}
else:
return_value.execute.return_value = {"servicePerimeters": []}
return return_value
client.accessPolicies().servicePerimeters().list = mock_list_service_perimeters
client.accessPolicies().servicePerimeters().list_next.return_value = None
@@ -0,0 +1,196 @@
from unittest.mock import MagicMock, patch
from tests.providers.gcp.gcp_fixtures import (
GCP_PROJECT_ID,
mock_api_client,
mock_is_api_active,
set_mocked_gcp_provider,
)
class TestAccessContextManagerService:
def test_service(self):
# Mock cloudresourcemanager_client before importing accesscontextmanager
mock_crm_client = MagicMock()
mock_crm_client.organizations = [
MagicMock(id="123456789", name="Organization 1"),
]
with (
patch(
"prowler.providers.gcp.lib.service.service.GCPService.__is_api_active__",
new=mock_is_api_active,
),
patch(
"prowler.providers.gcp.lib.service.service.GCPService.__generate_client__",
new=mock_api_client,
),
patch(
"prowler.providers.common.provider.Provider.get_global_provider",
return_value=set_mocked_gcp_provider(),
),
patch(
"prowler.providers.gcp.services.accesscontextmanager.accesscontextmanager_service.cloudresourcemanager_client",
new=mock_crm_client,
),
):
from prowler.providers.gcp.services.accesscontextmanager.accesscontextmanager_service import (
AccessContextManager,
)
accesscontextmanager_client = AccessContextManager(
set_mocked_gcp_provider(project_ids=[GCP_PROJECT_ID])
)
assert accesscontextmanager_client.service == "accesscontextmanager"
assert accesscontextmanager_client.project_ids == [GCP_PROJECT_ID]
# Should have 2 service perimeters from the first access policy
assert len(accesscontextmanager_client.service_perimeters) == 2
# First service perimeter
assert (
accesscontextmanager_client.service_perimeters[0].name
== "accessPolicies/123456/servicePerimeters/perimeter1"
)
assert (
accesscontextmanager_client.service_perimeters[0].title
== "Test Perimeter 1"
)
assert (
accesscontextmanager_client.service_perimeters[0].perimeter_type
== "PERIMETER_TYPE_REGULAR"
)
assert accesscontextmanager_client.service_perimeters[0].resources == [
f"projects/{GCP_PROJECT_ID}"
]
assert accesscontextmanager_client.service_perimeters[
0
].restricted_services == [
"storage.googleapis.com",
"bigquery.googleapis.com",
]
assert (
accesscontextmanager_client.service_perimeters[0].policy_name
== "accessPolicies/123456"
)
# Second service perimeter
assert (
accesscontextmanager_client.service_perimeters[1].name
== "accessPolicies/123456/servicePerimeters/perimeter2"
)
assert (
accesscontextmanager_client.service_perimeters[1].title
== "Test Perimeter 2"
)
assert (
accesscontextmanager_client.service_perimeters[1].perimeter_type
== "PERIMETER_TYPE_BRIDGE"
)
assert accesscontextmanager_client.service_perimeters[1].resources == []
assert accesscontextmanager_client.service_perimeters[
1
].restricted_services == [
"compute.googleapis.com",
]
assert (
accesscontextmanager_client.service_perimeters[1].policy_name
== "accessPolicies/123456"
)
def test_get_service_perimeters_access_policies_error(self):
"""Test error handling when listing access policies fails."""
mock_crm_client = MagicMock()
mock_crm_client.organizations = [
MagicMock(id="123456789", name="Organization 1"),
]
mock_client = MagicMock()
def mock_list_access_policies_error(parent):
return_value = MagicMock()
return_value.execute.side_effect = Exception("Access denied")
return return_value
mock_client.accessPolicies().list = mock_list_access_policies_error
with (
patch(
"prowler.providers.gcp.lib.service.service.GCPService.__is_api_active__",
new=mock_is_api_active,
),
patch(
"prowler.providers.gcp.lib.service.service.GCPService.__generate_client__",
return_value=mock_client,
),
patch(
"prowler.providers.common.provider.Provider.get_global_provider",
return_value=set_mocked_gcp_provider(),
),
patch(
"prowler.providers.gcp.services.accesscontextmanager.accesscontextmanager_service.cloudresourcemanager_client",
new=mock_crm_client,
),
):
from prowler.providers.gcp.services.accesscontextmanager.accesscontextmanager_service import (
AccessContextManager,
)
accesscontextmanager_client = AccessContextManager(
set_mocked_gcp_provider(project_ids=[GCP_PROJECT_ID])
)
assert len(accesscontextmanager_client.service_perimeters) == 0
def test_get_service_perimeters_list_perimeters_error(self):
"""Test error handling when listing service perimeters fails."""
mock_crm_client = MagicMock()
mock_crm_client.organizations = [
MagicMock(id="123456789", name="Organization 1"),
]
mock_client = MagicMock()
def mock_list_access_policies(parent):
return_value = MagicMock()
return_value.execute.return_value = {
"accessPolicies": [{"name": "accessPolicies/123456"}]
}
return return_value
def mock_list_perimeters_error(parent):
return_value = MagicMock()
return_value.execute.side_effect = Exception("Permission denied")
return return_value
mock_client.accessPolicies().list = mock_list_access_policies
mock_client.accessPolicies().list_next.return_value = None
mock_client.accessPolicies().servicePerimeters().list = (
mock_list_perimeters_error
)
with (
patch(
"prowler.providers.gcp.lib.service.service.GCPService.__is_api_active__",
new=mock_is_api_active,
),
patch(
"prowler.providers.gcp.lib.service.service.GCPService.__generate_client__",
return_value=mock_client,
),
patch(
"prowler.providers.common.provider.Provider.get_global_provider",
return_value=set_mocked_gcp_provider(),
),
patch(
"prowler.providers.gcp.services.accesscontextmanager.accesscontextmanager_service.cloudresourcemanager_client",
new=mock_crm_client,
),
):
from prowler.providers.gcp.services.accesscontextmanager.accesscontextmanager_service import (
AccessContextManager,
)
accesscontextmanager_client = AccessContextManager(
set_mocked_gcp_provider(project_ids=[GCP_PROJECT_ID])
)
assert len(accesscontextmanager_client.service_perimeters) == 0

Some files were not shown because too many files have changed in this diff Show More