Files
prowler/skills/prowler-attack-paths-query/SKILL.md

16 KiB

name, description, license, metadata, allowed-tools
name description license metadata allowed-tools
prowler-attack-paths-query Creates Prowler Attack Paths openCypher queries for graph analysis (compatible with Neo4j and Neptune). Trigger: When creating or updating Attack Paths queries that detect privilege escalation paths, network exposure, or security misconfigurations in cloud environments. Apache-2.0
author version scope auto_invoke
prowler-cloud 1.0
root
api
Creating Attack Paths queries
Updating existing Attack Paths queries
Adding privilege escalation detection queries
Read, Edit, Write, Glob, Grep, Bash, WebFetch, Task

Overview

Attack Paths queries are openCypher queries that analyze cloud infrastructure graphs (ingested via Cartography) to detect security risks like privilege escalation paths, network exposure, and misconfigurations.

Queries are written in openCypher Version 9 to ensure compatibility with both Neo4j and Amazon Neptune.


Input Sources

Queries can be created from:

  1. pathfinding.cloud ID (e.g., ECS-001, GLUE-001)

    Fetching a single path by ID — The aggregated paths.json is too large for WebFetch (content gets truncated). Use Bash with curl and a JSON parser instead:

    Prefer jq (concise), fall back to python3 (guaranteed in this Python project):

    # With jq
    curl -s https://raw.githubusercontent.com/DataDog/pathfinding.cloud/main/docs/paths.json \
      | jq '.[] | select(.id == "ecs-002")'
    
    # With python3 (fallback)
    curl -s https://raw.githubusercontent.com/DataDog/pathfinding.cloud/main/docs/paths.json \
      | python3 -c "import json,sys; print(json.dumps(next((p for p in json.load(sys.stdin) if p['id']=='ecs-002'), None), indent=2))"
    
  2. Listing Available Attack Paths

    • Use Bash to list available paths from the JSON index:
    # List all path IDs and names (jq)
    curl -s https://raw.githubusercontent.com/DataDog/pathfinding.cloud/main/docs/paths.json \
      | jq -r '.[] | "\(.id): \(.name)"'
    
    # List all path IDs and names (python3 fallback)
    curl -s https://raw.githubusercontent.com/DataDog/pathfinding.cloud/main/docs/paths.json \
      | python3 -c "import json,sys; [print(f\"{p['id']}: {p['name']}\") for p in json.load(sys.stdin)]"
    
    # List paths filtered by service prefix
    curl -s https://raw.githubusercontent.com/DataDog/pathfinding.cloud/main/docs/paths.json \
      | jq -r '.[] | select(.id | startswith("ecs")) | "\(.id): \(.name)"'
    
  3. Natural Language Description

    • User describes the Attack Paths in plain language
    • Agent maps to appropriate openCypher patterns

Query Structure

File Location

api/src/backend/api/attack_paths/queries/{provider}.py

Example: api/src/backend/api/attack_paths/queries/aws.py

Query Definition Pattern

from api.attack_paths.queries.types import (
    AttackPathsQueryDefinition,
    AttackPathsQueryParameterDefinition,
)
from tasks.jobs.attack_paths.config import PROWLER_FINDING_LABEL

# {REFERENCE_ID} (e.g., EC2-001, GLUE-001)
AWS_{QUERY_NAME} = AttackPathsQueryDefinition(
    id="aws-{kebab-case-name}",
    name="Privilege Escalation: {permission1} + {permission2}",
    description="{Detailed description of the Attack Paths}.",
    provider="aws",
    cypher=f"""
        // Find principals with {permission1}
        MATCH path_principal = (aws:AWSAccount {{id: $provider_uid}})--(principal:AWSPrincipal)--(policy:AWSPolicy)--(stmt:AWSPolicyStatement)
        WHERE stmt.effect = 'Allow'
            AND any(action IN stmt.action WHERE
                toLower(action) = '{permission1_lowercase}'
                OR toLower(action) = '{service}:*'
                OR action = '*'
            )

        // Find {permission2}
        MATCH (principal)--(policy2:AWSPolicy)--(stmt2:AWSPolicyStatement)
        WHERE stmt2.effect = 'Allow'
            AND any(action IN stmt2.action WHERE
                toLower(action) = '{permission2_lowercase}'
                OR toLower(action) = '{service2}:*'
                OR action = '*'
            )

        // Find target resources
        MATCH path_target = (aws)--(target_role:AWSRole)-[:TRUSTS_AWS_PRINCIPAL]->(:AWSPrincipal {{arn: '{service}.amazonaws.com'}})
        WHERE any(resource IN stmt.resource WHERE
            resource = '*'
            OR target_role.arn CONTAINS resource
            OR resource CONTAINS target_role.name
        )

        UNWIND nodes(path_principal) + nodes(path_target) as n
        OPTIONAL MATCH (n)-[pfr]-(pf:{PROWLER_FINDING_LABEL} {{status: 'FAIL', provider_uid: $provider_uid}})

        RETURN path_principal, path_target,
            collect(DISTINCT pf) as dpf, collect(DISTINCT pfr) as dpfr
    """,
    parameters=[],
)

Register in Query List

Add to the {PROVIDER}_QUERIES list at the bottom of the file:

AWS_QUERIES: list[AttackPathsQueryDefinition] = [
    # ... existing queries ...
    AWS_{NEW_QUERY_NAME},  # Add here
]

Step-by-Step Creation Process

1. Read the Queries Module

FIRST, read all files in the queries module to understand the structure:

api/src/backend/api/attack_paths/queries/
├── __init__.py      # Module exports
├── types.py         # AttackPathsQueryDefinition, AttackPathsQueryParameterDefinition
├── registry.py      # Query registry logic
└── {provider}.py    # Provider-specific queries (e.g., aws.py)

Read these files to learn:

  • Type definitions and available fields
  • How queries are registered
  • Current query patterns, style, and naming conventions

2. Determine Schema Source

Check the Cartography dependency in api/pyproject.toml:

grep cartography api/pyproject.toml

Parse the dependency to determine the schema source:

If git-based dependency (e.g., cartography @ git+https://github.com/prowler-cloud/cartography@0.126.1):

  • Extract the repository (e.g., prowler-cloud/cartography)
  • Extract the version/tag (e.g., 0.126.1)
  • Fetch schema from that repository at that tag

If PyPI dependency (e.g., cartography = "^0.126.0" or cartography>=0.126.0):

  • Extract the version (e.g., 0.126.0)
  • Use the official cartography-cncf repository

Schema URL patterns (ALWAYS use the specific version tag, not master/main):

# Official Cartography (cartography-cncf)
https://raw.githubusercontent.com/cartography-cncf/cartography/refs/tags/{version}/docs/root/modules/{provider}/schema.md

# Prowler fork (prowler-cloud)
https://raw.githubusercontent.com/prowler-cloud/cartography/refs/tags/{version}/docs/root/modules/{provider}/schema.md

Examples:

# For prowler-cloud/cartography@0.126.1 (git), fetch AWS schema:
https://raw.githubusercontent.com/prowler-cloud/cartography/refs/tags/0.126.1/docs/root/modules/aws/schema.md

# For cartography = "^0.126.0" (PyPI), fetch AWS schema:
https://raw.githubusercontent.com/cartography-cncf/cartography/refs/tags/0.126.0/docs/root/modules/aws/schema.md

IMPORTANT: Always match the schema version to the dependency version in pyproject.toml. Using master/main may reference node labels or properties that don't exist in the deployed version.

Additional Prowler Labels: The Attack Paths sync task adds extra labels:

  • ProwlerFinding - Prowler finding nodes with status, provider_uid properties
  • ProviderResource - Generic resource marker
  • {Provider}Resource - Provider-specific marker (e.g., AWSResource)

These are defined in api/src/backend/tasks/jobs/attack_paths/config.py.

3. Consult the Schema for Available Data

Use the Cartography schema to discover:

  • What node labels exist for the target resources
  • What properties are available on those nodes
  • What relationships connect the nodes

This informs query design by showing what data is actually available to query.

4. Create Query Definition

Use the standard pattern (see above) with:

  • id: Auto-generated as {provider}-{kebab-case-description}
  • name: Human-readable, e.g., "Privilege Escalation: {perm1} + {perm2}"
  • description: Explain the attack vector and impact
  • provider: Provider identifier (aws, azure, gcp, kubernetes, github)
  • cypher: The openCypher query with proper escaping
  • parameters: Optional list of user-provided parameters (use parameters=[] if none needed)

5. Add Query to Provider List

Add the constant to the {PROVIDER}_QUERIES list.


Query Naming Conventions

Query ID

{provider}-{category}-{description}

Examples:

  • aws-ec2-privesc-passrole-iam
  • aws-iam-privesc-attach-role-policy-assume-role
  • aws-rds-unencrypted-storage

Query Constant Name

{PROVIDER}_{CATEGORY}_{DESCRIPTION}

Examples:

  • AWS_EC2_PRIVESC_PASSROLE_IAM
  • AWS_IAM_PRIVESC_ATTACH_ROLE_POLICY_ASSUME_ROLE
  • AWS_RDS_UNENCRYPTED_STORAGE

Query Categories

Category Description Example
Basic Resource List resources with properties RDS instances, S3 buckets
Network Exposure Internet-exposed resources EC2 with public IPs
Privilege Escalation IAM privilege escalation paths PassRole + RunInstances
Data Access Access to sensitive data EC2 with S3 access

Common openCypher Patterns

Match Account and Principal

MATCH path_principal = (aws:AWSAccount {id: $provider_uid})--(principal:AWSPrincipal)--(policy:AWSPolicy)--(stmt:AWSPolicyStatement)

Check IAM Action Permissions

WHERE stmt.effect = 'Allow'
    AND any(action IN stmt.action WHERE
        toLower(action) = 'iam:passrole'
        OR toLower(action) = 'iam:*'
        OR action = '*'
    )

Find Roles Trusting a Service

MATCH path_target = (aws)--(target_role:AWSRole)-[:TRUSTS_AWS_PRINCIPAL]->(:AWSPrincipal {arn: 'ec2.amazonaws.com'})

Check Resource Scope

WHERE any(resource IN stmt.resource WHERE
    resource = '*'
    OR target_role.arn CONTAINS resource
    OR resource CONTAINS target_role.name
)

Include Prowler Findings

UNWIND nodes(path_principal) + nodes(path_target) as n
OPTIONAL MATCH (n)-[pfr]-(pf:{PROWLER_FINDING_LABEL} {status: 'FAIL', provider_uid: $provider_uid})

RETURN path_principal, path_target,
    collect(DISTINCT pf) as dpf, collect(DISTINCT pfr) as dpfr

Common Node Labels by Provider

AWS

Label Description
AWSAccount AWS account root
AWSPrincipal IAM principal (user, role, service)
AWSRole IAM role
AWSUser IAM user
AWSPolicy IAM policy
AWSPolicyStatement Policy statement
EC2Instance EC2 instance
EC2SecurityGroup Security group
S3Bucket S3 bucket
RDSInstance RDS database instance
LoadBalancer Classic ELB
LoadBalancerV2 ALB/NLB
LaunchTemplate EC2 launch template

Common Relationships

Relationship Description
TRUSTS_AWS_PRINCIPAL Role trust relationship
STS_ASSUMEROLE_ALLOW Can assume role
POLICY Has policy attached
STATEMENT Policy has statement

Parameters

For queries requiring user input, define parameters:

parameters=[
    AttackPathsQueryParameterDefinition(
        name="ip",
        label="IP address",
        description="Public IP address, e.g. 192.0.2.0.",
        placeholder="192.0.2.0",
    ),
    AttackPathsQueryParameterDefinition(
        name="tag_key",
        label="Tag key",
        description="Tag key to filter resources.",
        placeholder="Environment",
    ),
],

Best Practices

  1. Always filter by provider_uid: Use {id: $provider_uid} on account nodes and {provider_uid: $provider_uid} on ProwlerFinding nodes

  2. Use consistent naming: Follow existing patterns in the file

  3. Include Prowler findings: Always add the OPTIONAL MATCH for ProwlerFinding nodes

  4. Return distinct findings: Use collect(DISTINCT pf) to avoid duplicates

  5. Comment the query purpose: Add inline comments explaining each MATCH clause

  6. Validate schema first: Ensure all node labels and properties exist in Cartography schema


openCypher Compatibility

Queries must be written in openCypher Version 9 to ensure compatibility with both Neo4j and Amazon Neptune.

Why Version 9? Amazon Neptune implements openCypher Version 9. By targeting this specification, queries work on both Neo4j and Neptune without modification.

Avoid These (Not in openCypher spec)

Feature Reason
APOC procedures (apoc.*) Neo4j-specific plugin, not available in Neptune
Virtual nodes (apoc.create.vNode) APOC-specific
Virtual relationships (apoc.create.vRelationship) APOC-specific
Neptune extensions Not available in Neo4j
reduce() function Use UNWIND + aggregation instead
FOREACH clause Use WITH + UNWIND + SET instead
Regex match operator (=~) Not supported in Neptune

CALL Subqueries

Supported with limitations:

  • Use WITH clause to import variables: CALL { WITH var ... }
  • Updates inside CALL subqueries are NOT supported
  • Emitted variables cannot overlap with variables before the CALL

Reference

pathfinding.cloud (Attack Path Definitions)

Cartography Schema

  • URL pattern: https://raw.githubusercontent.com/{org}/cartography/refs/tags/{version}/docs/root/modules/{provider}/schema.md
  • Always use the version from api/pyproject.toml, not master/main

openCypher Specification


Learning from the Queries Module

IMPORTANT: Before creating a new query, ALWAYS read the entire queries module:

api/src/backend/api/attack_paths/queries/
├── __init__.py      # Module exports
├── types.py         # Type definitions
├── registry.py      # Registry logic
└── {provider}.py    # Provider queries (aws.py, etc.)

Use the existing queries to learn:

  • Query structure and formatting
  • Variable naming conventions
  • How to include Prowler findings
  • Comment style

Compatibility Warning: Some existing queries use Neo4j-specific features (e.g., apoc.create.vNode, apoc.create.vRelationship, regex =~) that are NOT compatible with Amazon Neptune. Use these queries to learn general patterns (structure, naming, Prowler findings integration, comment style) but DO NOT copy APOC procedures or other Neo4j-specific syntax into new queries. New queries must be pure openCypher Version 9. Refer to the openCypher Compatibility section for the full list of features to avoid.

DO NOT use generic templates. Match the exact style of existing compatible queries in the file.