mirror of
https://github.com/prowler-cloud/prowler.git
synced 2026-03-22 03:08:23 +00:00
chore(skills): improve attack-paths-query skill accuracy and schema emphasis
This commit is contained in:
@@ -1,13 +1,14 @@
|
||||
---
|
||||
name: prowler-attack-paths-query
|
||||
description: >
|
||||
Creates Prowler Attack Paths openCypher queries for graph analysis (compatible with Neo4j and Neptune).
|
||||
Trigger: When creating or updating Attack Paths queries that detect privilege escalation paths,
|
||||
network exposure, or security misconfigurations in cloud environments.
|
||||
Creates Prowler Attack Paths openCypher queries using the Cartography schema as the source of truth
|
||||
for node labels, properties, and relationships. Also covers Prowler-specific additions (Internet node,
|
||||
ProwlerFinding, internal isolation labels) and $provider_uid scoping for predefined queries.
|
||||
Trigger: When creating or updating Attack Paths queries.
|
||||
license: Apache-2.0
|
||||
metadata:
|
||||
author: prowler-cloud
|
||||
version: "1.1"
|
||||
version: "2.0"
|
||||
scope: [root, api]
|
||||
auto_invoke:
|
||||
- "Creating Attack Paths queries"
|
||||
@@ -20,7 +21,24 @@ allowed-tools: Read, Edit, Write, Glob, Grep, Bash, WebFetch, Task
|
||||
|
||||
Attack Paths queries are openCypher queries that analyze cloud infrastructure graphs (ingested via Cartography) to detect security risks like privilege escalation paths, network exposure, and misconfigurations.
|
||||
|
||||
Queries are written in **openCypher Version 9** to ensure compatibility with both Neo4j and Amazon Neptune.
|
||||
Queries are written in **openCypher Version 9** for compatibility with both Neo4j and Amazon Neptune.
|
||||
|
||||
---
|
||||
|
||||
## Two query audiences
|
||||
|
||||
This skill covers two types of queries with different isolation mechanisms:
|
||||
|
||||
| | Predefined queries | Custom queries |
|
||||
|---|---|---|
|
||||
| **Where they live** | `api/src/backend/api/attack_paths/queries/{provider}.py` | User/LLM-supplied via the custom query API endpoint |
|
||||
| **Provider isolation** | `AWSAccount {id: $provider_uid}` anchor + path connectivity | Automatic `_Provider_{uuid}` label injection via `cypher_rewriter.py` |
|
||||
| **What to write** | Chain every MATCH from the `aws` variable | Plain Cypher, no isolation boilerplate needed |
|
||||
| **Internal labels** | Never use (`_ProviderResource`, `_Tenant_*`, `_Provider_*`) | Never use (injected automatically by the system) |
|
||||
|
||||
**For predefined queries**: every node must be reachable from the `AWSAccount` root via graph traversal. This is the isolation boundary.
|
||||
|
||||
**For custom queries**: write natural Cypher without isolation concerns. The query runner injects a `_Provider_{uuid}` label into every node pattern before execution, and a post-query filter catches edge cases.
|
||||
|
||||
---
|
||||
|
||||
@@ -29,67 +47,44 @@ Queries are written in **openCypher Version 9** to ensure compatibility with bot
|
||||
Queries can be created from:
|
||||
|
||||
1. **pathfinding.cloud ID** (e.g., `ECS-001`, `GLUE-001`)
|
||||
- The JSON index contains: `id`, `name`, `description`, `services`, `permissions`, `exploitationSteps`, `prerequisites`, etc.
|
||||
- Reference: https://github.com/DataDog/pathfinding.cloud
|
||||
|
||||
**Fetching a single path by ID** - The aggregated `paths.json` is too large for WebFetch
|
||||
(content gets truncated). Use Bash with `curl` and a JSON parser instead:
|
||||
|
||||
Prefer `jq` (concise), fall back to `python3` (guaranteed in this Python project):
|
||||
- The aggregated `paths.json` is too large for WebFetch. Use Bash:
|
||||
|
||||
```bash
|
||||
# With jq
|
||||
# Fetch a single path by ID
|
||||
curl -s https://raw.githubusercontent.com/DataDog/pathfinding.cloud/main/docs/paths.json \
|
||||
| jq '.[] | select(.id == "ecs-002")'
|
||||
|
||||
# With python3 (fallback)
|
||||
curl -s https://raw.githubusercontent.com/DataDog/pathfinding.cloud/main/docs/paths.json \
|
||||
| python3 -c "import json,sys; print(json.dumps(next((p for p in json.load(sys.stdin) if p['id']=='ecs-002'), None), indent=2))"
|
||||
```
|
||||
|
||||
2. **Listing Available Attack Paths**
|
||||
- Use Bash to list available paths from the JSON index:
|
||||
|
||||
```bash
|
||||
# List all path IDs and names (jq)
|
||||
# List all path IDs and names
|
||||
curl -s https://raw.githubusercontent.com/DataDog/pathfinding.cloud/main/docs/paths.json \
|
||||
| jq -r '.[] | "\(.id): \(.name)"'
|
||||
|
||||
# List all path IDs and names (python3 fallback)
|
||||
curl -s https://raw.githubusercontent.com/DataDog/pathfinding.cloud/main/docs/paths.json \
|
||||
| python3 -c "import json,sys; [print(f\"{p['id']}: {p['name']}\") for p in json.load(sys.stdin)]"
|
||||
|
||||
# List paths filtered by service prefix
|
||||
# Filter by service prefix
|
||||
curl -s https://raw.githubusercontent.com/DataDog/pathfinding.cloud/main/docs/paths.json \
|
||||
| jq -r '.[] | select(.id | startswith("ecs")) | "\(.id): \(.name)"'
|
||||
```
|
||||
|
||||
3. **Natural Language Description**
|
||||
- User describes the Attack Paths in plain language
|
||||
- Agent maps to appropriate openCypher patterns
|
||||
If `jq` is not available, use `python3 -c "import json,sys; ..."` as a fallback.
|
||||
|
||||
2. **Natural language description** from the user
|
||||
|
||||
---
|
||||
|
||||
## Query Structure
|
||||
|
||||
### File Location
|
||||
### Provider scoping parameter
|
||||
|
||||
```
|
||||
api/src/backend/api/attack_paths/queries/{provider}.py
|
||||
```
|
||||
One parameter is injected automatically by the query runner:
|
||||
|
||||
Example: `api/src/backend/api/attack_paths/queries/aws.py`
|
||||
| Parameter | Property it matches | Used on | Purpose |
|
||||
| --------------- | ------------------- | ------------ | -------------------------------- |
|
||||
| `$provider_uid` | `id` | `AWSAccount` | Scopes to a specific AWS account |
|
||||
|
||||
### Query parameters for provider scoping
|
||||
All other nodes are isolated by path connectivity from the `AWSAccount` anchor.
|
||||
|
||||
Two parameters exist. Both are injected automatically by the query runner.
|
||||
### Imports
|
||||
|
||||
| Parameter | Property it matches | Used on | Purpose |
|
||||
| --------------- | ------------------- | -------------- | ------------------------------------ |
|
||||
| `$provider_uid` | `id` | `AWSAccount` | Scopes to a specific AWS account |
|
||||
| `$provider_id` | `_provider_id` | Any other node | Scopes nodes to the provider context |
|
||||
|
||||
### Privilege Escalation Query Pattern
|
||||
All query files start with these imports:
|
||||
|
||||
```python
|
||||
from api.attack_paths.queries.types import (
|
||||
@@ -97,47 +92,57 @@ from api.attack_paths.queries.types import (
|
||||
AttackPathsQueryDefinition,
|
||||
AttackPathsQueryParameterDefinition,
|
||||
)
|
||||
from tasks.jobs.attack_paths.config import PROWLER_FINDING_LABEL
|
||||
```
|
||||
|
||||
# {REFERENCE_ID} (e.g., EC2-001, GLUE-001)
|
||||
The `PROWLER_FINDING_LABEL` constant (value: `"ProwlerFinding"`) is used via f-string interpolation in all queries. Never hardcode the label string.
|
||||
|
||||
### Privilege escalation sub-patterns
|
||||
|
||||
There are four distinct privilege escalation patterns. Choose based on the attack type:
|
||||
|
||||
| Sub-pattern | Target | `path_target` shape | Example |
|
||||
|---|---|---|---|
|
||||
| Self-escalation | Principal's own policies | `(aws)--(target_policy:AWSPolicy)--(principal)` | IAM-001 |
|
||||
| Lateral to user | Other IAM users | `(aws)--(target_user:AWSUser)` | IAM-002 |
|
||||
| Assume-role lateral | Assumable roles | `(aws)--(target_role:AWSRole)<-[:STS_ASSUMEROLE_ALLOW]-(principal)` | IAM-014 |
|
||||
| PassRole + service | Service-trusting roles | `(aws)--(target_role:AWSRole)-[:TRUSTS_AWS_PRINCIPAL]->(...)` | EC2-001 |
|
||||
|
||||
#### Self-escalation (e.g., IAM-001)
|
||||
|
||||
The principal modifies resources attached to itself. `path_target` loops back to `principal`:
|
||||
|
||||
```python
|
||||
AWS_{QUERY_NAME} = AttackPathsQueryDefinition(
|
||||
id="aws-{kebab-case-name}",
|
||||
name="{Human-friendly label} ({REFERENCE_ID})",
|
||||
short_description="{Brief explanation of the attack, no technical permissions.}",
|
||||
short_description="{Brief explanation, no technical permissions.}",
|
||||
description="{Detailed description of the attack vector and impact.}",
|
||||
attribution=AttackPathsQueryAttribution(
|
||||
text="pathfinding.cloud - {REFERENCE_ID} - {permission1} + {permission2}",
|
||||
text="pathfinding.cloud - {REFERENCE_ID} - {permission}",
|
||||
link="https://pathfinding.cloud/paths/{reference_id_lowercase}",
|
||||
),
|
||||
provider="aws",
|
||||
cypher=f"""
|
||||
// Find principals with {permission1}
|
||||
// Find principals with {permission}
|
||||
MATCH path_principal = (aws:AWSAccount {{id: $provider_uid}})--(principal:AWSPrincipal)--(policy:AWSPolicy)--(stmt:AWSPolicyStatement)
|
||||
WHERE stmt.effect = 'Allow'
|
||||
AND any(action IN stmt.action WHERE
|
||||
toLower(action) = '{permission1_lowercase}'
|
||||
toLower(action) = '{permission_lowercase}'
|
||||
OR toLower(action) = '{service}:*'
|
||||
OR action = '*'
|
||||
)
|
||||
|
||||
// Find {permission2}
|
||||
MATCH (principal)--(policy2:AWSPolicy)--(stmt2:AWSPolicyStatement)
|
||||
WHERE stmt2.effect = 'Allow'
|
||||
AND any(action IN stmt2.action WHERE
|
||||
toLower(action) = '{permission2_lowercase}'
|
||||
OR toLower(action) = '{service2}:*'
|
||||
OR action = '*'
|
||||
// Find target resources attached to the same principal
|
||||
MATCH path_target = (aws)--(target_policy:AWSPolicy)--(principal)
|
||||
WHERE target_policy.arn CONTAINS $provider_uid
|
||||
AND any(resource IN stmt.resource WHERE
|
||||
resource = '*'
|
||||
OR target_policy.arn CONTAINS resource
|
||||
)
|
||||
|
||||
// Find target resources (MUST chain from `aws` for provider isolation)
|
||||
MATCH path_target = (aws)--(target_role:AWSRole)-[:TRUSTS_AWS_PRINCIPAL]->(:AWSPrincipal {{arn: '{service}.amazonaws.com'}})
|
||||
WHERE any(resource IN stmt.resource WHERE
|
||||
resource = '*'
|
||||
OR target_role.arn CONTAINS resource
|
||||
OR resource CONTAINS target_role.name
|
||||
)
|
||||
|
||||
UNWIND nodes(path_principal) + nodes(path_target) as n
|
||||
OPTIONAL MATCH (n)-[pfr]-(pf:ProwlerFinding {{status: 'FAIL', provider_uid: $provider_uid}})
|
||||
OPTIONAL MATCH (n)-[pfr]-(pf:{PROWLER_FINDING_LABEL} {{status: 'FAIL'}})
|
||||
|
||||
RETURN path_principal, path_target,
|
||||
collect(DISTINCT pf) as dpf, collect(DISTINCT pfr) as dpfr
|
||||
@@ -146,7 +151,29 @@ AWS_{QUERY_NAME} = AttackPathsQueryDefinition(
|
||||
)
|
||||
```
|
||||
|
||||
### Network Exposure Query Pattern
|
||||
#### Other sub-pattern `path_target` shapes
|
||||
|
||||
The other 3 sub-patterns share the same `path_principal`, UNWIND, and RETURN as self-escalation. Only the `path_target` MATCH differs:
|
||||
|
||||
```cypher
|
||||
// Lateral to user (e.g., IAM-002) - targets other IAM users
|
||||
MATCH path_target = (aws)--(target_user:AWSUser)
|
||||
WHERE any(resource IN stmt.resource WHERE resource = '*' OR target_user.arn CONTAINS resource OR resource CONTAINS target_user.name)
|
||||
|
||||
// Assume-role lateral (e.g., IAM-014) - targets roles the principal can assume
|
||||
MATCH path_target = (aws)--(target_role:AWSRole)<-[:STS_ASSUMEROLE_ALLOW]-(principal)
|
||||
WHERE any(resource IN stmt.resource WHERE resource = '*' OR target_role.arn CONTAINS resource OR resource CONTAINS target_role.name)
|
||||
|
||||
// PassRole + service (e.g., EC2-001) - targets roles trusting a service
|
||||
MATCH path_target = (aws)--(target_role:AWSRole)-[:TRUSTS_AWS_PRINCIPAL]->(:AWSPrincipal {arn: '{service}.amazonaws.com'})
|
||||
WHERE any(resource IN stmt.resource WHERE resource = '*' OR target_role.arn CONTAINS resource OR resource CONTAINS target_role.name)
|
||||
```
|
||||
|
||||
**Multi-permission**: PassRole queries require a second permission. Add `MATCH (principal)--(policy2:AWSPolicy)--(stmt2:AWSPolicyStatement)` with its own WHERE before `path_target`, then check BOTH `stmt.resource` AND `stmt2.resource` against the target. See IAM-015 or EC2-001 in `aws.py` for examples.
|
||||
|
||||
### Network exposure pattern
|
||||
|
||||
The Internet node is reached via `CAN_ACCESS` through the already-scoped resource, not via a standalone lookup:
|
||||
|
||||
```python
|
||||
AWS_{QUERY_NAME} = AttackPathsQueryDefinition(
|
||||
@@ -156,18 +183,15 @@ AWS_{QUERY_NAME} = AttackPathsQueryDefinition(
|
||||
description="{Detailed description.}",
|
||||
provider="aws",
|
||||
cypher=f"""
|
||||
// Match the Internet sentinel node
|
||||
OPTIONAL MATCH (internet:Internet {{_provider_id: $provider_id}})
|
||||
|
||||
// Match exposed resources (MUST chain from `aws`)
|
||||
MATCH path = (aws:AWSAccount {{id: $provider_uid}})--(resource:EC2Instance)
|
||||
WHERE resource.exposed_internet = true
|
||||
|
||||
// Link Internet to resource
|
||||
OPTIONAL MATCH (internet)-[can_access:CAN_ACCESS]->(resource)
|
||||
// Internet node reached via path connectivity through the resource
|
||||
OPTIONAL MATCH (internet:Internet)-[can_access:CAN_ACCESS]->(resource)
|
||||
|
||||
UNWIND nodes(path) as n
|
||||
OPTIONAL MATCH (n)-[pfr]-(pf:ProwlerFinding {{status: 'FAIL', provider_uid: $provider_uid}})
|
||||
OPTIONAL MATCH (n)-[pfr]-(pf:{PROWLER_FINDING_LABEL} {{status: 'FAIL'}})
|
||||
|
||||
RETURN path, collect(DISTINCT pf) as dpf, collect(DISTINCT pfr) as dpfr,
|
||||
internet, can_access
|
||||
@@ -176,7 +200,7 @@ AWS_{QUERY_NAME} = AttackPathsQueryDefinition(
|
||||
)
|
||||
```
|
||||
|
||||
### Register in Query List
|
||||
### Register in query list
|
||||
|
||||
Add to the `{PROVIDER}_QUERIES` list at the bottom of the file:
|
||||
|
||||
@@ -189,11 +213,11 @@ AWS_QUERIES: list[AttackPathsQueryDefinition] = [
|
||||
|
||||
---
|
||||
|
||||
## Step-by-Step Creation Process
|
||||
## Step-by-step creation process
|
||||
|
||||
### 1. Read the Queries Module
|
||||
### 1. Read the queries module
|
||||
|
||||
**FIRST**, read all files in the queries module to understand the structure:
|
||||
**FIRST**, read all files in the queries module to understand the structure, type definitions, registration, and existing style:
|
||||
|
||||
```
|
||||
api/src/backend/api/attack_paths/queries/
|
||||
@@ -203,94 +227,50 @@ api/src/backend/api/attack_paths/queries/
|
||||
└── {provider}.py # Provider-specific queries (e.g., aws.py)
|
||||
```
|
||||
|
||||
Read these files to learn:
|
||||
**DO NOT** use generic templates. Match the exact style of existing queries in the file.
|
||||
|
||||
- Type definitions and available fields
|
||||
- How queries are registered
|
||||
- Current query patterns, style, and naming conventions
|
||||
### 2. Fetch and consult the Cartography schema
|
||||
|
||||
### 2. Determine Schema Source
|
||||
**This is the most important step.** Every node label, property, and relationship in the query must exist in the Cartography schema for the pinned version. Do not guess or rely on memory.
|
||||
|
||||
Check the Cartography dependency in `api/pyproject.toml`:
|
||||
Check `api/pyproject.toml` for the Cartography dependency, then fetch the schema:
|
||||
|
||||
```bash
|
||||
grep cartography api/pyproject.toml
|
||||
```
|
||||
|
||||
Parse the dependency to determine the schema source:
|
||||
|
||||
**If git-based dependency** (e.g., `cartography @ git+https://github.com/prowler-cloud/cartography@0.126.1`):
|
||||
|
||||
- Extract the repository (e.g., `prowler-cloud/cartography`)
|
||||
- Extract the version/tag (e.g., `0.126.1`)
|
||||
- Fetch schema from that repository at that tag
|
||||
|
||||
**If PyPI dependency** (e.g., `cartography = "^0.126.0"` or `cartography>=0.126.0`):
|
||||
|
||||
- Extract the version (e.g., `0.126.0`)
|
||||
- Use the official `cartography-cncf` repository
|
||||
|
||||
**Schema URL patterns** (ALWAYS use the specific version tag, not master/main):
|
||||
Build the schema URL (ALWAYS use the specific tag, not master/main):
|
||||
|
||||
```
|
||||
# Official Cartography (cartography-cncf)
|
||||
https://raw.githubusercontent.com/cartography-cncf/cartography/refs/tags/{version}/docs/root/modules/{provider}/schema.md
|
||||
# Git dependency (prowler-cloud/cartography@0.126.1):
|
||||
https://raw.githubusercontent.com/prowler-cloud/cartography/refs/tags/0.126.1/docs/root/modules/{provider}/schema.md
|
||||
|
||||
# Prowler fork (prowler-cloud)
|
||||
https://raw.githubusercontent.com/prowler-cloud/cartography/refs/tags/{version}/docs/root/modules/{provider}/schema.md
|
||||
# PyPI dependency (cartography = "^0.126.0"):
|
||||
https://raw.githubusercontent.com/cartography-cncf/cartography/refs/tags/0.126.0/docs/root/modules/{provider}/schema.md
|
||||
```
|
||||
|
||||
**Examples**:
|
||||
Read the schema to discover available node labels, properties, and relationships for the target resources. Internal labels (`_ProviderResource`, `_AWSResource`, `_Tenant_*`, `_Provider_*`) exist for isolation but should never appear in queries.
|
||||
|
||||
```bash
|
||||
# For prowler-cloud/cartography@0.126.1 (git), fetch AWS schema:
|
||||
https://raw.githubusercontent.com/prowler-cloud/cartography/refs/tags/0.126.1/docs/root/modules/aws/schema.md
|
||||
|
||||
# For cartography = "^0.126.0" (PyPI), fetch AWS schema:
|
||||
https://raw.githubusercontent.com/cartography-cncf/cartography/refs/tags/0.126.0/docs/root/modules/aws/schema.md
|
||||
```
|
||||
|
||||
**IMPORTANT**: Always match the schema version to the dependency version in `pyproject.toml`. Using master/main may reference node labels or properties that don't exist in the deployed version.
|
||||
|
||||
**Additional Prowler Labels**: The Attack Paths sync task adds labels that queries can reference:
|
||||
|
||||
- `ProwlerFinding` - Prowler finding nodes with `status`, `provider_uid` properties
|
||||
- `Internet` - Internet sentinel node with `_provider_id` property (used in network exposure queries)
|
||||
|
||||
Other internal labels (`_ProviderResource`, `_AWSResource`, `_Tenant_*`, `_Provider_*`) exist for isolation but should never be used in queries.
|
||||
|
||||
These are defined in `api/src/backend/tasks/jobs/attack_paths/config.py`.
|
||||
|
||||
### 3. Consult the Schema for Available Data
|
||||
|
||||
Use the Cartography schema to discover:
|
||||
|
||||
- What node labels exist for the target resources
|
||||
- What properties are available on those nodes
|
||||
- What relationships connect the nodes
|
||||
|
||||
This informs query design by showing what data is actually available to query.
|
||||
|
||||
### 4. Create Query Definition
|
||||
### 4. Create query definition
|
||||
|
||||
Use the appropriate pattern (privilege escalation or network exposure) with:
|
||||
|
||||
- **id**: Auto-generated as `{provider}-{kebab-case-description}`
|
||||
- **name**: Short, human-friendly label. No raw IAM permissions. For sourced queries (e.g., pathfinding.cloud), append the reference ID in parentheses: `"EC2 Instance Launch with Privileged Role (EC2-001)"`. If the name already has parentheses, prepend the ID inside them: `"ECS Service Creation with Privileged Role (ECS-003 - Existing Cluster)"`.
|
||||
- **short_description**: Brief explanation of the attack, no technical permissions. E.g., "Launch EC2 instances with privileged IAM roles to gain their permissions via IMDS."
|
||||
- **description**: Full technical explanation of the attack vector and impact. Plain text only, no HTML or technical permissions here.
|
||||
- **id**: `{provider}-{kebab-case-description}`
|
||||
- **name**: Short, human-friendly label. For sourced queries, append the reference ID: `"EC2 Instance Launch with Privileged Role (EC2-001)"`.
|
||||
- **short_description**: Brief explanation, no technical permissions.
|
||||
- **description**: Full technical explanation. Plain text only.
|
||||
- **provider**: Provider identifier (aws, azure, gcp, kubernetes, github)
|
||||
- **cypher**: The openCypher query with proper escaping
|
||||
- **parameters**: Optional list of user-provided parameters (use `parameters=[]` if none needed)
|
||||
- **attribution**: Optional `AttackPathsQueryAttribution(text, link)` for sourced queries. The `text` includes the source, reference ID, and technical permissions (e.g., `"pathfinding.cloud - EC2-001 - iam:PassRole + ec2:RunInstances"`). The `link` is the URL with a lowercase ID (e.g., `"https://pathfinding.cloud/paths/ec2-001"`). Omit (defaults to `None`) for non-sourced queries.
|
||||
- **parameters**: Optional list of user-provided parameters (`parameters=[]` if none)
|
||||
- **attribution**: Optional `AttackPathsQueryAttribution(text, link)` for sourced queries. The `text` includes source, reference ID, and permissions. The `link` uses a lowercase ID. Omit for non-sourced queries.
|
||||
|
||||
### 5. Add Query to Provider List
|
||||
### 5. Add query to provider list
|
||||
|
||||
Add the constant to the `{PROVIDER}_QUERIES` list.
|
||||
|
||||
---
|
||||
|
||||
## Query Naming Conventions
|
||||
## Query naming conventions
|
||||
|
||||
### Query ID
|
||||
|
||||
@@ -298,27 +278,19 @@ Add the constant to the `{PROVIDER}_QUERIES` list.
|
||||
{provider}-{category}-{description}
|
||||
```
|
||||
|
||||
Examples:
|
||||
Examples: `aws-ec2-privesc-passrole-iam`, `aws-ec2-instances-internet-exposed`
|
||||
|
||||
- `aws-ec2-privesc-passrole-iam`
|
||||
- `aws-iam-privesc-attach-role-policy-assume-role`
|
||||
- `aws-ec2-instances-internet-exposed`
|
||||
|
||||
### Query Constant Name
|
||||
### Query constant name
|
||||
|
||||
```
|
||||
{PROVIDER}_{CATEGORY}_{DESCRIPTION}
|
||||
```
|
||||
|
||||
Examples:
|
||||
|
||||
- `AWS_EC2_PRIVESC_PASSROLE_IAM`
|
||||
- `AWS_IAM_PRIVESC_ATTACH_ROLE_POLICY_ASSUME_ROLE`
|
||||
- `AWS_EC2_INSTANCES_INTERNET_EXPOSED`
|
||||
Examples: `AWS_EC2_PRIVESC_PASSROLE_IAM`, `AWS_EC2_INSTANCES_INTERNET_EXPOSED`
|
||||
|
||||
---
|
||||
|
||||
## Query Categories
|
||||
## Query categories
|
||||
|
||||
| Category | Description | Example |
|
||||
| -------------------- | ------------------------------ | ------------------------- |
|
||||
@@ -329,15 +301,15 @@ Examples:
|
||||
|
||||
---
|
||||
|
||||
## Common openCypher Patterns
|
||||
## Common openCypher patterns
|
||||
|
||||
### Match Account and Principal
|
||||
### Match account and principal
|
||||
|
||||
```cypher
|
||||
MATCH path_principal = (aws:AWSAccount {id: $provider_uid})--(principal:AWSPrincipal)--(policy:AWSPolicy)--(stmt:AWSPolicyStatement)
|
||||
```
|
||||
|
||||
### Check IAM Action Permissions
|
||||
### Check IAM action permissions
|
||||
|
||||
```cypher
|
||||
WHERE stmt.effect = 'Allow'
|
||||
@@ -348,13 +320,21 @@ WHERE stmt.effect = 'Allow'
|
||||
)
|
||||
```
|
||||
|
||||
### Find Roles Trusting a Service
|
||||
### Find roles trusting a service
|
||||
|
||||
```cypher
|
||||
MATCH path_target = (aws)--(target_role:AWSRole)-[:TRUSTS_AWS_PRINCIPAL]->(:AWSPrincipal {arn: 'ec2.amazonaws.com'})
|
||||
```
|
||||
|
||||
### Check Resource Scope
|
||||
### Find roles the principal can assume
|
||||
|
||||
Note the arrow direction - `STS_ASSUMEROLE_ALLOW` points from the role to the principal:
|
||||
|
||||
```cypher
|
||||
MATCH path_target = (aws)--(target_role:AWSRole)<-[:STS_ASSUMEROLE_ALLOW]-(principal)
|
||||
```
|
||||
|
||||
### Check resource scope
|
||||
|
||||
```cypher
|
||||
WHERE any(resource IN stmt.resource WHERE
|
||||
@@ -364,26 +344,16 @@ WHERE any(resource IN stmt.resource WHERE
|
||||
)
|
||||
```
|
||||
|
||||
### Match Internet Sentinel Node
|
||||
### Internet node via path connectivity
|
||||
|
||||
Used in network exposure queries. The Internet node is a real graph node, scoped by `_provider_id`:
|
||||
The Internet node is reached through `CAN_ACCESS` relationships to already-scoped resources. No standalone lookup needed:
|
||||
|
||||
```cypher
|
||||
OPTIONAL MATCH (internet:Internet {_provider_id: $provider_id})
|
||||
```
|
||||
|
||||
### Link Internet to Exposed Resource
|
||||
|
||||
The `CAN_ACCESS` relationship is a real graph relationship linking the Internet node to exposed resources:
|
||||
|
||||
```cypher
|
||||
OPTIONAL MATCH (internet)-[can_access:CAN_ACCESS]->(resource)
|
||||
OPTIONAL MATCH (internet:Internet)-[can_access:CAN_ACCESS]->(resource)
|
||||
```
|
||||
|
||||
### Multi-label OR (match multiple resource types)
|
||||
|
||||
When a query needs to match different resource types in the same position, use label checks in WHERE:
|
||||
|
||||
```cypher
|
||||
MATCH path = (aws:AWSAccount {id: $provider_uid})-[r]-(x)-[q]-(y)
|
||||
WHERE (x:EC2PrivateIp AND x.public_ip = $ip)
|
||||
@@ -392,11 +362,11 @@ WHERE (x:EC2PrivateIp AND x.public_ip = $ip)
|
||||
OR (x:ElasticIPAddress AND x.public_ip = $ip)
|
||||
```
|
||||
|
||||
### Include Prowler Findings
|
||||
### Include Prowler findings
|
||||
|
||||
```cypher
|
||||
UNWIND nodes(path_principal) + nodes(path_target) as n
|
||||
OPTIONAL MATCH (n)-[pfr]-(pf:ProwlerFinding {status: 'FAIL', provider_uid: $provider_uid})
|
||||
OPTIONAL MATCH (n)-[pfr]-(pf:{PROWLER_FINDING_LABEL} {{status: 'FAIL'}})
|
||||
|
||||
RETURN path_principal, path_target,
|
||||
collect(DISTINCT pf) as dpf, collect(DISTINCT pfr) as dpfr
|
||||
@@ -411,154 +381,84 @@ RETURN path, collect(DISTINCT pf) as dpf, collect(DISTINCT pfr) as dpfr,
|
||||
|
||||
---
|
||||
|
||||
## Common Node Labels by Provider
|
||||
## Prowler-specific labels and relationships
|
||||
|
||||
### AWS
|
||||
These are added by the sync task, not part of the Cartography schema. For all other node labels, properties, and relationships, **always consult the Cartography schema** (see step 2 below).
|
||||
|
||||
| Label | Description |
|
||||
| --------------------- | --------------------------------------- |
|
||||
| `AWSAccount` | AWS account root |
|
||||
| `AWSPrincipal` | IAM principal (user, role, service) |
|
||||
| `AWSRole` | IAM role |
|
||||
| `AWSUser` | IAM user |
|
||||
| `AWSPolicy` | IAM policy |
|
||||
| `AWSPolicyStatement` | Policy statement |
|
||||
| `AWSTag` | Resource tag (key/value) |
|
||||
| `EC2Instance` | EC2 instance |
|
||||
| `EC2SecurityGroup` | Security group |
|
||||
| `EC2PrivateIp` | EC2 private IP (has `public_ip`) |
|
||||
| `IpPermissionInbound` | Inbound security group rule |
|
||||
| `IpRange` | IP range (e.g., `0.0.0.0/0`) |
|
||||
| `NetworkInterface` | ENI (has `public_ip`) |
|
||||
| `ElasticIPAddress` | Elastic IP (has `public_ip`) |
|
||||
| `S3Bucket` | S3 bucket |
|
||||
| `RDSInstance` | RDS database instance |
|
||||
| `LoadBalancer` | Classic ELB |
|
||||
| `LoadBalancerV2` | ALB/NLB |
|
||||
| `ELBListener` | Classic ELB listener |
|
||||
| `ELBV2Listener` | ALB/NLB listener |
|
||||
| `LaunchTemplate` | EC2 launch template |
|
||||
| `Internet` | Internet sentinel node (`_provider_id`) |
|
||||
|
||||
### Common Relationships
|
||||
|
||||
| Relationship | Description |
|
||||
| ---------------------- | ---------------------------------- |
|
||||
| `TRUSTS_AWS_PRINCIPAL` | Role trust relationship |
|
||||
| `STS_ASSUMEROLE_ALLOW` | Can assume role |
|
||||
| `CAN_ACCESS` | Internet-to-resource exposure link |
|
||||
| `POLICY` | Has policy attached |
|
||||
| `STATEMENT` | Policy has statement |
|
||||
| Label/Relationship | Description |
|
||||
| ---------------------- | -------------------------------------------------- |
|
||||
| `ProwlerFinding` | Finding node (`status`, `severity`, `check_id`) |
|
||||
| `Internet` | Internet sentinel node |
|
||||
| `CAN_ACCESS` | Internet-to-resource exposure (relationship) |
|
||||
| `HAS_FINDING` | Resource-to-finding link (relationship) |
|
||||
| `TRUSTS_AWS_PRINCIPAL` | Role trust relationship |
|
||||
| `STS_ASSUMEROLE_ALLOW` | Can assume role (direction: role -> principal) |
|
||||
|
||||
---
|
||||
|
||||
## Parameters
|
||||
|
||||
For queries requiring user input, define parameters:
|
||||
For queries requiring user input:
|
||||
|
||||
```python
|
||||
parameters=[
|
||||
AttackPathsQueryParameterDefinition(
|
||||
name="ip",
|
||||
label="IP address",
|
||||
# data_type defaults to "string", cast defaults to str.
|
||||
# For non-string params, set both: data_type="integer", cast=int
|
||||
description="Public IP address, e.g. 192.0.2.0.",
|
||||
placeholder="192.0.2.0",
|
||||
),
|
||||
AttackPathsQueryParameterDefinition(
|
||||
name="tag_key",
|
||||
label="Tag key",
|
||||
description="Tag key to filter resources.",
|
||||
placeholder="Environment",
|
||||
),
|
||||
],
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Best Practices
|
||||
## Best practices
|
||||
|
||||
1. **Always scope by provider**: Use `{id: $provider_uid}` on `AWSAccount` nodes. Use `{_provider_id: $provider_id}` on any other node that needs provider scoping (e.g., `Internet`).
|
||||
|
||||
2. **Use consistent naming**: Follow existing patterns in the file
|
||||
|
||||
3. **Include Prowler findings**: Always add the OPTIONAL MATCH for ProwlerFinding nodes
|
||||
|
||||
4. **Return distinct findings**: Use `collect(DISTINCT pf)` to avoid duplicates
|
||||
|
||||
5. **Comment the query purpose**: Add inline comments explaining each MATCH clause
|
||||
|
||||
6. **Validate schema first**: Ensure all node labels and properties exist in Cartography schema
|
||||
|
||||
7. **Chain all MATCHes from the root account node**: Every `MATCH` clause must connect to the `aws` variable (or another variable already bound to the account's subgraph). The tenant database contains data from multiple providers — an unanchored `MATCH` would return nodes from all providers, breaking provider isolation.
|
||||
1. **Chain all MATCHes from the root account node**: Every `MATCH` clause must connect to the `aws` variable (or another variable already bound to the account's subgraph). An unanchored `MATCH` would return nodes from all providers.
|
||||
|
||||
```cypher
|
||||
// WRONG: matches ALL AWSRoles across all providers in the tenant DB
|
||||
// WRONG: matches ALL AWSRoles across all providers
|
||||
MATCH (role:AWSRole) WHERE role.name = 'admin'
|
||||
|
||||
// CORRECT: scoped to the specific account's subgraph
|
||||
MATCH (aws)--(role:AWSRole) WHERE role.name = 'admin'
|
||||
```
|
||||
|
||||
The `Internet` node is an exception: it uses `OPTIONAL MATCH` with `_provider_id` for scoping instead of chaining from `aws`.
|
||||
**Exception**: A second-permission MATCH like `MATCH (principal)--(policy2:AWSPolicy)--(stmt2:AWSPolicyStatement)` is safe because `principal` is already bound to the account's subgraph by the first MATCH. It does not need to chain from `aws` again.
|
||||
|
||||
2. **Include Prowler findings**: Always add `OPTIONAL MATCH (n)-[pfr]-(pf:{PROWLER_FINDING_LABEL} {{status: 'FAIL'}})` with `collect(DISTINCT pf)`.
|
||||
|
||||
3. **Comment the query purpose**: Add inline comments explaining each MATCH clause.
|
||||
|
||||
4. **Never use internal labels in queries**: `_ProviderResource`, `_AWSResource`, `_Tenant_*`, `_Provider_*` are for system isolation. They should never appear in predefined or custom query text.
|
||||
|
||||
6. **Internet node uses path connectivity**: Reach it via `OPTIONAL MATCH (internet:Internet)-[can_access:CAN_ACCESS]->(resource)` where `resource` is already scoped by the account anchor. No standalone lookup.
|
||||
|
||||
---
|
||||
|
||||
## openCypher Compatibility
|
||||
## openCypher compatibility
|
||||
|
||||
Queries must be written in **openCypher Version 9** to ensure compatibility with both Neo4j and Amazon Neptune.
|
||||
Queries must be written in **openCypher Version 9** for compatibility with both Neo4j and Amazon Neptune.
|
||||
|
||||
> **Why Version 9?** Amazon Neptune implements openCypher Version 9. By targeting this specification, queries work on both Neo4j and Neptune without modification.
|
||||
### Avoid these (not in openCypher spec)
|
||||
|
||||
### Avoid These (Not in openCypher spec)
|
||||
|
||||
| Feature | Reason | Use instead |
|
||||
| -------------------------- | ----------------------------------------------- | ------------------------------------------------------ |
|
||||
| APOC procedures (`apoc.*`) | Neo4j-specific plugin, not available in Neptune | Real nodes and relationships in the graph |
|
||||
| Neptune extensions | Not available in Neo4j | Standard openCypher |
|
||||
| `reduce()` function | Not in openCypher spec | `UNWIND` + `collect()` |
|
||||
| `FOREACH` clause | Not in openCypher spec | `WITH` + `UNWIND` + `SET` |
|
||||
| Regex operator (`=~`) | Not supported in Neptune | `toLower()` + exact match, or `CONTAINS`/`STARTS WITH` |
|
||||
| `CALL () { UNION }` | Complex, hard to maintain | Multi-label OR in WHERE (see patterns section) |
|
||||
| Feature | Use instead |
|
||||
| -------------------------- | ------------------------------------------------------ |
|
||||
| APOC procedures (`apoc.*`) | Real nodes and relationships in the graph |
|
||||
| Neptune extensions | Standard openCypher |
|
||||
| `reduce()` function | `UNWIND` + `collect()` |
|
||||
| `FOREACH` clause | `WITH` + `UNWIND` + `SET` |
|
||||
| Regex operator (`=~`) | `toLower()` + exact match, or `CONTAINS`/`STARTS WITH`. One legacy query uses `=~` - do not add new usages |
|
||||
| `CALL () { UNION }` | Multi-label OR in WHERE (see patterns section) |
|
||||
|
||||
---
|
||||
|
||||
## Reference
|
||||
|
||||
### pathfinding.cloud (Attack Path Definitions)
|
||||
|
||||
- **Repository**: https://github.com/DataDog/pathfinding.cloud
|
||||
- **All paths JSON**: `https://raw.githubusercontent.com/DataDog/pathfinding.cloud/main/docs/paths.json`
|
||||
- Always use Bash with `curl | jq` to fetch paths (WebFetch truncates the large JSON)
|
||||
|
||||
### Cartography Schema
|
||||
|
||||
- **URL pattern**: `https://raw.githubusercontent.com/{org}/cartography/refs/tags/{version}/docs/root/modules/{provider}/schema.md`
|
||||
- Always use the version from `api/pyproject.toml`, not master/main
|
||||
|
||||
### openCypher Specification
|
||||
|
||||
- **Neptune openCypher compliance** (what Neptune supports): https://docs.aws.amazon.com/neptune/latest/userguide/feature-opencypher-compliance.html
|
||||
- **openCypher project** (spec, grammar, TCK): https://github.com/opencypher/openCypher
|
||||
|
||||
---
|
||||
|
||||
## Learning from the Queries Module
|
||||
|
||||
**IMPORTANT**: Before creating a new query, ALWAYS read the entire queries module:
|
||||
|
||||
```
|
||||
api/src/backend/api/attack_paths/queries/
|
||||
├── __init__.py # Module exports
|
||||
├── types.py # Type definitions
|
||||
├── registry.py # Registry logic
|
||||
└── {provider}.py # Provider queries (aws.py, etc.)
|
||||
```
|
||||
|
||||
Use the existing queries to learn:
|
||||
|
||||
- Query structure and formatting
|
||||
- Variable naming conventions
|
||||
- How to include Prowler findings
|
||||
- Comment style
|
||||
|
||||
**DO NOT** use generic templates. Match the exact style of existing queries in the file.
|
||||
- **pathfinding.cloud**: https://github.com/DataDog/pathfinding.cloud (use `curl | jq`, not WebFetch)
|
||||
- **Cartography schema**: `https://raw.githubusercontent.com/{org}/cartography/refs/tags/{version}/docs/root/modules/{provider}/schema.md`
|
||||
- **Neptune openCypher compliance**: https://docs.aws.amazon.com/neptune/latest/userguide/feature-opencypher-compliance.html
|
||||
- **openCypher spec**: https://github.com/opencypher/openCypher
|
||||
|
||||
Reference in New Issue
Block a user