docs(lighthouse): update lighthouse architecture docs (#9576)

Co-authored-by: Chandrapal Badshah <12944530+Chan9390@users.noreply.github.com> Co-authored-by: Rubén De la Torre Vico <ruben@prowler.com> Co-authored-by: Andoni Alonso <14891798+andoniaf@users.noreply.github.com>
2026-02-09 15:10:36 +00:00 · 2026-01-12 17:18:58 +08:00
parent 05466cff22
commit 6c01151d78
12 changed files with 448 additions and 244 deletions
--- a/docs/getting-started/products/prowler-lighthouse-ai.mdx
+++ b/docs/getting-started/products/prowler-lighthouse-ai.mdx
@@ -59,6 +59,14 @@ Prowler Lighthouse AI is powerful, but there are limitations:
 - **NextJS session dependence**: If your Prowler application session expires or logs out, Lighthouse AI will error out. Refresh and log back in to continue.
 - **Response quality**: The response quality depends on the selected LLM provider and model. Choose models with strong tool-calling capabilities for best results. We recommend `gpt-5` model from OpenAI.

+## Extending Lighthouse AI
+
+Lighthouse AI retrieves data through Prowler MCP. To add new capabilities, extend the Prowler MCP Server with additional tools and Lighthouse AI discovers them automatically.
+
+For development details, see:
+- [Lighthouse AI Architecture](/developer-guide/lighthouse-architecture) - Internal architecture and extension points
+- [Extending the MCP Server](/developer-guide/mcp-server) - Adding new tools to Prowler MCP
+
 ### Getting Help

 If you encounter issues with Prowler Lighthouse AI or have suggestions for improvements, please [reach out through our Slack channel](https://goto.prowler.com/slack).
@@ -67,94 +75,6 @@ If you encounter issues with Prowler Lighthouse AI or have suggestions for impro

 The following API endpoints are accessible to Prowler Lighthouse AI. Data from the following API endpoints could be shared with LLM provider depending on the scope of user's query:

-#### Accessible API Endpoints
-
-**User Management:**
-
- List all users - `/api/v1/users`
- Retrieve the current user's information - `/api/v1/users/me`
-
-**Provider Management:**
-
- List all providers - `/api/v1/providers`
- Retrieve data from a provider - `/api/v1/providers/{id}`
-
-**Scan Management:**
-
- List all scans - `/api/v1/scans`
- Retrieve data from a specific scan - `/api/v1/scans/{id}`
-
-**Resource Management:**
-
- List all resources - `/api/v1/resources`
- Retrieve data for a resource - `/api/v1/resources/{id}`
-
-**Findings Management:**
-
- List all findings - `/api/v1/findings`
- Retrieve data from a specific finding - `/api/v1/findings/{id}`
- Retrieve metadata values from findings - `/api/v1/findings/metadata`
-
-**Overview Data:**
-
- Get aggregated findings data - `/api/v1/overviews/findings`
- Get findings data by severity - `/api/v1/overviews/findings_severity`
- Get aggregated provider data - `/api/v1/overviews/providers`
- Get findings data by service - `/api/v1/overviews/services`
-
-**Compliance Management:**
-
- List compliance overviews (optionally filter by scan) - `/api/v1/compliance-overviews`
- Retrieve data from a specific compliance overview - `/api/v1/compliance-overviews/{id}`
-
-#### Excluded API Endpoints
-
-Not all Prowler API endpoints are integrated with Lighthouse AI. They are intentionally excluded for the following reasons:
-
- OpenAI/other LLM providers shouldn't have access to sensitive data (like fetching provider secrets and other sensitive config)
- Users queries don't need responses from those API endpoints (ex: tasks, tenant details, downloading zip file, etc.)
-
-**Excluded Endpoints:**
-
-**User Management:**
-
- List specific users information - `/api/v1/users/{id}`
- List user memberships - `/api/v1/users/{user_pk}/memberships`
- Retrieve membership data from the user - `/api/v1/users/{user_pk}/memberships/{id}`
-
-**Tenant Management:**
-
- List all tenants - `/api/v1/tenants`
- Retrieve data from a tenant - `/api/v1/tenants/{id}`
- List tenant memberships - `/api/v1/tenants/{tenant_pk}/memberships`
- List all invitations - `/api/v1/tenants/invitations`
- Retrieve data from tenant invitation - `/api/v1/tenants/invitations/{id}`
-
-**Security and Configuration:**
-
- List all secrets - `/api/v1/providers/secrets`
- Retrieve data from a secret - `/api/v1/providers/secrets/{id}`
- List all provider groups - `/api/v1/provider-groups`
- Retrieve data from a provider group - `/api/v1/provider-groups/{id}`
-
-**Reports and Tasks:**
-
- Download zip report - `/api/v1/scans/{v1}/report`
- List all tasks - `/api/v1/tasks`
- Retrieve data from a specific task - `/api/v1/tasks/{id}`
-
-**Lighthouse AI Configuration:**
-
- List LLM providers - `/api/v1/lighthouse/providers`
- Retrieve LLM provider - `/api/v1/lighthouse/providers/{id}`
- List available models - `/api/v1/lighthouse/models`
- Retrieve tenant configuration - `/api/v1/lighthouse/configuration`
-
-<Note>
-Agents only have access to hit GET endpoints. They don't have access to other HTTP methods.
-
-</Note>
-
 ## FAQs

 **1. Which LLM providers are supported?**
@@ -167,13 +87,21 @@ Lighthouse AI supports three providers:

 For detailed configuration instructions, see [Using Multiple LLM Providers with Lighthouse](/user-guide/tutorials/prowler-app-lighthouse-multi-llm).

-**2. Why a multi-agent supervisor model?**
+**2. Why some models don't appear in Lighthouse AI?**

-Context windows are limited. While demo data fits inside the context window, querying real-world data often exceeds it. A multi-agent architecture is used so different agents fetch different sizes of data and respond with the minimum required data to the supervisor. This spreads the context window usage across agents.
+LLM providers offer different types of models. Not every model can be integrated with Lighthouse AI (for example, text-to-speech, vision, embedding, computer use, etc.).
+
+Lighthouse AI requires models that support:
+
+- Text input
+- Text output
+- Tool calling
+
+Lighthouse AI [automatically filters](https://github.com/prowler-cloud/prowler/blob/master/api/src/backend/tasks/jobs/lighthouse_providers.py#L341-L353) out models that do not support these capabilities, so some provider models may not appear in the Lighthouse AI model list.

 **3. Is my security data shared with LLM providers?**

-Minimal data is shared to generate useful responses. Agents can access security findings and remediation details when needed. Provider secrets are protected by design and cannot be read. The LLM provider credentials configured with Lighthouse AI are only accessible to our NextJS server and are never sent to the LLM providers. Resource metadata (names, tags, account/project IDs, etc) may be shared with the configured LLM provider based on query requirements.
+Minimal data is shared to generate useful responses. Agent can access security findings and remediation details when needed. Provider secrets are protected by design and cannot be read. The LLM provider credentials configured with Lighthouse AI are only accessible to the Next.js server and are never sent to the LLM providers. Resource metadata (names, tags, account/project IDs, etc.) may be shared with the configured LLM provider based on query requirements.

 **4. Can the Lighthouse AI change my cloud environment?**