Files
prowler/api/docs/partitions.md
2024-11-25 13:12:54 +01:00

2.8 KiB

Partitions

Overview

Partitions are used to split the data in a table into smaller chunks, allowing for more efficient querying and storage.

The Prowler API uses partitions to store findings. The partitions are created based on the UUIDv7 id field.

You can use the Prowler API without ever creating additional partitions. This documentation is only relevant if you want to manage partitions to gain additional query performance.

Required Postgres Configuration

There are 3 configuration options that need to be set in the postgres.conf file to get the most performance out of the partitioning:

  • enable_partition_pruning = on (default is on)
  • enable_partitionwise_join = on (default is off)
  • enable_partitionwise_aggregate = on (default is off)

For more information on these options, see the Postgres documentation.

Partitioning Strategy

The partitioning strategy is defined in the api.partitions module. The strategy is responsible for creating and deleting partitions based on the provided configuration.

Managing Partitions

The application will run without any extra work on your part. If you want to add or delete partitions, you can use the following commands:

To manage the partitions, run python manage.py pgpartition --using admin

This command will generate a list of partitions to create and delete based on the provided configuration.

By default, the command will prompt you to accept the changes before applying them.

Finding:
   + 2024_nov
      name: 2024_nov
      from_values: 0192e505-9000-72c8-a47c-cce719d8fb93
      to_values: 01937f84-5418-7eb8-b2a6-e3be749e839d
      size_unit: months
      size_value: 1
   + 2024_dec
      name: 2024_dec
      from_values: 01937f84-5800-7b55-879c-9cdb46f023f6
      to_values: 01941f29-7818-7f9f-b4be-20b05bb2f574
      size_unit: months
      size_value: 1

0 partitions will be deleted
2 partitions will be created

If you choose to apply the partitions, tables will be generated with the following format: <table_name>_<year>_<month>.

For more info on the partitioning manager, see https://github.com/SectorLabs/django-postgres-extra

Changing the Partitioning Parameters

There are 4 environment variables that can be used to change the partitioning parameters:

  • DJANGO_MANAGE_DB_PARTITIONS: Allow Django to manage database partitons. By default is set to False.
  • FINDINGS_TABLE_PARTITION_MONTHS: Set the months for each partition. Setting the partition monts to 1 will create partitions with a size of 1 natural month.
  • FINDINGS_TABLE_PARTITION_COUNT: Set the number of partitions to create
  • FINDINGS_TABLE_PARTITION_MAX_AGE_MONTHS: Set the number of months to keep partitions before deleting them. Setting this to None will keep partitions indefinitely.