Graylog: Manage retention
How to configure log retention
👋 Welcome to the Stackhero documentation!
Stackhero offers a ready-to-use Graylog cloud solution that provides a host of benefits, including:
- Unlimited and dedicated SMTP email server included.
- Effortless updates with just a click.
- Customizable domain name secured with HTTPS (for example, https://logs.your-company.com).
- Optimal performance and robust security powered by a private and dedicated VM.
Save time and simplify your life: it only takes 5 minutes to try Stackhero's Graylog cloud hosting solution!
Retention defines the number of messages stored in the OpenSearch database. You can configure retention based on a message count, a maximum age, or an overall size limit.
For example, you might choose to keep messages from the past 365 days, retain up to 200 million messages, or reserve a total of 400 GB of storage space.
Understanding indices
Before defining your retention policy, it is important to understand how indices used by Graylog and OpenSearch work. Think of indices as physical containers. Graylog "opens" a container (an index) and places incoming messages inside it. When the quota assigned to that container is exceeded, the container is closed, stored on a shelf, and a new container is started for subsequent messages.
You can set this quota using different criteria:
- A number of messages: "Keep 20 million messages per container, then start a new one."
- A time limitation: "Use a container for 10 days, then switch to a new one."
- A size limitation: "Store 20 GB per container, then move on to a new one."
A maximum number of containers that can be stored on the shelf is also defined. If the number is exceeded, the oldest containers are automatically deleted. For instance, if you set a maximum of 20 containers and have 22 on the shelf, the 2 oldest containers will be removed.
In this analogy, the containers represent the indices, the shelf is OpenSearch, and the maximum number represents the permitted number of indices.
Choosing a rotation strategy
Graylog provides three retention strategies:
- "Index time" defines the maximum duration for which messages are kept in each index, for example, 14 days per index.
- "Index message count" sets the maximum number of messages per index, for example, 20 million messages per index.
- "Index size" limits the maximum size of an index, for example, 40 GB per index.
You can select one of these strategies based on your specific requirements. For instance, selecting "Index time" helps ensure that you always have logs from the past X days.
Be careful to estimate your disk storage needs accurately.
For example, if you store 1 GB of logs per day and decide to keep logs for the past 365 days, you will need 365 GB of disk space. Remember that extra operating space must also be reserved (see below).
Define the retention parameters
By default, Graylog limits the number of indices to 20. You can adjust this value to suit your needs. For example, if you want to store logs from the past 365 days, you could distribute retention across indices by dividing 365 days by 20 indices, which results in roughly 19 days per index.
You can perform similar calculations for the other strategies:
- For the "Index message count" strategy: if you want to keep 200 million messages with a maximum of 20 indices, then 200 million messages divided by 20 indices gives 10 million messages per index.
- For the "Index size" strategy: if you want to maintain 400 GB of logs with a maximum of 10 indices, then 400 GB divided by 10 indices gives 40 GB per index.
We recommend always keeping at least 15 GB of free disk space for logs, Graylog's journal and MongoDB data.
If free disk space runs out, OpenSearch will block its operations and you might need to upgrade to a larger instance.
Configure the retention policy
To configure the retention policy, navigate to the Graylog interface. Under "System" select "Indices" and click the "Edit" button in the "Default index set".
In the example below, the configuration sets a maximum of 27 indices, with each index retaining 14 days of logs. This setup retains logs for approximately a year (378 days).
We do not recommend keeping more than 14 days of messages per index.
Retention configuration to keep logs for a year
When choosing "Index time" as a rotation policy, you need to define the duration using the ISO8601 Duration standard.
For example, "P7D" means 7 days, "P14D" means 14 days and so on.
Going further
If you want to learn more about indices, we strongly encourage you to read the official documentation.
Handle errors about OpenSearch's read-only indices
Occasionally, OpenSearch may switch to read-only mode and you might encounter errors such as:
- "Flood stage disk watermark exceeded, all indices on this node will be marked read-only"
- "FORBIDDEN/12/index read-only / allow delete (api)"
These errors occur as part of OpenSearch's protection mechanism when disk space is critically low. When available disk space drops below 7 GB, OpenSearch sets indices to read-only as a precautionary measure to prevent data corruption.
If you encounter these errors, you have two options:
- Reconfigure your retention policy to keep fewer logs. After adjusting the policy, delete the oldest index to free up disk space and allow OpenSearch to switch back to read-write mode. Please note that deleting an index means that all data in that index will be lost.
- Upgrade your instance to one with a larger disk. With a single click in your Stackhero dashboard, the instance will restart with additional disk space and OpenSearch will automatically return to read-write mode.