Valkey: Prometheus
Prometheus metrics
👋 Welcome to the Stackhero documentation!
Stackhero offers a ready-to-use Valkey cloud solution that provides a host of benefits, including:
Redis Commanderweb UI included.- Unlimited message size and transfers.
- Effortless updates with just a click.
- Optimal performance and robust security powered by a private and dedicated VM.
Save time and simplify your life: it only takes 5 minutes to try Stackhero's Valkey cloud hosting solution!
Stackhero provides the capability to retrieve metrics in Prometheus format for each of your services. These metrics use the valkey_ prefix when returned to Prometheus, making them easy to identify and integrate with your monitoring tools.
Below is a detailed overview of each Stackhero for Valkey metric available. Please note that every metric is prefixed with valkey_ when returned to Prometheus.
Note that all these metrics are preceded by "valkey_" when they are returned to your Prometheus.
-
shutdown_in_milliseconds: The maximum remaining time in milliseconds for replicas to catch up with replication before the shutdown sequence is completed. This field is only present during the shutdown process. -
connected_clients: The number of client connections (excluding connections from replicas). -
cluster_connections: An approximation of the number of sockets used by the cluster bus. -
maxclients: The value of the maxclients configuration directive. It represents the upper limit for the sum of connected_clients, connected_slaves, and cluster_connections. -
client_recent_max_input_buffer: The largest input buffer size among the currently connected clients. -
client_recent_max_output_buffer: The largest output buffer size among the currently connected clients. -
blocked_clients: The number of clients waiting on a blocking call such as BLPOP, BRPOP, BRPOPLPUSH, BLMOVE, BZPOPMIN, or BZPOPMAX. -
tracking_clients: The number of clients that are currently tracked (CLIENT TRACKING). -
clients_in_timeout_table: The number of clients in the timeout table. -
used_memory: The total amount of memory (in bytes) allocated by Valkey using its chosen allocator (be it standard libc, jemalloc, or an alternative such as tcmalloc). -
used_memory_rss: The number of bytes allocated by Valkey as seen by the operating system (also known as the resident set size). -
used_memory_peak: The peak memory consumed by Valkey. -
used_memory_peak_perc: The percentage of used_memory_peak relative to used_memory. -
used_memory_overhead: The total overhead in bytes allocated by the server for managing its internal data structures. -
used_memory_startup: The initial amount of memory (in bytes) consumed by Valkey at startup. -
used_memory_dataset: The size in bytes of the dataset (calculated by subtracting used_memory_overhead from used_memory). -
used_memory_dataset_perc: The percentage of used_memory_dataset relative to the net memory usage (used_memory minus used_memory_startup). -
total_system_memory: The total amount of memory available on the Valkey host. -
used_memory_lua: The number of bytes used by the Lua engine. -
used_memory_scripts: The number of bytes occupied by cached Lua scripts. -
maxmemory: The value of the maxmemory configuration directive. -
maxmemory_policy: The value of the maxmemory-policy configuration directive. -
mem_fragmentation_ratio: The ratio between used_memory_rss and used_memory. Note that this ratio includes not just fragmentation but also other process overheads (see the allocator_* metrics) along with overheads for code, shared libraries, the stack, etc. -
mem_fragmentation_bytes: The difference in bytes between used_memory_rss and used_memory. When this value is low (only a few megabytes), a high ratio (for example, 1.5 or above) does not necessarily indicate a problem. -
allocator_frag_ratio: The ratio between allocator_active and allocator_allocated. This is a measure of true (external) fragmentation (unlike mem_fragmentation_ratio). -
allocator_frag_bytes: The difference in bytes between allocator_active and allocator_allocated. Refer to the note for mem_fragmentation_bytes. -
allocator_rss_ratio: The ratio between allocator_resident and allocator_active. This metric often indicates pages that the allocator can soon release back to the OS. -
allocator_rss_bytes: The difference in bytes between allocator_resident and allocator_active. -
rss_overhead_ratio: The ratio between used_memory_rss (the process RSS) and allocator_resident. This includes RSS overheads that are not related to the allocator or heap. -
rss_overhead_bytes: The difference in bytes between used_memory_rss (the process RSS) and allocator_resident. -
allocator_allocated: The total bytes allocated by the allocator, including internal fragmentation. This value is normally the same as used_memory. -
allocator_active: The total bytes in the allocator's active pages, including external fragmentation. -
allocator_resident: The total resident bytes (RSS) in the allocator, including pages that can be released back to the OS (by MEMORY PURGE or inactivity). -
mem_not_counted_for_evict: The used memory not counted for key eviction. This predominantly includes transient replica and AOF buffers. -
mem_clients_slaves: The memory used by replica clients. Since replica buffers share memory with the replication backlog, this field might show 0 when replicas do not trigger an increase in memory usage. -
mem_clients_normal: The memory used by normal clients. -
mem_cluster_links: The memory used by connections to peers on the cluster bus when cluster mode is active. -
mem_aof_buffer: The transient memory used for AOF and AOF rewrite buffers. -
mem_replication_backlog: The memory used by the replication backlog. -
mem_total_replication_buffers: The total memory consumed for replication buffers. -
mem_allocator: The memory allocator selected at compile time. -
active_defrag_running: When active defragmentation is enabled, this metric indicates whether defragmentation is currently active and the CPU percentage it intends to use. -
lazyfree_pending_objects: The number of objects waiting to be freed lazily (due to operations such as UNLINK or asynchronous FLUSHDB/FLUSHALL). -
lazyfreed_objects: The number of objects that have been freed lazily. -
loading: A flag indicating if a dump file is currently being loaded. -
async_loading: Indicates if the replication dataset is being loaded asynchronously while serving old data. This occurs when repl-diskless-load is enabled and set to swapdb. -
current_cow_peak: The peak size in bytes of copy-on-write memory during a child fork operation. -
current_cow_size: The size in bytes of copy-on-write memory during a child fork operation. -
current_cow_size_age: The age in seconds of the current_cow_size value. -
current_fork_perc: The percentage progress of the current fork process. For AOF and RDB forks, it represents the percentage of current_save_keys_processed out of current_save_keys_total. -
current_save_keys_processed: The number of keys processed in the current save operation. -
current_save_keys_total: The total number of keys at the start of the current save operation. -
rdb_bgsave_in_progress: A flag indicating that an RDB save is in progress. -
rdb_last_save_time: The epoch timestamp of the last successful RDB save. -
rdb_last_bgsave_status: The status of the last RDB save operation. -
rdb_last_bgsave_time_sec: The duration in seconds of the last RDB save operation. -
rdb_current_bgsave_time_sec: The duration in seconds of an ongoing RDB save operation, if any. -
rdb_last_cow_size: The size in bytes of copy-on-write memory during the last RDB save operation. -
rdb_last_load_keys_expired: The number of volatile keys deleted during the last RDB load. -
rdb_last_load_keys_loaded: The number of keys loaded during the last RDB load. -
aof_enabled: A flag indicating that AOF logging is activated. -
aof_rewrite_in_progress: A flag showing that an AOF rewrite operation is in progress. -
aof_rewrite_scheduled: A flag indicating that an AOF rewrite operation will be scheduled once an ongoing RDB save is complete. -
aof_last_rewrite_time_sec: The duration, in seconds, of the last AOF rewrite operation. -
aof_current_rewrite_time_sec: The duration, in seconds, of an ongoing AOF rewrite operation, if any. -
aof_last_bgrewrite_status: The status of the last AOF rewrite operation. -
aof_last_write_status: The status of the last write to the AOF. -
aof_last_cow_size: The size in bytes of copy-on-write memory during the last AOF rewrite operation. -
module_fork_in_progress: A flag indicating that a module fork is in progress. -
module_fork_last_cow_size: The size in bytes of copy-on-write memory during the last module fork operation. -
aof_current_size: The current size of the AOF file. -
aof_base_size: The AOF file size at the time of the last startup or rewrite. -
aof_pending_rewrite: A flag indicating that an AOF rewrite operation will be scheduled once the current RDB save completes. -
aof_buffer_length: The size of the AOF buffer. -
aof_pending_bio_fsync: The number of fsync jobs pending in the background I/O queue. -
aof_delayed_fsync: The counter for delayed fsync operations. -
loading_start_time: The epoch timestamp marking the start of the load operation. -
loading_total_bytes: The total size of the file being loaded. -
loading_rdb_used_mem: The memory usage of the server that generated the RDB file at the time of its creation. -
loading_loaded_bytes: The number of bytes that have already been loaded. -
loading_loaded_perc: The percentage of the file that has been loaded. -
loading_eta_seconds: The estimated time in seconds remaining for the load to complete. -
instantaneous_ops_per_sec: The number of commands processed per second. -
instantaneous_input_kbps: The network read rate in KB/sec. -
instantaneous_output_kbps: The network write rate in KB/sec. -
instantaneous_input_repl_kbps: The network read rate in KB/sec for replication purposes. -
instantaneous_output_repl_kbps: The network write rate in KB/sec for replication purposes. -
sync_full: The number of full resynchronisations with replicas. -
sync_partial_ok: The number of accepted partial resynchronisation requests. -
sync_partial_err: The number of denied partial resynchronisation requests. -
expired_stale_perc: The percentage of keys that have probably expired. -
expired_time_cap_reached_count: The number of times active expiry cycles have stopped early. -
expire_cycle_cpu_milliseconds: The cumulative time in milliseconds spent on active expiry cycles. -
evicted_clients: The number of clients evicted due to the maxmemory-clients limit. -
pubsub_channels: The total number of pub/sub channels with active client subscriptions. -
pubsub_patterns: The total number of pub/sub patterns with active client subscriptions. -
pubsubshard_channels: The total number of pub/sub shard channels with active client subscriptions. -
latest_fork_usec: The duration in microseconds of the most recent fork operation. -
migrate_cached_sockets: The number of sockets open for MIGRATE purposes. -
slave_expires_tracked_keys: The number of keys tracked for expiry purposes (applicable only to writable replicas). -
active_defrag_hits: The number of value reallocations successfully performed by the active defragmentation process. -
active_defrag_misses: The number of value reallocations that were aborted by the active defragmentation process. -
active_defrag_key_hits: The number of keys that were actively defragmented. -
active_defrag_key_misses: The number of keys that were skipped during the active defragmentation process. -
tracking_total_keys: The total number of keys being tracked by the server. -
tracking_total_items: The total number of tracked items (this is the sum of the number of clients per key). -
tracking_total_prefixes: The number of tracked prefixes in the server's prefix table (only applicable in broadcast mode). -
role: Returns "master" if the instance is not a replica, or "slave" if it is replicating from a master. Note that a replica may act as a master for another replica (chained replication). -
master_failover_state: The current state of an ongoing failover, if one exists. -
master_replid: The replication ID of the Valkey server. -
master_replid2: The secondary replication ID used for PSYNC after a failover. -
master_repl_offset: The current replication offset of the server. -
second_repl_offset: The offset up to which replication IDs are accepted. -
repl_backlog_active: A flag indicating if the replication backlog is active. -
repl_backlog_size: The total size in bytes of the replication backlog buffer. -
repl_backlog_first_byte_offset: The master offset corresponding to the first byte in the replication backlog buffer. -
repl_backlog_histlen: The size in bytes of data contained in the replication backlog buffer. -
master_host: The host or IP address of the master instance. -
master_port: The TCP port on which the master is listening. -
master_link_status: The status of the link (up or down). -
master_sync_in_progress: Indicates whether the master is currently syncing with a replica. -
slave_read_repl_offset: The replication offset up to which data has been read by the replica. -
slave_repl_offset: The current replication offset of the replica instance. -
slave_priority: The candidate priority of the instance for failover. -
slave_read_only: A flag indicating whether the replica is in read-only mode. -
replica_announced: A flag indicating if the replica has been announced by Sentinel. -
master_sync_total_bytes: The total number of bytes that need to be transferred during synchronisation. This value might be 0 when the size is unknown (for example, when using the repl-diskless-sync configuration directive). -
master_sync_read_bytes: The number of bytes that have already been transferred. -
master_sync_left_bytes: The number of bytes remaining to be transferred before synchronisation is complete (this value may be negative when master_sync_total_bytes is 0). -
master_sync_perc: The percentage of bytes transferred (master_sync_read_bytes) from the total (master_sync_total_bytes), or an approximation that uses loading_rdb_used_mem when master_sync_total_bytes is 0. -
connected_slaves: The number of connected replicas. -
min_slaves_good_slaves: The number of replicas currently considered good for the purpose of replication. -
current_eviction_exceeded_time: The time (in milliseconds) since used_memory last exceeded maxmemory. -
current_active_defrag_time: The time (in milliseconds) since memory fragmentation last exceeded its limit. -
master_last_io_seconds_ago: The number of seconds since the last interaction with the master. -
master_sync_last_io_seconds_ago: The number of seconds since the last transfer I/O during a SYNC operation. -
master_link_down_since_seconds: The number of seconds since the master link went down. -
total_eviction_exceeded_time: The total time (in milliseconds) that used_memory has been greater than maxmemory since server startup. -
rdb_changes_since_last_save: The number of changes recorded since the last dump. -
total_connections_received: The total number of connections accepted since the server started. -
total_commands_processed: The total number of commands processed by the server. -
total_net_input_bytes: The total number of bytes read from the network. -
total_net_output_bytes: The total number of bytes written to the network. -
total_net_repl_input_bytes: The total number of bytes read from the network for replication purposes. -
total_net_repl_output_bytes: The total number of bytes written to the network for replication purposes. -
rejected_connections: The number of connections rejected because the maxclients limit was reached. -
expired_keys: The total number of key expiration events. -
evicted_keys: The number of keys evicted due to the maxmemory limit. -
keyspace_hits: The number of successful lookups of keys in the main dictionary. -
keyspace_misses: The number of failed lookups of keys in the main dictionary. -
used_cpu_sys: The system CPU time (in seconds) consumed by Valkey, summing the usage of all threads (main and background). -
used_cpu_user: The user CPU time (in seconds) consumed by Valkey, summing the usage of all threads. -
used_cpu_sys_children: The system CPU time (in seconds) consumed by background processes. -
used_cpu_user_children: The user CPU time (in seconds) consumed by background processes. -
used_cpu_sys_main_thread: The system CPU time consumed by the main thread of the Valkey server. -
used_cpu_user_main_thread: The user CPU time consumed by the main thread of the Valkey server. -
unexpected_error_replies: The number of unexpected error replies, typically arising during AOF loads or replication errors. -
total_error_replies: The total number of error replies issued. This value includes both errors before command execution (rejected commands) and errors occurring during command execution (failed commands). -
total_reads_processed: The total number of read events processed. -
total_writes_processed: The total number of write events processed. -
io_threaded_reads_processed: The number of read events handled by both the main and I/O threads. -
io_threaded_writes_processed: The number of write events handled by both the main and I/O threads. -
dump_payload_sanitizations: The total number of deep integrity validations performed on dump payloads (as configured in sanitize-dump-payload). -
total_forks: The total number of fork operations since the server started. -
total_active_defrag_time: The total time (in milliseconds) that memory fragmentation has exceeded the set limit. -
aof_rewrites: The number of AOF rewrite operations performed since startup. -
rdb_saves: The number of RDB snapshots performed since startup.