db

db_tidb_tikv_http

TiDB TiKV by HTTP

Overview

For Zabbix version: 6.2 and higher
The template to monitor TiKV server of TiDB cluster by Zabbix that works without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

Template TiDB TiKV by HTTP — collects metrics by HTTP agent from TiKV /metrics endpoint.

This template was tested on:

TiDB cluster, version 4.0.10

Setup

This template works with TiKV server of TiDB cluster. Internal service metrics are collected from TiKV /metrics endpoint. Don't forget to change the macros {$TIKV.URL}, {$TIKV.PORT}. Also, see the Macros section for a list of macros used to set trigger values.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$TIKV.COPOCESSOR.ERRORS.MAX.WARN}	Maximum number of coprocessor request errors	`1`
{$TIKV.PENDING_COMMANDS.MAX.WARN}	Maximum number of pending commands	`1`
{$TIKV.PENDING_TASKS.MAX.WARN}	Maximum number of tasks currently running by the worker or pending	`1`
{$TIKV.PORT}	The port of TiKV server metrics web endpoint	`20180`
{$TIKV.STORE.ERRORS.MAX.WARN}	Maximum number of failure messages	`1`
{$TIKV.URL}	TiKV server URL	`localhost`

Template links

There are no template links in this template.

Discovery rules

Name	Description	Type	Key and additional info
Coprocessor metrics discovery	Discovery coprocessor metrics.	DEPENDENT	tikv.coprocessor.discovery Preprocessing: - JSONPATH: `$[?(@.name == "tikv_coprocessor_request_duration_seconds_count")]` - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `1h`
QPS metrics discovery	Discovery QPS metrics.	DEPENDENT	tikv.qps.discovery Preprocessing: - JSONPATH: `$[?(@.name == "tikv_grpc_msg_duration_seconds_count")]` - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `1h`
Scheduler metrics discovery	Discovery scheduler metrics.	DEPENDENT	tikv.scheduler.discovery Preprocessing: - JSONPATH: `$[?(@.name == "tikv_scheduler_stage_total")]` - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `1h`
Server errors discovery	Discovery server errors metrics.	DEPENDENT	tikv.serverreportfailure.discovery Preprocessing: - JSONPATH: `$[?(@.name == "tikv_server_report_failure_msg_total")]` - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `1h` Overrides: Too many unreachable messages trigger - {#TYPE} MATCHESREGEX `unreachable` - TRIGGERPROTOTYPE LIKE `Too many failure messages` - DISCOVER

Items collected

Group	Name	Description	Type	Key and additional info
TiKV node	TiKV: Store size	The storage size of TiKV instance.	DEPENDENT	tikv.engine_size Preprocessing: - JSONPATH: `$[?(@.name == "tikv_engine_size_bytes")].value.sum()`
TiKV node	TiKV: Available size	The available capacity of TiKV instance.	DEPENDENT	tikv.store_size.available Preprocessing: - JSONPATH: `$[?(@.name == "tikv_store_size_bytes" && @.labels.type == "available")].value.first()`
TiKV node	TiKV: Capacity size	The capacity size of TiKV instance.	DEPENDENT	tikv.store_size.capacity Preprocessing: - JSONPATH: `$[?(@.name == "tikv_store_size_bytes" && @.labels.type == "capacity")].value.first()`
TiKV node	TiKV: Bytes read	The total bytes of read in TiKV instance.	DEPENDENT	tikv.engineflowbytes.read Preprocessing: - JSONPATH: `$[?(@.name == "tikvengineflowbytes" && @.labels.db == "kv" && @.labels.type =~ "bytesread	iterbytesread")].value.sum()`
TiKV node	TiKV: Bytes write	The total bytes of write in TiKV instance.	DEPENDENT	tikv.engineflowbytes.write Preprocessing: - JSONPATH: `$[?(@.name == "tikv_engine_flow_bytes" && @.labels.db == "kv" && @.labels.type == "wal_file_bytes")].value.first()`
TiKV node	TiKV: Storage: commands total, rate	Total number of commands received per second.	DEPENDENT	tikv.storagecommand.rate Preprocessing: - JSONPATH: `$[?(@.name == "tikv_storage_command_total")].value.sum()` - CHANGEPER_SECOND
TiKV node	TiKV: CPU util	The CPU usage ratio on TiKV instance.	DEPENDENT	tikv.cpu.util Preprocessing: - JSONPATH: `$[?(@.name == "tikv_thread_cpu_seconds_total")].value.sum()` - CHANGEPERSECOND - MULTIPLIER: `100`
TiKV node	TiKV: RSS memory usage	Resident memory size in bytes.	DEPENDENT	tikv.rss_bytes Preprocessing: - JSONPATH: `$[?(@.name == "process_resident_memory_bytes")].value.first()`
TiKV node	TiKV: Regions, count	The number of regions collected in TiKV instance.	DEPENDENT	tikv.region_count Preprocessing: - JSONPATH: `$[?(@.name == "tikv_raftstore_region_count" && @.labels.type == "region" )].value.first()`
TiKV node	TiKV: Regions, leader	The number of leaders in TiKV instance.	DEPENDENT	tikv.region_leader Preprocessing: - JSONPATH: `$[?(@.name == "tikv_raftstore_region_count" && @.labels.type == "leader" )].value.first()`
TiKV node	TiKV: Total query, rate	The total QPS in TiKV instance.	DEPENDENT	tikv.grpcmsg.rate Preprocessing: - JSONPATH: `$[?(@.name == "tikv_grpc_msg_duration_seconds_count")].value.sum()` - CHANGEPER_SECOND
TiKV node	TiKV: Total query errors, rate	The total number of gRPC message handling failure per second.	DEPENDENT	tikv.grpcmsgfail.rate Preprocessing: - JSONPATH: `$[?(@.name == "tikv_grpc_msg_fail_total")].value.sum()` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
TiKV node	TiKV: Coprocessor: Errors, rate	Total number of push down request error per second.	DEPENDENT	tikv.coprocessorrequesterror.rate Preprocessing: - JSONPATH: `$[?(@.name == "tikv_coprocessor_request_error")].value.sum()` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
TiKV node	TiKV: Coprocessor: Requests, rate	Total number of coprocessor requests per second.	DEPENDENT	tikv.coprocessorrequest.rate Preprocessing: - JSONPATH: `$[?(@.name == "tikv_coprocessor_request_duration_seconds_count")].value.sum()` - CHANGEPER_SECOND
TiKV node	TiKV: Coprocessor: Scan keys, rate	Total number of scan keys observed per request per second.	DEPENDENT	tikv.coprocessorscankeyssum.rate Preprocessing: - JSONPATH: `$[?(@.name == "tikv_coprocessor_scan_keys")].value.sum()` - CHANGEPER_SECOND
TiKV node	TiKV: Coprocessor: RocksDB ops, rate	Total number of RocksDB internal operations from PerfContext per second.	DEPENDENT	tikv.coprocessorrocksdbperf.rate Preprocessing: - JSONPATH: `$[?(@.name == "tikv_coprocessor_rocksdb_perf")].value.sum()` - CHANGEPERSECOND
TiKV node	TiKV: Coprocessor: Response size, rate	The total size of coprocessor response per second.	DEPENDENT	tikv.coprocessorresponsebytes.rate Preprocessing: - JSONPATH: `$[?(@.name == "tikv_coprocessor_response_bytes")].value.first()` - CHANGEPERSECOND
TiKV node	TiKV: Scheduler: Pending commands	The total number of pending commands. The scheduler receives commands from clients, executes them against the MVCC layer storage engine.	DEPENDENT	tikv.scheduler_contex Preprocessing: - JSONPATH: `$[?(@.name == "tikv_scheduler_contex_total")].value.first()`
TiKV node	TiKV: Scheduler: Busy, rate	The total count of too busy schedulers per second.	DEPENDENT	tikv.schedulertoobusy.rate Preprocessing: - JSONPATH: `$[?(@.name == "tikv_scheduler_too_busy_total")].value.sum()` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
TiKV node	TiKV: Scheduler: Commands total, rate	Total number of commands per second.	DEPENDENT	tikv.schedulercommands.rate Preprocessing: - JSONPATH: `$[?(@.name == "tikv_scheduler_stage_total")].value.sum()` ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - CHANGEPERSECOND
TiKV node	TiKV: Scheduler: Low priority commands total, rate	Total count of low priority commands per second.	DEPENDENT	tikv.commandspri.low.rate Preprocessing: - JSONPATH: `$[?(@.name == "tikv_scheduler_commands_pri_total" && @.labels.priority == "low")].value.first()` - CHANGEPER_SECOND
TiKV node	TiKV: Scheduler: Normal priority commands total, rate	Total count of normal priority commands per second.	DEPENDENT	tikv.commandspri.normal.rate Preprocessing: - JSONPATH: `$[?(@.name == "tikv_scheduler_commands_pri_total" && @.labels.priority == "normal")].value.first()` - CHANGEPER_SECOND
TiKV node	TiKV: Scheduler: High priority commands total, rate	Total count of high priority commands per second.	DEPENDENT	tikv.commandspri.high.rate Preprocessing: - JSONPATH: `$[?(@.name == "tikv_scheduler_commands_pri_total" && @.labels.priority == "high")].value.first()` - CHANGEPER_SECOND
TiKV node	TiKV: Snapshot: Pending tasks	The number of tasks currently running by the worker or pending.	DEPENDENT	tikv.workerpendingtask Preprocessing: - JSONPATH: `$[?(@.name == "tikv_worker_pending_task_total")].value.first()`
TiKV node	TiKV: Snapshot: Sending	The total amount of raftstore snapshot traffic.	DEPENDENT	tikv.snapshot.sending Preprocessing: - JSONPATH: `$[?(@.name == "tikv_raftstore_snapshot_traffic_total" && @.labels.type == "sending")].value.first()`
TiKV node	TiKV: Snapshot: Receiving	The total amount of raftstore snapshot traffic.	DEPENDENT	tikv.snapshot.receiving Preprocessing: - JSONPATH: `$[?(@.name == "tikv_raftstore_snapshot_traffic_total" && @.labels.type == "receiving")].value.first()`
TiKV node	TiKV: Snapshot: Applying	The total amount of raftstore snapshot traffic.	DEPENDENT	tikv.snapshot.applying Preprocessing: - JSONPATH: `$[?(@.name == "tikv_raftstore_snapshot_traffic_total" && @.labels.type == "applying")].value.first()`
TiKV node	TiKV: Uptime	The runtime of each TiKV instance.	DEPENDENT	tikv.uptime Preprocessing: - JSONPATH: `$[?(@.name=="process_start_time_seconds")].value.first()` - JAVASCRIPT: `//use boottime to calculate uptime return (Math.floor(Date.now()/1000)-Number(value));`
TiKV node	TiKV: Server: failure messages total, rate	Total number of reporting failure messages per second.	DEPENDENT	tikv.messages.failure.rate Preprocessing: - JSONPATH: `$[?(@.name == "tikv_server_report_failure_msg_total")].value.sum()` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
TiKV node	TiKV: Query: {#TYPE}, rate	The QPS per command in TiKV instance.	DEPENDENT	tikv.grpcmsg.rate[{#TYPE}] Preprocessing: - JSONPATH: `$[?(@.name == "tikv_grpc_msg_duration_seconds_count" && @.labels.type == "{#TYPE}")].value.first()` ⛔️ONFAIL: `CUSTOM_VALUE ->`
TiKV node	TiKV: Coprocessor: {#REQ_TYPE} errors, rate	Total number of push down request error per second.	DEPENDENT	tikv.coprocessorrequesterror.rate[{#REQTYPE}] Preprocessing: - JSONPATH: `$[?(@.name == "tikv_coprocessor_request_error" && @.labels.req == "{#REQ_TYPE}")].value.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
TiKV node	TiKV: Coprocessor: {#REQ_TYPE} requests, rate	Total number of coprocessor requests per second.	DEPENDENT	tikv.coprocessorrequest.rate[{#REQTYPE}] Preprocessing: - JSONPATH: `$[?(@.name == "tikv_coprocessor_request_duration_seconds_count" && @.labels.req == "{#REQ_TYPE}")].value.first()` - CHANGEPERSECOND
TiKV node	TiKV: Coprocessor: {#REQ_TYPE} scan keys, rate	Total number of scan keys observed per request per second.	DEPENDENT	tikv.coprocessorscankeys.rate[{#REQTYPE}] Preprocessing: - JSONPATH: `$[?(@.name == "tikv_coprocessor_scan_keys_count" && @.labels.req == "{#REQ_TYPE}")].value.first()` - CHANGEPER_SECOND
TiKV node	TiKV: Coprocessor: {#REQ_TYPE} RocksDB ops, rate	Total number of RocksDB internal operations from PerfContext per second.	DEPENDENT	tikv.coprocessorrocksdbperf.rate[{#REQTYPE}] Preprocessing: - JSONPATH: `$[?(@.name == "tikv_coprocessor_rocksdb_perf" && @.labels.req == "{#REQ_TYPE}")].value.sum()` - CHANGEPER_SECOND
TiKV node	TiKV: Scheduler: commands {#STAGE}, rate	Total number of commands on each stage per second.	DEPENDENT	tikv.schedulerstage.rate[{#STAGE}] Preprocessing: - JSONPATH: `$[?(@.name == "tikv_scheduler_stage_total" && @.labels.stage == "{#STAGE}")].value.sum()` ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - CHANGEPERSECOND
TiKV node	TiKV: Storeid {#STOREID}: failure messages "{#TYPE}", rate	Total number of reporting failure messages. The metric has two labels: type and storeid. type represents the failure type, and storeid represents the destination peer store id.	DEPENDENT	tikv.messages.failure.rate[{#STOREID},{#TYPE}] Preprocessing: - JSONPATH: `$[?(@.name == "tikv_server_report_failure_msg_total" && @.labels.store_id == "{#STORE_ID}" && @.labels.type == "{#TYPE}")].value.sum()` - CHANGEPER_SECOND
Zabbix raw items	TiKV: Get instance metrics	Get TiKV instance metrics.	HTTP_AGENT	tikv.getmetrics Preprocessing: - CHECKNOTSUPPORTED ⛔️ONFAIL: `DISCARD_VALUE ->` - PROMETHEUSTOJSON

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
TiKV: Too many coprocessor request error	-	`min(/TiDB TiKV by HTTP/tikv.coprocessor_request_error.rate,5m)>{$TIKV.COPOCESSOR.ERRORS.MAX.WARN}`	WARNING
TiKV: Too many pending commands	-	`min(/TiDB TiKV by HTTP/tikv.scheduler_contex,5m)>{$TIKV.PENDING_COMMANDS.MAX.WARN}`	AVERAGE
TiKV: Too many pending tasks	-	`min(/TiDB TiKV by HTTP/tikv.worker_pending_task,5m)>{$TIKV.PENDING_TASKS.MAX.WARN}`	AVERAGE
TiKV: has been restarted	Uptime is less than 10 minutes.	`last(/TiDB TiKV by HTTP/tikv.uptime)<10m`	INFO	Manual close: YES
TiKV: Storeid {#STOREID}: Too many failure messages "{#TYPE}"	Indicates that the remote TiKV cannot be connected.	`min(/TiDB TiKV by HTTP/tikv.messages.failure.rate[{#STORE_ID},{#TYPE}],5m)>{$TIKV.STORE.ERRORS.MAX.WARN}`	WARNING

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

db_tidb_tidb_http

View README Download JSON

TiDB by HTTP

Overview

For Zabbix version: 6.2 and higher
The template to monitor TiDB server of TiDB cluster by Zabbix that works without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

Template TiDB by HTTP — collects metrics by HTTP agent from PD /metrics endpoint and from monitoring API. See https://docs.pingcap.com/tidb/stable/tidb-monitoring-api.

This template was tested on:

TiDB cluster, version 4.0.10

Setup

This template works with TiDB server of TiDB cluster. Internal service metrics are collected from TiDB /metrics endpoint and from monitoring API. See https://docs.pingcap.com/tidb/stable/tidb-monitoring-api. Don't forget to change the macros {$TIDB.URL}, {$TIDB.PORT}. Also, see the Macros section for a list of macros used to set trigger values.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$TIDB.DDL.WAITING.MAX.WARN}	Maximum number of DDL tasks that are waiting	`5`
{$TIDB.GC_ACTIONS.ERRORS.MAX.WARN}	Maximum number of GC-related operations failures	`1`
{$TIDB.HEAP.USAGE.MAX.WARN}	Maximum heap memory used	`10G`
{$TIDB.MONITORKEEPALIVE.MAX.WARN}	Minimum number of keep alive operations	`10`
{$TIDB.OPEN.FDS.MAX.WARN}	Maximum percentage of used file descriptors	`90`
{$TIDB.PORT}	The port of TiDB server metrics web endpoint	`10080`
{$TIDB.REGION_ERROR.MAX.WARN}	Maximum number of region related errors	`50`
{$TIDB.SCHEMALEASEERRORS.MAX.WARN}	Maximum number of schema lease errors	`0`
{$TIDB.SCHEMALOADERRORS.MAX.WARN}	Maximum number of load schema errors	`1`
{$TIDB.TIMEJUMPBACK.MAX.WARN}	Maximum number of times that the operating system rewinds every second	`1`
{$TIDB.URL}	TiDB server URL	`localhost`

Template links

There are no template links in this template.

Discovery rules

Name	Description	Type	Key and additional info
GC action results discovery	Discovery GC action results metrics.	DEPENDENT	tidb.tikvclientgcaction.discovery Preprocessing: - JSONPATH: `$[?(@.name=="tidb_tikvclient_gc_action_result")]` - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `1h` Overrides: Failed GC-related operations trigger - {#TYPE} MATCHESREGEX `failed` - TRIGGERPROTOTYPE LIKE `Too many failed GC-related operations` - DISCOVER
KV backoff discovery	Discovery KV backoff specific metrics.	DEPENDENT	tidb.tikvclientbackoff.discovery Preprocessing: - JSONPATH: `$[?(@.name=="tidb_tikvclient_backoff_total")]` - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGED_HEARTBEAT: `1h`
KV metrics discovery	Discovery KV specific metrics.	DEPENDENT	tidb.kvops.discovery Preprocessing: - JSONPATH: `$[?(@.name=="tidb_tikvclient_txn_cmd_duration_seconds_count")]` - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGED_HEARTBEAT: `1h`
Lock resolves discovery	Discovery lock resolves specific metrics.	DEPENDENT	tidb.tikvclientlockresolveraction.discovery Preprocessing: - JSONPATH: `$[?(@.name=="tidb_tikvclient_lock_resolver_actions_total")]` - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGED_HEARTBEAT: `1h`
QPS metrics discovery	Discovery QPS specific metrics.	DEPENDENT	tidb.qps.discovery Preprocessing: - JSONPATH: `$[?(@.name=="tidb_server_query_total")]` - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `1h`
Statement metrics discovery	Discovery statement specific metrics.	DEPENDENT	tidb.statement.discover Preprocessing: - JSONPATH: `$[?(@.name=="tidb_executor_statement_total")]` - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `1h`

Items collected

Group	Name	Description	Type	Key and additional info
TiDB node	TiDB: Status	Status of PD instance.	DEPENDENT	tidb.status Preprocessing: - JSONPATH: `$.status` ⛔️ONFAIL: `CUSTOM_VALUE -> 1` - DISCARDUNCHANGED_HEARTBEAT: `1h`
TiDB node	TiDB: Total "error" server query, rate	The number of queries on TiDB instance per second with failure of command execution results.	DEPENDENT	tidb.serverquery.error.rate Preprocessing: - JSONPATH: `$[?(@.name == "tidb_server_query_total" && @.labels.result == "Error")].value.sum()` - CHANGEPER_SECOND
TiDB node	TiDB: Total "ok" server query, rate	The number of queries on TiDB instance per second with success of command execution results.	DEPENDENT	tidb.serverquery.ok.rate Preprocessing: - JSONPATH: `$[?(@.name == "tidb_server_query_total" && @.labels.result == "OK")].value.sum()` - CHANGEPER_SECOND
TiDB node	TiDB: Total server query, rate	The number of queries per second on TiDB instance.	DEPENDENT	tidb.serverquery.rate Preprocessing: - JSONPATH: `$[?(@.name == "tidb_server_query_total")].value.sum()` - CHANGEPER_SECOND
TiDB node	TiDB: SQL statements, rate	The total number of SQL statements executed per second.	DEPENDENT	tidb.statementtotal.rate Preprocessing: - JSONPATH: `$[?(@.name=="tidb_executor_statement_total")].value.sum()` - CHANGEPER_SECOND
TiDB node	TiDB: Failed Query, rate	The number of error occurred when executing SQL statements per second (such as syntax errors and primary key conflicts).	DEPENDENT	tidb.executeerror.rate Preprocessing: - JSONPATH: `$[?(@.name=="tidb_server_execute_error_total")].value.sum()` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
TiDB node	TiDB: KV commands, rate	The number of executed KV commands per second.	DEPENDENT	tidb.tikvclienttxn.rate Preprocessing: - JSONPATH: `$[?(@.name=="tidb_tikvclient_txn_cmd_duration_seconds_count")].value.sum()` - CHANGEPER_SECOND
TiDB node	TiDB: PD TSO commands, rate	The number of TSO commands that TiDB obtains from PD per second.	DEPENDENT	tidb.pdtsocmd.rate Preprocessing: - JSONPATH: `$[?(@.name=="pd_client_cmd_handle_cmds_duration_seconds_count" && @.labels.type == "tso")].value.first()` - CHANGEPERSECOND
TiDB node	TiDB: PD TSO requests, rate	The number of TSO requests that TiDB obtains from PD per second.	DEPENDENT	tidb.pdtsorequest.rate Preprocessing: - JSONPATH: `$[?(@.name=="pd_client_request_handle_requests_duration_seconds_count" && @.labels.type == "tso")].value.first()` - CHANGEPERSECOND
TiDB node	TiDB: TiClient region errors, rate	The number of region related errors returned by TiKV per second.	DEPENDENT	tidb.tikvclientregionerr.rate Preprocessing: - JSONPATH: `$[?(@.name=="tidb_tikvclient_region_err_total")].value.sum()` - CHANGEPERSECOND
TiDB node	TiDB: Lock resolves, rate	The number of DDL tasks that are waiting.	DEPENDENT	tidb.tikvclientlockresolveraction.rate Preprocessing: - JSONPATH: `$[?(@.name=="tidb_tikvclient_lock_resolver_actions_total")].value.sum()` - CHANGEPER_SECOND
TiDB node	TiDB: DDL waiting jobs	The number of TiDB operations that resolve locks per second. When TiDB's read or write request encounters a lock, it tries to resolve the lock.	DEPENDENT	tidb.ddlwaitingjobs Preprocessing: - JSONPATH: `$[?(@.name=="tidb_ddl_waiting_jobs")].value.sum()`
TiDB node	TiDB: Load schema total, rate	The statistics of the schemas that TiDB obtains from TiKV per second.	DEPENDENT	tidb.domainloadschema.rate Preprocessing: - JSONPATH: `$[?(@.name=="tidb_domain_load_schema_total")].value.sum()` - CHANGEPERSECOND
TiDB node	TiDB: Load schema failed, rate	The total number of failures to reload the latest schema information in TiDB per second.	DEPENDENT	tidb.domainloadschema.failed.rate Preprocessing: - JSONPATH: `$[?(@.name=="tidb_domain_load_schema_total && @.labels.type == "failed"")].value.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
TiDB node	TiDB: Schema lease "outdate" errors , rate	The number of schema lease errors per second. "outdate" errors means that the schema cannot be updated, which is a more serious error and triggers an alert.	DEPENDENT	tidb.sessionschemaleaseerror.outdate.rate Preprocessing: - JSONPATH: `$[?(@.name=="tidb_session_schema_lease_error_total && @.labels.type == "outdate"")].value.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
TiDB node	TiDB: Schema lease "change" errors, rate	The number of schema lease errors per second. "change" means that the schema has changed	DEPENDENT	tidb.sessionschemaleaseerror.change.rate Preprocessing: - JSONPATH: `$[?(@.name=="tidb_session_schema_lease_error_total && @.labels.type == "change"")].value.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
TiDB node	TiDB: KV backoff, rate	The number of errors returned by TiKV.	DEPENDENT	tidb.tikvclientbackoff.rate Preprocessing: - JSONPATH: `$[?(@.name=="tidb_tikvclient_backoff_total")].value.sum()` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
TiDB node	TiDB: Keep alive, rate	The number of times that the metrics are refreshed on TiDB instance per minute.	DEPENDENT	tidb.monitorkeepalive.rate Preprocessing: - JSONPATH: `$[?(@.name=="tidb_monitor_keep_alive_total")].value.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` - SIMPLECHANGE
TiDB node	TiDB: Server connections	The connection number of current TiDB instance.	DEPENDENT	tidb.tidbserverconnections Preprocessing: - JSONPATH: `$[?(@.name=="tidb_server_connections")].value.first()`
TiDB node	TiDB: Heap memory usage	Number of heap bytes that are in use.	DEPENDENT	tidb.heap_bytes Preprocessing: - JSONPATH: `$[?(@.name=="go_memstats_heap_inuse_bytes")].value.first()`
TiDB node	TiDB: RSS memory usage	Resident memory size in bytes.	DEPENDENT	tidb.rss_bytes Preprocessing: - JSONPATH: `$[?(@.name=="process_resident_memory_bytes")].value.first()`
TiDB node	TiDB: Goroutine count	The number of Goroutines on TiDB instance.	DEPENDENT	tidb.goroutines Preprocessing: - JSONPATH: `$[?(@.name=="go_goroutines")].value.first()`
TiDB node	TiDB: Open file descriptors	Number of open file descriptors.	DEPENDENT	tidb.processopenfds Preprocessing: - JSONPATH: `$[?(@.name=="process_open_fds")].value.first()`
TiDB node	TiDB: Open file descriptors, max	Maximum number of open file descriptors.	DEPENDENT	tidb.processmaxfds Preprocessing: - JSONPATH: `$[?(@.name=="process_max_fds")].value.first()`
TiDB node	TiDB: CPU	Total user and system CPU usage ratio.	DEPENDENT	tidb.cpu.util Preprocessing: - JSONPATH: `$[?(@.name=="process_cpu_seconds_total")].value.first()` - CHANGEPERSECOND - MULTIPLIER: `100`
TiDB node	TiDB: Uptime	The runtime of each TiDB instance.	DEPENDENT	tidb.uptime Preprocessing: - JSONPATH: `$[?(@.name=="process_start_time_seconds")].value.first()` - JAVASCRIPT: `//use boottime to calculate uptime return (Math.floor(Date.now()/1000)-Number(value));`
TiDB node	TiDB: Version	Version of the TiDB instance.	DEPENDENT	tidb.version Preprocessing: - JSONPATH: `$.version` - DISCARDUNCHANGEDHEARTBEAT: `3h`
TiDB node	TiDB: Time jump back, rate	The number of times that the operating system rewinds every second.	DEPENDENT	tidb.monitortimejumpback.rate Preprocessing: - JSONPATH: `$[?(@.name=="tidb_monitor_time_jump_back_total")].value.first()` - CHANGEPER_SECOND
TiDB node	TiDB: Server critical error, rate	The number of critical errors occurred in TiDB per second.	DEPENDENT	tidb.tidbservercriticalerrortotal.rate Preprocessing: - JSONPATH: `$[?(@.name=="tidb_server_critical_error_total")].value.first()` - CHANGEPERSECOND
TiDB node	TiDB: Server panic, rate	The number of panics occurred in TiDB per second.	DEPENDENT	tidb.tidbserverpanictotal.rate Preprocessing: - JSONPATH: `$[?(@.name=="tidb_server_panic_total")].value.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
TiDB node	TiDB: Server query "OK": {#TYPE}, rate	The number of queries on TiDB instance per second with success of command execution results.	DEPENDENT	tidb.serverquery.ok.rate[{#TYPE}] Preprocessing: - JSONPATH: `$[?(@.name == "tidb_server_query_total" && @.labels.result == "OK" && @.labels.type == "{#TYPE}")].value.first()` - CHANGEPER_SECOND
TiDB node	TiDB: Server query "Error": {#TYPE}, rate	The number of queries on TiDB instance per second with failure of command execution results.	DEPENDENT	tidb.serverquery.error.rate[{#TYPE}] Preprocessing: - JSONPATH: `$[?(@.name == "tidb_server_query_total" && @.labels.result == "Error" && @.labels.type == "{#TYPE}")].value.first()` - CHANGEPER_SECOND
TiDB node	TiDB: SQL statements: {#TYPE}, rate	The number of SQL statements executed per second.	DEPENDENT	tidb.statement.rate[{#TYPE}] Preprocessing: - JSONPATH: `$[?(@.name=="tidb_executor_statement_total" && @.labels.type == "{#TYPE}")].value.first()` - CHANGEPERSECOND
TiDB node	TiDB: KV Commands: {#TYPE}, rate	The number of executed KV commands per second.	DEPENDENT	tidb.tikvclienttxn.rate[{#TYPE}] Preprocessing: - JSONPATH: `$[?(@.name=="tidb_tikvclient_txn_cmd_duration_seconds_count" && @.labels.type == "{#TYPE}")].value.first()` - CHANGEPER_SECOND
TiDB node	TiDB: Lock resolves: {#TYPE}, rate	The number of TiDB operations that resolve locks per second. When TiDB's read or write request encounters a lock, it tries to resolve the lock.	DEPENDENT	tidb.tikvclientlockresolveraction.rate[{#TYPE}] Preprocessing: - JSONPATH: `$[?(@.name=="tidb_tikvclient_lock_resolver_actions_total" && @.labels.type == "{#TYPE}")].value.first()` - CHANGEPER_SECOND
TiDB node	TiDB: KV backoff: {#TYPE}, rate	The number of TiDB operations that resolve locks per second. When TiDB's read or write request encounters a lock, it tries to resolve the lock.	DEPENDENT	tidb.tikvclientbackoff.rate[{#TYPE}] Preprocessing: - JSONPATH: `$[?(@.name=="tidb_tikvclient_backoff_total" && @.labels.type == "{#TYPE}")].value.first()` - CHANGEPER_SECOND
TiDB node	TiDB: GC action result: {#TYPE}, rate	The number of results of GC-related operations per second.	DEPENDENT	tidb.tikvclientgcaction.rate[{#TYPE}] Preprocessing: - JSONPATH: `$[?(@.name=="tidb_tikvclient_gc_action_result" && @.labels.type == "{#TYPE}")].value.first()` - CHANGEPERSECOND
Zabbix raw items	TiDB: Get instance metrics	Get TiDB instance metrics.	HTTP_AGENT	tidb.getmetrics Preprocessing: - CHECKNOTSUPPORTED ⛔️ONFAIL: `DISCARD_VALUE ->` - PROMETHEUSTOJSON
Zabbix raw items	TiDB: Get instance status	Get TiDB instance status info.	HTTP_AGENT	tidb.getstatus Preprocessing: - CHECKNOTSUPPORTED ⛔️ONFAIL: `CUSTOM_VALUE -> {"status": "0"}`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
TiDB: Instance is not responding	-	`last(/TiDB by HTTP/tidb.status)=0`	AVERAGE
TiDB: Too many region related errors	-	`min(/TiDB by HTTP/tidb.tikvclient_region_err.rate,5m)>{$TIDB.REGION_ERROR.MAX.WARN}`	AVERAGE
TiDB: Too many DDL waiting jobs	-	`min(/TiDB by HTTP/tidb.ddl_waiting_jobs,5m)>{$TIDB.DDL.WAITING.MAX.WARN}`	WARNING
TiDB: Too many schema lease errors	-	`min(/TiDB by HTTP/tidb.domain_load_schema.failed.rate,5m)>{$TIDB.SCHEMA_LOAD_ERRORS.MAX.WARN}`	AVERAGE
TiDB: Too many schema lease errors	The latest schema information is not reloaded in TiDB within one lease.	`min(/TiDB by HTTP/tidb.session_schema_lease_error.outdate.rate,5m)>{$TIDB.SCHEMA_LEASE_ERRORS.MAX.WARN}`	AVERAGE
TiDB: Too few keep alive operations	Indicates whether the TiDB process still exists. If the number of times for tidbmonitorkeepalivetotal increases less than 10 per minute, the TiDB process might already exit and an alert is triggered.	`max(/TiDB by HTTP/tidb.monitor_keep_alive.rate,5m)<{$TIDB.MONITOR_KEEP_ALIVE.MAX.WARN}`	AVERAGE
TiDB: Heap memory usage is too high	-	`min(/TiDB by HTTP/tidb.heap_bytes,5m)>{$TIDB.HEAP.USAGE.MAX.WARN}`	WARNING
TiDB: Current number of open files is too high	Heavy file descriptor usage (i.e., near the process's file descriptor limit) indicates a potential file descriptor exhaustion issue.	`min(/TiDB by HTTP/tidb.process_open_fds,5m)/last(/TiDB by HTTP/tidb.process_max_fds)*100>{$TIDB.OPEN.FDS.MAX.WARN}`	WARNING
TiDB: has been restarted	Uptime is less than 10 minutes.	`last(/TiDB by HTTP/tidb.uptime)<10m`	INFO	Manual close: YES
TiDB: Version has changed	TiDB version has changed. Ack to close.	`last(/TiDB by HTTP/tidb.version,#1)<>last(/TiDB by HTTP/tidb.version,#2) and length(last(/TiDB by HTTP/tidb.version))>0`	INFO	Manual close: YES
TiDB: Too many time jump backs	-	`min(/TiDB by HTTP/tidb.monitor_time_jump_back.rate,5m)>{$TIDB.TIME_JUMP_BACK.MAX.WARN}`	WARNING
TiDB: There are panicked TiDB threads	When a panic occurs, an alert is triggered. The thread is often recovered, otherwise, TiDB will frequently restart.	`last(/TiDB by HTTP/tidb.tidb_server_panic_total.rate)>0`	AVERAGE
TiDB: Too many failed GC-related operations	-	`min(/TiDB by HTTP/tidb.tikvclient_gc_action.rate[{#TYPE}],5m)>{$TIDB.GC_ACTIONS.ERRORS.MAX.WARN}`	WARNING

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

db_tidb_pd_http

View README Download JSON

TiDB PD by HTTP

Overview

For Zabbix version: 6.2 and higher
The template to monitor PD server of TiDB cluster by Zabbix that works without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

Template TiDB PD by HTTP — collects metrics by HTTP agent from PD /metrics endpoint and from monitoring API. See https://docs.pingcap.com/tidb/stable/tidb-monitoring-api.

This template was tested on:

TiDB cluster, version 4.0.10

Setup

This template works with PD server of TiDB cluster. Internal service metrics are collected from PD /metrics endpoint and from monitoring API. See https://docs.pingcap.com/tidb/stable/tidb-monitoring-api. Don't forget to change the macros {$PD.URL}, {$PD.PORT}. Also, see the Macros section for a list of macros used to set trigger values.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$PD.MISS_REGION.MAX.WARN}	Maximum number of missed regions	`100`
{$PD.PORT}	The port of PD server metrics web endpoint	`2379`
{$PD.STORAGE_USAGE.MAX.WARN}	Maximum percentage of cluster space used	`80`
{$PD.URL}	PD server URL	`localhost`

Template links

There are no template links in this template.

Discovery rules

Name	Description	Type	Key and additional info
Cluster metrics discovery	Discovery cluster specific metrics.	DEPENDENT	pd.cluster.discovery Preprocessing: - JSONPATH: `$[?(@.name=="pd_cluster_status")]` - JAVASCRIPT: `return JSON.stringify(value != "[]" ? [{'{#SINGLETON}': ''}] : []);` - DISCARDUNCHANGEDHEARTBEAT: `1h`
gRPC commands discovery	Discovery grpc commands specific metrics.	DEPENDENT	pd.grpccommand.discovery Preprocessing: - JSONPATH: `$[?(@.name == "grpc_server_handling_seconds_count")]` - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGED_HEARTBEAT: `1h`
Region discovery	Discovery region specific metrics.	DEPENDENT	pd.region.discovery Preprocessing: - JSONPATH: `$[?(@.name == "pd_scheduler_region_heartbeat")]` - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `1h`
Region labels discovery	Discovery region labels specific metrics.	DEPENDENT	pd.regionlabels.discovery Preprocessing: - JSONPATH: `$[?(@.name == "pd_regions_label_level")]` - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGED_HEARTBEAT: `1h`
Region status discovery	Discovery region status specific metrics.	DEPENDENT	pd.regionstatus.discovery Preprocessing: - JSONPATH: `$[?(@.name == "pd_regions_status")]` - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `1h` Overrides: Too many missed regions trigger - {#TYPE} MATCHESREGEX `miss_peer_region_count` - TRIGGERPROTOTYPE LIKE `Too many missed regions` - DISCOVER Unresponsive peers trigger - {#TYPE} MATCHESREGEX `down_peer_region_count` - TRIGGER_PROTOTYPE LIKE `There are unresponsive peers` - DISCOVER
Running scheduler discovery	Discovery scheduler specific metrics.	DEPENDENT	pd.scheduler.discovery Preprocessing: - JSONPATH: `$[?(@.name == "pd_scheduler_status" && @.labels.type == "allow")]` - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `1h`

Items collected

Group	Name	Description	Type	Key and additional info
PD instance	PD: Status	Status of PD instance.	DEPENDENT	pd.status Preprocessing: - JSONPATH: `$.status` ⛔️ONFAIL: `CUSTOM_VALUE -> 1` - DISCARDUNCHANGED_HEARTBEAT: `1h`
PD instance	PD: GRPC Commands total, rate	The rate at which gRPC commands are completed.	DEPENDENT	pd.grpccommand.rate Preprocessing: - JSONPATH: `$[?(@.name == "grpc_server_handling_seconds_count")].value.sum()` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
PD instance	PD: Version	Version of the PD instance.	DEPENDENT	pd.version Preprocessing: - JSONPATH: `$.version` - DISCARDUNCHANGEDHEARTBEAT: `3h`
PD instance	PD: Uptime	The runtime of each PD instance.	DEPENDENT	pd.uptime Preprocessing: - JSONPATH: `$.start_timestamp` - JAVASCRIPT: `//use boottime to calculate uptime return (Math.floor(Date.now()/1000)-Number(value));`
PD instance	PD: GRPC Commands: {#GRPC_METHOD}, rate	The rate per command type at which gRPC commands are completed.	DEPENDENT	pd.grpccommand.rate[{#GRPCMETHOD}] Preprocessing: - JSONPATH: `$[?(@.name == "grpc_server_handling_seconds_count" && @.labels.grpc_method == "{#GRPC_METHOD}")].value.first()` - CHANGEPERSECOND
TiDB cluster	TiDB cluster: Offline stores	-	DEPENDENT	pd.clusterstatus.storeoffline[{#SINGLETON}] Preprocessing: - JSONPATH: `$[?(@.name == "pd_cluster_status" && @.labels.type == "store_offline_count")].value.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
TiDB cluster	TiDB cluster: Tombstone stores	The count of tombstone stores.	DEPENDENT	pd.clusterstatus.storetombstone[{#SINGLETON}] Preprocessing: - JSONPATH: `$[?(@.name == "pd_cluster_status" && @.labels.type == "store_tombstone_count")].value.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
TiDB cluster	TiDB cluster: Down stores	The count of down stores.	DEPENDENT	pd.clusterstatus.storedown[{#SINGLETON}] Preprocessing: - JSONPATH: `$[?(@.name == "pd_cluster_status" && @.labels.type == "store_down_count")].value.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
TiDB cluster	TiDB cluster: Lowspace stores	The count of low space stores.	DEPENDENT	pd.clusterstatus.storelowspace[{#SINGLETON}] Preprocessing: - JSONPATH: `$[?(@.name == "pd_cluster_status" && @.labels.type == "store_low_space_count")].value.first()` - DISCARDUNCHANGED_HEARTBEAT: `1h`
TiDB cluster	TiDB cluster: Unhealth stores	The count of unhealthy stores.	DEPENDENT	pd.clusterstatus.storeunhealth[{#SINGLETON}] Preprocessing: - JSONPATH: `$[?(@.name == "pd_cluster_status" && @.labels.type == "store_unhealth_count")].value.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
TiDB cluster	TiDB cluster: Disconnect stores	The count of disconnected stores.	DEPENDENT	pd.clusterstatus.storedisconnected[{#SINGLETON}] Preprocessing: - JSONPATH: `$[?(@.name == "pd_cluster_status" && @.labels.type == "store_disconnected_count")].value.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
TiDB cluster	TiDB cluster: Normal stores	The count of healthy storage instances.	DEPENDENT	pd.clusterstatus.storeup[{#SINGLETON}] Preprocessing: - JSONPATH: `$[?(@.name == "pd_cluster_status" && @.labels.type == "store_up_count")].value.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
TiDB cluster	TiDB cluster: Storage capacity	The total storage capacity for this TiDB cluster.	DEPENDENT	pd.clusterstatus.storagecapacity[{#SINGLETON}] Preprocessing: - JSONPATH: `$[?(@.name == "pd_cluster_status" && @.labels.type == "storage_capacity")].value.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
TiDB cluster	TiDB cluster: Storage size	The storage size that is currently used by the TiDB cluster.	DEPENDENT	pd.clusterstatus.storagesize[{#SINGLETON}] Preprocessing: - JSONPATH: `$[?(@.name == "pd_cluster_status" && @.labels.type == "storage_size")].value.first()`
TiDB cluster	TiDB cluster: Number of regions	The total count of cluster Regions.	DEPENDENT	pd.clusterstatus.leadercount[{#SINGLETON}] Preprocessing: - JSONPATH: `$[?(@.name == "pd_cluster_status" && @.labels.type == "leader_count")].value.first()`
TiDB cluster	TiDB cluster: Current peer count	The current count of all cluster peers.	DEPENDENT	pd.clusterstatus.regioncount[{#SINGLETON}] Preprocessing: - JSONPATH: `$[?(@.name == "pd_cluster_status" && @.labels.type == "region_count")].value.first()`
TiDB cluster	TiDB cluster: Regions label: {#TYPE}	The number of Regions in different label levels.	DEPENDENT	pd.region_labels[{#TYPE}] Preprocessing: - JSONPATH: `$[?(@.name == "pd_regions_label_level" && @.labels.type == "{#TYPE}")].value.first()`
TiDB cluster	TiDB cluster: Regions status: {#TYPE}	The health status of Regions indicated via the count of unusual Regions including pending peers, down peers, extra peers, offline peers, missing peers, learner peers and incorrect namespaces.	DEPENDENT	pd.region_status[{#TYPE}] Preprocessing: - JSONPATH: `$[?(@.name == "pd_regions_status" && @.labels.type == "{#TYPE}")].value.first()`
TiDB cluster	TiDB cluster: Scheduler status: {#KIND}	The current running schedulers.	DEPENDENT	pd.scheduler[{#KIND}] Preprocessing: - JSONPATH: `$[?(@.name == "pd_regions_status" && @.labels.type == "allow" && @.labels.kind == "{#KIND}")].value.first()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
TiDB cluster	PD: Region heartbeat: active, rate	The count of heartbeats with the ok status per second.	DEPENDENT	pd.regionheartbeat.ok.rate[{#STOREADDRESS}] Preprocessing: - JSONPATH: `$[?(@.name == "pd_scheduler_region_heartbeat" && @.labels.status == "ok" && @.labels.type == "report" && @.labels.address == "{#STORE_ADDRESS}")].value.sum()` ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - CHANGEPER_SECOND
TiDB cluster	PD: Region heartbeat: error, rate	The count of heartbeats with the error status per second.	DEPENDENT	pd.regionheartbeat.error.rate[{#STOREADDRESS}] Preprocessing: - JSONPATH: `$[?(@.name == "pd_scheduler_region_heartbeat" && @.labels.status == "err" && @.labels.type == "report" && @.labels.address == "{#STORE_ADDRESS}")].value.sum()` ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - CHANGEPER_SECOND
TiDB cluster	PD: Region heartbeat: total, rate	The count of heartbeats reported to PD per instance per second.	DEPENDENT	pd.regionheartbeat.rate[{#STOREADDRESS}] Preprocessing: - JSONPATH: `$[?(@.name == "pd_scheduler_region_heartbeat" && @.labels.type == "report" && @.labels.address == "{#STORE_ADDRESS}")].value.sum()` ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - CHANGEPER_SECOND
TiDB cluster	PD: Region schedule push: total, rate	-	DEPENDENT	pd.regionheartbeat.push.err.rate[{#STOREADDRESS}] Preprocessing: - JSONPATH: `$[?(@.name == "pd_scheduler_region_heartbeat" && @.labels.type == "push" && @.labels.address == "{#STORE_ADDRESS}")].value.sum()` ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - CHANGEPER_SECOND
Zabbix raw items	PD: Get instance metrics	Get TiDB PD instance metrics.	HTTP_AGENT	pd.getmetrics Preprocessing: - CHECKNOTSUPPORTED ⛔️ONFAIL: `DISCARD_VALUE ->` - PROMETHEUSTOJSON
Zabbix raw items	PD: Get instance status	Get TiDB PD instance status info.	HTTP_AGENT	pd.getstatus Preprocessing: - CHECKNOTSUPPORTED ⛔️ONFAIL: `CUSTOM_VALUE -> {"status": "0"}`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
PD: Instance is not responding	-	`last(/TiDB PD by HTTP/pd.status)=0`	AVERAGE
PD: Version has changed	PD version has changed. Ack to close.	`last(/TiDB PD by HTTP/pd.version,#1)<>last(/TiDB PD by HTTP/pd.version,#2) and length(last(/TiDB PD by HTTP/pd.version))>0`	INFO	Manual close: YES
PD: has been restarted	Uptime is less than 10 minutes.	`last(/TiDB PD by HTTP/pd.uptime)<10m`	INFO	Manual close: YES
TiDB cluster: There are offline TiKV nodes	PD has not received a TiKV heartbeat for a long time.	`last(/TiDB PD by HTTP/pd.cluster_status.store_down[{#SINGLETON}])>0`	AVERAGE
TiDB cluster: There are low space TiKV nodes	Indicates that there is no sufficient space on the TiKV node.	`last(/TiDB PD by HTTP/pd.cluster_status.store_low_space[{#SINGLETON}])>0`	AVERAGE
TiDB cluster: There are disconnected TiKV nodes	PD does not receive a TiKV heartbeat within 20 seconds. Normally a TiKV heartbeat comes in every 10 seconds.	`last(/TiDB PD by HTTP/pd.cluster_status.store_disconnected[{#SINGLETON}])>0`	WARNING
TiDB cluster: Current storage usage is too high	Over {$PD.STORAGE_USAGE.MAX.WARN}% of the cluster space is occupied.	`min(/TiDB PD by HTTP/pd.cluster_status.storage_size[{#SINGLETON}],5m)/last(/TiDB PD by HTTP/pd.cluster_status.storage_capacity[{#SINGLETON}])*100>{$PD.STORAGE_USAGE.MAX.WARN}`	WARNING
TiDB cluster: Too many missed regions	The number of Region replicas is smaller than the value of max-replicas. When a TiKV machine is down and its downtime exceeds max-down-time, it usually leads to missing replicas for some Regions during a period of time. When a TiKV node is made offline, it might result in a small number of Regions with missing replicas.	`min(/TiDB PD by HTTP/pd.region_status[{#TYPE}],5m)>{$PD.MISS_REGION.MAX.WARN}`	WARNING
TiDB cluster: There are unresponsive peers	The number of Regions with an unresponsive peer reported by the Raft leader.	`min(/TiDB PD by HTTP/pd.region_status[{#TYPE}],5m)>0`	WARNING

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

db_redis

View README Download JSON

Redis by Zabbix agent 2

Overview

For Zabbix version: 6.2 and higher
The template to monitor Redis server by Zabbix that work without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

Template Redis by Zabbix agent 2 — collects metrics by polling zabbix-agent2.

This template was tested on:

Redis, version 5.0.6, 4.0.14, 3.0.6

Setup

Setup and configure zabbix-agent2 compiled with the Redis monitoring plugin (ZBXNEXT-5428-4.3).

Test availability: zabbix_get -s redis-master -k redis.ping

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$REDIS.CLIENTS.PRC.MAX.WARN}	Maximum percentage of connected clients	`80`
{$REDIS.CONN.URI}	Connection string in the URI format (password is not used). This param overwrites a value configured in the "Server" option of the configuration file (if it's set), otherwise, the plugin's default value is used: "tcp://localhost:6379"	`tcp://localhost:6379`
{$REDIS.LLD.FILTER.DB.MATCHES}	Filter of discoverable databases	`.*`
{$REDIS.LLD.FILTER.DB.NOT_MATCHES}	Filter to exclude discovered databases	`CHANGE_IF_NEEDED`
{$REDIS.LLD.PROCESS_NAME}	Redis server process name for LLD	`redis-server`
{$REDIS.MEM.FRAG_RATIO.MAX.WARN}	Maximum memory fragmentation ratio	`1.5`
{$REDIS.MEM.PUSED.MAX.WARN}	Maximum percentage of memory used	`90`
{$REDIS.PROCESS_NAME}	Redis server process name	`redis-server`
{$REDIS.REPL.LAG.MAX.WARN}	Maximum replication lag in seconds	`30s`
{$REDIS.SLOWLOG.COUNT.MAX.WARN}	Maximum number of slowlog entries per second	`1`

Template links

There are no template links in this template.

Discovery rules

Name	Description	Type	Key and additional info
AOF metrics discovery	If AOF is activated, additional metrics will be added	DEPENDENT	redis.persistence.aof.discovery Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.`
Keyspace discovery	Individual keyspace metrics	DEPENDENT	redis.keyspace.discovery Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.` Filter: AND - {#DB} MATCHESREGEX `{$REDIS.LLD.FILTER.DB.MATCHES}` - {#DB} NOTMATCHES_REGEX `{$REDIS.LLD.FILTER.DB.NOT_MATCHES}`
Process metrics discovery	Collect metrics by Zabbix agent if it exists	ZABBIX_PASSIVE	proc.num["{$REDIS.LLD.PROCESS_NAME}"] Preprocessing: - JAVASCRIPT: `return JSON.stringify(value > 0 ? [{'{#SINGLETON}': ''}] : []);`
Replication metrics discovery	If the instance is the master and the slaves are connected, additional metrics are provided	DEPENDENT	redis.replication.master.discovery Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.`
Slave metrics discovery	If the instance is a replica, additional metrics are provided	DEPENDENT	redis.replication.slave.discovery Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.`
Version 4+ metrics discovery	Additional metrics for versions 4+	DEPENDENT	redis.metrics.v4.discovery Preprocessing: - JSONPATH: `$.Server.redis_version` - JAVASCRIPT: `return JSON.stringify(parseInt(value.split('.')[0]) >= 4 ? [{'{#SINGLETON}': ''}] : []);`
Version 5+ metrics discovery	Additional metrics for versions 5+	DEPENDENT	redis.metrics.v5.discovery Preprocessing: - JSONPATH: `$.Server.redis_version` - JAVASCRIPT: `return JSON.stringify(parseInt(value.split('.')[0]) >= 5 ? [{'{#SINGLETON}': ''}] : []);`

Items collected

Group	Name	Description	Type	Key and additional info
Redis	Redis: Ping		ZABBIX_PASSIVE	redis.ping["{$REDIS.CONN.URI}"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
Redis	Redis: Slowlog entries per second		ZABBIX_PASSIVE	redis.slowlog.count["{$REDIS.CONN.URI}"] Preprocessing: - CHANGEPERSECOND
Redis	Redis: CPU sys	System CPU consumed by the Redis server	DEPENDENT	redis.cpu.sys Preprocessing: - JSONPATH: `$.CPU.used_cpu_sys`
Redis	Redis: CPU sys children	System CPU consumed by the background processes	DEPENDENT	redis.cpu.sys_children Preprocessing: - JSONPATH: `$.CPU.used_cpu_sys_children`
Redis	Redis: CPU user	User CPU consumed by the Redis server	DEPENDENT	redis.cpu.user Preprocessing: - JSONPATH: `$.CPU.used_cpu_user`
Redis	Redis: CPU user children	User CPU consumed by the background processes	DEPENDENT	redis.cpu.user_children Preprocessing: - JSONPATH: `$.CPU.used_cpu_user_children`
Redis	Redis: Blocked clients	The number of connections waiting on a blocking call	DEPENDENT	redis.clients.blocked Preprocessing: - JSONPATH: `$.Clients.blocked_clients`
Redis	Redis: Max input buffer	The biggest input buffer among current client connections	DEPENDENT	redis.clients.maxinputbuffer Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.`
Redis	Redis: Max output buffer	The biggest output buffer among current client connections	DEPENDENT	redis.clients.maxoutputbuffer Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.`
Redis	Redis: Connected clients	The number of connected clients	DEPENDENT	redis.clients.connected Preprocessing: - JSONPATH: `$.Clients.connected_clients`
Redis	Redis: Cluster enabled	Indicate Redis cluster is enabled	DEPENDENT	redis.cluster.enabled Preprocessing: - JSONPATH: `$.Cluster.cluster_enabled`
Redis	Redis: Memory used	Total number of bytes allocated by Redis using its allocator	DEPENDENT	redis.memory.used_memory Preprocessing: - JSONPATH: `$.Memory.used_memory`
Redis	Redis: Memory used Lua	Amount of memory used by the Lua engine	DEPENDENT	redis.memory.usedmemorylua Preprocessing: - JSONPATH: `$.Memory.used_memory_lua`
Redis	Redis: Memory used peak	Peak memory consumed by Redis (in bytes)	DEPENDENT	redis.memory.usedmemorypeak Preprocessing: - JSONPATH: `$.Memory.used_memory_peak`
Redis	Redis: Memory used RSS	Number of bytes that Redis allocated as seen by the operating system	DEPENDENT	redis.memory.usedmemoryrss Preprocessing: - JSONPATH: `$.Memory.used_memory_rss`
Redis	Redis: Memory fragmentation ratio	This ratio is an indication of memory mapping efficiency: — Value over 1.0 indicate that memory fragmentation is very likely. Consider restarting the Redis server so the operating system can recover fragmented memory, especially with a ratio over 1.5. — Value under 1.0 indicate that Redis likely has insufficient memory available. Consider optimizing memory usage or adding more RAM. Note: If your peak memory usage is much higher than your current memory usage, the memory fragmentation ratio may be unreliable. https://redis.io/topics/memory-optimization	DEPENDENT	redis.memory.fragmentation_ratio Preprocessing: - JSONPATH: `$.Memory.mem_fragmentation_ratio`
Redis	Redis: AOF current rewrite time sec	Duration of the on-going AOF rewrite operation if any	DEPENDENT	redis.persistence.aofcurrentrewritetimesec Preprocessing: - JSONPATH: `$.Persistence.aof_current_rewrite_time_sec`
Redis	Redis: AOF enabled	Flag indicating AOF logging is activated	DEPENDENT	redis.persistence.aof_enabled Preprocessing: - JSONPATH: `$.Persistence.aof_enabled`
Redis	Redis: AOF last bgrewrite status	Status of the last AOF rewrite operation	DEPENDENT	redis.persistence.aoflastbgrewritestatus Preprocessing: - JSONPATH: `$.Persistence.aof_last_bgrewrite_status` - BOOLTO_DECIMAL
Redis	Redis: AOF last rewrite time sec	Duration of the last AOF rewrite	DEPENDENT	redis.persistence.aoflastrewritetimesec Preprocessing: - JSONPATH: `$.Persistence.aof_last_rewrite_time_sec`
Redis	Redis: AOF last write status	Status of the last write operation to the AOF	DEPENDENT	redis.persistence.aoflastwritestatus Preprocessing: - JSONPATH: `$.Persistence.aof_last_write_status` - BOOLTO_DECIMAL
Redis	Redis: AOF rewrite in progress	Flag indicating a AOF rewrite operation is on-going	DEPENDENT	redis.persistence.aofrewritein_progress Preprocessing: - JSONPATH: `$.Persistence.aof_rewrite_in_progress`
Redis	Redis: AOF rewrite scheduled	Flag indicating an AOF rewrite operation will be scheduled once the on-going RDB save is complete	DEPENDENT	redis.persistence.aofrewritescheduled Preprocessing: - JSONPATH: `$.Persistence.aof_rewrite_scheduled`
Redis	Redis: Dump loading	Flag indicating if the load of a dump file is on-going	DEPENDENT	redis.persistence.loading Preprocessing: - JSONPATH: `$.Persistence.loading`
Redis	Redis: RDB bgsave in progress	"1" if bgsave is in progress and "0" otherwise	DEPENDENT	redis.persistence.rdbbgsavein_progress Preprocessing: - JSONPATH: `$.Persistence.rdb_bgsave_in_progress`
Redis	Redis: RDB changes since last save	Number of changes since the last background save	DEPENDENT	redis.persistence.rdbchangessincelastsave Preprocessing: - JSONPATH: `$.Persistence.rdb_changes_since_last_save`
Redis	Redis: RDB current bgsave time sec	Duration of the on-going RDB save operation if any	DEPENDENT	redis.persistence.rdbcurrentbgsavetimesec Preprocessing: - JSONPATH: `$.Persistence.rdb_current_bgsave_time_sec`
Redis	Redis: RDB last bgsave status	Status of the last RDB save operation	DEPENDENT	redis.persistence.rdblastbgsavestatus Preprocessing: - JSONPATH: `$.Persistence.rdb_last_bgsave_status` - BOOLTO_DECIMAL
Redis	Redis: RDB last bgsave time sec	Duration of the last bg_save operation	DEPENDENT	redis.persistence.rdblastbgsavetimesec Preprocessing: - JSONPATH: `$.Persistence.rdb_last_bgsave_time_sec`
Redis	Redis: RDB last save time	Epoch-based timestamp of last successful RDB save	DEPENDENT	redis.persistence.rdblastsave_time Preprocessing: - JSONPATH: `$.Persistence.rdb_last_save_time`
Redis	Redis: Connected slaves	Number of connected slaves	DEPENDENT	redis.replication.connected_slaves Preprocessing: - JSONPATH: `$.Replication.connected_slaves`
Redis	Redis: Replication backlog active	Flag indicating replication backlog is active	DEPENDENT	redis.replication.replbacklogactive Preprocessing: - JSONPATH: `$.Replication.repl_backlog_active`
Redis	Redis: Replication backlog first byte offset	The master offset of the replication backlog buffer	DEPENDENT	redis.replication.replbacklogfirstbyteoffset Preprocessing: - JSONPATH: `$.Replication.repl_backlog_first_byte_offset`
Redis	Redis: Replication backlog history length	Amount of data in the backlog sync buffer	DEPENDENT	redis.replication.replbackloghistlen Preprocessing: - JSONPATH: `$.Replication.repl_backlog_histlen`
Redis	Redis: Replication backlog size	Total size in bytes of the replication backlog buffer	DEPENDENT	redis.replication.replbacklogsize Preprocessing: - JSONPATH: `$.Replication.repl_backlog_size`
Redis	Redis: Replication role	Value is "master" if the instance is replica of no one, or "slave" if the instance is a replica of some master instance. Note that a replica can be master of another replica (chained replication).	DEPENDENT	redis.replication.role Preprocessing: - JSONPATH: `$.Replication.role` - DISCARDUNCHANGEDHEARTBEAT: `1d`
Redis	Redis: Master replication offset	Replication offset reported by the master	DEPENDENT	redis.replication.masterreploffset Preprocessing: - JSONPATH: `$.Replication.master_repl_offset`
Redis	Redis: Process id	PID of the server process	DEPENDENT	redis.server.processid Preprocessing: - JSONPATH: `$.Server.process_id` - DISCARDUNCHANGED_HEARTBEAT: `1d`
Redis	Redis: Redis mode	The server's mode ("standalone", "sentinel" or "cluster")	DEPENDENT	redis.server.redismode Preprocessing: - JSONPATH: `$.Server.redis_mode` - DISCARDUNCHANGED_HEARTBEAT: `1d`
Redis	Redis: Redis version	Version of the Redis server	DEPENDENT	redis.server.redisversion Preprocessing: - JSONPATH: `$.Server.redis_version` - DISCARDUNCHANGED_HEARTBEAT: `1d`
Redis	Redis: TCP port	TCP/IP listen port	DEPENDENT	redis.server.tcpport Preprocessing: - JSONPATH: `$.Server.tcp_port` - DISCARDUNCHANGED_HEARTBEAT: `1d`
Redis	Redis: Uptime	Number of seconds since Redis server start	DEPENDENT	redis.server.uptime Preprocessing: - JSONPATH: `$.Server.uptime_in_seconds`
Redis	Redis: Evicted keys	Number of evicted keys due to maxmemory limit	DEPENDENT	redis.stats.evicted_keys Preprocessing: - JSONPATH: `$.Stats.evicted_keys`
Redis	Redis: Expired keys	Total number of key expiration events	DEPENDENT	redis.stats.expired_keys Preprocessing: - JSONPATH: `$.Stats.expired_keys`
Redis	Redis: Instantaneous input bytes per second	The network's read rate per second in KB/sec	DEPENDENT	redis.stats.instantaneous_input.rate Preprocessing: - JSONPATH: `$.Stats.instantaneous_input_kbps` - MULTIPLIER: `1024`
Redis	Redis: Instantaneous operations per sec	Number of commands processed per second	DEPENDENT	redis.stats.instantaneous_ops.rate Preprocessing: - JSONPATH: `$.Stats.instantaneous_ops_per_sec`
Redis	Redis: Instantaneous output bytes per second	The network's write rate per second in KB/sec	DEPENDENT	redis.stats.instantaneous_output.rate Preprocessing: - JSONPATH: `$.Stats.instantaneous_output_kbps` - MULTIPLIER: `1024`
Redis	Redis: Keyspace hits	Number of successful lookup of keys in the main dictionary	DEPENDENT	redis.stats.keyspace_hits Preprocessing: - JSONPATH: `$.Stats.keyspace_hits`
Redis	Redis: Keyspace misses	Number of failed lookup of keys in the main dictionary	DEPENDENT	redis.stats.keyspace_misses Preprocessing: - JSONPATH: `$.Stats.keyspace_misses`
Redis	Redis: Latest fork usec	Duration of the latest fork operation in microseconds	DEPENDENT	redis.stats.latestforkusec Preprocessing: - JSONPATH: `$.Stats.latest_fork_usec` - MULTIPLIER: `1.0E-5`
Redis	Redis: Migrate cached sockets	The number of sockets open for MIGRATE purposes	DEPENDENT	redis.stats.migratecachedsockets Preprocessing: - JSONPATH: `$.Stats.migrate_cached_sockets`
Redis	Redis: Pubsub channels	Global number of pub/sub channels with client subscriptions	DEPENDENT	redis.stats.pubsub_channels Preprocessing: - JSONPATH: `$.Stats.pubsub_channels`
Redis	Redis: Pubsub patterns	Global number of pub/sub pattern with client subscriptions	DEPENDENT	redis.stats.pubsub_patterns Preprocessing: - JSONPATH: `$.Stats.pubsub_patterns`
Redis	Redis: Rejected connections	Number of connections rejected because of maxclients limit	DEPENDENT	redis.stats.rejected_connections Preprocessing: - JSONPATH: `$.Stats.rejected_connections`
Redis	Redis: Sync full	The number of full resyncs with replicas	DEPENDENT	redis.stats.sync_full Preprocessing: - JSONPATH: `$.Stats.sync_full`
Redis	Redis: Sync partial err	The number of denied partial resync requests	DEPENDENT	redis.stats.syncpartialerr Preprocessing: - JSONPATH: `$.Stats.sync_partial_err`
Redis	Redis: Sync partial ok	The number of accepted partial resync requests	DEPENDENT	redis.stats.syncpartialok Preprocessing: - JSONPATH: `$.Stats.sync_partial_ok`
Redis	Redis: Total commands processed	Total number of commands processed by the server	DEPENDENT	redis.stats.totalcommandsprocessed Preprocessing: - JSONPATH: `$.Stats.total_commands_processed`
Redis	Redis: Total connections received	Total number of connections accepted by the server	DEPENDENT	redis.stats.totalconnectionsreceived Preprocessing: - JSONPATH: `$.Stats.total_connections_received`
Redis	Redis: Total net input bytes	The total number of bytes read from the network	DEPENDENT	redis.stats.totalnetinput_bytes Preprocessing: - JSONPATH: `$.Stats.total_net_input_bytes`
Redis	Redis: Total net output bytes	The total number of bytes written to the network	DEPENDENT	redis.stats.totalnetoutput_bytes Preprocessing: - JSONPATH: `$.Stats.total_net_output_bytes`
Redis	Redis: Max clients	Max number of connected clients at the same time. Once the limit is reached Redis will close all the new connections sending an error "max number of clients reached".	DEPENDENT	redis.config.maxclients Preprocessing: - JSONPATH: `$.maxclients` - DISCARDUNCHANGEDHEARTBEAT: `30m`
Redis	DB {#DB}: Average TTL	Average TTL	DEPENDENT	redis.db.avg_ttl["{#DB}"] Preprocessing: - JSONPATH: `$.Keyspace["{#DB}"].avg_ttl` - MULTIPLIER: `0.001`
Redis	DB {#DB}: Expires	Number of keys with an expiration	DEPENDENT	redis.db.expires["{#DB}"] Preprocessing: - JSONPATH: `$.Keyspace["{#DB}"].expires`
Redis	DB {#DB}: Keys	Total number of keys	DEPENDENT	redis.db.keys["{#DB}"] Preprocessing: - JSONPATH: `$.Keyspace["{#DB}"].keys`
Redis	Redis: AOF current size{#SINGLETON}	AOF current file size	DEPENDENT	redis.persistence.aofcurrentsize[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Persistence.aof_current_size`
Redis	Redis: AOF base size{#SINGLETON}	AOF file size on latest startup or rewrite	DEPENDENT	redis.persistence.aofbasesize[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Persistence.aof_base_size`
Redis	Redis: AOF pending rewrite{#SINGLETON}	Flag indicating an AOF rewrite operation will	DEPENDENT	redis.persistence.aofpendingrewrite[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Persistence.aof_pending_rewrite`
Redis	Redis: AOF buffer length{#SINGLETON}	Size of the AOF buffer	DEPENDENT	redis.persistence.aofbufferlength[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Persistence.aof_buffer_length`
Redis	Redis: AOF rewrite buffer length{#SINGLETON}	Size of the AOF rewrite buffer	DEPENDENT	redis.persistence.aofrewritebuffer_length[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Persistence.aof_rewrite_buffer_length`
Redis	Redis: AOF pending background I/O fsync{#SINGLETON}	Number of fsync pending jobs in background I/O queue	DEPENDENT	redis.persistence.aofpendingbio_fsync[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Persistence.aof_pending_bio_fsync`
Redis	Redis: AOF delayed fsync{#SINGLETON}	Delayed fsync counter	DEPENDENT	redis.persistence.aofdelayedfsync[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Persistence.aof_delayed_fsync`
Redis	Redis: Master host{#SINGLETON}	Host or IP address of the master	DEPENDENT	redis.replication.masterhost[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Replication.master_host` - DISCARDUNCHANGED_HEARTBEAT: `1d`
Redis	Redis: Master port{#SINGLETON}	Master listening TCP port	DEPENDENT	redis.replication.masterport[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Replication.master_port` - DISCARDUNCHANGED_HEARTBEAT: `1d`
Redis	Redis: Master link status{#SINGLETON}	Status of the link (up/down)	DEPENDENT	redis.replication.masterlinkstatus[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Replication.master_link_status` - BOOLTODECIMAL
Redis	Redis: Master last I/O seconds ago{#SINGLETON}	Number of seconds since the last interaction with master	DEPENDENT	redis.replication.masterlastiosecondsago[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Replication.master_last_io_seconds_ago`
Redis	Redis: Master sync in progress{#SINGLETON}	Indicate the master is syncing to the replica	DEPENDENT	redis.replication.mastersyncin_progress[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Replication.master_sync_in_progress`
Redis	Redis: Slave replication offset{#SINGLETON}	The replication offset of the replica instance	DEPENDENT	redis.replication.slavereploffset[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Replication.slave_repl_offset`
Redis	Redis: Slave priority{#SINGLETON}	The priority of the instance as a candidate for failover	DEPENDENT	redis.replication.slave_priority[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Replication.slave_priority`
Redis	Redis: Slave priority{#SINGLETON}	Flag indicating if the replica is read-only	DEPENDENT	redis.replication.slavereadonly[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Replication.slave_read_only` - DISCARDUNCHANGEDHEARTBEAT: `1d`
Redis	Redis slave {#SLAVEIP}:{#SLAVEPORT}: Replication lag in bytes	Replication lag in bytes	DEPENDENT	redis.replication.lagbytes["{#SLAVEIP}:{#SLAVE_PORT}"] Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.`
Redis	Redis: Number of processes running	-	ZABBIX_PASSIVE	proc.num["{$REDIS.PROCESS_NAME}{#SINGLETON}"]
Redis	Redis: Memory usage (rss)	Resident set size memory used by process in bytes.	ZABBIX_PASSIVE	proc.mem["{$REDIS.PROCESS_NAME}{#SINGLETON}",,,,rss]
Redis	Redis: Memory usage (vsize)	Virtual memory size used by process in bytes.	ZABBIX_PASSIVE	proc.mem["{$REDIS.PROCESS_NAME}{#SINGLETON}",,,,vsize]
Redis	Redis: CPU utilization	Process CPU utilization percentage.	ZABBIX_PASSIVE	proc.cpu.util["{$REDIS.PROCESS_NAME}{#SINGLETON}"]
Redis	Redis: Executable path{#SINGLETON}	The path to the server's executable	DEPENDENT	redis.server.executable[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Server.executable` - DISCARDUNCHANGEDHEARTBEAT: `1d`
Redis	Redis: Memory used peak %{#SINGLETON}	The percentage of usedmemorypeak out of used_memory	DEPENDENT	redis.memory.usedmemorypeak_perc[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Memory.used_memory_peak_perc` - REGEX: `(.+)% \1`
Redis	Redis: Memory used overhead{#SINGLETON}	The sum in bytes of all overheads that the server allocated for managing its internal data structures	DEPENDENT	redis.memory.usedmemoryoverhead[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Memory.used_memory_overhead`
Redis	Redis: Memory used startup{#SINGLETON}	Initial amount of memory consumed by Redis at startup in bytes	DEPENDENT	redis.memory.usedmemorystartup[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Memory.used_memory_startup`
Redis	Redis: Memory used dataset{#SINGLETON}	The size in bytes of the dataset	DEPENDENT	redis.memory.usedmemorydataset[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Memory.used_memory_dataset`
Redis	Redis: Memory used dataset %{#SINGLETON}	The percentage of usedmemorydataset out of the net memory usage (usedmemory minus usedmemory_startup)	DEPENDENT	redis.memory.usedmemorydataset_perc[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Memory.used_memory_dataset_perc` - REGEX: `(.+)% \1`
Redis	Redis: Total system memory{#SINGLETON}	The total amount of memory that the Redis host has	DEPENDENT	redis.memory.totalsystemmemory[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Memory.total_system_memory`
Redis	Redis: Max memory{#SINGLETON}	Maximum amount of memory allocated to the Redisdb system	DEPENDENT	redis.memory.maxmemory[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Memory.maxmemory`
Redis	Redis: Max memory policy{#SINGLETON}	The value of the maxmemory-policy configuration directive	DEPENDENT	redis.memory.maxmemorypolicy[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Memory.maxmemory_policy` - DISCARDUNCHANGED_HEARTBEAT: `1d`
Redis	Redis: Active defrag running{#SINGLETON}	Flag indicating if active defragmentation is active	DEPENDENT	redis.memory.activedefragrunning[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Memory.active_defrag_running`
Redis	Redis: Lazyfree pending objects{#SINGLETON}	The number of objects waiting to be freed (as a result of calling UNLINK, or FLUSHDB and FLUSHALL with the ASYNC option)	DEPENDENT	redis.memory.lazyfreependingobjects[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Memory.lazyfree_pending_objects`
Redis	Redis: RDB last CoW size{#SINGLETON}	The size in bytes of copy-on-write allocations during the last RDB save operation	DEPENDENT	redis.persistence.rdblastcow_size[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Persistence.rdb_last_cow_size`
Redis	Redis: AOF last CoW size{#SINGLETON}	The size in bytes of copy-on-write allocations during the last AOF rewrite operation	DEPENDENT	redis.persistence.aoflastcow_size[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Persistence.aof_last_cow_size`
Redis	Redis: Expired stale %{#SINGLETON}	-	DEPENDENT	redis.stats.expiredstaleperc[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Stats.expired_stale_perc`
Redis	Redis: Expired time cap reached count{#SINGLETON}	-	DEPENDENT	redis.stats.expiredtimecapreachedcount[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Stats.expired_time_cap_reached_count`
Redis	Redis: Slave expires tracked keys{#SINGLETON}	The number of keys tracked for expiry purposes (applicable only to writable replicas)	DEPENDENT	redis.stats.slaveexpirestracked_keys[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Stats.slave_expires_tracked_keys`
Redis	Redis: Active defrag hits{#SINGLETON}	Number of value reallocations performed by active the defragmentation process	DEPENDENT	redis.stats.activedefraghits[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Stats.active_defrag_hits`
Redis	Redis: Active defrag misses{#SINGLETON}	Number of aborted value reallocations started by the active defragmentation process	DEPENDENT	redis.stats.activedefragmisses[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Stats.active_defrag_misses`
Redis	Redis: Active defrag key hits{#SINGLETON}	Number of keys that were actively defragmented	DEPENDENT	redis.stats.activedefragkey_hits[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Stats.active_defrag_key_hits`
Redis	Redis: Active defrag key misses{#SINGLETON}	Number of keys that were skipped by the active defragmentation process	DEPENDENT	redis.stats.activedefragkey_misses[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Stats.active_defrag_key_misses`
Redis	Redis: Replication second offset{#SINGLETON}	Offset up to which replication IDs are accepted	DEPENDENT	redis.replication.secondreploffset[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Replication.second_repl_offset`
Redis	Redis: Allocator active{#SINGLETON}	-	DEPENDENT	redis.memory.allocator_active[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Memory.allocator_active`
Redis	Redis: Allocator allocated{#SINGLETON}	-	DEPENDENT	redis.memory.allocator_allocated[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Memory.allocator_allocated`
Redis	Redis: Allocator resident{#SINGLETON}	-	DEPENDENT	redis.memory.allocator_resident[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Memory.allocator_resident`
Redis	Redis: Memory used scripts{#SINGLETON}	-	DEPENDENT	redis.memory.usedmemoryscripts[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Memory.used_memory_scripts`
Redis	Redis: Memory number of cached scripts{#SINGLETON}	-	DEPENDENT	redis.memory.numberofcached_scripts[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Memory.number_of_cached_scripts`
Redis	Redis: Allocator fragmentation bytes{#SINGLETON}	-	DEPENDENT	redis.memory.allocatorfragbytes[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Memory.allocator_frag_bytes`
Redis	Redis: Allocator fragmentation ratio{#SINGLETON}	-	DEPENDENT	redis.memory.allocatorfragratio[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Memory.allocator_frag_ratio`
Redis	Redis: Allocator RSS bytes{#SINGLETON}	-	DEPENDENT	redis.memory.allocatorrssbytes[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Memory.allocator_rss_bytes`
Redis	Redis: Allocator RSS ratio{#SINGLETON}	-	DEPENDENT	redis.memory.allocatorrssratio[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Memory.allocator_rss_ratio`
Redis	Redis: Memory RSS overhead bytes{#SINGLETON}	-	DEPENDENT	redis.memory.rssoverheadbytes[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Memory.rss_overhead_bytes`
Redis	Redis: Memory RSS overhead ratio{#SINGLETON}	-	DEPENDENT	redis.memory.rssoverheadratio[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Memory.rss_overhead_ratio`
Redis	Redis: Memory fragmentation bytes{#SINGLETON}	-	DEPENDENT	redis.memory.fragmentation_bytes[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Memory.mem_fragmentation_bytes`
Redis	Redis: Memory not counted for evict{#SINGLETON}	-	DEPENDENT	redis.memory.notcountedfor_evict[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Memory.mem_not_counted_for_evict`
Redis	Redis: Memory replication backlog{#SINGLETON}	-	DEPENDENT	redis.memory.replication_backlog[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Memory.mem_replication_backlog`
Redis	Redis: Memory clients normal{#SINGLETON}	-	DEPENDENT	redis.memory.memclientsnormal[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Memory.mem_clients_normal`
Redis	Redis: Memory clients slaves{#SINGLETON}	-	DEPENDENT	redis.memory.memclientsslaves[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Memory.mem_clients_slaves`
Redis	Redis: Memory AOF buffer{#SINGLETON}	Size of the AOF buffer	DEPENDENT	redis.memory.memaofbuffer[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Memory.mem_aof_buffer`
Zabbix raw items	Redis: Get info		ZABBIX_PASSIVE	redis.info["{$REDIS.CONN.URI}"]
Zabbix raw items	Redis: Get config		ZABBIX_PASSIVE	redis.config["{$REDIS.CONN.URI}"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1h`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Redis: Service is down	-	`last(/Redis by Zabbix agent 2/redis.ping["{$REDIS.CONN.URI}"])=0`	AVERAGE	Manual close: YES
Redis: Too many entries in the slowlog	-	`min(/Redis by Zabbix agent 2/redis.slowlog.count["{$REDIS.CONN.URI}"],5m)>{$REDIS.SLOWLOG.COUNT.MAX.WARN}`	INFO
Redis: Total number of connected clients is too high	When the number of clients reaches the value of the "maxclients" parameter, new connections will be rejected. https://redis.io/topics/clients#maximum-number-of-clients	`min(/Redis by Zabbix agent 2/redis.clients.connected,5m)/last(/Redis by Zabbix agent 2/redis.config.maxclients)*100>{$REDIS.CLIENTS.PRC.MAX.WARN}`	WARNING
Redis: Memory fragmentation ratio is too high	This ratio is an indication of memory mapping efficiency: — Value over 1.0 indicate that memory fragmentation is very likely. Consider restarting the Redis server so the operating system can recover fragmented memory, especially with a ratio over 1.5. — Value under 1.0 indicate that Redis likely has insufficient memory available. Consider optimizing memory usage or adding more RAM. Note: If your peak memory usage is much higher than your current memory usage, the memory fragmentation ratio may be unreliable. https://redis.io/topics/memory-optimization	`min(/Redis by Zabbix agent 2/redis.memory.fragmentation_ratio,15m)>{$REDIS.MEM.FRAG_RATIO.MAX.WARN}`	WARNING
Redis: Last AOF write operation failed	Detailed information about persistence: https://redis.io/topics/persistence	`last(/Redis by Zabbix agent 2/redis.persistence.aof_last_write_status)=0`	WARNING
Redis: Last RDB save operation failed	Detailed information about persistence: https://redis.io/topics/persistence	`last(/Redis by Zabbix agent 2/redis.persistence.rdb_last_bgsave_status)=0`	WARNING
Redis: Number of slaves has changed	Redis number of slaves has changed. Ack to close.	`last(/Redis by Zabbix agent 2/redis.replication.connected_slaves,#1)<>last(/Redis by Zabbix agent 2/redis.replication.connected_slaves,#2)`	INFO	Manual close: YES
Redis: Replication role has changed	Redis replication role has changed. Ack to close.	`last(/Redis by Zabbix agent 2/redis.replication.role,#1)<>last(/Redis by Zabbix agent 2/redis.replication.role,#2) and length(last(/Redis by Zabbix agent 2/redis.replication.role))>0`	WARNING	Manual close: YES
Redis: Version has changed	Redis version has changed. Ack to close.	`last(/Redis by Zabbix agent 2/redis.server.redis_version,#1)<>last(/Redis by Zabbix agent 2/redis.server.redis_version,#2) and length(last(/Redis by Zabbix agent 2/redis.server.redis_version))>0`	INFO	Manual close: YES
Redis: has been restarted	Uptime is less than 10 minutes.	`last(/Redis by Zabbix agent 2/redis.server.uptime)<10m`	INFO	Manual close: YES
Redis: Connections are rejected	The number of connections has reached the value of "maxclients". https://redis.io/topics/clients	`last(/Redis by Zabbix agent 2/redis.stats.rejected_connections)>0`	HIGH
Redis: Replication lag with master is too high	-	`min(/Redis by Zabbix agent 2/redis.replication.master_last_io_seconds_ago[{#SINGLETON}],5m)>{$REDIS.REPL.LAG.MAX.WARN}`	WARNING
Redis: Process is not running	-	`last(/Redis by Zabbix agent 2/proc.num["{$REDIS.PROCESS_NAME}{#SINGLETON}"])=0`	HIGH
Redis: Memory usage is too high	-	`last(/Redis by Zabbix agent 2/redis.memory.used_memory)/min(/Redis by Zabbix agent 2/redis.memory.maxmemory[{#SINGLETON}],5m)*100>{$REDIS.MEM.PUSED.MAX.WARN}`	WARNING
Redis: Failed to fetch info data	Zabbix has not received data for items for the last 30 minutes	`nodata(/Redis by Zabbix agent 2/redis.info["{$REDIS.CONN.URI}"],30m)=1`	WARNING	Manual close: YES Depends on: - Redis: Service is down
Redis: Configuration has changed	Redis configuration has changed. Ack to close.	`last(/Redis by Zabbix agent 2/redis.config["{$REDIS.CONN.URI}"],#1)<>last(/Redis by Zabbix agent 2/redis.config["{$REDIS.CONN.URI}"],#2) and length(last(/Redis by Zabbix agent 2/redis.config["{$REDIS.CONN.URI}"]))>0`	INFO	Manual close: YES

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

db_postgresql_agent2

View README Download JSON

PostgreSQL by Zabbix agent 2

Overview

For Zabbix version: 6.2 and higher. The template is developed to monitor DBMS PostgreSQL and its forks.

This template has been tested on:

PostgreSQL, version 10-15

Setup

Deploy Zabbix agent2 with Postgres plugin. Starting with Zabbix versions 6.0.10 / 6.2.4 / 6.4 postgres metrics moved to a loadable plugin and requires separate package installation or compilation of a plugin from sources.
Create PostgreSQL user to monitor (<password> at your discretion) and inherit permissions from the default role pg_monitor:

CREATE USER zbx_monitor WITH PASSWORD '<PASSWORD>' INHERIT;
GRANT pg_monitor TO zbx_monitor;

Edit pg_hba.conf to allow connections from Zabbix agent:

# TYPE  DATABASE        USER            ADDRESS                 METHOD
  host       all        zbx_monitor     localhost               md5

For more information please read the PostgreSQL documentation https://www.postgresql.org/docs/current/auth-pg-hba-conf.html.

Set in the {$PG.URI} macro the system data source name of the PostgreSQL instance such as <protocol(host:port)>.
Set the user name and password in host macros ({$PG.USER} and {$PG.PASSWORD}) if you want to override parameters from the Zabbix agent configuration file.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$PG.CONFLICTS.MAX.WARN}	-	`0`
{$PG.CONNTOTALPCT.MAX.WARN}	-	`90`
{$PG.DATABASE}	-	`postgres`
{$PG.DEADLOCKS.MAX.WARN}	-	`0`
{$PG.LLD.FILTER.APPLICATION}	-	`(.+)`
{$PG.LLD.FILTER.DBNAME}	-	`(.+)`
{$PG.PASSWORD}	-	`postgres`
{$PG.QUERY_ETIME.MAX.WARN}	Execution time limit for count of slow queries.	`30`
{$PG.SLOW_QUERIES.MAX.WARN}	Slow queries count threshold for a trigger.	`5`
{$PG.URI}	-	`tcp://localhost:5432`
{$PG.USER}	-	`postgres`

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Database discovery

Name	Description	Type	Key and additional info
Database discovery	-	ZABBIX_PASSIVE	pgsql.db.discovery["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"] Filter: AND - {#DBNAME} MATCHES_REGEX `{$PG.LLD.FILTER.DBNAME}`
Replication Discovery	-	ZABBIX_PASSIVE	pgsql.replication.process.discovery["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"] Filter: AND - {#APPLICATIONNAME} MATCHESREGEX `{$PG.LLD.FILTER.APPLICATION}`

ZABBIX_PASSIVE

pgsql.db.discovery["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"]

Filter:

AND

- {#DBNAME} MATCHES_REGEX {$PG.LLD.FILTER.DBNAME}

Replication Discovery

ZABBIX_PASSIVE

pgsql.replication.process.discovery["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"]

Filter:

AND

- {#APPLICATIONNAME} MATCHESREGEX {$PG.LLD.FILTER.APPLICATION}

Items collected

Group	Name	Description	Type	Key and additional info
PostgreSQL	PostgreSQL: Custom queries	Execute custom queries from file *.sql (check for option Plugins.Postgres.CustomQueriesPath at agent configuration)	ZABBIX_PASSIVE	pgsql.custom.query["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DATABASE}",""]
PostgreSQL	WAL: Bytes written	WAL write in bytes	DEPENDENT	pgsql.wal.write Preprocessing: - JSONPATH: `$.write` - CHANGEPERSECOND
PostgreSQL	WAL: Bytes received	WAL receive in bytes	DEPENDENT	pgsql.wal.receive Preprocessing: - JSONPATH: `$.receive` - CHANGEPERSECOND
PostgreSQL	WAL: Segments count	Number of WAL segments	DEPENDENT	pgsql.wal.count Preprocessing: - JSONPATH: `$.count`
PostgreSQL	Bgwriter: Buffers allocated	Number of buffers allocated	DEPENDENT	pgsql.bgwriter.buffersalloc.rate Preprocessing: - JSONPATH: `$.buffers_alloc` - CHANGEPER_SECOND
PostgreSQL	Bgwriter: Buffers written directly by a backend	Number of buffers written directly by a backend	DEPENDENT	pgsql.bgwriter.buffersbackend.rate Preprocessing: - JSONPATH: `$.buffers_backend` - CHANGEPER_SECOND
PostgreSQL	Bgwriter: Number of bgwriter stopped	Number of times the background writer stopped a cleaning scan because it had written too many buffers	DEPENDENT	pgsql.bgwriter.maxwrittenclean.rate Preprocessing: - JSONPATH: `$.maxwritten_clean` - CHANGEPER_SECOND
PostgreSQL	Bgwriter: Times a backend execute its own fsync	Number of times a backend had to execute its own fsync call (normally the background writer handles those even when the backend does its own write)	DEPENDENT	pgsql.bgwriter.buffersbackendfsync.rate Preprocessing: - JSONPATH: `$.buffers_backend_fsync` - CHANGEPERSECOND
PostgreSQL	Checkpoint: Buffers background written	Number of buffers written by the background writer	DEPENDENT	pgsql.bgwriter.buffersclean.rate Preprocessing: - JSONPATH: `$.buffers_clean` - CHANGEPER_SECOND
PostgreSQL	Checkpoint: Buffers checkpoints written	Number of buffers written during checkpoints	DEPENDENT	pgsql.bgwriter.bufferscheckpoint.rate Preprocessing: - JSONPATH: `$.buffers_checkpoint` - CHANGEPER_SECOND
PostgreSQL	Checkpoint: By timeout	Number of scheduled checkpoints that have been performed	DEPENDENT	pgsql.bgwriter.checkpointstimed.rate Preprocessing: - JSONPATH: `$.checkpoints_timed` - CHANGEPER_SECOND
PostgreSQL	Checkpoint: Requested	Number of requested checkpoints that have been performed	DEPENDENT	pgsql.bgwriter.checkpointsreq.rate Preprocessing: - JSONPATH: `$.checkpoints_req` - CHANGEPER_SECOND
PostgreSQL	Checkpoint: Checkpoint write time	Total amount of time that has been spent in the portion of checkpoint processing where files are written to disk, in milliseconds	DEPENDENT	pgsql.bgwriter.checkpointwritetime.rate Preprocessing: - JSONPATH: `$.checkpoint_write_time` - MULTIPLIER: `0.001` - CHANGEPERSECOND
PostgreSQL	Checkpoint: Checkpoint write time	Total amount of time that has been spent in the portion of checkpoint processing where files are synchronized to disk, in milliseconds	DEPENDENT	pgsql.bgwriter.checkpointsynctime.rate Preprocessing: - JSONPATH: `$.checkpoint_sync_time` - MULTIPLIER: `0.001` - CHANGEPERSECOND
PostgreSQL	Checkpoint: Checkpoint sync time	Total amount of time that has been spent in the portion of checkpoint processing where files are synchronized to disk	DEPENDENT	pgsql.bgwriter.checkpointsynctime.rate Preprocessing: - JSONPATH: `$.checkpoint_sync_time` - MULTIPLIER: `0.001` - CHANGEPERSECOND
PostgreSQL	Archive: Count of archive files	Collect all metrics from pgstatactivity https://www.postgresql.org/docs/current/monitoring-stats.html#PG-STAT-ARCHIVER-VIEW	DEPENDENT	pgsql.archive.countarchivedfiles Preprocessing: - JSONPATH: `$.archived_count`
PostgreSQL	Archive: Count of attempts to archive files	Collect all metrics from pgstatactivity https://www.postgresql.org/docs/current/monitoring-stats.html#PG-STAT-ARCHIVER-VIEW	DEPENDENT	pgsql.archive.failedtryingto_archive Preprocessing: - JSONPATH: `$.failed_count`
PostgreSQL	Archive: Count of files in archive_status need to archive	-	DEPENDENT	pgsql.archive.countfilesto_archive Preprocessing: - JSONPATH: `$.count_files`
PostgreSQL	Archive: Count of files need to archive	Size of files to archive	DEPENDENT	pgsql.archive.sizefilesto_archive Preprocessing: - JSONPATH: `$.size_files`
PostgreSQL	Dbstat: Blocks read time	Time spent reading data file blocks by backends, in milliseconds	DEPENDENT	pgsql.dbstat.sum.blkreadtime Preprocessing: - JSONPATH: `$.blk_read_time` - MULTIPLIER: `0.001`
PostgreSQL	Dbstat: Blocks write time	Time spent writing data file blocks by backends, in milliseconds	DEPENDENT	pgsql.dbstat.sum.blkwritetime Preprocessing: - JSONPATH: `$.blk_write_time` - MULTIPLIER: `0.001`
PostgreSQL	Dbstat: Checksum failures	Number of data page checksum failures detected (or on a shared object), or NULL if data checksums are not enabled. This metric included in PostgreSQL 12	DEPENDENT	pgsql.dbstat.sum.checksumfailures.rate Preprocessing: - JSONPATH: `$.checksum_failures` - MATCHESREGEX: `^\d*$` - CHANGEPERSECOND ⛔️ON_FAIL: `CUSTOM_VALUE -> -1`
PostgreSQL	Dbstat: Committed transactions	Number of transactions that have been committed	DEPENDENT	pgsql.dbstat.sum.xactcommit.rate Preprocessing: - JSONPATH: `$.xact_commit` - CHANGEPER_SECOND
PostgreSQL	Dbstat: Conflicts	Number of queries canceled due to conflicts with recovery. (Conflicts occur only on standby servers; see pgstatdatabase_conflicts for details.)	DEPENDENT	pgsql.dbstat.sum.conflicts.rate Preprocessing: - JSONPATH: `$.conflicts` - CHANGEPERSECOND
PostgreSQL	Dbstat: Deadlocks	Number of deadlocks detected	DEPENDENT	pgsql.dbstat.sum.deadlocks.rate Preprocessing: - JSONPATH: `$.deadlocks` - CHANGEPERSECOND
PostgreSQL	Dbstat: Disk blocks read	Number of disk blocks read	DEPENDENT	pgsql.dbstat.sum.blksread.rate Preprocessing: - JSONPATH: `$.blks_read` - CHANGEPER_SECOND
PostgreSQL	Dbstat: Hit blocks read	Number of times disk blocks were found already in the buffer cache	DEPENDENT	pgsql.dbstat.sum.blkshit.rate Preprocessing: - JSONPATH: `$.blks_hit` - CHANGEPER_SECOND
PostgreSQL	Dbstat: Number temp bytes	Total amount of data written to temporary files by queries	DEPENDENT	pgsql.dbstat.sum.tempbytes.rate Preprocessing: - JSONPATH: `$.temp_bytes` - CHANGEPER_SECOND
PostgreSQL	Dbstat: Number temp bytes	Number of temporary files created by queries	DEPENDENT	pgsql.dbstat.sum.tempfiles.rate Preprocessing: - JSONPATH: `$.temp_files` - CHANGEPER_SECOND
PostgreSQL	Dbstat: Roll backed transactions	Number of transactions that have been rolled back	DEPENDENT	pgsql.dbstat.sum.xactrollback.rate Preprocessing: - JSONPATH: `$.xact_rollback` - CHANGEPER_SECOND
PostgreSQL	Dbstat: Rows deleted	Number of rows deleted by queries	DEPENDENT	pgsql.dbstat.sum.tupdeleted.rate Preprocessing: - JSONPATH: `$.tup_deleted` - CHANGEPER_SECOND
PostgreSQL	Dbstat: Rows fetched	Number of rows fetched by queries	DEPENDENT	pgsql.dbstat.sum.tupfetched.rate Preprocessing: - JSONPATH: `$.tup_fetched` - CHANGEPER_SECOND
PostgreSQL	Dbstat: Rows inserted	Number of rows inserted by queries	DEPENDENT	pgsql.dbstat.sum.tupinserted.rate Preprocessing: - JSONPATH: `$.tup_inserted` - CHANGEPER_SECOND
PostgreSQL	Dbstat: Rows returned	Number of rows returned by queries	DEPENDENT	pgsql.dbstat.sum.tupreturned.rate Preprocessing: - JSONPATH: `$.tup_returned` - CHANGEPER_SECOND
PostgreSQL	Dbstat: Rows updated	Number of rows updated by queries	DEPENDENT	pgsql.dbstat.sum.tupupdated.rate Preprocessing: - JSONPATH: `$.tup_updated` - CHANGEPER_SECOND
PostgreSQL	Dbstat: Backends connected	Number of connected backends	DEPENDENT	pgsql.dbstat.sum.numbackends Preprocessing: - JSONPATH: `$.numbackends`
PostgreSQL	Connections sum: Active	Total number of connections executing a query	DEPENDENT	pgsql.connections.active Preprocessing: - JSONPATH: `$.active`
PostgreSQL	Connections sum: Fastpath function call	Total number of connections executing a fast-path function	DEPENDENT	pgsql.connections.fastpathfunctioncall Preprocessing: - JSONPATH: `$.idle_in_transaction`
PostgreSQL	Connections sum: Idle	Total number of connections waiting for a new client command	DEPENDENT	pgsql.connections.idle Preprocessing: - JSONPATH: `$.idle`
PostgreSQL	Connections sum: Idle in transaction	Total number of connections in a transaction state, but not executing a query	DEPENDENT	pgsql.connections.idleintransaction Preprocessing: - JSONPATH: `$.idle_in_transaction`
PostgreSQL	Connections sum: Prepared	Total number of prepared transactions https://www.postgresql.org/docs/current/sql-prepare-transaction.html	DEPENDENT	pgsql.connections.prepared Preprocessing: - JSONPATH: `$.prepared`
PostgreSQL	Connections sum: Total	Total number of connections	DEPENDENT	pgsql.connections.total Preprocessing: - JSONPATH: `$.total`
PostgreSQL	Connections sum: Total %	Total number of connections in percentage	DEPENDENT	pgsql.connections.total_pct Preprocessing: - JSONPATH: `$.total_pct`
PostgreSQL	Connections sum: Waiting	Total number of waiting connections https://www.postgresql.org/docs/current/monitoring-stats.html#WAIT-EVENT-TABLE	DEPENDENT	pgsql.connections.waiting Preprocessing: - JSONPATH: `$.waiting`
PostgreSQL	Connections sum: Idle in transaction (aborted)	Total number of connections in a transaction state, but not executing a query and one of the statements in the transaction caused an error.	DEPENDENT	pgsql.connections.idleintransaction_aborted Preprocessing: - JSONPATH: `$.idle_in_transaction_aborted`
PostgreSQL	Connections sum: Disabled	Total number of disabled connections	DEPENDENT	pgsql.connections.disabled Preprocessing: - JSONPATH: `$.disabled`
PostgreSQL	PostgreSQL: Age of oldest xid	Age of oldest xid.	ZABBIX_PASSIVE	pgsql.oldest.xid["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"]
PostgreSQL	Autovacuum: Count of autovacuum workers	Number of autovacuum workers.	ZABBIX_PASSIVE	pgsql.autovacuum.count["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"]
PostgreSQL	PostgreSQL: Cache hit	-	CALCULATED	pgsql.cache.hit["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"] Expression: `last(//pgsql.dbstat.sum.blks_hit.rate) * 100 / (last(//pgsql.dbstat.sum.blks_hit.rate) + last(//pgsql.dbstat.sum.blks_read.rate))`
PostgreSQL	PostgreSQL: Uptime	-	ZABBIX_PASSIVE	pgsql.uptime["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"]
PostgreSQL	Replication: Lag in bytes	Replication lag with Master in byte.	ZABBIX_PASSIVE	pgsql.replication.lag.b["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"]
PostgreSQL	Replication: Lag in seconds	Replication lag with Master in seconds.	ZABBIX_PASSIVE	pgsql.replication.lag.sec["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"]
PostgreSQL	Replication: Recovery role	Replication role: 1 — recovery is still in progress (standby mode), 0 — master mode.	ZABBIX_PASSIVE	pgsql.replication.recovery_role["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"]
PostgreSQL	Replication: Standby count	Number of standby servers	ZABBIX_PASSIVE	pgsql.replication.count["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"]
PostgreSQL	Replication: Status	Replication status: 0 — streaming is down, 1 — streaming is up, 2 — master mode	ZABBIX_PASSIVE	pgsql.replication.status["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"]
PostgreSQL	PostgreSQL: Ping	-	ZABBIX_PASSIVE	pgsql.ping["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1h`
PostgreSQL	Application {#APPLICATION_NAME}: Replication flush lag		DEPENDENT	pgsql.replication.process.flushlag["{#APPLICATIONNAME}"] Preprocessing: - JSONPATH: `$['{#APPLICATION_NAME}'].flush_lag`
PostgreSQL	Application {#APPLICATION_NAME}: Replication replay lag		DEPENDENT	pgsql.replication.process.replaylag["{#APPLICATIONNAME}"] Preprocessing: - JSONPATH: `$['{#APPLICATION_NAME}'].replay_lag`
PostgreSQL	Application {#APPLICATION_NAME}: Replication write lag		DEPENDENT	pgsql.replication.process.writelag["{#APPLICATIONNAME}"] Preprocessing: - JSONPATH: `$['{#APPLICATION_NAME}'].write_lag`
PostgreSQL	DB {#DBNAME}: Database age	Database age	ZABBIX_PASSIVE	pgsql.db.age["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}","{#DBNAME}"]
PostgreSQL	DB {#DBNAME}: Get bloating tables	Number of bloating tables	ZABBIX_PASSIVE	pgsql.db.bloating_tables["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}","{#DBNAME}"]
PostgreSQL	DB {#DBNAME}: Database size	Database size	ZABBIX_PASSIVE	pgsql.db.size["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}","{#DBNAME}"]
PostgreSQL	DB {#DBNAME}: Blocks hit per second	Total number of times disk blocks were found already in the buffer cache, so that a read was not necessary	DEPENDENT	pgsql.dbstat.blkshit.rate["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].blks_hit` - CHANGEPER_SECOND
PostgreSQL	DB {#DBNAME}: Disk blocks read per second	Total number of disk blocks read in this database	DEPENDENT	pgsql.dbstat.blksread.rate["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].blks_read` - CHANGEPER_SECOND
PostgreSQL	DB {#DBNAME}: Detected conflicts per second	Total number of queries canceled due to conflicts with recovery in this database	DEPENDENT	pgsql.dbstat.conflicts.rate["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].conflicts` - CHANGEPERSECOND
PostgreSQL	DB {#DBNAME}: Detected deadlocks per second	Total number of detected deadlocks in this database	DEPENDENT	pgsql.dbstat.deadlocks.rate["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].deadlocks` - CHANGEPERSECOND
PostgreSQL	DB {#DBNAME}: Temp_bytes written per second	Total amount of data written to temporary files by queries in this database	DEPENDENT	pgsql.dbstat.tempbytes.rate["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].temp_bytes` - CHANGEPER_SECOND
PostgreSQL	DB {#DBNAME}: Temp_files created per second	Total number of temporary files created by queries in this database	DEPENDENT	pgsql.dbstat.tempfiles.rate["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].temp_files` - CHANGEPER_SECOND
PostgreSQL	DB {#DBNAME}: Tuples deleted per second	Total number of rows deleted by queries in this database	DEPENDENT	pgsql.dbstat.tupdeleted.rate["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].tup_deleted` - CHANGEPER_SECOND
PostgreSQL	DB {#DBNAME}: Tuples fetched per second	Total number of rows fetched by queries in this database	DEPENDENT	pgsql.dbstat.tupfetched.rate["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].tup_fetched` - CHANGEPER_SECOND
PostgreSQL	DB {#DBNAME}: Tuples inserted per second	Total number of rows inserted by queries in this database	DEPENDENT	pgsql.dbstat.tupinserted.rate["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].tup_inserted` - CHANGEPER_SECOND
PostgreSQL	DB {#DBNAME}: Tuples returned per second	Number of rows returned by queries in this database	DEPENDENT	pgsql.dbstat.tupreturned.rate["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].tup_returned` - CHANGEPER_SECOND
PostgreSQL	DB {#DBNAME}: Tuples updated per second	Total number of rows updated by queries in this database	DEPENDENT	pgsql.dbstat.tupupdated.rate["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].tup_updated` - CHANGEPER_SECOND
PostgreSQL	DB {#DBNAME}: Commits per second	Number of transactions in this database that have been committed	DEPENDENT	pgsql.dbstat.xactcommit.rate["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].xact_commit` - CHANGEPER_SECOND
PostgreSQL	DB {#DBNAME}: Rollbacks per second	Total number of transactions in this database that have been rolled back	DEPENDENT	pgsql.dbstat.xactrollback.rate["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].xact_rollback` - CHANGEPER_SECOND
PostgreSQL	DB {#DBNAME}: Backends connected	Number of backends currently connected to this database	DEPENDENT	pgsql.dbstat.numbackends["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].numbackends`
PostgreSQL	DB {#DBNAME}: Checksum failures	Number of data page checksum failures detected in this database	DEPENDENT	pgsql.dbstat.checksumfailures.rate["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].checksum_failures` - MATCHESREGEX: `^\d*$` - CHANGEPERSECOND ⛔️ON_FAIL: `CUSTOM_VALUE -> -1`
PostgreSQL	DB {#DBNAME}: Disk blocks read time	Time spent reading data file blocks by backends, in milliseconds	DEPENDENT	pgsql.dbstat.blkreadtime.rate["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].blk_read_time` - MULTIPLIER: `0.001` - CHANGEPERSECOND
PostgreSQL	DB {#DBNAME}: Disk blocks write time	Time spent writing data file blocks by backends, in milliseconds	DEPENDENT	pgsql.dbstat.blkwritetime.rate["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].blk_write_time` - MULTIPLIER: `0.001` - CHANGEPERSECOND
PostgreSQL	DB {#DBNAME}: Num of accessexclusive locks	Number of accessexclusive locks for each database	DEPENDENT	pgsql.locks.accessexclusive["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].accessexclusive`
PostgreSQL	DB {#DBNAME}: Num of accessshare locks	Number of accessshare locks for each database	DEPENDENT	pgsql.locks.accessshare["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].accessshare`
PostgreSQL	DB {#DBNAME}: Num of exclusive locks	Number of exclusive locks for each database	DEPENDENT	pgsql.locks.exclusive["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].exclusive`
PostgreSQL	DB {#DBNAME}: Num of rowexclusive locks	Number of rowexclusive locks for each database	DEPENDENT	pgsql.locks.rowexclusive["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].rowexclusive`
PostgreSQL	DB {#DBNAME}: Num of rowshare locks	Number of rowshare locks for each database	DEPENDENT	pgsql.locks.rowshare["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].rowshare`
PostgreSQL	DB {#DBNAME}: Num of sharerowexclusive locks	Number of total sharerowexclusive for each database	DEPENDENT	pgsql.locks.sharerowexclusive["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].sharerowexclusive`
PostgreSQL	DB {#DBNAME}: Num of shareupdateexclusive locks	Number of shareupdateexclusive locks for each database	DEPENDENT	pgsql.locks.shareupdateexclusive["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].shareupdateexclusive`
PostgreSQL	DB {#DBNAME}: Num of share locks	Number of share locks for each database	DEPENDENT	pgsql.locks.share["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].share`
PostgreSQL	DB {#DBNAME}: Num of total locks	Number of total locks for each database	DEPENDENT	pgsql.locks.total["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].total`
PostgreSQL	DB {#DBNAME}: Queries max maintenance time	Max maintenance query time	DEPENDENT	pgsql.queries.mro.time_max["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].mro_time_max`
PostgreSQL	DB {#DBNAME}: Queries max query time	Max query time	DEPENDENT	pgsql.queries.query.time_max["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].query_time_max`
PostgreSQL	DB {#DBNAME}: Queries max transaction time	Max transaction query time	DEPENDENT	pgsql.queries.tx.time_max["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].tx_time_max`
PostgreSQL	DB {#DBNAME}: Queries slow maintenance count	Slow maintenance query count	DEPENDENT	pgsql.queries.mro.slow_count["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].mro_slow_count`
PostgreSQL	DB {#DBNAME}: Queries slow query count	Slow query count	DEPENDENT	pgsql.queries.query.slow_count["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].query_slow_count`
PostgreSQL	DB {#DBNAME}: Queries slow transaction count	Slow transaction query count	DEPENDENT	pgsql.queries.tx.slow_count["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].tx_slow_count`
PostgreSQL	DB {#DBNAME}: Queries sum maintenance time	Sum maintenance query time	DEPENDENT	pgsql.queries.mro.time_sum["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].mro_time_sum`
PostgreSQL	DB {#DBNAME}: Queries sum query time	Sum query time	DEPENDENT	pgsql.queries.query.time_sum["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].query_time_sum`
PostgreSQL	DB {#DBNAME}: Queries sum transaction time	Sum transaction query time	DEPENDENT	pgsql.queries.tx.time_sum["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].tx_time_sum`
Zabbix raw items	PostgreSQL: Get bgwriter	https://www.postgresql.org/docs/12/monitoring-stats.html#PG-STAT-BGWRITER-VIEW	ZABBIX_PASSIVE	pgsql.bgwriter["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"]
Zabbix raw items	PostgreSQL: Get archive	Collect archive status metrics	ZABBIX_PASSIVE	pgsql.archive["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"]
Zabbix raw items	PostgreSQL: Get dbstat	Collect all metrics from pgstatdatabase per database https://www.postgresql.org/docs/current/monitoring-stats.html#PG-STAT-DATABASE-VIEW	ZABBIX_PASSIVE	pgsql.dbstat["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"]
Zabbix raw items	PostgreSQL: Get dbstat sum	Collect all metrics from pgstatdatabase per database https://www.postgresql.org/docs/current/monitoring-stats.html#PG-STAT-DATABASE-VIEW	ZABBIX_PASSIVE	pgsql.dbstat.sum["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"]
Zabbix raw items	PostgreSQL: Get connections	Collect all metrics from pgstatactivity https://www.postgresql.org/docs/current/monitoring-stats.html#PG-STAT-ACTIVITY-VIEW	ZABBIX_PASSIVE	pgsql.connections["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"]
Zabbix raw items	PostgreSQL: Get WAL	Collect WAL metrics	ZABBIX_PASSIVE	pgsql.wal.stat["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"]
Zabbix raw items	PostgreSQL: Get locks	Collect all metrics from pg_locks per database https://www.postgresql.org/docs/current/explicit-locking.html#LOCKING-TABLES	ZABBIX_PASSIVE	pgsql.locks["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"]
Zabbix raw items	PostgreSQL: Get replication	Collect metrics from the pgstatreplication, which contains information about the WAL sender process, showing statistics about replication to that sender's connected standby server.	ZABBIX_PASSIVE	pgsql.replication.process["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"]
Zabbix raw items	PostgreSQL: Get queries	Collect all metrics by query execution time	ZABBIX_PASSIVE	pgsql.queries["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DATABASE}","{$PG.QUERY_ETIME.MAX.WARN}"]

Triggers

Name	Description	Expression	Severity
Dbstat: Checksum failures detected	Data page checksum failures were detected on that DB instance. https://www.postgresql.org/docs/current/checksums.html	`last(/PostgreSQL by Zabbix agent 2/pgsql.dbstat.sum.checksum_failures.rate)>0`	AVERAGE
Connections sum: Total number of connections is too high	-	`min(/PostgreSQL by Zabbix agent 2/pgsql.connections.total_pct,5m) > {$PG.CONN_TOTAL_PCT.MAX.WARN}`	AVERAGE
PostgreSQL: Oldest xid is too big	-	`last(/PostgreSQL by Zabbix agent 2/pgsql.oldest.xid["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"]) > 18000000`	AVERAGE
PostgreSQL: Service has been restarted	-	`last(/PostgreSQL by Zabbix agent 2/pgsql.uptime["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"]) < 600`	AVERAGE
PostgreSQL: Service is down	-	`last(/PostgreSQL by Zabbix agent 2/pgsql.ping["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"])=0`	HIGH
DB {#DBNAME}: Too many recovery conflicts	The primary and standby servers are in many ways loosely connected. Actions on the primary will have an effect on the standby. As a result, there is potential for negative interactions or conflicts between them. https://www.postgresql.org/docs/current/hot-standby.html#HOT-STANDBY-CONFLICT	`min(/PostgreSQL by Zabbix agent 2/pgsql.dbstat.conflicts.rate["{#DBNAME}"],5m) > {$PG.CONFLICTS.MAX.WARN:"{#DBNAME}"}`	AVERAGE
DB {#DBNAME}: Deadlock occurred	-	`min(/PostgreSQL by Zabbix agent 2/pgsql.dbstat.deadlocks.rate["{#DBNAME}"],5m) > {$PG.DEADLOCKS.MAX.WARN:"{#DBNAME}"}`	HIGH
DB {#DBNAME}: Checksum failures detected	Data page checksum failures were detected on that database. https://www.postgresql.org/docs/current/checksums.html	`last(/PostgreSQL by Zabbix agent 2/pgsql.dbstat.checksum_failures.rate["{#DBNAME}"])>0`	AVERAGE
DB {#DBNAME}: Too many slow queries	-	`min(/PostgreSQL by Zabbix agent 2/pgsql.queries.query.slow_count["{#DBNAME}"],5m)>{$PG.SLOW_QUERIES.MAX.WARN:"{#DBNAME}"}`	WARNING

Feedback

Please report any issues with the template at https://support.zabbix.com.

You can also provide feedback, discuss the template, or ask for help at ZABBIX forums.

db_postgresql

View README Download JSON

PostgreSQL by Zabbix agent

Overview

For Zabbix version: 6.2 and higher
Templates to monitor PostgreSQL by Zabbix. This template was tested on PostgreSQL versions 9.6, 10 and 11 on Linux and Windows.

Setup

Install Zabbix agent and create a read-only zbx_monitor user with proper access to your PostgreSQL server.

For PostgreSQL version 10 and above:

CREATE USER zbx_monitor WITH PASSWORD '<PASSWORD>' INHERIT;
GRANT pg_monitor TO zbx_monitor;

For PostgreSQL version 9.6 and below:

CREATE USER zbx_monitor WITH PASSWORD '<PASSWORD>';
GRANT SELECT ON pg_stat_database TO zbx_monitor;

-- To collect WAL metrics, the user must have a `superuser` role.
ALTER USER zbx_monitor WITH SUPERUSER;

Copy postgresql/ to Zabbix agent home directory /var/lib/zabbix/. The postgresql/ directory contains the files needed to obtain metrics from PostgreSQL.
Copy template_db_postgresql.conf to Zabbix agent configuration directory /etc/zabbix/zabbix_agentd.d/ and restart Zabbix agent service.
Edit pg_hba.conf to allow connections from Zabbix agent https://www.postgresql.org/docs/current/auth-pg-hba-conf.html.

Add rows (for example):
```
host all zbx_monitor 127.0.0.1/32 trust
host all zbx_monitor 0.0.0.0/0 md5
host all zbx_monitor ::0/0 md5
```
Import template file to Zabbix and link it to the target host
Set {$PG.HOST}, {$PG.PORT}, {$PG.USER}, {$PG.PASSWORD} and {$PG.DB} macros values.

Zabbix configuration

If PostgreSQL is installed from the PGDG repository, then add the path to pg_isready to the PATH environment variable for zabbix user.

Macros used

Name	Description	Default
{$PG.CACHE_HITRATIO.MIN.WARN}	-	`90`
{$PG.CHECKPOINTS_REQ.MAX.WARN}	-	`5`
{$PG.CONFLICTS.MAX.WARN}	-	`0`
{$PG.CONNIDLEIN_TRANS.MAX.WARN}	-	`5`
{$PG.CONNTOTALPCT.MAX.WARN}	-	`90`
{$PG.CONN_WAIT.MAX.WARN}	-	`0`
{$PG.DB}	-	`postgres`
{$PG.DEADLOCKS.MAX.WARN}	-	`0`
{$PG.FROZENXIDPCTSTOP.MIN.HIGH}	-	`75`
{$PG.HOST}	-	`127.0.0.1`
{$PG.LLD.FILTER.DBNAME}	-	`(.*)`
{$PG.LOCKS.MAX.WARN}	-	`100`
{$PG.PASSWORD}	Please set user's password in this macro.	``
{$PG.PING_TIME.MAX.WARN}	-	`1s`
{$PG.PORT}	-	`5432`
{$PG.QUERY_ETIME.MAX.WARN}	-	`30`
{$PG.REPL_LAG.MAX.WARN}	-	`10m`
{$PG.SLOW_QUERIES.MAX.WARN}	-	`5`
{$PG.TRANS_ACTIVE.MAX.WARN}	-	`30s`
{$PG.TRANS_IDLE.MAX.WARN}	-	`30s`
{$PG.TRANS_WAIT.MAX.WARN}	-	`30s`
{$PG.USER}	-	`zbx_monitor`

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Database discovery

Name	Description	Type	Key and additional info
Database discovery	-	ZABBIX_PASSIVE	pgsql.discovery.db["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"] Filter: - {#DBNAME} MATCHES_REGEX `{$PG.LLD.FILTER.DBNAME}`

ZABBIX_PASSIVE

pgsql.discovery.db["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]

Filter:

- {#DBNAME} MATCHES_REGEX {$PG.LLD.FILTER.DBNAME}

Items collected

Group	Name	Description	Type	Key and additional info
PostgreSQL	Bgwriter: Buffers allocated per second	Number of buffers allocated	DEPENDENT	pgsql.bgwriter.buffersalloc.rate Preprocessing: - JSONPATH: `$.buffers_alloc` - CHANGEPER_SECOND
PostgreSQL	Bgwriter: Buffers written directly by a backend per second	Number of buffers written directly by a backend	DEPENDENT	pgsql.bgwriter.buffersbackend.rate Preprocessing: - JSONPATH: `$.buffers_backend` - CHANGEPER_SECOND
PostgreSQL	Bgwriter: Buffers backend fsync per second	Number of times a backend had to execute its own fsync call (normally the background writer handles those even when the backend does its own write)	DEPENDENT	pgsql.bgwriter.buffersbackendfsync.rate Preprocessing: - JSONPATH: `$.buffers_backend_fsync` - CHANGEPERSECOND
PostgreSQL	Bgwriter: Buffers written during checkpoints per second	Number of buffers written during checkpoints	DEPENDENT	pgsql.bgwriter.bufferscheckpoint.rate Preprocessing: - JSONPATH: `$.buffers_checkpoint` - CHANGEPER_SECOND
PostgreSQL	Bgwriter: Buffers written by the background writer per second	Number of buffers written by the background writer	DEPENDENT	pgsql.bgwriter.buffersclean.rate Preprocessing: - JSONPATH: `$.buffers_clean` - CHANGEPER_SECOND
PostgreSQL	Bgwriter: Requested checkpoints per second	Number of requested checkpoints that have been performed	DEPENDENT	pgsql.bgwriter.checkpointsreq.rate Preprocessing: - JSONPATH: `$.checkpoints_req` - CHANGEPER_SECOND
PostgreSQL	Bgwriter: Scheduled checkpoints per second	Number of scheduled checkpoints that have been performed	DEPENDENT	pgsql.bgwriter.checkpointstimed.rate Preprocessing: - JSONPATH: `$.checkpoints_timed` - CHANGEPER_SECOND
PostgreSQL	Bgwriter: Checkpoint sync time	Total amount of time that has been spent in the portion of checkpoint processing where files are synchronized to disk	DEPENDENT	pgsql.bgwriter.checkpointsynctime Preprocessing: - JSONPATH: `$.checkpoint_sync_time` - MULTIPLIER: `0.001` - CHANGEPERSECOND
PostgreSQL	Bgwriter: Checkpoint write time	Total amount of time that has been spent in the portion of checkpoint processing where files are written to disk, in milliseconds	DEPENDENT	pgsql.bgwriter.checkpointwritetime Preprocessing: - JSONPATH: `$.checkpoint_write_time` - MULTIPLIER: `0.001` - CHANGEPERSECOND
PostgreSQL	Bgwriter: Max written per second	Number of times the background writer stopped a cleaning scan because it had written too many buffers	DEPENDENT	pgsql.bgwriter.maxwrittenclean.rate Preprocessing: - JSONPATH: `$.maxwritten_clean` - CHANGEPER_SECOND
PostgreSQL	Status: Cache hit ratio %	Cache hit ratio	ZABBIX_PASSIVE	pgsql.cache.hit["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]
PostgreSQL	Status: Config hash	PostgreSQL configuration hash	ZABBIX_PASSIVE	pgsql.config.hash["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1h`
PostgreSQL	Connections sum: Active	Total number of connections executing a query	DEPENDENT	pgsql.connections.sum.active Preprocessing: - JSONPATH: `$.active`
PostgreSQL	Connections sum: Idle	Total number of connections waiting for a new client command	DEPENDENT	pgsql.connections.sum.idle Preprocessing: - JSONPATH: `$.idle`
PostgreSQL	Connections sum: Idle in transaction	Total number of connections in a transaction state, but not executing a query	DEPENDENT	pgsql.connections.sum.idleintransaction Preprocessing: - JSONPATH: `$.idle_in_transaction`
PostgreSQL	Connections sum: Prepared	Total number of prepared transactions https://www.postgresql.org/docs/current/sql-prepare-transaction.html	DEPENDENT	pgsql.connections.sum.prepared Preprocessing: - JSONPATH: `$.prepared`
PostgreSQL	Connections sum: Total	Total number of connections	DEPENDENT	pgsql.connections.sum.total Preprocessing: - JSONPATH: `$.total`
PostgreSQL	Connections sum: Total %	Total number of connections in percentage	DEPENDENT	pgsql.connections.sum.total_pct Preprocessing: - JSONPATH: `$.total_pct`
PostgreSQL	Connections sum: Waiting	Total number of waiting connections https://www.postgresql.org/docs/current/monitoring-stats.html#WAIT-EVENT-TABLE	DEPENDENT	pgsql.connections.sum.waiting Preprocessing: - JSONPATH: `$.waiting`
PostgreSQL	Status: Ping time	-	ZABBIX_PASSIVE	pgsql.ping.time["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"] Preprocessing: - REGEX: `Time:\s+(\d+\.\d+)\s+ms \1` - MULTIPLIER: `0.001`
PostgreSQL	Status: Ping	-	ZABBIX_PASSIVE	pgsql.ping["{$PG.HOST}","{$PG.PORT}"] Preprocessing: - JAVASCRIPT: `return value.search(/accepting connections/)>0 ? 1 : 0` - DISCARDUNCHANGEDHEARTBEAT: `1h`
PostgreSQL	Replication: standby count	Number of standby servers	ZABBIX_PASSIVE	pgsql.replication.count["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]
PostgreSQL	Replication: lag in seconds	Replication lag with Master in seconds	ZABBIX_PASSIVE	pgsql.replication.lag.sec["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]
PostgreSQL	Replication: recovery role	Replication role: 1 — recovery is still in progress (standby mode), 0 — master mode.	ZABBIX_PASSIVE	pgsql.replication.recovery_role["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]
PostgreSQL	Replication: status	Replication status: 0 — streaming is down, 1 — streaming is up, 2 — master mode	ZABBIX_PASSIVE	pgsql.replication.status["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]
PostgreSQL	Transactions: Max active transaction time	Current max active transaction time	DEPENDENT	pgsql.transactions.active Preprocessing: - JSONPATH: `$.active`
PostgreSQL	Transactions: Max idle transaction time	Current max idle transaction time	DEPENDENT	pgsql.transactions.idle Preprocessing: - JSONPATH: `$.idle`
PostgreSQL	Transactions: Max prepared transaction time	Current max prepared transaction time	DEPENDENT	pgsql.transactions.prepared Preprocessing: - JSONPATH: `$.prepared`
PostgreSQL	Transactions: Max waiting transaction time	Current max waiting transaction time	DEPENDENT	pgsql.transactions.waiting Preprocessing: - JSONPATH: `$.waiting`
PostgreSQL	Status: Uptime	-	ZABBIX_PASSIVE	pgsql.uptime["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]
PostgreSQL	Status: Version	PostgreSQL version	ZABBIX_PASSIVE	pgsql.version["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1d`
PostgreSQL	WAL: Segments count	Number of WAL segments	DEPENDENT	pgsql.wal.count Preprocessing: - JSONPATH: `$.count`
PostgreSQL	WAL: Bytes written	WAL write in bytes	DEPENDENT	pgsql.wal.write Preprocessing: - JSONPATH: `$.write` - CHANGEPERSECOND
PostgreSQL	DB {#DBNAME}: Database size	Database size	ZABBIX_PASSIVE	pgsql.db.size["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}","{#DBNAME}"]
PostgreSQL	DB {#DBNAME}: Blocks hit per second	Total number of times disk blocks were found already in the buffer cache, so that a read was not necessary	DEPENDENT	pgsql.dbstat.blkshit.rate["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].blks_hit` - CHANGEPER_SECOND
PostgreSQL	DB {#DBNAME}: Disk blocks read per second	Total number of disk blocks read in this database	DEPENDENT	pgsql.dbstat.blksread.rate["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].blks_read` - CHANGEPER_SECOND
PostgreSQL	DB {#DBNAME}: Detected conflicts per second	Total number of queries canceled due to conflicts with recovery in this database	DEPENDENT	pgsql.dbstat.conflicts.rate["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].conflicts` - CHANGEPERSECOND
PostgreSQL	DB {#DBNAME}: Detected deadlocks per second	Total number of detected deadlocks in this database	DEPENDENT	pgsql.dbstat.deadlocks.rate["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].deadlocks` - CHANGEPERSECOND
PostgreSQL	DB {#DBNAME}: Temp_bytes written per second	Total amount of data written to temporary files by queries in this database	DEPENDENT	pgsql.dbstat.tempbytes.rate["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].temp_bytes` - CHANGEPER_SECOND
PostgreSQL	DB {#DBNAME}: Temp_files created per second	Total number of temporary files created by queries in this database	DEPENDENT	pgsql.dbstat.tempfiles.rate["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].temp_files` - CHANGEPER_SECOND
PostgreSQL	DB {#DBNAME}: Tuples deleted per second	Total number of rows deleted by queries in this database	DEPENDENT	pgsql.dbstat.tupdeleted.rate["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].tup_deleted` - CHANGEPER_SECOND
PostgreSQL	DB {#DBNAME}: Tuples fetched per second	Total number of rows fetched by queries in this database	DEPENDENT	pgsql.dbstat.tupfetched.rate["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].tup_fetched` - CHANGEPER_SECOND
PostgreSQL	DB {#DBNAME}: Tuples inserted per second	Total number of rows inserted by queries in this database	DEPENDENT	pgsql.dbstat.tupinserted.rate["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].tup_inserted` - CHANGEPER_SECOND
PostgreSQL	DB {#DBNAME}: Tuples returned per second	Total number of rows updated by queries in this database	DEPENDENT	pgsql.dbstat.tupreturned.rate["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].tup_returned` - CHANGEPER_SECOND
PostgreSQL	DB {#DBNAME}: Tuples updated per second	Total number of rows updated by queries in this database	DEPENDENT	pgsql.dbstat.tupupdated.rate["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].tup_updated` - CHANGEPER_SECOND
PostgreSQL	DB {#DBNAME}: Commits per second	Number of transactions in this database that have been committed	DEPENDENT	pgsql.dbstat.xactcommit.rate["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].xact_commit` - CHANGEPER_SECOND
PostgreSQL	DB {#DBNAME}: Rollbacks per second	Total number of transactions in this database that have been rolled back	DEPENDENT	pgsql.dbstat.xactrollback.rate["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].xact_rollback` - CHANGEPER_SECOND
PostgreSQL	DB {#DBNAME}: Frozen XID before avtovacuum %	reventing Transaction ID Wraparound Failures https://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-FOR-WRAPAROUND	DEPENDENT	pgsql.frozenxid.prcbeforeav["{#DBNAME}"] Preprocessing: - JSONPATH: `$.prc_before_av`
PostgreSQL	DB {#DBNAME}: Frozen XID before stop %	Preventing Transaction ID Wraparound Failures https://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-FOR-WRAPAROUND	DEPENDENT	pgsql.frozenxid.prcbeforestop["{#DBNAME}"] Preprocessing: - JSONPATH: `$.prc_before_stop`
PostgreSQL	DB {#DBNAME}: Locks total	Total number of locks in the database	DEPENDENT	pgsql.locks.total["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].total`
PostgreSQL	DB {#DBNAME}: Queries slow maintenance count	Slow maintenance query count	DEPENDENT	pgsql.queries.mro.slow_count["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].mro_slow_count`
PostgreSQL	DB {#DBNAME}: Queries max maintenance time	Max maintenance query time	DEPENDENT	pgsql.queries.mro.time_max["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].mro_time_max`
PostgreSQL	DB {#DBNAME}: Queries sum maintenance time	Sum maintenance query time	DEPENDENT	pgsql.queries.mro.time_sum["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].mro_time_sum`
PostgreSQL	DB {#DBNAME}: Queries slow query count	Slow query count	DEPENDENT	pgsql.queries.query.slow_count["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].query_slow_count`
PostgreSQL	DB {#DBNAME}: Queries max query time	Max query time	DEPENDENT	pgsql.queries.query.time_max["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].query_time_max`
PostgreSQL	DB {#DBNAME}: Queries sum query time	Sum query time	DEPENDENT	pgsql.queries.query.time_sum["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].query_time_sum`
PostgreSQL	DB {#DBNAME}: Queries slow transaction count	Slow transaction query count	DEPENDENT	pgsql.queries.tx.slow_count["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].tx_slow_count`
PostgreSQL	DB {#DBNAME}: Queries max transaction time	Max transaction query time	DEPENDENT	pgsql.queries.tx.time_max["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].tx_time_max`
PostgreSQL	DB {#DBNAME}: Queries sum transaction time	Sum transaction query time	DEPENDENT	pgsql.queries.tx.time_sum["{#DBNAME}"] Preprocessing: - JSONPATH: `$['{#DBNAME}'].tx_time_sum`
PostgreSQL	DB {#DBNAME}: Index scans per second	Number of index scans in the database	DEPENDENT	pgsql.scans.idx.rate["{#DBNAME}"] Preprocessing: - JSONPATH: `$.idx` - CHANGEPERSECOND
PostgreSQL	DB {#DBNAME}: Sequential scans per second	Number of sequential scans in the database	DEPENDENT	pgsql.scans.seq.rate["{#DBNAME}"] Preprocessing: - JSONPATH: `$.seq` - CHANGEPERSECOND
Zabbix raw items	PostgreSQL: Get bgwriter	Statistics about the background writer process's activity	ZABBIX_PASSIVE	pgsql.bgwriter["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]
Zabbix raw items	PostgreSQL: Get connections sum	Collect all metrics from pgstatactivity https://www.postgresql.org/docs/current/monitoring-stats.html#PG-STAT-ACTIVITY-VIEW	ZABBIX_PASSIVE	pgsql.connections.sum["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]
Zabbix raw items	PostgreSQL: Get dbstat	Collect all metrics from pgstatdatabase per database https://www.postgresql.org/docs/current/monitoring-stats.html#PG-STAT-DATABASE-VIEW	ZABBIX_PASSIVE	pgsql.dbstat["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]
Zabbix raw items	PostgreSQL: Get locks	Collect all metrics from pg_locks per database https://www.postgresql.org/docs/current/explicit-locking.html#LOCKING-TABLES	ZABBIX_PASSIVE	pgsql.locks["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]
Zabbix raw items	PostgreSQL: Get queries	Collect all metrics by query execution time	ZABBIX_PASSIVE	pgsql.queries["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}","{$PG.QUERY_ETIME.MAX.WARN}"]
Zabbix raw items	PostgreSQL: Get transactions	Collect metrics by transaction execution time	ZABBIX_PASSIVE	pgsql.transactions["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]
Zabbix raw items	PostgreSQL: Get WAL	Master item to collect WAL metrics	ZABBIX_PASSIVE	pgsql.wal.stat["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]
Zabbix raw items	DB {#DBNAME}: Get frozen XID	-	ZABBIX_PASSIVE	pgsql.frozenxid["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{#DBNAME}"]
Zabbix raw items	DB {#DBNAME}: Get scans	Number of scans done for table/index in the database	ZABBIX_PASSIVE	pgsql.scans["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{#DBNAME}"]

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
PostgreSQL: Required checkpoints occurs too frequently	Checkpoints are points in the sequence of transactions at which it is guaranteed that the heap and index data files have been updated with all information written before that checkpoint. At checkpoint time, all dirty data pages are flushed to disk and a special checkpoint record is written to the log file. https://www.postgresql.org/docs/current/wal-configuration.html	`last(/PostgreSQL by Zabbix agent/pgsql.bgwriter.checkpoints_req.rate) > {$PG.CHECKPOINTS_REQ.MAX.WARN}`	AVERAGE
PostgreSQL: Cache hit ratio too low	-	`max(/PostgreSQL by Zabbix agent/pgsql.cache.hit["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"],5m) < {$PG.CACHE_HITRATIO.MIN.WARN}`	WARNING
PostgreSQL: Configuration has changed	-	`last(/PostgreSQL by Zabbix agent/pgsql.config.hash["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"],#1)<>last(/PostgreSQL by Zabbix agent/pgsql.config.hash["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"],#2) and length(last(/PostgreSQL by Zabbix agent/pgsql.config.hash["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]))>0`	INFO
PostgreSQL: Total number of connections is too high	-	`min(/PostgreSQL by Zabbix agent/pgsql.connections.sum.total_pct,5m) > {$PG.CONN_TOTAL_PCT.MAX.WARN}`	AVERAGE
PostgreSQL: Response too long	-	`min(/PostgreSQL by Zabbix agent/pgsql.ping.time["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"],5m) > {$PG.PING_TIME.MAX.WARN}`	AVERAGE	Depends on: - PostgreSQL: Service is down
PostgreSQL: Service is down	-	`last(/PostgreSQL by Zabbix agent/pgsql.ping["{$PG.HOST}","{$PG.PORT}"]) = 0`	HIGH
PostgreSQL: Streaming lag with {#MASTER} is too high	-	`min(/PostgreSQL by Zabbix agent/pgsql.replication.lag.sec["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"],5m) > {$PG.REPL_LAG.MAX.WARN}`	AVERAGE
PostgreSQL: Replication is down	-	`max(/PostgreSQL by Zabbix agent/pgsql.replication.status["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"],5m)=0`	AVERAGE
PostgreSQL: Service has been restarted	PostgreSQL uptime is less than 10 minutes	`last(/PostgreSQL by Zabbix agent/pgsql.uptime["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]) < 10m`	INFO
PostgreSQL: Version has changed	-	`last(/PostgreSQL by Zabbix agent/pgsql.version["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"],#1)<>last(/PostgreSQL by Zabbix agent/pgsql.version["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"],#2) and length(last(/PostgreSQL by Zabbix agent/pgsql.version["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]))>0`	INFO
DB {#DBNAME}: Too many recovery conflicts	The primary and standby servers are in many ways loosely connected. Actions on the primary will have an effect on the standby. As a result, there is potential for negative interactions or conflicts between them. https://www.postgresql.org/docs/current/hot-standby.html#HOT-STANDBY-CONFLICT	`min(/PostgreSQL by Zabbix agent/pgsql.dbstat.conflicts.rate["{#DBNAME}"],5m) > {$PG.CONFLICTS.MAX.WARN:"{#DBNAME}"}`	AVERAGE
DB {#DBNAME}: Deadlock occurred	-	`min(/PostgreSQL by Zabbix agent/pgsql.dbstat.deadlocks.rate["{#DBNAME}"],5m) > {$PG.DEADLOCKS.MAX.WARN:"{#DBNAME}"}`	HIGH
DB {#DBNAME}: VACUUM FREEZE is required to prevent wraparound	Preventing Transaction ID Wraparound Failures https://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-FOR-WRAPAROUND	`last(/PostgreSQL by Zabbix agent/pgsql.frozenxid.prc_before_stop["{#DBNAME}"])<{$PG.FROZENXID_PCT_STOP.MIN.HIGH:"{#DBNAME}"}`	AVERAGE
DB {#DBNAME}: Number of locks is too high	-	`min(/PostgreSQL by Zabbix agent/pgsql.locks.total["{#DBNAME}"],5m)>{$PG.LOCKS.MAX.WARN:"{#DBNAME}"}`	WARNING
DB {#DBNAME}: Too many slow queries	-	`min(/PostgreSQL by Zabbix agent/pgsql.queries.query.slow_count["{#DBNAME}"],5m)>{$PG.SLOW_QUERIES.MAX.WARN:"{#DBNAME}"}`	WARNING
PostgreSQL: Failed to get items	Zabbix has not received data for items for the last 30 minutes	`nodata(/PostgreSQL by Zabbix agent/pgsql.bgwriter["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"],30m) = 1`	WARNING	Depends on: - PostgreSQL: Service is down

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

db_oracle_odbc

View README Download JSON

Oracle by ODBC

Overview

For Zabbix version: 6.2 and higher. The template is developed to monitor a single DBMS Oracle Database instance with ODBC.

This template was tested on:

Oracle Database, version 12c2, 18c, 19c

Setup

Create an Oracle DB user for monitoring:

CREATE USER zabbix_mon IDENTIFIED BY <PASSWORD>;
-- Grant access to the zabbix_mon user.
GRANT CONNECT, CREATE SESSION TO zabbix_mon;
GRANT SELECT_CATALOG_ROLE to zabbix_mon;
GRANT SELECT ON v_$instance TO zabbix_mon;
GRANT SELECT ON v_$database TO zabbix_mon;
GRANT SELECT ON v_$sysmetric TO zabbix_mon;
GRANT SELECT ON v_$system_parameter TO zabbix_mon;
GRANT SELECT ON v_$session TO zabbix_mon;
GRANT SELECT ON v_$recovery_file_dest TO zabbix_mon;
GRANT SELECT ON v_$active_session_history TO zabbix_mon;
GRANT SELECT ON v_$osstat TO zabbix_mon;
GRANT SELECT ON v_$restore_point TO zabbix_mon;
GRANT SELECT ON v_$process TO zabbix_mon;
GRANT SELECT ON v_$datafile TO zabbix_mon;
GRANT SELECT ON v_$pgastat TO zabbix_mon;
GRANT SELECT ON v_$sgastat TO zabbix_mon;
GRANT SELECT ON v_$log TO zabbix_mon;
GRANT SELECT ON v_$archive_dest TO zabbix_mon;
GRANT SELECT ON v_$asm_diskgroup TO zabbix_mon;
GRANT SELECT ON sys.dba_data_files TO zabbix_mon;
GRANT SELECT ON DBA_TABLESPACES TO zabbix_mon;
GRANT SELECT ON DBA_TABLESPACE_USAGE_METRICS TO zabbix_mon;
GRANT SELECT ON DBA_USERS TO zabbix_mon;

Note! Ensure that ODBC connects to Oracle with session parameter NLSNUMERICCHARACTERS= '.,'. It is important for displaying the float numbers in Zabbix correctly.

Install the ODBC driver on Zabbix server or Zabbix proxy. See the Oracle documentation for instructions.

Configure Zabbix server or Zabbix proxy for the usage of Oracle Environment:

Edit or add a new file:

/etc/sysconfig/zabbix-server # for server
/etc/sysconfig/zabbix-proxy # for proxy

Then, add:

export ORACLE_HOME=/usr/lib/oracle/19.6/client64
export PATH=$PATH:$ORACLE_HOME/bin
export LD_LIBRARY_PATH=$ORACLE_HOME/lib:/usr/lib64:/usr/lib:$ORACLE_HOME/bin
export TNS_ADMIN=$ORACLE_HOME/network/admin

Restart Zabbix server or Zabbix proxy.
Set the username and password in the host macros ({$ORACLE.USER} and {$ORACLE.PASSWORD}).
Set the {$ORACLE.DRIVER} and {$ORACLE.SERVICE} in the host macros. {$ORACLE.DRIVER} is a path to the driver location in OS. The "Service's TCP port state" item uses {HOST.CONN} and {$ORACLE.PORT} macros to check the availability of the listener.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$ORACLE.ASM.USED.PCT.MAX.HIGH}	The maximum percentage of used Automatic Storage Management (ASM) disk group for a high trigger expression.	`95`
{$ORACLE.ASM.USED.PCT.MAX.WARN}	The maximum percentage of used ASM disk group for a warning trigger expression.	`90`
{$ORACLE.CONCURRENCY.MAX.WARN}	The maximum percentage of sessions concurrency usage for a trigger expression.	`80`
{$ORACLE.DB.FILE.MAX.WARN}	The maximum percentage of used database files for a trigger expression.	`80`
{$ORACLE.DBNAME.MATCHES}	This macro is used in database discovery. It can be overridden on host level or its linked template level.	`.*`
{$ORACLE.DBNAME.NOT_MATCHES}	This macro is used in database discovery. It can be overridden on host level or its linked template level.	`PDB\$SEED`
{$ORACLE.DRIVER}	The Oracle driver path. For example: `/usr/lib/oracle/21/client64/lib/libsqora.so.21.1`	`<Put path to oracle driver here>`
{$ORACLE.EXPIRE.PASSWORD.MIN.WARN}	The number of warning days before the password expires for a trigger expression.	`7`
{$ORACLE.PASSWORD}	The Oracle user's password.	`<Put your password here>`
{$ORACLE.PGA.USE.MAX.WARN}	Alert threshold for the maximum percentage of the Program Global Area (PGA) usage for a trigger expression.	`90`
{$ORACLE.PORT}	Oracle DB TCP port.	`1521`
{$ORACLE.PROCESSES.MAX.WARN}	Alert threshold for the maximum percentage of active processes for a trigger expression.	`80`
{$ORACLE.REDO.MIN.WARN}	Alert threshold for the minimum number of REDO logs for a trigger expression.	`3`
{$ORACLE.SERVICE}	Oracle Service Name.	`<Put oracle service name here>`
{$ORACLE.SESSION.LOCK.MAX.TIME}	The maximum duration of the session lock in seconds to count the session as a prolongedly locked query.	`600`
{$ORACLE.SESSION.LONG.LOCK.MAX.WARN}	Alert threshold for the maximum number of the prolongedly locked sessions for a trigger expression.	`3`
{$ORACLE.SESSIONS.LOCK.MAX.WARN}	Alert threshold for the maximum percentage of locked sessions for a trigger expression.	`20`
{$ORACLE.SESSIONS.MAX.WARN}	Alert threshold for the maximum percentage of active sessions for a trigger expression.	`80`
{$ORACLE.SHARED.FREE.MIN.WARN}	Alert threshold for the minimum percentage of free shared pool for a trigger expression.	`5`
{$ORACLE.TABLESPACE.NAME.MATCHES}	This macro is used in tablespace discovery. It can be overridden on host level or its linked template level.	`.*`
{$ORACLE.TABLESPACE.NAME.NOT_MATCHES}	This macro is used in tablespace discovery. It can be overridden on host level or its linked template level.	`CHANGE_IF_NEEDED`
{$ORACLE.TBS.USED.PCT.MAX.HIGH}	High severity alert threshold for the maximum percentage of tablespace usage (used bytes/allocated bytes) for a trigger expression.	`95`
{$ORACLE.TBS.USED.PCT.MAX.WARN}	Warning severity alert threshold for the maximum percentage of tablespace usage (used bytes/allocated bytes) for a trigger expression.	`90`
{$ORACLE.TBS.UTIL.PCT.MAX.HIGH}	High severity alert threshold for the maximum percentage of tablespace utilization (allocated bytes/max bytes) for a trigger expression.	`90`
{$ORACLE.TBS.UTIL.PCT.MAX.WARN}	Warning severity alert threshold for the maximum percentage of tablespace utilization (allocated bytes/max bytes) for a trigger expression.	`80`
{$ORACLE.USER}	Oracle username.	`<Put your username here>`

Template links

There are no template links in this template.

Discovery rules

Name	Description	Type	Key and additional info
Archive log discovery	Destinations of the log archive.	ODBC	db.odbc.discovery[archivelog,,"Driver={$ORACLE.DRIVER};DBQ=//{HOST.CONN}:{$ORACLE.PORT}/{$ORACLE.SERVICE};"]
ASM disk groups discovery	The ASM disk groups.	ODBC	db.odbc.discovery[asm,,"Driver={$ORACLE.DRIVER};DBQ=//{HOST.CONN}:{$ORACLE.PORT}/{$ORACLE.SERVICE};"]
Database discovery	Scanning databases in the database management system (DBMS).	ODBC	db.odbc.discovery[dblist,,"Driver={$ORACLE.DRIVER};DBQ=//{HOST.CONN}:{$ORACLE.PORT}/{$ORACLE.SERVICE};"] Filter: AND - {#DBNAME} MATCHESREGEX `{$ORACLE.DBNAME.MATCHES}` - {#DBNAME} NOTMATCHESREGEX `{$ORACLE.DBNAME.NOT_MATCHES}`
PDB discovery	Scanning a pluggable database (PDB) in DBMS.	ODBC	db.odbc.discovery[pdblist,,"Driver={$ORACLE.DRIVER};DBQ=//{HOST.CONN}:{$ORACLE.PORT}/{$ORACLE.SERVICE};"] Filter: AND - {#DBNAME} MATCHESREGEX `{$ORACLE.DBNAME.MATCHES}` - {#DBNAME} NOTMATCHESREGEX `{$ORACLE.DBNAME.NOT_MATCHES}`
Tablespace discovery	Scanning tablespaces in DBMS.	ODBC	db.odbc.discovery[tbsname,,"Driver={$ORACLE.DRIVER};DBQ=//{HOST.CONN}:{$ORACLE.PORT}/{$ORACLE.SERVICE};"] Filter: AND - {#TABLESPACE} MATCHESREGEX `{$ORACLE.TABLESPACE.NAME.MATCHES}` - {#TABLESPACE} NOTMATCHES_REGEX `{$ORACLE.TABLESPACE.NAME.NOT_MATCHES}`

Items collected

Group	Name	Description	Type	Key and additional info
Oracle	Oracle: Service's TCP port state	It checks the availability of Oracle on the TCP port.	ZABBIX_PASSIVE	net.tcp.service[tcp,{HOST.CONN},{$ORACLE.PORT}] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
Oracle	Oracle: Number of LISTENER processes	The number of running LISTENER processes.	ZABBIX_PASSIVE	proc.num[,,,"tnslsnr LISTENER"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
Oracle	Oracle: Version	The Oracle Server version.	DEPENDENT	oracle.version Preprocessing: - JSONPATH: `$..VERSION.first()` - DISCARDUNCHANGEDHEARTBEAT: `1d`
Oracle	Oracle: Uptime	The Oracle instance uptime expressed in seconds.	DEPENDENT	oracle.uptime Preprocessing: - JSONPATH: `$..UPTIME.first()`
Oracle	Oracle: Instance status	The status of the instance.	DEPENDENT	oracle.instance_status Preprocessing: - JSONPATH: `$..STATUS.first()`
Oracle	Oracle: Archiver state	The status of automatic archiving.	DEPENDENT	oracle.archiver_state Preprocessing: - JSONPATH: `$..ARCHIVER.first()`
Oracle	Oracle: Instance name	The name of an instance.	DEPENDENT	oracle.instance_name Preprocessing: - JSONPATH: `$..INSTANCE_NAME.first()`
Oracle	Oracle: Instance hostname	The name of the host machine.	DEPENDENT	oracle.instance_hostname Preprocessing: - JSONPATH: `$..HOST_NAME.first()`
Oracle	Oracle: Instance role	It indicates whether the instance is an active instance or an inactive secondary instance.	DEPENDENT	oracle.instance.role Preprocessing: - JSONPATH: `$..INSTANCE_ROLE.first()`
Oracle	Oracle: Sessions limit	The user and system sessions.	DEPENDENT	oracle.session_limit Preprocessing: - JSONPATH: `$[?(@.METRIC=='SYSPARAM::Sessions')].VALUE.first()`
Oracle	Oracle: Datafiles limit	The maximum allowable number of datafiles.	DEPENDENT	oracle.dbfileslimit Preprocessing: - JSONPATH: `$[?(@.METRIC=='SYSPARAM::Db_Files')].VALUE.first()`
Oracle	Oracle: Processes limit	The maximum number of user processes.	DEPENDENT	oracle.processes_limit Preprocessing: - JSONPATH: `$[?(@.METRIC=='SYSPARAM::Processes')].VALUE.first()`
Oracle	Oracle: Number of processes		DEPENDENT	oracle.processes_count Preprocessing: - JSONPATH: `$[?(@.METRIC=='PROC::Procnum')].VALUE.first()`
Oracle	Oracle: Datafiles count	The current number of datafiles.	DEPENDENT	oracle.dbfilescount Preprocessing: - JSONPATH: `$[?(@.METRIC=='DATAFILE::Count')].VALUE.first()`
Oracle	Oracle: Buffer cache hit ratio	The ratio of buffer cache hits ((LogRead - PhyRead)/LogRead).	DEPENDENT	oracle.buffercachehit_ratio Preprocessing: - JSONPATH: `$[?(@.METRIC=='SYS::Buffer Cache Hit Ratio')].VALUE.first()`
Oracle	Oracle: Cursor cache hit ratio	The ratio of cursor cache hits (CursorCacheHit/SoftParse).	DEPENDENT	oracle.cursorcachehit_ratio Preprocessing: - JSONPATH: `$[?(@.METRIC=='SYS::Cursor Cache Hit Ratio')].VALUE.first()`
Oracle	Oracle: Library cache hit ratio	The ratio of library cache hits (Hits/Pins).	DEPENDENT	oracle.librarycachehit_ratio Preprocessing: - JSONPATH: `$[?(@.METRIC=='SYS::Library Cache Hit Ratio')].VALUE.first()`
Oracle	Oracle: Shared pool free %	Free memory of a shared pool expressed in %.	DEPENDENT	oracle.sharedpoolfree Preprocessing: - JSONPATH: `$[?(@.METRIC=='SYS::Shared Pool Free %')].VALUE.first()`
Oracle	Oracle: Physical reads per second	Reads per second.	DEPENDENT	oracle.physicalreadsrate Preprocessing: - JSONPATH: `$[?(@.METRIC=='SYS::Physical Reads Per Sec')].VALUE.first()`
Oracle	Oracle: Physical writes per second	Writes per second.	DEPENDENT	oracle.physicalwritesrate Preprocessing: - JSONPATH: `$[?(@.METRIC=='SYS::Physical Writes Per Sec')].VALUE.first()`
Oracle	Oracle: Physical reads bytes per second	Read bytes per second.	DEPENDENT	oracle.physicalreadbytes_rate Preprocessing: - JSONPATH: `$[?(@.METRIC=='SYS::Physical Read Bytes Per Sec')].VALUE.first()`
Oracle	Oracle: Physical writes bytes per second	Write bytes per second.	DEPENDENT	oracle.physicalwritebytes_rate Preprocessing: - JSONPATH: `$[?(@.METRIC=='SYS::Physical Write Bytes Per Sec')].VALUE.first()`
Oracle	Oracle: Enqueue timeouts per second	Enqueue timeouts per second.	DEPENDENT	oracle.enqueuetimeoutsrate Preprocessing: - JSONPATH: `$[?(@.METRIC=='SYS::Enqueue Timeouts Per Sec')].VALUE.first()`
Oracle	Oracle: GC CR block received per second	The global cache (GC) and the consistent read (CR) block received per second.	DEPENDENT	oracle.gccrblockreceivedrate Preprocessing: - JSONPATH: `$[?(@.METRIC=='SYS::GC CR Block Received Per Second')].VALUE.first()`
Oracle	Oracle: Global cache blocks corrupted	The number of blocks that encountered corruption or checksum failure during the interconnect.	DEPENDENT	oracle.cacheblockscorrupt Preprocessing: - JSONPATH: `$[?(@.METRIC=='SYS::Global Cache Blocks Corrupted')].VALUE.first()`
Oracle	Oracle: Global cache blocks lost	The number of lost global cache blocks.	DEPENDENT	oracle.cacheblockslost Preprocessing: - JSONPATH: `$[?(@.METRIC=='SYS::Global Cache Blocks Lost')].VALUE.first()`
Oracle	Oracle: Logons per second	The number of logon attempts.	DEPENDENT	oracle.logons_rate Preprocessing: - JSONPATH: `$[?(@.METRIC=='SYS::Logons Per Sec')].VALUE.first()`
Oracle	Oracle: Average active sessions	The average active sessions at a point in time. The number of sessions that are either working or waiting.	DEPENDENT	oracle.active_sessions Preprocessing: - JSONPATH: `$[?(@.METRIC=='SYS::Average Active Sessions')].VALUE.first()`
Oracle	Oracle: Session count	The count of sessions.	DEPENDENT	oracle.session_count Preprocessing: - JSONPATH: `$[?(@.METRIC=='SESSION::Total')].VALUE.first()`
Oracle	Oracle: Active user sessions	The number of active user sessions.	DEPENDENT	oracle.sessionactiveuser Preprocessing: - JSONPATH: `$[?(@.METRIC=='SESSION::Active User')].VALUE.first()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Oracle	Oracle: Active background sessions	The number of active background sessions.	DEPENDENT	oracle.sessionactivebackground Preprocessing: - JSONPATH: `$[?(@.METRIC=='SESSION::Active Background')].VALUE.first()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Oracle	Oracle: Inactive user sessions	The number of inactive user sessions.	DEPENDENT	oracle.sessioninactiveuser Preprocessing: - JSONPATH: `$[?(@.METRIC=='SESSION::Inactive User')].VALUE.first()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Oracle	Oracle: Sessions lock rate	The percentage of locked sessions. Locks are mechanisms that prevent destructive interaction between transactions accessing the same resource — either user objects, such as tables and rows or system objects not visible to users, such as shared data structures in memory and data dictionary rows.	DEPENDENT	oracle.sessionlockrate Preprocessing: - JSONPATH: `$[?(@.METRIC=='SESSION::Lock rate')].VALUE.first()`
Oracle	Oracle: Sessions locked over {$ORACLE.SESSION.LOCK.MAX.TIME}s	The count of the prolongedly locked sessions. (You can change the duration of maximum session lock in seconds for a query by {$ORACLE.SESSION.LOCK.MAX.TIME} macro. Default is 600 sec).	DEPENDENT	oracle.sessionlongtime_locked Preprocessing: - JSONPATH: `$[?(@.METRIC=='SESSION::Long time locked')].VALUE.first()`
Oracle	Oracle: Sessions concurrency	The percentage of concurrency. Concurrency is a DB behavior when different transactions request to change the same resource. In the case of modifying data transactions, it sequentially temporarily blocks the right to change the data, the rest of the transactions are waiting for the access. In the case when the access for the resource is locked for a long time, then the concurrency grows (like the transaction queue) and this often has an extremely negative impact on the performance. A high contention value does not indicate the root cause of the problem but is a signal to search for it.	DEPENDENT	oracle.sessionconcurrencyrate Preprocessing: - JSONPATH: `$[?(@.METRIC=='SESSION::Concurrency rate')].VALUE.first()`
Oracle	Oracle: User '{$ORACLE.USER}' expire password	The number of days before the password of Zabbix account expires.	DEPENDENT	oracle.userexpirepassword Preprocessing: - JSONPATH: `$[?(@.METRIC=='USER::Expire password')].VALUE.first()`
Oracle	Oracle: Active serial sessions	The number of active serial sessions.	DEPENDENT	oracle.activeserialsessions Preprocessing: - JSONPATH: `$[?(@.METRIC=='SYS::Active Serial Sessions')].VALUE.first()`
Oracle	Oracle: Active parallel sessions	The number of active parallel sessions.	DEPENDENT	oracle.activeparallelsessions Preprocessing: - JSONPATH: `$[?(@.METRIC=='SYS::Active Parallel Sessions')].VALUE.first()`
Oracle	Oracle: Long table scans per second	The number of long table scans per second. A table is considered 'long' if the table is not cached and if its high-water mark is greater than five blocks.	DEPENDENT	oracle.longtablescans_rate Preprocessing: - JSONPATH: `$[?(@.METRIC=='SYS::Long Table Scans Per Sec')].VALUE.first()`
Oracle	Oracle: SQL service response time	The Structured Query Language (SQL) service response time expressed in seconds.	DEPENDENT	oracle.serviceresponsetime Preprocessing: - JSONPATH: `$[?(@.METRIC=='SYS::SQL Service Response Time')].VALUE.first()` - MULTIPLIER: `0.01`
Oracle	Oracle: User rollbacks per second	The number of times that users manually issue the ROLLBACK statement or an error occurred during the users' transactions.	DEPENDENT	oracle.userrollbacksrate Preprocessing: - JSONPATH: `$[?(@.METRIC=='SYS::User Rollbacks Per Sec')].VALUE.first()`
Oracle	Oracle: Total sorts per user call	The total sorts per user call.	DEPENDENT	oracle.sortsperuser_call Preprocessing: - JSONPATH: `$[?(@.METRIC=='SYS::Total Sorts Per User Call')].VALUE.first()`
Oracle	Oracle: Rows per sort	The average number of rows per sort for all types of sorts performed.	DEPENDENT	oracle.rowspersort Preprocessing: - JSONPATH: `$[?(@.METRIC=='SYS::Rows Per Sort')].VALUE.first()`
Oracle	Oracle: Disk sort per second	The number of sorts going to disk per second.	DEPENDENT	oracle.disk_sorts Preprocessing: - JSONPATH: `$[?(@.METRIC=='SYS::Disk Sort Per Sec')].VALUE.first()`
Oracle	Oracle: Memory sorts ratio	The percentage of sorts (from ORDER BY clauses or index building) that are done to disk vs in-memory.	DEPENDENT	oracle.memorysortsratio Preprocessing: - JSONPATH: `$[?(@.METRIC=='SYS::Memory Sorts Ratio')].VALUE.first()`
Oracle	Oracle: Database wait time ratio	Wait time - the time that the server process spends waiting for available shared resources to be released by other server processes, such as latches, locks, data buffers, etc.	DEPENDENT	oracle.databasewaittime_ratio Preprocessing: - JSONPATH: `$[?(@.METRIC=='SYS::Database Wait Time Ratio')].VALUE.first()`
Oracle	Oracle: Database CPU time ratio	It is calculated by dividing the total CPU (used by the database) by the Oracle time model statistic DB time.	DEPENDENT	oracle.databasecputime_ratio Preprocessing: - JSONPATH: `$[?(@.METRIC=='SYS::Database CPU Time Ratio')].VALUE.first()`
Oracle	Oracle: Temp space used	Used temporary space.	DEPENDENT	oracle.tempspaceused Preprocessing: - JSONPATH: `$[?(@.METRIC=='SYS::Temp Space Used')].VALUE.first()`
Oracle	Oracle: PGA, Total inuse	It indicates how much the Program Global Area (PGA) memory is currently consumed by work areas. This number can be used to determine how much memory is consumed by other consumers of the PGA memory (for example, PL/SQL or Java).	DEPENDENT	oracle.totalpgaused Preprocessing: - JSONPATH: `$[?(@.METRIC=='PGA::Total Pga Inuse')].VALUE.first()`
Oracle	Oracle: PGA, Aggregate target parameter	The current value of the PGAAGGREGATETARGET initialization parameter. If this parameter is not set, then its value is 0 and automatic management of the PGA memory is disabled.	DEPENDENT	oracle.pga_target Preprocessing: - JSONPATH: `$[?(@.METRIC=='PGA::Aggregate Pga Target Parameter')].VALUE.first()`
Oracle	Oracle: PGA, Total allocated	The current amount of the PGA memory allocated by the instance. The Oracle Database attempts to keep this number below the value of the PGAAGGREGATETARGET initialization parameter. However, it is possible for the PGA allocated to exceed that value by a small percentage and for a short period of time when the work area workload is increasing very rapidly or when PGAAGGREGATETARGET is set to a small value.	DEPENDENT	oracle.totalpgaallocated Preprocessing: - JSONPATH: `$[?(@.METRIC=='PGA::Total Pga Allocated')].VALUE.first()`
Oracle	Oracle: PGA, Total freeable	The number of bytes of the PGA memory in all processes that could be freed back to the operating system.	DEPENDENT	oracle.totalpgafreeable Preprocessing: - JSONPATH: `$[?(@.METRIC=='PGA::Total Freeable Pga Memory')].VALUE.first()`
Oracle	Oracle: PGA, Global memory bound	The maximum size of work area executed in automatic mode.	DEPENDENT	oracle.pgaglobalbound Preprocessing: - JSONPATH: `$[?(@.METRIC=='PGA::Global Memory Bound')].VALUE.first()`
Oracle	Oracle: FRA, Space limit	The maximum amount of disk space (in bytes) that the database can use for the Fast Recovery Area (FRA).	DEPENDENT	oracle.fraspacelimit Preprocessing: - JSONPATH: `$[?(@.METRIC=='FRA::Space Limit')].VALUE.first()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Oracle	Oracle: FRA, Used space	The amount of disk space (in bytes) used by FRA files created in the current and all the previous FRAs.	DEPENDENT	oracle.fraspaceused Preprocessing: - JSONPATH: `$[?(@.METRIC=='FRA::Space Used')].VALUE.first()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Oracle	Oracle: FRA, Space reclaimable	The total amount of disk space (in bytes) that can be created by deleting obsolete, redundant, and other low priority files from the FRA.	DEPENDENT	oracle.fraspacereclaimable Preprocessing: - JSONPATH: `$[?(@.METRIC=='FRA::Space Reclaimable')].VALUE.first()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Oracle	Oracle: FRA, Number of files	The number of files in the FRA.	DEPENDENT	oracle.franumberoffiles Preprocessing: - JSONPATH: `$[?(@.METRIC=='FRA::Number Of Files')].VALUE.first()` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
Oracle	Oracle: FRA, Usable space in %		DEPENDENT	oracle.frausablepct Preprocessing: - JSONPATH: `$[?(@.METRIC=='FRA::Usable Pct')].VALUE.first()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Oracle	Oracle: FRA, Number of restore points		DEPENDENT	oracle.frarestorepoint Preprocessing: - JSONPATH: `$[?(@.METRIC=='FRA::Restore Point')].VALUE.first()`
Oracle	Oracle: SGA, java pool	The memory is allocated from the Java pool.	DEPENDENT	oracle.sgajavapool Preprocessing: - JSONPATH: `$[?(@.METRIC=='SGA::Java Pool')].VALUE.first()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Oracle	Oracle: SGA, large pool	The memory is allocated from a large pool.	DEPENDENT	oracle.sgalargepool Preprocessing: - JSONPATH: `$[?(@.METRIC=='SGA::Large Pool')].VALUE.first()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Oracle	Oracle: SGA, shared pool	The memory is allocated from a shared pool.	DEPENDENT	oracle.sgasharedpool Preprocessing: - JSONPATH: `$[?(@.METRIC=='SGA::Shared Pool')].VALUE.first()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Oracle	Oracle: SGA, log buffer	The number of bytes allocated for the redo log buffer.	DEPENDENT	oracle.sgalogbuffer Preprocessing: - JSONPATH: `$[?(@.METRIC=='SGA::Log_Buffer')].VALUE.first()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Oracle	Oracle: SGA, fixed	The fixed System Global Area (SGA) is an internal housekeeping area.	DEPENDENT	oracle.sgafixed Preprocessing: - JSONPATH: `$[?(@.METRIC=='SGA::Fixed_Sga')].VALUE.first()` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
Oracle	Oracle: SGA, buffer cache	The size of standard block cache.	DEPENDENT	oracle.sgabuffercache Preprocessing: - JSONPATH: `$[?(@.METRIC=='SGA::Buffer_Cache')].VALUE.first()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Oracle	Oracle: Redo logs available to switch	The number of inactive/unused redo logs available for log switching.	DEPENDENT	oracle.redologsavailable Preprocessing: - JSONPATH: `$[?(@.METRIC=='REDO::Available')].VALUE.first()`
Oracle	Oracle Database '{#DBNAME}': Open status	1 - 'MOUNTED'; 2 - 'READ WRITE'; 3 - 'READ ONLY'; 4 - 'READ ONLY WITH APPLY' (a physical standby database is open in real-time query mode).	DEPENDENT	oracle.dbopenmode["{#DBNAME}"] Preprocessing: - JSONPATH: `$.OPEN_MODE` - DISCARDUNCHANGEDHEARTBEAT: `15m`
Oracle	Oracle Database '{#DBNAME}': Role	The current role of the database where: 1 - 'SNAPSHOT STANDBY'; 2 - 'LOGICAL STANDBY'; 3 - 'PHYSICAL STANDBY'; 4 - 'PRIMARY '; 5 - 'FAR SYNC'.	DEPENDENT	oracle.dbrole["{#DBNAME}"] Preprocessing: - JSONPATH: `$.ROLE` - DISCARDUNCHANGED_HEARTBEAT: `15m`
Oracle	Oracle Database '{#DBNAME}': Log mode	The archive log mode where: 0 - 'NOARCHIVELOG'; 1 - 'ARCHIVELOG'; 2 - 'MANUAL'.	DEPENDENT	oracle.dblogmode["{#DBNAME}"] Preprocessing: - JSONPATH: `$.LOG_MODE` - DISCARDUNCHANGEDHEARTBEAT: `15m`
Oracle	Oracle Database '{#DBNAME}': Force logging	It indicates whether the database is under force logging mode 'YES' or 'NO'.	DEPENDENT	oracle.dbforcelogging["{#DBNAME}"] Preprocessing: - JSONPATH: `$.FORCE_LOGGING` - DISCARDUNCHANGEDHEARTBEAT: `15m`
Oracle	Oracle Database '{#DBNAME}': Open status	1 - 'MOUNTED'; 2 - 'READ WRITE'; 3 - 'READ ONLY'; 4 - 'READ ONLY WITH APPLY' (a physical standby database is open in real-time query mode).	DEPENDENT	oracle.pdbopenmode["{#DBNAME}"] Preprocessing: - JSONPATH: `$.OPEN_MODE` - DISCARDUNCHANGEDHEARTBEAT: `15m`
Oracle	Oracle TBS '{#TABLESPACE}': Tablespace allocated, bytes	Currently allocated bytes for the tablespace (sum of the current size of datafiles).	DEPENDENT	oracle.tbsallocbytes["{#TABLESPACE}"] Preprocessing: - JSONPATH: `$.FILE_BYTES`
Oracle	Oracle TBS '{#TABLESPACE}': Tablespace MAX size, bytes	The maximum size of the tablespace.	DEPENDENT	oracle.tbsmaxbytes["{#TABLESPACE}"] Preprocessing: - JSONPATH: `$.MAX_BYTES`
Oracle	Oracle TBS '{#TABLESPACE}': Tablespace used, bytes	Currently used bytes for the tablespace (current size of datafiles - the free space).	DEPENDENT	oracle.tbsusedbytes["{#TABLESPACE}"] Preprocessing: - JSONPATH: `$.USED_BYTES`
Oracle	Oracle TBS '{#TABLESPACE}': Tablespace free, bytes	Free bytes of the allocated space.	DEPENDENT	oracle.tbsfreebytes["{#TABLESPACE}"] Preprocessing: - JSONPATH: `$.FREE_BYTES`
Oracle	Oracle TBS '{#TABLESPACE}': Tablespace allocated, percent	Allocated bytes/max bytes*100.	DEPENDENT	oracle.tbsusedpct["{#TABLESPACE}"] Preprocessing: - JSONPATH: `$.USED_PCT_MAX`
Oracle	Oracle TBS '{#TABLESPACE}': Tablespace usage, percent	Used bytes/allocated bytes*100.	DEPENDENT	oracle.tbsusedfile_pct["{#TABLESPACE}"] Preprocessing: - JSONPATH: `$.USED_FILE_PCT`
Oracle	Oracle TBS '{#TABLESPACE}': Open status	The tablespace status where: 1 - 'ONLINE'; 2 - 'OFFLINE'; 3 - 'READ ONLY'.	DEPENDENT	oracle.tbs_status["{#TABLESPACE}"] Preprocessing: - JSONPATH: `$.STATUS`
Oracle	Archivelog '{#DEST_NAME}': Error	It displays the error message.	DEPENDENT	oracle.archivelogerror["{#DESTNAME}"] Preprocessing: - JSONPATH: `$.ERROR` - DISCARDUNCHANGEDHEARTBEAT: `1h`
Oracle	Archivelog '{#DEST_NAME}': Last sequence	It identifies the sequence number of the last archived redo log to be archived.	DEPENDENT	oracle.archiveloglogsequence["{#DEST_NAME}"] Preprocessing: - JSONPATH: `$.LOG_SEQUENCE`
Oracle	Archivelog '{#DEST_NAME}': Status	It identifies the current status of the destination where: 1 - 'VALID'; 2 - 'DEFERRED'; 3 - 'ERROR'; 0 - 'UNKNOWN'.	DEPENDENT	oracle.archiveloglogstatus["{#DESTNAME}"] Preprocessing: - JSONPATH: `$.STATUS` - DISCARDUNCHANGED_HEARTBEAT: `1h`
Oracle	ASM '{#DGNAME}': Total size	The total size of the ASM disk group.	DEPENDENT	oracle.asmtotalsize["{#DGNAME}"] Preprocessing: - JSONPATH: `$.SIZE_BYTE`
Oracle	ASM '{#DGNAME}': Free size	The free size of the ASM disk group.	DEPENDENT	oracle.asmfreesize["{#DGNAME}"] Preprocessing: - JSONPATH: `$.FREE_SIZE_BYTE`
Oracle	ASM '{#DGNAME}': Free size	Usage of the ASM disk group expressed in %.	DEPENDENT	oracle.asmusedpct["{#DGNAME}"] Preprocessing: - JSONPATH: `$.USED_PERCENT`
Zabbix raw items	Oracle: Get instance state	The item gets its state of the current instance.	ODBC	db.odbc.get[getinstancestate,,"Driver={$ORACLE.DRIVER};DBQ=//{HOST.CONN}:{$ORACLE.PORT}/{$ORACLE.SERVICE};"] Expression: `The text is too long. Please see the template.`
Zabbix raw items	Oracle: Get system metrics	The item gets the values of the system metrics.	ODBC	db.odbc.get[getsystemmetrics,,"Driver={$ORACLE.DRIVER};DBQ=//{HOST.CONN}:{$ORACLE.PORT}/{$ORACLE.SERVICE};"] Expression: `The text is too long. Please see the template.`
Zabbix raw items	Oracle Database '{#DBNAME}': Get CDB and No-CDB info	It gets the information about the container database (CDB) and non-CDB database on an instance.	ODBC	db.odbc.get[getcdb{#DBNAME}info,,"Driver={$ORACLE.DRIVER};DBQ=//{HOST.CONN}:{$ORACLE.PORT}/{$ORACLE.SERVICE};"] Preprocessing: - JSONPATH: `$.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` Expression: `The text is too long. Please see the template.`
Zabbix raw items	Oracle Database '{#DBNAME}': Get PDB info	It gets the information about the PDB database on an instance.	ODBC	db.odbc.get[getpdb{#DBNAME}info,,"Driver={$ORACLE.DRIVER};DBQ=//{HOST.CONN}:{$ORACLE.PORT}/{$ORACLE.SERVICE};"] Preprocessing: - JSONPATH: `$.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` Expression: `The text is too long. Please see the template.`
Zabbix raw items	Oracle TBS '{#TABLESPACE}': Get tablespaces stats	It gets the statistics of the tablespace.	ODBC	db.odbc.get[gettablespace{#TABLESPACE}stats,,"Driver={$ORACLE.DRIVER};DBQ=//{HOST.CONN}:{$ORACLE.PORT}/{$ORACLE.SERVICE};"] Preprocessing: - JSONPATH: `$.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` Expression: `The text is too long. Please see the template.`
Zabbix raw items	Archivelog '{#DEST_NAME}': Get archive log info	It gets the archivelog statistics.	ODBC	db.odbc.get[getarchivelog{#DESTNAME}stat,,"Driver={$ORACLE.DRIVER};DBQ=//{HOST.CONN}:{$ORACLE.PORT}/{$ORACLE.SERVICE};"] Preprocessing: - JSONPATH: `$.first()` ⛔️ON_FAIL: `DISCARD_VALUE ->` Expression: `The text is too long. Please see the template.`
Zabbix raw items	ASM '{#DGNAME}': Get ASM stats	It gets the ASM disk group statistics.	ODBC	db.odbc.get[getasm{#DGNAME}stat,,"Driver={$ORACLE.DRIVER};DBQ=//{HOST.CONN}:{$ORACLE.PORT}/{$ORACLE.SERVICE};"] Preprocessing: - JSONPATH: `$.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` Expression: `The text is too long. Please see the template.`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Oracle: Port {$ORACLE.PORT} is unavailable	The TCP port of the Oracle Server service is currently unavailable.	`max(/Oracle by ODBC/net.tcp.service[tcp,{HOST.CONN},{$ORACLE.PORT}],#3)=0 and max(/Oracle by ODBC/proc.num[,,,"tnslsnr LISTENER"],#3)>0`	DISASTER
Oracle: LISTENER process is not running	-	`max(/Oracle by ODBC/proc.num[,,,"tnslsnr LISTENER"],#3)=0`	DISASTER
Oracle: Version has changed	The Oracle DB version has changed. Acknowledge to close manually.	`last(/Oracle by ODBC/oracle.version,#1)<>last(/Oracle by ODBC/oracle.version,#2) and length(last(/Oracle by ODBC/oracle.version))>0`	INFO	Manual close: YES
Oracle: Host has been restarted	The host uptime is less than 10 minutes.	`last(/Oracle by ODBC/oracle.uptime)<10m`	INFO	Manual close: YES
Oracle: Failed to fetch info data	Zabbix has not received any data for the items for the last 5 minutes. The database might be unavailable for connecting.	`nodata(/Oracle by ODBC/oracle.uptime,5m)=1`	WARNING	Depends on: - Oracle: Port {$ORACLE.PORT} is unavailable
Oracle: Instance name has changed	The Oracle DB instance has changed. Ack to close manually.	`last(/Oracle by ODBC/oracle.instance_name,#1)<>last(/Oracle by ODBC/oracle.instance_name,#2) and length(last(/Oracle by ODBC/oracle.instance_name))>0`	INFO	Manual close: YES
Oracle: Instance hostname has changed	Oracle DB Instance hostname has changed. Ack to close.	`last(/Oracle by ODBC/oracle.instance_hostname,#1)<>last(/Oracle by ODBC/oracle.instance_hostname,#2) and length(last(/Oracle by ODBC/oracle.instance_hostname))>0`	INFO	Manual close: YES
Oracle: Too many active processes	Active processes are using more than {$ORACLE.PROCESSES.MAX.WARN}% of the available number of processes.	`min(/Oracle by ODBC/oracle.processes_count,5m) * 100 / last(/Oracle by ODBC/oracle.processes_limit) > {$ORACLE.PROCESSES.MAX.WARN}`	WARNING
Oracle: Too many database files	The number of datafiles is higher than {$ORACLE.DB.FILE.MAX.WARN}% of the available datafiles limit.	`min(/Oracle by ODBC/oracle.db_files_count,5m) * 100 / last(/Oracle by ODBC/oracle.db_files_limit) > {$ORACLE.DB.FILE.MAX.WARN}`	WARNING
Oracle: Shared pool free is too low	The free memory percent of the shared pool has been less than {$ORACLE.SHARED.FREE.MIN.WARN}% for the last 5 minutes.	`max(/Oracle by ODBC/oracle.shared_pool_free,5m)<{$ORACLE.SHARED.FREE.MIN.WARN}`	WARNING
Oracle: Too many active sessions	Active sessions are using more than {$ORACLE.SESSIONS.MAX.WARN}% of the available sessions.	`min(/Oracle by ODBC/oracle.session_count,5m) * 100 / last(/Oracle by ODBC/oracle.session_limit) > {$ORACLE.SESSIONS.MAX.WARN}`	WARNING
Oracle: Too many locked sessions	The number of locked sessions exceeds {$ORACLE.SESSIONS.LOCK.MAX.WARN}% of the running sessions.	`min(/Oracle by ODBC/oracle.session_lock_rate,5m) > {$ORACLE.SESSIONS.LOCK.MAX.WARN}`	WARNING
Oracle: Too many sessions locked	The number of locked sessions exceeding {$ORACLE.SESSION.LOCK.MAX.TIME} seconds is too high. Long-term locks can negatively affect the database performance. Therefore, if they are detected, you should first find the most difficult queries from the database point of view and then analyze possible resource leaks.	`min(/Oracle by ODBC/oracle.session_long_time_locked,5m) > {$ORACLE.SESSION.LONG.LOCK.MAX.WARN}`	WARNING
Oracle: Too high database concurrency	The concurrency rate exceeds {$ORACLE.CONCURRENCY.MAX.WARN}%. A high contention value does not indicate the root cause of the problem, but it is a signal to search for it. In the case of high competition, the analysis of resource consumption should be carried out. Which are the most "heavy" queries made in the database? Possibly, also session tracing. All this will help to determine the root cause and possible optimization points both in the database configuration and in the logic of building queries of the application itself.	`min(/Oracle by ODBC/oracle.session_concurrency_rate,5m) > {$ORACLE.CONCURRENCY.MAX.WARN}`	WARNING
Oracle: Zabbix account will expire soon	The password for Zabbix user in the database expires soon.	`last(/Oracle by ODBC/oracle.user_expire_password) < {$ORACLE.EXPIRE.PASSWORD.MIN.WARN}`	WARNING
Oracle: Total PGA inuse is too high	The total PGA in use is more than {$ORACLE.PGA.USE.MAX.WARN}% of PGAAGGREGATETARGET.	`min(/Oracle by ODBC/oracle.total_pga_used,5m) * 100 / last(/Oracle by ODBC/oracle.pga_target) > {$ORACLE.PGA.USE.MAX.WARN}`	WARNING
Oracle: Number of REDO logs available for switching is too low	The number of inactive/unused REDOs available for log switching is low (database down risk).	`max(/Oracle by ODBC/oracle.redo_logs_available,5m) < {$ORACLE.REDO.MIN.WARN}`	WARNING
Oracle Database '{#DBNAME}': Open status in mount mode	The Oracle DB is in a mounted state.	`last(/Oracle by ODBC/oracle.db_open_mode["{#DBNAME}"])=1`	WARNING
Oracle Database '{#DBNAME}': Open status has changed	The Oracle DB open status has changed. Ack to close manually.	`last(/Oracle by ODBC/oracle.db_open_mode["{#DBNAME}"],#1)<>last(/Oracle by ODBC/oracle.db_open_mode["{#DBNAME}"],#2)`	INFO	Manual close: YES Depends on: - Oracle Database '{#DBNAME}': Open status in mount mode
Oracle Database '{#DBNAME}': Role has changed	The Oracle DB role has changed. Ack to close manually.	`last(/Oracle by ODBC/oracle.db_role["{#DBNAME}"],#1)<>last(/Oracle by ODBC/oracle.db_role["{#DBNAME}"],#2)`	INFO	Manual close: YES
Oracle Database '{#DBNAME}': Force logging is deactivated for DB with active Archivelog	Force Logging mode - it is very important metric for Databases in 'ARCHIVELOG'. This feature allows to forcibly write all the transactions to the REDO.	`last(/Oracle by ODBC/oracle.db_force_logging["{#DBNAME}"]) = 0 and last(/Oracle by ODBC/oracle.db_log_mode["{#DBNAME}"]) = 1`	WARNING
Oracle Database '{#DBNAME}': Open status in mount mode	The Oracle DB is in a mounted state.	`last(/Oracle by ODBC/oracle.pdb_open_mode["{#DBNAME}"])=1`	WARNING
Oracle Database '{#DBNAME}': Open status has changed	The Oracle DB open status has changed. Ack to close manually.	`last(/Oracle by ODBC/oracle.pdb_open_mode["{#DBNAME}"],#1)<>last(/Oracle by ODBC/oracle.pdb_open_mode["{#DBNAME}"],#2)`	INFO	Manual close: YES
Oracle TBS '{#TABLESPACE}': Tablespace utilization is too high	-	`min(/Oracle by ODBC/oracle.tbs_used_pct["{#TABLESPACE}"],5m)>{$ORACLE.TBS.USED.PCT.MAX.WARN}`	WARNING	Depends on: - Oracle TBS '{#TABLESPACE}': Tablespace utilization is too high
Oracle TBS '{#TABLESPACE}': Tablespace utilization is too high	-	`min(/Oracle by ODBC/oracle.tbs_used_pct["{#TABLESPACE}"],5m)>{$ORACLE.TBS.UTIL.PCT.MAX.HIGH}`	HIGH
Oracle TBS '{#TABLESPACE}': Tablespace usage is too high	-	`min(/Oracle by ODBC/oracle.tbs_used_file_pct["{#TABLESPACE}"],5m)>{$ORACLE.TBS.USED.PCT.MAX.WARN}`	WARNING	Depends on: - Oracle TBS '{#TABLESPACE}': Tablespace usage is too high
Oracle TBS '{#TABLESPACE}': Tablespace usage is too high	-	`min(/Oracle by ODBC/oracle.tbs_used_file_pct["{#TABLESPACE}"],5m)>{$ORACLE.TBS.USED.PCT.MAX.HIGH}`	HIGH
Oracle TBS '{#TABLESPACE}': Tablespace is OFFLINE	The tablespace is in the offline state.	`last(/Oracle by ODBC/oracle.tbs_status["{#TABLESPACE}"])=2`	WARNING
Oracle TBS '{#TABLESPACE}': Tablespace status has changed	Oracle tablespace status has changed. Ack to close.	`last(/Oracle by ODBC/oracle.tbs_status["{#TABLESPACE}"],#1)<>last(/Oracle by ODBC/oracle.tbs_status["{#TABLESPACE}"],#2)`	INFO	Manual close: YES Depends on: - Oracle TBS '{#TABLESPACE}': Tablespace is OFFLINE
Archivelog '{#DEST_NAME}': Log Archive is not valid	The trigger will launch if the archive log destination is not in one of these states: 2 - 'DEFERRED'; 3 - 'VALID'."	`last(/Oracle by ODBC/oracle.archivelog_log_status["{#DEST_NAME}"])<2`	HIGH
ASM '{#DGNAME}': Disk group usage is too high	The usage of the ASM disk group expressed in % exceeds {$ORACLE.ASM.USED.PCT.MAX.WARN}.	`min(/Oracle by ODBC/oracle.asm_used_pct["{#DGNAME}"],5m)>{$ORACLE.ASM.USED.PCT.MAX.WARN}`	WARNING	Depends on: - ASM '{#DGNAME}': Disk group usage is too high
ASM '{#DGNAME}': Disk group usage is too high	The usage of the ASM disk group expressed in % exceeds {$ORACLE.ASM.USED.PCT.MAX.WARN}.	`min(/Oracle by ODBC/oracle.asm_used_pct["{#DGNAME}"],5m)>{$ORACLE.ASM.USED.PCT.MAX.HIGH}`	HIGH

Feedback

Please report any issues with the template at https://support.zabbix.com.

db_oracle_agent2

View README Download JSON

Oracle by Zabbix agent 2

Overview

For Zabbix version: 6.2 and higher. The template is developed to monitor a single DBMS Oracle Database instance with Zabbix agent 2.

This template was tested on:

Oracle Database, version 12c2, 18c, 19c

Setup

Setup and configure Zabbix agent 2 compiled with the Oracle monitoring plugin. See the setup instructions for Oracle Database plugin.
Set the {$ORACLE.CONNSTRING} macro value using either
If you want to override parameters from Zabbix agent configuration file, set the user name, password and service name in host macros ({$ORACLE.USER}, {$ORACLE.PASSWORD}, and {$ORACLE.SERVICE}).

Test availability: zabbix_get -s oracle-host -k oracle.ping["{$ORACLE.CONNSTRING}","{$ORACLE.USER}","{$ORACLE.PASSWORD}","{$ORACLE.SERVICE}"]

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$ORACLE.ASM.USED.PCT.MAX.HIGH}	The maximum percentage of used Automatic Storage Management (ASM) disk group for a high trigger expression.	`95`
{$ORACLE.ASM.USED.PCT.MAX.WARN}	The maximum percentage of used ASM disk group for a warning trigger expression.	`90`
{$ORACLE.CONCURRENCY.MAX.WARN}	The maximum percentage of sessions concurrency usage for a trigger expression.	`80`
{$ORACLE.CONNSTRING}	-	`tcp://localhost:1521`
{$ORACLE.DB.FILE.MAX.WARN}	The maximum percentage of used database files for a trigger expression.	`80`
{$ORACLE.DBNAME.MATCHES}	This macro is used in discovery of the database. It can be overridden on host level or its linked template level.	`.*`
{$ORACLE.DBNAME.NOT_MATCHES}	This macro is used in discovery of the database. It can be overridden on host level or its linked template level.	`PDB\$SEED`
{$ORACLE.EXPIRE.PASSWORD.MIN.WARN}	The number of warning days before the password expires for a trigger expression.	`7`
{$ORACLE.PASSWORD}	The Oracle user's password.	`zabbix_password`
{$ORACLE.PGA.USE.MAX.WARN}	The maximum percentage of the Program Global Area (PGA) usage that alerts the threshold for a trigger expression.	`90`
{$ORACLE.PROCESSES.MAX.WARN}	Alert threshold for the maximum percentage of active processes for a trigger expression.	`80`
{$ORACLE.REDO.MIN.WARN}	Alert threshold for the minimum number of REDO logs for a trigger expression.	`3`
{$ORACLE.SERVICE}	Oracle Service Name.	`ORA`
{$ORACLE.SESSION.LOCK.MAX.TIME}	The maximum duration of the session lock in seconds to count the session as a prolongedly locked query.	`600`
{$ORACLE.SESSION.LONG.LOCK.MAX.WARN}	Alert threshold for the maximum number of the prolongedly locked sessions for a trigger expression.	`3`
{$ORACLE.SESSIONS.LOCK.MAX.WARN}	Alert threshold for the maximum percentage of locked sessions for a trigger expression.	`20`
{$ORACLE.SESSIONS.MAX.WARN}	Alert threshold for the maximum percentage of active sessions for a trigger expression.	`80`
{$ORACLE.SHARED.FREE.MIN.WARN}	Alert threshold for the minimum percentage of free shared pool for a trigger expression.	`5`
{$ORACLE.TABLESPACE.NAME.MATCHES}	This macro is used in tablespace discovery. It can be overridden on host level or its linked template level.	`.*`
{$ORACLE.TABLESPACE.NAME.NOT_MATCHES}	This macro is used in tablespace discovery. It can be overridden on host level or its linked template level.	`CHANGE_IF_NEEDED`
{$ORACLE.TBS.USED.PCT.MAX.HIGH}	High severity alert threshold for the maximum percentage of tablespace usage (used bytes/allocated bytes) for a trigger expression.	`95`
{$ORACLE.TBS.USED.PCT.MAX.WARN}	Warning severity alert threshold for the maximum percentage of tablespace usage (used bytes/allocated bytes) for a trigger expression.	`90`
{$ORACLE.TBS.UTIL.PCT.MAX.HIGH}	High severity alert threshold for the maximum percentage of tablespace utilization (allocated bytes/max bytes) for a trigger expression.	`90`
{$ORACLE.TBS.UTIL.PCT.MAX.WARN}	Warning severity alert threshold for the maximum percentage of tablespace utilization (allocated bytes/max bytes) for a trigger expression.	`80`
{$ORACLE.USER}	Oracle username.	`zabbix`

Template links

There are no template links in this template.

Discovery rules

Name	Description	Type	Key and additional info
Archive log discovery	Destinations of the log archive.	ZABBIX_PASSIVE	oracle.archive.discovery["{$ORACLE.CONNSTRING}","{$ORACLE.USER}","{$ORACLE.PASSWORD}","{$ORACLE.SERVICE}"]
ASM disk groups discovery	The ASM disk groups.	ZABBIX_PASSIVE	oracle.diskgroups.discovery["{$ORACLE.CONNSTRING}","{$ORACLE.USER}","{$ORACLE.PASSWORD}","{$ORACLE.SERVICE}"]
Database discovery	Scanning databases in the database management system (DBMS).	ZABBIX_PASSIVE	oracle.db.discovery["{$ORACLE.CONNSTRING}","{$ORACLE.USER}","{$ORACLE.PASSWORD}","{$ORACLE.SERVICE}"] Filter: AND - {#DBNAME} MATCHESREGEX `{$ORACLE.DBNAME.MATCHES}` - {#DBNAME} NOTMATCHES_REGEX `{$ORACLE.DBNAME.NOT_MATCHES}`
PDB discovery	Scanning a pluggable database (PDB) in DBMS.	ZABBIX_PASSIVE	oracle.pdb.discovery["{$ORACLE.CONNSTRING}","{$ORACLE.USER}","{$ORACLE.PASSWORD}","{$ORACLE.SERVICE}"] Filter: AND - {#DBNAME} MATCHESREGEX `{$ORACLE.DBNAME.MATCHES}` - {#DBNAME} NOTMATCHES_REGEX `{$ORACLE.DBNAME.NOT_MATCHES}`
Tablespace discovery	Scanning tablespaces in DBMS.	ZABBIX_PASSIVE	oracle.ts.discovery["{$ORACLE.CONNSTRING}","{$ORACLE.USER}","{$ORACLE.PASSWORD}","{$ORACLE.SERVICE}"] Filter: AND - {#TABLESPACE} MATCHESREGEX `{$ORACLE.TABLESPACE.NAME.MATCHES}` - {#TABLESPACE} NOTMATCHES_REGEX `{$ORACLE.TABLESPACE.NAME.NOT_MATCHES}`

Items collected

Group	Name	Description	Type	Key and additional info
Oracle	Oracle: Ping	Test the connection to Oracle Database state.	ZABBIX_PASSIVE	oracle.ping["{$ORACLE.CONNSTRING}","{$ORACLE.USER}","{$ORACLE.PASSWORD}","{$ORACLE.SERVICE}"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
Oracle	Oracle: Version	The Oracle Server version.	DEPENDENT	oracle.version Preprocessing: - JSONPATH: `$.version` - DISCARDUNCHANGEDHEARTBEAT: `1d`
Oracle	Oracle: Uptime	The Oracle instance uptime expressed in seconds.	DEPENDENT	oracle.uptime Preprocessing: - JSONPATH: `$.uptime`
Oracle	Oracle: Instance status	The status of the instance.	DEPENDENT	oracle.instance_status Preprocessing: - JSONPATH: `$.status`
Oracle	Oracle: Archiver state	The status of automatic archiving.	DEPENDENT	oracle.archiver_state Preprocessing: - JSONPATH: `$..archiver.first()`
Oracle	Oracle: Instance name	The name of an instance.	DEPENDENT	oracle.instance_name Preprocessing: - JSONPATH: `$.instance`
Oracle	Oracle: Instance hostname	The name of the host machine.	DEPENDENT	oracle.instance_hostname Preprocessing: - JSONPATH: `$..hostname.first()`
Oracle	Oracle: Instance role	It indicates whether the instance is an active instance or an inactive secondary instance.	DEPENDENT	oracle.instance.role Preprocessing: - JSONPATH: `$.role`
Oracle	Oracle: Buffer cache hit ratio	The ratio of buffer cache hits ((LogRead - PhyRead)/LogRead).	DEPENDENT	oracle.buffercachehit_ratio Preprocessing: - JSONPATH: `$.['Buffer Cache Hit Ratio']`
Oracle	Oracle: Cursor cache hit ratio	The ratio of cursor cache hits (CursorCacheHit/SoftParse).	DEPENDENT	oracle.cursorcachehit_ratio Preprocessing: - JSONPATH: `$.['Cursor Cache Hit Ratio']`
Oracle	Oracle: Library cache hit ratio	The ratio of library cache hits (Hits/Pins).	DEPENDENT	oracle.librarycachehit_ratio Preprocessing: - JSONPATH: `$.['Library Cache Hit Ratio']`
Oracle	Oracle: Shared pool free %	Free memory of a shared pool expressed in %.	DEPENDENT	oracle.sharedpoolfree Preprocessing: - JSONPATH: `$.['Shared Pool Free %']`
Oracle	Oracle: Physical reads per second	Reads per second.	DEPENDENT	oracle.physicalreadsrate Preprocessing: - JSONPATH: `$.['Physical Reads Per Sec']`
Oracle	Oracle: Physical writes per second	Writes per second.	DEPENDENT	oracle.physicalwritesrate Preprocessing: - JSONPATH: `$.['Physical Writes Per Sec']`
Oracle	Oracle: Physical reads bytes per second	Read bytes per second.	DEPENDENT	oracle.physicalreadbytes_rate Preprocessing: - JSONPATH: `$.['Physical Read Bytes Per Sec']`
Oracle	Oracle: Physical writes bytes per second	Write bytes per second.	DEPENDENT	oracle.physicalwritebytes_rate Preprocessing: - JSONPATH: `$.['Physical Write Bytes Per Sec']`
Oracle	Oracle: Enqueue timeouts per second	Enqueue timeouts per second.	DEPENDENT	oracle.enqueuetimeoutsrate Preprocessing: - JSONPATH: `$.['Enqueue Timeouts Per Sec']`
Oracle	Oracle: GC CR block received per second	The global cache (GC) and the consistent read (CR) block received per second.	DEPENDENT	oracle.gccrblockreceivedrate Preprocessing: - JSONPATH: `$.['GC CR Block Received Per Second']`
Oracle	Oracle: Global cache blocks corrupted	The number of blocks that encountered corruption or checksum failure during the interconnect.	DEPENDENT	oracle.cacheblockscorrupt Preprocessing: - JSONPATH: `$.['Global Cache Blocks Corrupted']`
Oracle	Oracle: Global cache blocks lost	The number of lost global cache blocks.	DEPENDENT	oracle.cacheblockslost Preprocessing: - JSONPATH: `$.['Global Cache Blocks Lost']`
Oracle	Oracle: Logons per second	The number of logon attempts.	DEPENDENT	oracle.logons_rate Preprocessing: - JSONPATH: `$.['Logons Per Sec']`
Oracle	Oracle: Average active sessions	The average active sessions at a point in time. The number of sessions that are either working or waiting.	DEPENDENT	oracle.active_sessions Preprocessing: - JSONPATH: `$.['Average Active Sessions']`
Oracle	Oracle: Active serial sessions	The number of active serial sessions.	DEPENDENT	oracle.activeserialsessions Preprocessing: - JSONPATH: `$.['Active Serial Sessions']`
Oracle	Oracle: Active parallel sessions	The number of active parallel sessions.	DEPENDENT	oracle.activeparallelsessions Preprocessing: - JSONPATH: `$.['Active Parallel Sessions']`
Oracle	Oracle: Long table scans per second	The number of long table scans per second. A table is considered 'long' if the table is not cached and if its high-water mark is greater than five blocks.	DEPENDENT	oracle.longtablescans_rate Preprocessing: - JSONPATH: `$.['Long Table Scans Per Sec']`
Oracle	Oracle: SQL service response time	The Structured Query Language (SQL) service response time expressed in seconds.	DEPENDENT	oracle.serviceresponsetime Preprocessing: - JSONPATH: `$.['SQL Service Response Time']` - MULTIPLIER: `0.01`
Oracle	Oracle: User rollbacks per second	The number of times that users manually issue the ROLLBACK statement or an error occurred during the users' transactions.	DEPENDENT	oracle.userrollbacksrate Preprocessing: - JSONPATH: `$.['User Rollbacks Per Sec']`
Oracle	Oracle: Total sorts per user call	The total sorts per user call.	DEPENDENT	oracle.sortsperuser_call Preprocessing: - JSONPATH: `$.['Total Sorts Per User Call']`
Oracle	Oracle: Rows per sort	The average number of rows per sort for all types of sorts performed.	DEPENDENT	oracle.rowspersort Preprocessing: - JSONPATH: `$.['Rows Per Sort']`
Oracle	Oracle: Disk sort per second	The number of sorts going to disk per second.	DEPENDENT	oracle.disk_sorts Preprocessing: - JSONPATH: `$.['Disk Sort Per Sec']`
Oracle	Oracle: Memory sorts ratio	The percentage of sorts (from ORDER BY clauses or index building) that are done to disk vs in-memory.	DEPENDENT	oracle.memorysortsratio Preprocessing: - JSONPATH: `$.['Memory Sorts Ratio']`
Oracle	Oracle: Database wait time ratio	Wait time - the time that the server process spends waiting for available shared resources to be released by other server processes, such as latches, locks, data buffers, etc.	DEPENDENT	oracle.databasewaittime_ratio Preprocessing: - JSONPATH: `$.['Database Wait Time Ratio']`
Oracle	Oracle: Database CPU time ratio	It is calculated by dividing the total CPU (used by the database) by the Oracle time model statistic DB time.	DEPENDENT	oracle.databasecputime_ratio Preprocessing: - JSONPATH: `$.['Database CPU Time Ratio']`
Oracle	Oracle: Temp space used	Used temporary space.	DEPENDENT	oracle.tempspaceused Preprocessing: - JSONPATH: `$.['Temp Space Used']`
Oracle	Oracle: Sessions limit	The user and system sessions.	DEPENDENT	oracle.session_limit Preprocessing: - JSONPATH: `$.sessions`
Oracle	Oracle: Datafiles limit	The maximum allowable number of datafiles.	DEPENDENT	oracle.dbfileslimit Preprocessing: - JSONPATH: `$.db_files`
Oracle	Oracle: Processes limit	The maximum number of user processes.	DEPENDENT	oracle.processes_limit Preprocessing: - JSONPATH: `$.processes`
Oracle	Oracle: Session count	The count of sessions.	DEPENDENT	oracle.session_count Preprocessing: - JSONPATH: `$.total`
Oracle	Oracle: Active user sessions	The number of active user sessions.	DEPENDENT	oracle.sessionactiveuser Preprocessing: - JSONPATH: `$.active_user` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Oracle	Oracle: Active background sessions	The number of active background sessions.	DEPENDENT	oracle.sessionactivebackground Preprocessing: - JSONPATH: `$.active_background` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Oracle	Oracle: Inactive user sessions	The number of inactive user sessions.	DEPENDENT	oracle.sessioninactiveuser Preprocessing: - JSONPATH: `$.inactive_user` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Oracle	Oracle: Sessions lock rate	The percentage of locked sessions. Locks are mechanisms that prevent destructive interaction between transactions accessing the same resource — either user objects, such as tables and rows or system objects not visible to users, such as shared data structures in memory and data dictionary rows.	DEPENDENT	oracle.sessionlockrate Preprocessing: - JSONPATH: `$.lock_rate`
Oracle	Oracle: Sessions locked over {$ORACLE.SESSION.LOCK.MAX.TIME}s	The count of the prolongedly locked sessions. (You can change the duration of maximum session lock in seconds for a query by {$ORACLE.SESSION.LOCK.MAX.TIME} macro. Default is 600 sec).	DEPENDENT	oracle.sessionlongtime_locked Preprocessing: - JSONPATH: `$.long_time_locked`
Oracle	Oracle: Sessions concurrency	The percentage of concurrency. Concurrency is a DB behavior when different transactions request to change the same resource. In the case of modifying data transactions, it sequentially temporarily blocks the right to change the data, the rest of the transactions are waiting for the access. In the case when the access for the resource is locked for a long time, then the concurrency grows (like the transaction queue) and this often has an extremely negative impact on the performance. A high contention value does not indicate the root cause of the problem but is a signal to search for it.	DEPENDENT	oracle.sessionconcurrencyrate Preprocessing: - JSONPATH: `$.concurrency_rate`
Oracle	Oracle: PGA, Total inuse	It indicates how much the Program Global Area (PGA) memory is currently consumed by work areas. This number can be used to determine how much memory is consumed by other consumers of the PGA memory (for example, PL/SQL or Java).	DEPENDENT	oracle.totalpgaused Preprocessing: - JSONPATH: `$.['total PGA inuse']`
Oracle	Oracle: PGA, Aggregate target parameter	The current value of the PGAAGGREGATETARGET initialization parameter. If this parameter is not set, then its value is 0 and automatic management of the PGA memory is disabled.	DEPENDENT	oracle.pga_target Preprocessing: - JSONPATH: `$.['aggregate PGA target parameter']`
Oracle	Oracle: PGA, Total allocated	The current amount of the PGA memory allocated by the instance. The Oracle Database attempts to keep this number below the value of the PGAAGGREGATETARGET initialization parameter. However, it is possible for the PGA allocated to exceed that value by a small percentage and for a short period of time when the work area workload is increasing very rapidly or when PGAAGGREGATETARGET is set to a small value.	DEPENDENT	oracle.totalpgaallocated Preprocessing: - JSONPATH: `$.['total PGA allocated']`
Oracle	Oracle: PGA, Total freeable	The number of bytes of the PGA memory in all processes that could be freed back to the operating system.	DEPENDENT	oracle.totalpgafreeable Preprocessing: - JSONPATH: `$.['total freeable PGA memory']`
Oracle	Oracle: PGA, Global memory bound	The maximum size of work area executed in automatic mode.	DEPENDENT	oracle.pgaglobalbound Preprocessing: - JSONPATH: `$.['global memory bound']`
Oracle	Oracle: FRA, Space limit	The maximum amount of disk space (in bytes) that the database can use for the Fast Recovery Area (FRA).	DEPENDENT	oracle.fraspacelimit Preprocessing: - JSONPATH: `$.space_limit`
Oracle	Oracle: FRA, Used space	The amount of disk space (in bytes) used by FRA files created in the current and all the previous FRAs.	DEPENDENT	oracle.fraspaceused Preprocessing: - JSONPATH: `$.space_used`
Oracle	Oracle: FRA, Space reclaimable	The total amount of disk space (in bytes) that can be created by deleting obsolete, redundant, and other low priority files from the FRA.	DEPENDENT	oracle.fraspacereclaimable Preprocessing: - JSONPATH: `$.space_reclaimable`
Oracle	Oracle: FRA, Number of files	The number of files in the FRA.	DEPENDENT	oracle.franumberof_files Preprocessing: - JSONPATH: `$.number_of_files`
Oracle	Oracle: FRA, Usable space in %		DEPENDENT	oracle.frausablepct Preprocessing: - JSONPATH: `$.usable_pct`
Oracle	Oracle: FRA, Number of restore points		DEPENDENT	oracle.frarestorepoint Preprocessing: - JSONPATH: `$.restore_point`
Oracle	Oracle: SGA, java pool	The memory is allocated from the Java pool.	DEPENDENT	oracle.sgajavapool Preprocessing: - JSONPATH: `$.java_pool`
Oracle	Oracle: SGA, large pool	The memory is allocated from a large pool.	DEPENDENT	oracle.sgalargepool Preprocessing: - JSONPATH: `$.large_pool`
Oracle	Oracle: SGA, shared pool	The memory is allocated from a shared pool.	DEPENDENT	oracle.sgasharedpool Preprocessing: - JSONPATH: `$.shared_pool`
Oracle	Oracle: SGA, log buffer	The number of bytes allocated for the redo log buffer.	DEPENDENT	oracle.sgalogbuffer Preprocessing: - JSONPATH: `$.log_buffer`
Oracle	Oracle: SGA, fixed	The fixed System Global Area (SGA) is an internal housekeeping area.	DEPENDENT	oracle.sga_fixed Preprocessing: - JSONPATH: `$.fixed_sga`
Oracle	Oracle: SGA, buffer cache	The size of standard block cache.	DEPENDENT	oracle.sgabuffercache Preprocessing: - JSONPATH: `$.buffer_cache`
Oracle	Oracle: User's expire password	The number of days before the password of Zabbix account expires.	ZABBIX_PASSIVE	oracle.user.info["{$ORACLE.CONNSTRING}","{$ORACLE.USER}","{$ORACLE.PASSWORD}","{$ORACLE.SERVICE}"] Preprocessing: - JSONPATH: `$.exp_passwd_days_before`
Oracle	Oracle: Redo logs available to switch	The number of inactive/unused redo logs available for log switching.	ZABBIX_PASSIVE	oracle.redolog.info["{$ORACLE.CONNSTRING}","{$ORACLE.USER}","{$ORACLE.PASSWORD}","{$ORACLE.SERVICE}"] Preprocessing: - JSONPATH: `$.available`
Oracle	Oracle: Number of processes		ZABBIX_PASSIVE	oracle.proc.stats["{$ORACLE.CONNSTRING}","{$ORACLE.USER}","{$ORACLE.PASSWORD}","{$ORACLE.SERVICE}"] Preprocessing: - JSONPATH: `$.proc_num`
Oracle	Oracle: Datafiles count	The current number of datafiles.	ZABBIX_PASSIVE	oracle.datafiles.stats["{$ORACLE.CONNSTRING}","{$ORACLE.USER}","{$ORACLE.PASSWORD}","{$ORACLE.SERVICE}"] Preprocessing: - JSONPATH: `$.datafile_num`
Oracle	Oracle Database '{#DBNAME}': Open status	1 - 'MOUNTED'; 2 - 'READ WRITE'; 3 - 'READ ONLY'; 4 - 'READ ONLY WITH APPLY' (a physical standby database is open in real-time query mode).	DEPENDENT	oracle.dbopenmode["{#DBNAME}"] Preprocessing: - JSONPATH: `$..{#DBNAME}.open_mode.first()` - DISCARDUNCHANGEDHEARTBEAT: `15m`
Oracle	Oracle Database '{#DBNAME}': Role	The current role of the database where: 1 - 'SNAPSHOT STANDBY'; 2 - 'LOGICAL STANDBY'; 3 - 'PHYSICAL STANDBY'; 4 - 'PRIMARY '; 5 - 'FAR SYNC'.	DEPENDENT	oracle.dbrole["{#DBNAME}"] Preprocessing: - JSONPATH: `$..{#DBNAME}.role.first()` - DISCARDUNCHANGED_HEARTBEAT: `15m`
Oracle	Oracle Database '{#DBNAME}': Log mode	The archive log mode where: 0 - 'NOARCHIVELOG'; 1 - 'ARCHIVELOG'; 2 - 'MANUAL'.	DEPENDENT	oracle.dblogmode["{#DBNAME}"] Preprocessing: - JSONPATH: `$..{#DBNAME}.log_mode.first()` - DISCARDUNCHANGEDHEARTBEAT: `15m`
Oracle	Oracle Database '{#DBNAME}': Force logging	It indicates whether the database is under force logging mode 'YES' or 'NO'.	DEPENDENT	oracle.dbforcelogging["{#DBNAME}"] Preprocessing: - JSONPATH: `$..{#DBNAME}.force_logging.first()` - DISCARDUNCHANGEDHEARTBEAT: `15m`
Oracle	Oracle Database '{#DBNAME}': Open status	1 - 'MOUNTED'; 2 - 'READ WRITE'; 3 - 'READ ONLY'; 4 - 'READ ONLY WITH APPLY' (a physical standby database is open in real-time query mode).	DEPENDENT	oracle.pdbopenmode["{#DBNAME}"] Preprocessing: - JSONPATH: `$..{#DBNAME}.open_mode.first()` - DISCARDUNCHANGEDHEARTBEAT: `15m`
Oracle	Oracle TBS '{#TABLESPACE}': Tablespace allocated, bytes	Currently allocated bytes for the tablespace (sum of the current size of datafiles).	DEPENDENT	oracle.tbsallocbytes["{#TABLESPACE}"] Preprocessing: - JSONPATH: `$..['{#TABLESPACE}'].file_bytes.first()`
Oracle	Oracle TBS '{#TABLESPACE}': Tablespace MAX size, bytes	The maximum size of the tablespace.	DEPENDENT	oracle.tbsmaxbytes["{#TABLESPACE}"] Preprocessing: - JSONPATH: `$..['{#TABLESPACE}'].max_bytes.first()`
Oracle	Oracle TBS '{#TABLESPACE}': Tablespace used, bytes	Currently used bytes for the tablespace (current size of datafiles - the free space).	DEPENDENT	oracle.tbsusedbytes["{#TABLESPACE}"] Preprocessing: - JSONPATH: `$..['{#TABLESPACE}'].used_bytes.first()`
Oracle	Oracle TBS '{#TABLESPACE}': Tablespace free, bytes	Free bytes of the allocated space.	DEPENDENT	oracle.tbsfreebytes["{#TABLESPACE}"] Preprocessing: - JSONPATH: `$..['{#TABLESPACE}'].free_bytes.first()`
Oracle	Oracle TBS '{#TABLESPACE}': Tablespace usage, percent	Used bytes/allocated bytes*100.	DEPENDENT	oracle.tbsusedfile_pct["{#TABLESPACE}"] Preprocessing: - JSONPATH: `$..['{#TABLESPACE}'].used_file_pct.first()`
Oracle	Oracle TBS '{#TABLESPACE}': Tablespace allocated, percent	Allocated bytes/max bytes*100.	DEPENDENT	oracle.tbsusedpct["{#TABLESPACE}"] Preprocessing: - JSONPATH: `$..['{#TABLESPACE}'].used_pct_max.first()`
Oracle	Oracle TBS '{#TABLESPACE}': Open status	The tablespace status where: 1 - 'ONLINE'; 2 - 'OFFLINE'; 3 - 'READ ONLY'.	DEPENDENT	oracle.tbs_status["{#TABLESPACE}"] Preprocessing: - JSONPATH: `$..['{#TABLESPACE}'].status.first()`
Oracle	Archivelog '{#DEST_NAME}': Error	It displays the error message.	DEPENDENT	oracle.archivelogerror["{#DESTNAME}"] Preprocessing: - JSONPATH: `$..['{#DEST_NAME}'].error.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
Oracle	Archivelog '{#DEST_NAME}': Last sequence	It identifies the sequence number of the last archived redo log to be archived.	DEPENDENT	oracle.archiveloglogsequence["{#DEST_NAME}"] Preprocessing: - JSONPATH: `$..['{#DEST_NAME}'].log_sequence.first()`
Oracle	Archivelog '{#DEST_NAME}': Status	It identifies the current status of the destination where: 1 - 'VALID'; 2 - 'DEFERRED'; 3 - 'ERROR'; 0 - 'UNKNOWN'.	DEPENDENT	oracle.archiveloglogstatus["{#DESTNAME}"] Preprocessing: - JSONPATH: `$..['{#DEST_NAME}'].status.first()` - DISCARDUNCHANGED_HEARTBEAT: `1h`
Oracle	ASM '{#DGNAME}': Total size	The total size of the ASM disk group.	DEPENDENT	oracle.asmtotalsize["{#DGNAME}"] Preprocessing: - JSONPATH: `$..['{#DGNAME}'].total_bytes.first()`
Oracle	ASM '{#DGNAME}': Free size	The free size of the ASM disk group.	DEPENDENT	oracle.asmfreesize["{#DGNAME}"] Preprocessing: - JSONPATH: `$..['{#DGNAME}'].free_bytes.first()`
Oracle	ASM '{#DGNAME}': Free size	Usage of the ASM disk group expressed in %.	DEPENDENT	oracle.asmusedpct["{#DGNAME}"] Preprocessing: - JSONPATH: `$..['{#DGNAME}'].used_pct.first()`
Zabbix raw items	Oracle: Get instance state	The item gets its state of the current instance.	ZABBIX_PASSIVE	oracle.instance.info["{$ORACLE.CONNSTRING}","{$ORACLE.USER}","{$ORACLE.PASSWORD}","{$ORACLE.SERVICE}"]
Zabbix raw items	Oracle: Get system metrics	The item gets the values of the system metrics.	ZABBIX_PASSIVE	oracle.sys.metrics["{$ORACLE.CONNSTRING}","{$ORACLE.USER}","{$ORACLE.PASSWORD}","{$ORACLE.SERVICE}"]
Zabbix raw items	Oracle: Get system parameters	Get a set of system parameter values.	ZABBIX_PASSIVE	oracle.sys.params["{$ORACLE.CONNSTRING}","{$ORACLE.USER}","{$ORACLE.PASSWORD}","{$ORACLE.SERVICE}"]
Zabbix raw items	Oracle: Get sessions stats	Get sessions statistics. {$ORACLE.SESSION.LOCK.MAX.TIME} -- maximum seconds in the current wait condition for counting long time locked sessions. Default: 600 seconds.	ZABBIX_PASSIVE	oracle.sessions.stats["{$ORACLE.CONNSTRING}","{$ORACLE.USER}","{$ORACLE.PASSWORD}","{$ORACLE.SERVICE}","{$ORACLE.SESSION.LOCK.MAX.TIME}"]
Zabbix raw items	Oracle: Get PGA stats	Get PGA statistics.	ZABBIX_PASSIVE	oracle.pga.stats["{$ORACLE.CONNSTRING}","{$ORACLE.USER}","{$ORACLE.PASSWORD}","{$ORACLE.SERVICE}"]
Zabbix raw items	Oracle: Get FRA stats	Get FRA statistics.	ZABBIX_PASSIVE	oracle.fra.stats["{$ORACLE.CONNSTRING}","{$ORACLE.USER}","{$ORACLE.PASSWORD}","{$ORACLE.SERVICE}"]
Zabbix raw items	Oracle: Get SGA stats	Get SGA statistics.	ZABBIX_PASSIVE	oracle.sga.stats["{$ORACLE.CONNSTRING}","{$ORACLE.USER}","{$ORACLE.PASSWORD}","{$ORACLE.SERVICE}"]
Zabbix raw items	Oracle Database '{#DBNAME}': Get CDB and No-CDB info	It gets the information about the container database (CDB) and non-CDB database on an instance.	ZABBIX_PASSIVE	oracle.cdb.info["{$ORACLE.CONNSTRING}","{$ORACLE.USER}","{$ORACLE.PASSWORD}","{$ORACLE.SERVICE}","{#DBNAME}"]
Zabbix raw items	Oracle Database '{#DBNAME}': Get PDB info	It gets the information about the PDB database on an instance.	ZABBIX_PASSIVE	oracle.pdb.info["{$ORACLE.CONNSTRING}","{$ORACLE.USER}","{$ORACLE.PASSWORD}","{$ORACLE.SERVICE}","{#DBNAME}"]
Zabbix raw items	Oracle TBS '{#TABLESPACE}': Get tablespaces stats	It gets the statistics of the tablespace.	ZABBIX_PASSIVE	oracle.ts.stats["{$ORACLE.CONNSTRING}","{$ORACLE.USER}","{$ORACLE.PASSWORD}","{$ORACLE.SERVICE}","{#TABLESPACE}","{#CONTENTS}"]
Zabbix raw items	Archivelog '{#DEST_NAME}': Get archive log info	It gets the archivelog statistics.	ZABBIX_PASSIVE	oracle.archive.info["{$ORACLE.CONNSTRING}","{$ORACLE.USER}","{$ORACLE.PASSWORD}","{$ORACLE.SERVICE}","{#DEST_NAME}"]
Zabbix raw items	ASM '{#DGNAME}': Get ASM stats	It gets the ASM disk group statistics.	ZABBIX_PASSIVE	oracle.diskgroups.stats["{$ORACLE.CONNSTRING}","{$ORACLE.USER}","{$ORACLE.PASSWORD}","{$ORACLE.SERVICE}","{#DGNAME}"]

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Oracle: Connection to database is unavailable	Connection to Oracle Database is currently unavailable.	`last(/Oracle by Zabbix agent 2/oracle.ping["{$ORACLE.CONNSTRING}","{$ORACLE.USER}","{$ORACLE.PASSWORD}","{$ORACLE.SERVICE}"])=0`	DISASTER
Oracle: Version has changed	The Oracle DB version has changed. Acknowledge (Ack) to close manually.	`last(/Oracle by Zabbix agent 2/oracle.version,#1)<>last(/Oracle by Zabbix agent 2/oracle.version,#2) and length(last(/Oracle by Zabbix agent 2/oracle.version))>0`	INFO	Manual close: YES
Oracle: Failed to fetch info data	Zabbix has not received any data for the items for the last 5 minutes. The database might be unavailable for connecting.	`nodata(/Oracle by Zabbix agent 2/oracle.uptime,30m)=1`	INFO
Oracle: Host has been restarted	The host uptime is less than 10 minutes.	`last(/Oracle by Zabbix agent 2/oracle.uptime)<10m`	INFO	Manual close: YES
Oracle: Instance name has changed	Oracle DB Instance name has changed. Ack to close manually.	`last(/Oracle by Zabbix agent 2/oracle.instance_name,#1)<>last(/Oracle by Zabbix agent 2/oracle.instance_name,#2) and length(last(/Oracle by Zabbix agent 2/oracle.instance_name))>0`	INFO	Manual close: YES
Oracle: Instance hostname has changed	Oracle DB Instance hostname has changed. Ack to close.	`last(/Oracle by Zabbix agent 2/oracle.instance_hostname,#1)<>last(/Oracle by Zabbix agent 2/oracle.instance_hostname,#2) and length(last(/Oracle by Zabbix agent 2/oracle.instance_hostname))>0`	INFO	Manual close: YES
Oracle: Shared pool free is too low	The free memory percent of the shared pool has been less than {$ORACLE.SHARED.FREE.MIN.WARN}% for the last 5 minutes.	`max(/Oracle by Zabbix agent 2/oracle.shared_pool_free,5m)<{$ORACLE.SHARED.FREE.MIN.WARN}`	WARNING
Oracle: Too many active sessions	Active sessions are using more than {$ORACLE.SESSIONS.MAX.WARN}% of the available sessions.	`min(/Oracle by Zabbix agent 2/oracle.session_count,5m) * 100 / last(/Oracle by Zabbix agent 2/oracle.session_limit) > {$ORACLE.SESSIONS.MAX.WARN}`	WARNING
Oracle: Too many locked sessions	The number of locked sessions exceeds {$ORACLE.SESSIONS.LOCK.MAX.WARN}% of the running sessions.	`min(/Oracle by Zabbix agent 2/oracle.session_lock_rate,5m) > {$ORACLE.SESSIONS.LOCK.MAX.WARN}`	WARNING
Oracle: Too many sessions locked	The number of locked sessions exceeding {$ORACLE.SESSION.LOCK.MAX.TIME} seconds is too high. Long-term locks can negatively affect the database performance. Therefore, if they are detected, you should first find the most difficult queries from the database point of view and then analyze possible resource leaks.	`min(/Oracle by Zabbix agent 2/oracle.session_long_time_locked,5m) > {$ORACLE.SESSION.LONG.LOCK.MAX.WARN}`	WARNING
Oracle: Too high database concurrency	The concurrency rate exceeds {$ORACLE.CONCURRENCY.MAX.WARN}%. A high contention value does not indicate the root cause of the problem, but it is a signal to search for it. In the case of high competition, the analysis of resource consumption should be carried out. Which are the most "heavy" queries made in the database? Possibly, also session tracing. All this will help to determine the root cause and possible optimization points both in the database configuration and in the logic of building queries of the application itself.	`min(/Oracle by Zabbix agent 2/oracle.session_concurrency_rate,5m) > {$ORACLE.CONCURRENCY.MAX.WARN}`	WARNING
Oracle: Total PGA inuse is too high	The total PGA in use is more than {$ORACLE.PGA.USE.MAX.WARN}% of PGAAGGREGATETARGET.	`min(/Oracle by Zabbix agent 2/oracle.total_pga_used,5m) * 100 / last(/Oracle by Zabbix agent 2/oracle.pga_target) > {$ORACLE.PGA.USE.MAX.WARN}`	WARNING
Oracle: Zabbix account will expire soon	The password for Zabbix user in the database expires soon.	`last(/Oracle by Zabbix agent 2/oracle.user.info["{$ORACLE.CONNSTRING}","{$ORACLE.USER}","{$ORACLE.PASSWORD}","{$ORACLE.SERVICE}"]) < {$ORACLE.EXPIRE.PASSWORD.MIN.WARN}`	WARNING
Oracle: Number of REDO logs available for switching is too low	The number of inactive/unused REDOs available for log switching is low (database down risk).	`max(/Oracle by Zabbix agent 2/oracle.redolog.info["{$ORACLE.CONNSTRING}","{$ORACLE.USER}","{$ORACLE.PASSWORD}","{$ORACLE.SERVICE}"],5m) < {$ORACLE.REDO.MIN.WARN}`	WARNING
Oracle: Too many active processes	Active processes are using more than {$ORACLE.PROCESSES.MAX.WARN}% of the available number of processes.	`min(/Oracle by Zabbix agent 2/oracle.proc.stats["{$ORACLE.CONNSTRING}","{$ORACLE.USER}","{$ORACLE.PASSWORD}","{$ORACLE.SERVICE}"],5m) * 100 / last(/Oracle by Zabbix agent 2/oracle.processes_limit) > {$ORACLE.PROCESSES.MAX.WARN}`	WARNING
Oracle: Too many database files	The number of datafiles is higher than {$ORACLE.DB.FILE.MAX.WARN}% of the available datafiles limit.	`min(/Oracle by Zabbix agent 2/oracle.datafiles.stats["{$ORACLE.CONNSTRING}","{$ORACLE.USER}","{$ORACLE.PASSWORD}","{$ORACLE.SERVICE}"],5m) * 100 / last(/Oracle by Zabbix agent 2/oracle.db_files_limit) > {$ORACLE.DB.FILE.MAX.WARN}`	WARNING
Oracle Database '{#DBNAME}': Open status in mount mode	The Oracle DB is in a mounted state.	`last(/Oracle by Zabbix agent 2/oracle.db_open_mode["{#DBNAME}"])=1`	WARNING
Oracle Database '{#DBNAME}': Open status has changed	The Oracle DB open status has changed. Ack to close manually.	`last(/Oracle by Zabbix agent 2/oracle.db_open_mode["{#DBNAME}"],#1)<>last(/Oracle by Zabbix agent 2/oracle.db_open_mode["{#DBNAME}"],#2)`	INFO	Manual close: YES Depends on: - Oracle Database '{#DBNAME}': Open status in mount mode
Oracle Database '{#DBNAME}': Role has changed	The Oracle DB role has changed. Ack to close manually.	`last(/Oracle by Zabbix agent 2/oracle.db_role["{#DBNAME}"],#1)<>last(/Oracle by Zabbix agent 2/oracle.db_role["{#DBNAME}"],#2)`	INFO	Manual close: YES
Oracle Database '{#DBNAME}': Force logging is deactivated for DB with active Archivelog	Force Logging mode - it is very important metric for Databases in 'ARCHIVELOG'. This feature allows to forcibly write all the transactions to the REDO.	`last(/Oracle by Zabbix agent 2/oracle.db_force_logging["{#DBNAME}"]) = 0 and last(/Oracle by Zabbix agent 2/oracle.db_log_mode["{#DBNAME}"]) = 1`	WARNING
Oracle Database '{#DBNAME}': Open status in mount mode	The Oracle DB is in a mounted state.	`last(/Oracle by Zabbix agent 2/oracle.pdb_open_mode["{#DBNAME}"])=1`	WARNING
Oracle Database '{#DBNAME}': Open status has changed	The Oracle DB open status has changed. Ack to close manually.	`last(/Oracle by Zabbix agent 2/oracle.pdb_open_mode["{#DBNAME}"],#1)<>last(/Oracle by Zabbix agent 2/oracle.pdb_open_mode["{#DBNAME}"],#2)`	INFO	Manual close: YES
Oracle TBS '{#TABLESPACE}': Tablespace usage is too high	-	`min(/Oracle by Zabbix agent 2/oracle.tbs_used_file_pct["{#TABLESPACE}"],5m)>{$ORACLE.TBS.USED.PCT.MAX.WARN}`	WARNING	Depends on: - Oracle TBS '{#TABLESPACE}': Tablespace usage is too high
Oracle TBS '{#TABLESPACE}': Tablespace usage is too high	-	`min(/Oracle by Zabbix agent 2/oracle.tbs_used_file_pct["{#TABLESPACE}"],5m)>{$ORACLE.TBS.USED.PCT.MAX.HIGH}`	HIGH
Oracle TBS '{#TABLESPACE}': Tablespace utilization is too high	-	`min(/Oracle by Zabbix agent 2/oracle.tbs_used_pct["{#TABLESPACE}"],5m)>{$ORACLE.TBS.USED.PCT.MAX.WARN}`	WARNING	Depends on: - Oracle TBS '{#TABLESPACE}': Tablespace utilization is too high
Oracle TBS '{#TABLESPACE}': Tablespace utilization is too high	-	`min(/Oracle by Zabbix agent 2/oracle.tbs_used_pct["{#TABLESPACE}"],5m)>{$ORACLE.TBS.UTIL.PCT.MAX.HIGH}`	HIGH
Oracle TBS '{#TABLESPACE}': Tablespace is OFFLINE	The tablespace is in the offline state.	`last(/Oracle by Zabbix agent 2/oracle.tbs_status["{#TABLESPACE}"])=2`	WARNING
Oracle TBS '{#TABLESPACE}': Tablespace status has changed	Oracle tablespace status has changed. Ack to close.	`last(/Oracle by Zabbix agent 2/oracle.tbs_status["{#TABLESPACE}"],#1)<>last(/Oracle by Zabbix agent 2/oracle.tbs_status["{#TABLESPACE}"],#2)`	INFO	Manual close: YES Depends on: - Oracle TBS '{#TABLESPACE}': Tablespace is OFFLINE
Archivelog '{#DEST_NAME}': Log Archive is not valid	The trigger will launch if the archive log destination is not in one of these states: 2 - 'DEFERRED'; 3 - 'VALID'.	`last(/Oracle by Zabbix agent 2/oracle.archivelog_log_status["{#DEST_NAME}"])<2`	HIGH
ASM '{#DGNAME}': Disk group usage is too high	The usage of the ASM disk group expressed in % exceeds {$ORACLE.ASM.USED.PCT.MAX.WARN}	`min(/Oracle by Zabbix agent 2/oracle.asm_used_pct["{#DGNAME}"],5m)>{$ORACLE.ASM.USED.PCT.MAX.WARN}`	WARNING	Depends on: - ASM '{#DGNAME}': Disk group usage is too high
ASM '{#DGNAME}': Disk group usage is too high	The usage of the ASM disk group expressed in % exceeds {$ORACLE.ASM.USED.PCT.MAX.HIGH}	`min(/Oracle by Zabbix agent 2/oracle.asm_used_pct["{#DGNAME}"],5m)>{$ORACLE.ASM.USED.PCT.MAX.HIGH}`	HIGH

Feedback

Please report any issues with the template at https://support.zabbix.com.

db_mysql_odbc

View README Download JSON

MySQL by ODBC

Overview

For Zabbix version: 6.2 and higher
The template is developed for monitoring DBMS MySQL and its forks.

This template was tested on:

MySQL, version 5.7, 8.0
Percona, version 8.0
MariaDB, version 10.4

Setup

Create MySQL user for monitoring (<password> at your discretion):

CREATE USER 'zbx_monitor'@'%' IDENTIFIED BY '<password>';
GRANT REPLICATION CLIENT,PROCESS,SHOW DATABASES,SHOW VIEW ON *.* TO 'zbx_monitor'@'%';

For more information, please see MySQL documentation https://dev.mysql.com/doc/refman/8.0/en/grant.html

Set the username and password in host macros ({$MYSQL.USER} and {$MYSQL.PASSWORD}).

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$MYSQL.ABORTED_CONN.MAX.WARN}	Number of failed attempts to connect to the MySQL server for trigger expression.	`3`
{$MYSQL.BUFF_UTIL.MIN.WARN}	The minimum buffer pool utilization in percentage for trigger expression.	`50`
{$MYSQL.CREATEDTMPDISK_TABLES.MAX.WARN}	The maximum number of created tmp tables on a disk per second for trigger expressions.	`10`
{$MYSQL.CREATEDTMPFILES.MAX.WARN}	The maximum number of created tmp files on a disk per second for trigger expressions.	`10`
{$MYSQL.CREATEDTMPTABLES.MAX.WARN}	The maximum number of created tmp tables in memory per second for trigger expressions.	`30`
{$MYSQL.DSN}	System data source name.	`<Put your DSN here>`
{$MYSQL.INNODBLOGFILES}	Number of physical files in the InnoDB redo log for calculating innodblogfile_size.	`2`
{$MYSQL.PASSWORD}	MySQL user password.	`<Put your password here>`
{$MYSQL.REPL_LAG.MAX.WARN}	The lag of slave from master for trigger expression.	`30m`
{$MYSQL.SLOW_QUERIES.MAX.WARN}	Number of slow queries for trigger expression.	`3`
{$MYSQL.USER}	MySQL username.	`<Put your username here>`

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Database discovery

Name	Description	Type	Key and additional info
Database discovery	Scanning databases in DBMS.	ODBC	db.odbc.discovery[databases,"{$MYSQL.DSN}"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1d` Filter: ANDOR - {#DATABASE} NOTMATCHES_REGEX `information_schema`
MariaDB discovery	Additional metrics if MariaDB is used.	DEPENDENT	mysql.extra_metric.discovery Preprocessing: - JAVASCRIPT: `return JSON.stringify(value.search('MariaDB')>-1 ? [{'{#SINGLETON}': ''}] : []);`
Replication discovery	If "show slave status" returns Master_Host, "Replication: *" items are created.	ODBC	db.odbc.discovery[replication,"{$MYSQL.DSN}"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1d`

Scanning databases in DBMS.

ODBC

db.odbc.discovery[databases,"{$MYSQL.DSN}"]

Preprocessing:

- DISCARDUNCHANGEDHEARTBEAT: 1d

Filter:

ANDOR

- {#DATABASE} NOTMATCHES_REGEX information_schema

MariaDB discovery

Additional metrics if MariaDB is used.

DEPENDENT

mysql.extra_metric.discovery

Preprocessing:

- JAVASCRIPT: return JSON.stringify(value.search('MariaDB')>-1 ? [{'{#SINGLETON}': ''}] : []);

Replication discovery

If "show slave status" returns Master_Host, "Replication: *" items are created.

ODBC

db.odbc.discovery[replication,"{$MYSQL.DSN}"]

Preprocessing:

- DISCARDUNCHANGEDHEARTBEAT: 1d

Items collected

Group	Name	Description	Type	Key and additional info
MySQL	MySQL: Status		ODBC	db.odbc.select[ping,"{$MYSQL.DSN}"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m` Expression: `select "1"`
MySQL	MySQL: Version		ODBC	db.odbc.select[version,"{$MYSQL.DSN}"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `6h` Expression: `select version()`
MySQL	MySQL: Uptime	The amount of seconds that the server has been up.	DEPENDENT	mysql.uptime Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Uptime')].Value.first()`
MySQL	MySQL: Aborted clients per second	Number of connections that were aborted because the client died without closing the connection properly.	DEPENDENT	mysql.abortedclients.rate Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Aborted_clients')].Value.first()` - CHANGEPER_SECOND
MySQL	MySQL: Aborted connections per second	Number of failed attempts to connect to the MySQL server.	DEPENDENT	mysql.abortedconnects.rate Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Aborted_connects')].Value.first()` - CHANGEPER_SECOND
MySQL	MySQL: Connection errors accept per second	Number of errors that occurred during calls to accept() on the listening port.	DEPENDENT	mysql.connectionerrorsaccept.rate Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Connection_errors_accept')].Value.first()` - CHANGEPERSECOND
MySQL	MySQL: Connection errors internal per second	Number of refused connections due to internal server errors, for example, out of memory errors, or failed thread starts.	DEPENDENT	mysql.connectionerrorsinternal.rate Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Connection_errors_internal')].Value.first()` - CHANGEPERSECOND
MySQL	MySQL: Connection errors max connections per second	Number of refused connections due to the max_connections limit being reached.	DEPENDENT	mysql.connectionerrorsmaxconnections.rate Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Connection_errors_max_connections')].Value.first()` - CHANGEPER_SECOND
MySQL	MySQL: Connection errors peer address per second	Number of errors while searching for the connecting client IP address.	DEPENDENT	mysql.connectionerrorspeeraddress.rate Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Connection_errors_peer_address')].Value.first()` - CHANGEPER_SECOND
MySQL	MySQL: Connection errors select per second	Number of errors during calls to select() or poll() on the listening port. The client would not necessarily have been rejected in these cases.	DEPENDENT	mysql.connectionerrorsselect.rate Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Connection_errors_select')].Value.first()` - CHANGEPERSECOND
MySQL	MySQL: Connection errors tcpwrap per second	Number of connections the libwrap library has refused.	DEPENDENT	mysql.connectionerrorstcpwrap.rate Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Connection_errors_tcpwrap')].Value.first()` - CHANGEPERSECOND
MySQL	MySQL: Connections per second	Number of connection attempts (successful or not) to the MySQL server.	DEPENDENT	mysql.connections.rate Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Connections')].Value.first()` - CHANGEPERSECOND
MySQL	MySQL: Max used connections	The maximum number of connections that have been in use simultaneously since the server start.	DEPENDENT	mysql.maxusedconnections Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Max_used_connections')].Value.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
MySQL	MySQL: Threads cached	Number of threads in the thread cache.	DEPENDENT	mysql.threads_cached Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Threads_cached')].Value.first()`
MySQL	MySQL: Threads connected	Number of currently open connections.	DEPENDENT	mysql.threads_connected Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Threads_connected')].Value.first()`
MySQL	MySQL: Threads created per second	Number of threads created to handle connections. If Threadscreated is big, you may want to increase the threadcachesize value. The cache miss rate can be calculated as Threadscreated/Connections.	DEPENDENT	mysql.threadscreated.rate Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Threads_created')].Value.first()` - CHANGEPER_SECOND
MySQL	MySQL: Threads running	Number of threads which are not sleeping.	DEPENDENT	mysql.threads_running Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Threads_running')].Value.first()`
MySQL	MySQL: Buffer pool efficiency	The item shows how effectively the buffer pool is serving reads.	CALCULATED	mysql.bufferpoolefficiency Expression: `last(//mysql.innodb_buffer_pool_reads) / ( last(//mysql.innodb_buffer_pool_read_requests) + ( last(//mysql.innodb_buffer_pool_read_requests) = 0 ) ) * 100 * ( last(//mysql.innodb_buffer_pool_read_requests) > 0 )`
MySQL	MySQL: Buffer pool utilization	Ratio of used to total pages in the buffer pool.	CALCULATED	mysql.bufferpoolutilization Expression: `( last(//mysql.innodb_buffer_pool_pages_total) - last(//mysql.innodb_buffer_pool_pages_free) ) / ( last(//mysql.innodb_buffer_pool_pages_total) + ( last(//mysql.innodb_buffer_pool_pages_total) = 0 ) ) * 100 * ( last(//mysql.innodb_buffer_pool_pages_total) > 0 )`
MySQL	MySQL: Created tmp files on disk per second	How many temporary files mysqld has created.	DEPENDENT	mysql.createdtmpfiles.rate Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Created_tmp_files')].Value.first()` - CHANGEPERSECOND
MySQL	MySQL: Created tmp tables on disk per second	Number of internal on-disk temporary tables created by the server while executing statements.	DEPENDENT	mysql.createdtmpdisktables.rate Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Created_tmp_disk_tables')].Value.first()` - CHANGEPER_SECOND
MySQL	MySQL: Created tmp tables on memory per second	Number of internal temporary tables created by the server while executing statements.	DEPENDENT	mysql.createdtmptables.rate Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Created_tmp_tables')].Value.first()` - CHANGEPERSECOND
MySQL	MySQL: InnoDB buffer pool pages free	The total size of the InnoDB buffer pool, in pages.	DEPENDENT	mysql.innodbbufferpoolpagesfree Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Innodb_buffer_pool_pages_free')].Value.first()`
MySQL	MySQL: InnoDB buffer pool pages total	The total size of the InnoDB buffer pool, in pages.	DEPENDENT	mysql.innodbbufferpoolpagestotal Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Innodb_buffer_pool_pages_total')].Value.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
MySQL	MySQL: InnoDB buffer pool read requests per second	Number of logical read requests per second.	DEPENDENT	mysql.innodbbufferpoolreadrequests.rate Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Innodb_buffer_pool_read_requests')].Value.first()` - CHANGEPERSECOND
MySQL	MySQL: InnoDB buffer pool reads per second	Number of logical reads per second that InnoDB could not satisfy from the buffer pool, and had to read directly from the disk.	DEPENDENT	mysql.innodbbufferpoolreads.rate Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Innodb_buffer_pool_reads')].Value.first()` - CHANGEPER_SECOND
MySQL	MySQL: InnoDB row lock time	The total time spent in acquiring row locks for InnoDB tables, in milliseconds.	DEPENDENT	mysql.innodbrowlocktime Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Innodb_row_lock_time')].Value.first()` - MULTIPLIER: `0.001` - DISCARDUNCHANGED_HEARTBEAT: `1h`
MySQL	MySQL: InnoDB row lock time max	The maximum time to acquire a row lock for InnoDB tables, in milliseconds.	DEPENDENT	mysql.innodbrowlocktimemax Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Innodb_row_lock_time_max')].Value.first()` - MULTIPLIER: `0.001` - DISCARDUNCHANGEDHEARTBEAT: `1h`
MySQL	MySQL: InnoDB row lock waits	Number of times operations on InnoDB tables had to wait for a row lock.	DEPENDENT	mysql.innodbrowlock_waits Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Innodb_row_lock_waits')].Value.first()`
MySQL	MySQL: Slow queries per second	Number of queries that have taken more than longquerytime seconds.	DEPENDENT	mysql.slowqueries.rate Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Slow_queries')].Value.first()` - CHANGEPER_SECOND
MySQL	MySQL: Bytes received	Number of bytes received from all clients.	DEPENDENT	mysql.bytesreceived.rate Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Bytes_received')].Value.first()` - CHANGEPER_SECOND
MySQL	MySQL: Bytes sent	Number of bytes sent to all clients.	DEPENDENT	mysql.bytessent.rate Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Bytes_sent')].Value.first()` - CHANGEPER_SECOND
MySQL	MySQL: Command Delete per second	The Com_delete counter variable indicates the number of times the delete statement has been executed.	DEPENDENT	mysql.comdelete.rate Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Com_delete')].Value.first()` - CHANGEPER_SECOND
MySQL	MySQL: Command Insert per second	The Com_insert counter variable indicates the number of times the insert statement has been executed.	DEPENDENT	mysql.cominsert.rate Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Com_insert')].Value.first()` - CHANGEPER_SECOND
MySQL	MySQL: Command Select per second	The Com_select counter variable indicates the number of times the select statement has been executed.	DEPENDENT	mysql.comselect.rate Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Com_select')].Value.first()` - CHANGEPER_SECOND
MySQL	MySQL: Command Update per second	The Com_update counter variable indicates the number of times the update statement has been executed.	DEPENDENT	mysql.comupdate.rate Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Com_update')].Value.first()` - CHANGEPER_SECOND
MySQL	MySQL: Queries per second	Number of statements executed by the server. This variable includes statements executed within stored programs, unlike the Questions variable.	DEPENDENT	mysql.queries.rate Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Queries')].Value.first()` - CHANGEPERSECOND
MySQL	MySQL: Questions per second	Number of statements executed by the server. This includes only statements sent to the server by clients and not statements executed within stored programs, unlike the Queries variable.	DEPENDENT	mysql.questions.rate Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Questions')].Value.first()` - CHANGEPERSECOND
MySQL	MySQL: Binlog cache disk use	Number of transactions that used a temporary disk cache because they could not fit in the regular binary log cache, being larger than binlogcachesize.	DEPENDENT	mysql.binlogcachediskuse Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Binlog_cache_disk_use')].Value.first()` - DISCARDUNCHANGED_HEARTBEAT: `6h`
MySQL	MySQL: Innodb buffer pool wait free	Number of times InnoDB waited for a free page before reading or creating a page. Normally, writes to the InnoDB buffer pool happen in the background. When no clean pages are available, dirty pages are flushed first in order to free some up. This counts the numbers of wait for this operation to finish. If this value is not small, look at the increasing innodbbufferpool_size.	DEPENDENT	mysql.innodbbufferpoolwaitfree Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Innodb_buffer_pool_wait_free')].Value.first()` - DISCARDUNCHANGEDHEARTBEAT: `6h`
MySQL	MySQL: Innodb number open files	Number of open files held by InnoDB. InnoDB only.	DEPENDENT	mysql.innodbnumopenfiles Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Innodb_num_open_files')].Value.first()` - DISCARDUNCHANGED_HEARTBEAT: `6h`
MySQL	MySQL: Open table definitions	Number of cached table definitions.	DEPENDENT	mysql.opentabledefinitions Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Open_table_definitions')].Value.first()` - DISCARDUNCHANGEDHEARTBEAT: `6h`
MySQL	MySQL: Open tables	Number of tables that are open.	DEPENDENT	mysql.opentables Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Open_tables')].Value.first()` - DISCARDUNCHANGED_HEARTBEAT: `6h`
MySQL	MySQL: Innodb log written	Number of bytes written to the InnoDB log.	DEPENDENT	mysql.innodboslogwritten Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Innodb_os_log_written')].Value.first()` - DISCARDUNCHANGED_HEARTBEAT: `6h`
MySQL	MySQL: Calculated value of innodblogfile_size	Calculated by (innodboslogwritten-innodboslogwritten(time shift -1h))/{$MYSQL.INNODBLOGFILES} value of the innodblogfilesize. Innodblogfilesize is the size in bytes of the each InnoDB redo log file in the log group. The combined size can be no more than 512GB. Larger values mean less disk I/O due to less flushing checkpoint activity, but also slower recovery from a crash.	CALCULATED	mysql.innodblogfilesize Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `6h` Expression: `(last(//mysql.innodb_os_log_written) - last(//mysql.innodb_os_log_written,#1:now-1h)) / {$MYSQL.INNODB_LOG_FILES}`
MySQL	MySQL: Size of database {#DATABASE}	-	ODBC	db.odbc.select[{#DATABASE}size,"{$MYSQL.DSN}"] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `1h` Expression: `The text is too long. Please see the template.`
MySQL	MySQL: Replication Slave SQL Running State {#MASTER_HOST}	This shows the state of the SQL driver threads.	DEPENDENT	mysql.slavesqlrunningstate["{#MASTERHOST}"] Preprocessing: - JSONPATH: `$.[?(@.Master_Host=='{#MASTER_HOST}')]['Slave_SQL_Running_State'].first()` - DISCARDUNCHANGEDHEARTBEAT: `6h`
MySQL	MySQL: Replication Seconds Behind Master {#MASTER_HOST}	The amount of seconds the slave SQL thread has been behind processing the master binary log. A high number (or an increasing one) can indicate that the slave is unable to handle events from the master in a timely fashion.	DEPENDENT	mysql.secondsbehindmaster["{#MASTERHOST}"] Preprocessing: - JSONPATH: `$.[?(@.Master_Host=='{#MASTER_HOST}')]['Seconds_Behind_Master'].first()` - MATCHESREGEX: `\d+` ⛔️ONFAIL: `CUSTOM_ERROR -> Replication is not performed.` - DISCARDUNCHANGED_HEARTBEAT: `1h`
MySQL	MySQL: Replication Slave IO Running {#MASTER_HOST}	Whether the I/O thread for reading the master's binary log is running. Normally, you want this to be Yes unless you have not yet started a replication or have explicitly stopped it with STOP SLAVE.	DEPENDENT	mysql.slaveiorunning["{#MASTERHOST}"] Preprocessing: - JSONPATH: `$.[?(@.Master_Host=='{#MASTER_HOST}')]['Slave_IO_Running'].first()` - DISCARDUNCHANGED_HEARTBEAT: `1h`
MySQL	MySQL: Replication Slave SQL Running {#MASTER_HOST}	Whether the SQL thread for executing events in the relay log is running. As with the I/O thread, this should normally be Yes.	DEPENDENT	mysql.slavesqlrunning["{#MASTERHOST}"] Preprocessing: - JSONPATH: `$.[?(@.Master_Host=='{#MASTER_HOST}')]['Slave_SQL_Running'].first()` - DISCARDUNCHANGED_HEARTBEAT: `1h`
MySQL	MySQL: Binlog commits	Total number of transactions committed to the binary log.	DEPENDENT	mysql.binlog_commits[{#SINGLETON}] Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Binlog_commits')].Value.first()`
MySQL	MySQL: Binlog group commits	Total number of group commits done to the binary log.	DEPENDENT	mysql.binloggroupcommits[{#SINGLETON}] Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Binlog_group_commits')].Value.first()`
MySQL	MySQL: Master GTID wait count	The number of times MASTERGTIDWAIT called.	DEPENDENT	mysql.mastergtidwaitcount[{#SINGLETON}] Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Master_gtid_wait_count')].Value.first()` - DISCARDUNCHANGED_HEARTBEAT: `6h`
MySQL	MySQL: Master GTID wait time	Total number of time spent in MASTERGTIDWAIT.	DEPENDENT	mysql.mastergtidwaittime[{#SINGLETON}] Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Master_gtid_wait_time')].Value.first()` - DISCARDUNCHANGED_HEARTBEAT: `6h`
MySQL	MySQL: Master GTID wait timeouts	Number of timeouts occurring in MASTERGTIDWAIT.	DEPENDENT	mysql.mastergtidwaittimeouts[{#SINGLETON}] Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Master_gtid_wait_timeouts')].Value.first()` - DISCARDUNCHANGED_HEARTBEAT: `6h`
Zabbix raw items	MySQL: Get status variables	The item gets server global status information.	ODBC	db.odbc.get[getstatusvariables,"{$MYSQL.DSN}"] Expression: `show global status`
Zabbix raw items	MySQL: InnoDB buffer pool read requests	Number of logical read requests.	DEPENDENT	mysql.innodbbufferpoolreadrequests Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Innodb_buffer_pool_read_requests')].Value.first()`
Zabbix raw items	MySQL: InnoDB buffer pool reads	Number of logical reads that InnoDB could not satisfy from the buffer pool, and had to read directly from the disk.	DEPENDENT	mysql.innodbbufferpool_reads Preprocessing: - JSONPATH: `$[?(@.Variable_name=='Innodb_buffer_pool_reads')].Value.first()`
Zabbix raw items	MySQL: Replication Slave status {#MASTER_HOST}	The item gets status information on the essential parameters of the slave threads.	ODBC	db.odbc.get["{#MASTER_HOST}","{$MYSQL.DSN}"] Expression: `show slave status`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
MySQL: Service is down	-	`last(/MySQL by ODBC/db.odbc.select[ping,"{$MYSQL.DSN}"])=0`	HIGH
MySQL: Version has changed	MySQL version has changed. Ack to close.	`last(/MySQL by ODBC/db.odbc.select[version,"{$MYSQL.DSN}"],#1)<>last(/MySQL by ODBC/db.odbc.select[version,"{$MYSQL.DSN}"],#2) and length(last(/MySQL by ODBC/db.odbc.select[version,"{$MYSQL.DSN}"]))>0`	INFO	Manual close: YES
MySQL: Service has been restarted	MySQL uptime is less than 10 minutes.	`last(/MySQL by ODBC/mysql.uptime)<10m`	INFO
MySQL: Failed to fetch info data	Zabbix has not received data for items for the last 30 minutes.	`nodata(/MySQL by ODBC/mysql.uptime,30m)=1`	INFO	Depends on: - MySQL: Service is down
MySQL: Server has aborted connections	The number of failed attempts to connect to the MySQL server is more than {$MYSQL.ABORTED_CONN.MAX.WARN} in the last 5 minutes.	`min(/MySQL by ODBC/mysql.aborted_connects.rate,5m)>{$MYSQL.ABORTED_CONN.MAX.WARN}`	AVERAGE	Depends on: - MySQL: Refused connections
MySQL: Refused connections	Number of refused connections due to the max_connections limit being reached.	`last(/MySQL by ODBC/mysql.connection_errors_max_connections.rate)>0`	AVERAGE
MySQL: Buffer pool utilization is too low	The buffer pool utilization is less than {$MYSQL.BUFF_UTIL.MIN.WARN}% in the last 5 minutes. This means that there is a lot of unused RAM allocated for the buffer pool, which you can easily reallocate at the moment.	`max(/MySQL by ODBC/mysql.buffer_pool_utilization,5m)<{$MYSQL.BUFF_UTIL.MIN.WARN}`	WARNING
MySQL: Number of temporary files created per second is high	Possibly the application using the database is in need of query optimization.	`min(/MySQL by ODBC/mysql.created_tmp_files.rate,5m)>{$MYSQL.CREATED_TMP_FILES.MAX.WARN}`	WARNING
MySQL: Number of on-disk temporary tables created per second is high	Possibly the application using the database is in need of query optimization.	`min(/MySQL by ODBC/mysql.created_tmp_disk_tables.rate,5m)>{$MYSQL.CREATED_TMP_DISK_TABLES.MAX.WARN}`	WARNING
MySQL: Number of internal temporary tables created per second is high	Possibly the application using the database is in need of query optimization.	`min(/MySQL by ODBC/mysql.created_tmp_tables.rate,5m)>{$MYSQL.CREATED_TMP_TABLES.MAX.WARN}`	WARNING
MySQL: Server has slow queries	The number of slow queries is more than {$MYSQL.SLOW_QUERIES.MAX.WARN} in the last 5 minutes.	`min(/MySQL by ODBC/mysql.slow_queries.rate,5m)>{$MYSQL.SLOW_QUERIES.MAX.WARN}`	WARNING
MySQL: Replication lag is too high	-	`min(/MySQL by ODBC/mysql.seconds_behind_master["{#MASTER_HOST}"],5m)>{$MYSQL.REPL_LAG.MAX.WARN}`	WARNING
MySQL: The slave I/O thread is not running	Whether the I/O thread for reading the master's binary log is running.	`count(/MySQL by ODBC/mysql.slave_io_running["{#MASTER_HOST}"],#1,"eq","No")=1`	AVERAGE
MySQL: The slave I/O thread is not connected to a replication master	-	`count(/MySQL by ODBC/mysql.slave_io_running["{#MASTER_HOST}"],#1,"ne","Yes")=1`	WARNING	Depends on: - MySQL: The slave I/O thread is not running
MySQL: The SQL thread is not running	Whether the SQL thread for executing events in the relay log is running.	`count(/MySQL by ODBC/mysql.slave_sql_running["{#MASTER_HOST}"],#1,"eq","No")=1`	WARNING	Depends on: - MySQL: The slave I/O thread is not running

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

db_mysql_agent2

View README Download JSON

MySQL by Zabbix agent 2

Overview

For Zabbix version: 6.2 and higher
The template is developed for monitoring DBMS MySQL and its forks.

This template was tested on:

MySQL, version 5.7, 8.0
Percona, version 8.0
MariaDB, version 10.4

Setup

Create MySQL user for monitoring (<password> at your discretion):

CREATE USER 'zbx_monitor'@'%' IDENTIFIED BY '<password>';
GRANT REPLICATION CLIENT,PROCESS,SHOW DATABASES,SHOW VIEW ON *.* TO 'zbx_monitor'@'%';

For more information, please see MySQL documentation https://dev.mysql.com/doc/refman/8.0/en/grant.html

Set in the {$MYSQL.DSN} macro the data source name of the MySQL instance either session name from Zabbix agent 2 configuration file or URI. Examples: MySQL1, tcp://localhost:3306, tcp://172.16.0.10, unix:/var/run/mysql.sock For more information about MySQL Unix socket file, see the MySQL documentation https://dev.mysql.com/doc/refman/8.0/en/problems-with-mysql-sock.html.
If you had set URI in the {$MYSQL.DSN}, define the user name and password in host macros ({$MYSQL.USER} and {$MYSQL.PASSWORD}). Leave macros {$MYSQL.USER} and {$MYSQL.PASSWORD} empty if you use a session name. Set the user name and password in the Plugins.Mysql.<...> section of your Zabbix agent 2 configuration file. For more information about configuring the Zabbix MySQL plugin, see the documentation https://git.zabbix.com/projects/ZBX/repos/zabbix/browse/src/go/plugins/mysql/README.md.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$MYSQL.ABORTED_CONN.MAX.WARN}	Number of failed attempts to connect to the MySQL server for trigger expression.	`3`
{$MYSQL.BUFF_UTIL.MIN.WARN}	The minimum buffer pool utilization percentage for trigger expression.	`50`
{$MYSQL.CREATEDTMPDISK_TABLES.MAX.WARN}	The maximum number of created tmp tables on a disk per second for trigger expressions.	`10`
{$MYSQL.CREATEDTMPFILES.MAX.WARN}	The maximum number of created tmp files on a disk per second for trigger expressions.	`10`
{$MYSQL.CREATEDTMPTABLES.MAX.WARN}	The maximum number of created tmp tables in memory per second for trigger expressions.	`30`
{$MYSQL.DSN}	System data source name such as .	`<Put your DSN>`
{$MYSQL.INNODBLOGFILES}	Number of physical files in the InnoDB redo log for calculating innodblogfile_size.	`2`
{$MYSQL.PASSWORD}	MySQL user password.	``
{$MYSQL.REPL_LAG.MAX.WARN}	The lag of slave from master for trigger expression.	`30m`
{$MYSQL.SLOW_QUERIES.MAX.WARN}	The number of slow queries for trigger expression.	`3`
{$MYSQL.USER}	MySQL user name.	``

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Database discovery

Name	Description	Type	Key and additional info
Database discovery	Scanning databases in DBMS.	ZABBIX_PASSIVE	mysql.db.discovery["{$MYSQL.DSN}","{$MYSQL.USER}","{$MYSQL.PASSWORD}"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1d` Filter: ANDOR - {#DATABASE} NOTMATCHES_REGEX `information_schema`
MariaDB discovery	Additional metrics if MariaDB is used.	DEPENDENT	mysql.extra_metric.discovery Preprocessing: - JAVASCRIPT: `return JSON.stringify(value.search('MariaDB')>-1 ? [{'{#SINGLETON}': ''}] : []);`
Replication discovery	If "show slave status" returns Master_Host, "Replication: *" items are created.	ZABBIX_PASSIVE	mysql.replication.discovery["{$MYSQL.DSN}","{$MYSQL.USER}","{$MYSQL.PASSWORD}"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1d`

Scanning databases in DBMS.

ZABBIX_PASSIVE

mysql.db.discovery["{$MYSQL.DSN}","{$MYSQL.USER}","{$MYSQL.PASSWORD}"]

Preprocessing:

- DISCARDUNCHANGEDHEARTBEAT: 1d

Filter:

ANDOR

- {#DATABASE} NOTMATCHES_REGEX information_schema

MariaDB discovery

Additional metrics if MariaDB is used.

DEPENDENT

mysql.extra_metric.discovery

Preprocessing:

- JAVASCRIPT: return JSON.stringify(value.search('MariaDB')>-1 ? [{'{#SINGLETON}': ''}] : []);

Replication discovery

If "show slave status" returns Master_Host, "Replication: *" items are created.

ZABBIX_PASSIVE

mysql.replication.discovery["{$MYSQL.DSN}","{$MYSQL.USER}","{$MYSQL.PASSWORD}"]

Preprocessing:

- DISCARDUNCHANGEDHEARTBEAT: 1d

Items collected

Group	Name	Description	Type	Key and additional info
MySQL	MySQL: Status		ZABBIX_PASSIVE	mysql.ping["{$MYSQL.DSN}","{$MYSQL.USER}","{$MYSQL.PASSWORD}"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
MySQL	MySQL: Version		ZABBIX_PASSIVE	mysql.version["{$MYSQL.DSN}","{$MYSQL.USER}","{$MYSQL.PASSWORD}"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `6h`
MySQL	MySQL: Uptime	The amount of seconds that the server has been up.	DEPENDENT	mysql.uptime Preprocessing: - JSONPATH: `$.Uptime`
MySQL	MySQL: Aborted clients per second	Number of connections that were aborted because the client died without closing the connection properly.	DEPENDENT	mysql.abortedclients.rate Preprocessing: - JSONPATH: `$.Aborted_clients` - CHANGEPER_SECOND
MySQL	MySQL: Aborted connections per second	Number of failed attempts to connect to the MySQL server.	DEPENDENT	mysql.abortedconnects.rate Preprocessing: - JSONPATH: `$.Aborted_connects` - CHANGEPER_SECOND
MySQL	MySQL: Connection errors accept per second	Number of errors that occurred during calls to accept() on the listening port.	DEPENDENT	mysql.connectionerrorsaccept.rate Preprocessing: - JSONPATH: `$.Connection_errors_accept` - CHANGEPERSECOND
MySQL	MySQL: Connection errors internal per second	Number of refused connections due to internal server errors, for example, out of memory errors, or failed thread starts.	DEPENDENT	mysql.connectionerrorsinternal.rate Preprocessing: - JSONPATH: `$.Connection_errors_internal` - CHANGEPERSECOND
MySQL	MySQL: Connection errors max connections per second	Number of refused connections due to the max_connections limit being reached.	DEPENDENT	mysql.connectionerrorsmaxconnections.rate Preprocessing: - JSONPATH: `$.Connection_errors_max_connections` - CHANGEPER_SECOND
MySQL	MySQL: Connection errors peer address per second	Number of errors while searching for the connecting client IP address.	DEPENDENT	mysql.connectionerrorspeeraddress.rate Preprocessing: - JSONPATH: `$.Connection_errors_peer_address` - CHANGEPER_SECOND
MySQL	MySQL: Connection errors select per second	Number of errors during calls to select() or poll() on the listening port. The client would not necessarily have been rejected in these cases.	DEPENDENT	mysql.connectionerrorsselect.rate Preprocessing: - JSONPATH: `$.Connection_errors_select` - CHANGEPERSECOND
MySQL	MySQL: Connection errors tcpwrap per second	Number of connections the libwrap library has refused.	DEPENDENT	mysql.connectionerrorstcpwrap.rate Preprocessing: - JSONPATH: `$.Connection_errors_tcpwrap` - CHANGEPERSECOND
MySQL	MySQL: Connections per second	Number of connection attempts (successful or not) to the MySQL server.	DEPENDENT	mysql.connections.rate Preprocessing: - JSONPATH: `$.Connections` - CHANGEPERSECOND
MySQL	MySQL: Max used connections	The maximum number of connections that have been in use simultaneously since the server start.	DEPENDENT	mysql.maxusedconnections Preprocessing: - JSONPATH: `$.Max_used_connections` - DISCARDUNCHANGEDHEARTBEAT: `1h`
MySQL	MySQL: Threads cached	Number of threads in the thread cache.	DEPENDENT	mysql.threads_cached Preprocessing: - JSONPATH: `$.Threads_cached`
MySQL	MySQL: Threads connected	Number of currently open connections.	DEPENDENT	mysql.threads_connected Preprocessing: - JSONPATH: `$.Threads_connected`
MySQL	MySQL: Threads created per second	Number of threads created to handle connections. If Threadscreated is big, you may want to increase the threadcachesize value. The cache miss rate can be calculated as Threadscreated/Connections.	DEPENDENT	mysql.threadscreated.rate Preprocessing: - JSONPATH: `$.Threads_created` - CHANGEPER_SECOND
MySQL	MySQL: Threads running	Number of threads which are not sleeping.	DEPENDENT	mysql.threads_running Preprocessing: - JSONPATH: `$.Threads_running`
MySQL	MySQL: Buffer pool efficiency	The item shows how effectively the buffer pool is serving reads.	CALCULATED	mysql.bufferpoolefficiency Expression: `last(//mysql.innodb_buffer_pool_reads) / ( last(//mysql.innodb_buffer_pool_read_requests) + ( last(//mysql.innodb_buffer_pool_read_requests) = 0 ) ) * 100 * ( last(//mysql.innodb_buffer_pool_read_requests) > 0 )`
MySQL	MySQL: Buffer pool utilization	Ratio of used to total pages in the buffer pool.	CALCULATED	mysql.bufferpoolutilization Expression: `( last(//mysql.innodb_buffer_pool_pages_total) - last(//mysql.innodb_buffer_pool_pages_free) ) / ( last(//mysql.innodb_buffer_pool_pages_total) + ( last(//mysql.innodb_buffer_pool_pages_total) = 0 ) ) * 100 * ( last(//mysql.innodb_buffer_pool_pages_total) > 0 )`
MySQL	MySQL: Created tmp files on disk per second	How many temporary files mysqld has created.	DEPENDENT	mysql.createdtmpfiles.rate Preprocessing: - JSONPATH: `$.Created_tmp_files` - CHANGEPERSECOND
MySQL	MySQL: Created tmp tables on disk per second	Number of internal on-disk temporary tables created by the server while executing statements.	DEPENDENT	mysql.createdtmpdisktables.rate Preprocessing: - JSONPATH: `$.Created_tmp_disk_tables` - CHANGEPER_SECOND
MySQL	MySQL: Created tmp tables on memory per second	Number of internal temporary tables created by the server while executing statements.	DEPENDENT	mysql.createdtmptables.rate Preprocessing: - JSONPATH: `$.Created_tmp_tables` - CHANGEPERSECOND
MySQL	MySQL: InnoDB buffer pool pages free	The total size of the InnoDB buffer pool, in pages.	DEPENDENT	mysql.innodbbufferpoolpagesfree Preprocessing: - JSONPATH: `$.Innodb_buffer_pool_pages_free`
MySQL	MySQL: InnoDB buffer pool pages total	The total size of the InnoDB buffer pool, in pages.	DEPENDENT	mysql.innodbbufferpoolpagestotal Preprocessing: - JSONPATH: `$.Innodb_buffer_pool_pages_total` - DISCARDUNCHANGEDHEARTBEAT: `1h`
MySQL	MySQL: InnoDB buffer pool read requests per second	Number of logical read requests per second.	DEPENDENT	mysql.innodbbufferpoolreadrequests.rate Preprocessing: - JSONPATH: `$.Innodb_buffer_pool_read_requests` - CHANGEPERSECOND
MySQL	MySQL: InnoDB buffer pool reads per second	Number of logical reads per second that InnoDB could not satisfy from the buffer pool, and had to read directly from the disk.	DEPENDENT	mysql.innodbbufferpoolreads.rate Preprocessing: - JSONPATH: `$.Innodb_buffer_pool_reads` - CHANGEPER_SECOND
MySQL	MySQL: InnoDB row lock time	The total time spent in acquiring row locks for InnoDB tables, in milliseconds.	DEPENDENT	mysql.innodbrowlocktime Preprocessing: - JSONPATH: `$.Innodb_row_lock_time` - MULTIPLIER: `0.001` - DISCARDUNCHANGED_HEARTBEAT: `1h`
MySQL	MySQL: InnoDB row lock time max	The maximum time to acquire a row lock for InnoDB tables, in milliseconds.	DEPENDENT	mysql.innodbrowlocktimemax Preprocessing: - JSONPATH: `$.Innodb_row_lock_time_max` - MULTIPLIER: `0.001` - DISCARDUNCHANGEDHEARTBEAT: `1h`
MySQL	MySQL: InnoDB row lock waits	Number of times operations on InnoDB tables had to wait for a row lock.	DEPENDENT	mysql.innodbrowlock_waits Preprocessing: - JSONPATH: `$.Innodb_row_lock_waits`
MySQL	MySQL: Slow queries per second	Number of queries that have taken more than longquerytime seconds.	DEPENDENT	mysql.slowqueries.rate Preprocessing: - JSONPATH: `$.Slow_queries` - CHANGEPER_SECOND
MySQL	MySQL: Bytes received	Number of bytes received from all clients.	DEPENDENT	mysql.bytesreceived.rate Preprocessing: - JSONPATH: `$.Bytes_received` - CHANGEPER_SECOND
MySQL	MySQL: Bytes sent	Number of bytes sent to all clients.	DEPENDENT	mysql.bytessent.rate Preprocessing: - JSONPATH: `$.Bytes_sent` - CHANGEPER_SECOND
MySQL	MySQL: Command Delete per second	The Com_delete counter variable indicates the number of times the delete statement has been executed.	DEPENDENT	mysql.comdelete.rate Preprocessing: - JSONPATH: `$.Com_delete` - CHANGEPER_SECOND
MySQL	MySQL: Command Insert per second	The Com_insert counter variable indicates the number of times the insert statement has been executed.	DEPENDENT	mysql.cominsert.rate Preprocessing: - JSONPATH: `$.Com_insert` - CHANGEPER_SECOND
MySQL	MySQL: Command Select per second	The Com_select counter variable indicates the number of times the select statement has been executed.	DEPENDENT	mysql.comselect.rate Preprocessing: - JSONPATH: `$.Com_select` - CHANGEPER_SECOND
MySQL	MySQL: Command Update per second	The Com_update counter variable indicates the number of times the update statement has been executed.	DEPENDENT	mysql.comupdate.rate Preprocessing: - JSONPATH: `$.Com_update` - CHANGEPER_SECOND
MySQL	MySQL: Queries per second	Number of statements executed by the server. This variable includes statements executed within stored programs, unlike the Questions variable.	DEPENDENT	mysql.queries.rate Preprocessing: - JSONPATH: `$.Queries` - CHANGEPERSECOND
MySQL	MySQL: Questions per second	Number of statements executed by the server. This includes only statements sent to the server by clients and not statements executed within stored programs, unlike the Queries variable.	DEPENDENT	mysql.questions.rate Preprocessing: - JSONPATH: `$.Questions` - CHANGEPERSECOND
MySQL	MySQL: Binlog cache disk use	Number of transactions that used a temporary disk cache because they could not fit in the regular binary log cache, being larger than binlogcachesize.	DEPENDENT	mysql.binlogcachediskuse Preprocessing: - JSONPATH: `$.Binlog_cache_disk_use` - DISCARDUNCHANGED_HEARTBEAT: `6h`
MySQL	MySQL: Innodb buffer pool wait free	Number of times InnoDB waited for a free page before reading or creating a page. Normally, writes to the InnoDB buffer pool happen in the background. When no clean pages are available, dirty pages are flushed first in order to free some up. This counts the numbers of wait for this operation to finish. If this value is not small, look at the increasing innodbbufferpool_size.	DEPENDENT	mysql.innodbbufferpoolwaitfree Preprocessing: - JSONPATH: `$.Innodb_buffer_pool_wait_free` - DISCARDUNCHANGEDHEARTBEAT: `6h`
MySQL	MySQL: Innodb number open files	Number of open files held by InnoDB. InnoDB only.	DEPENDENT	mysql.innodbnumopenfiles Preprocessing: - JSONPATH: `$.Innodb_num_open_files` - DISCARDUNCHANGED_HEARTBEAT: `6h`
MySQL	MySQL: Open table definitions	Number of cached table definitions.	DEPENDENT	mysql.opentabledefinitions Preprocessing: - JSONPATH: `$.Open_table_definitions` - DISCARDUNCHANGEDHEARTBEAT: `6h`
MySQL	MySQL: Open tables	Number of tables that are open.	DEPENDENT	mysql.opentables Preprocessing: - JSONPATH: `$.Open_tables` - DISCARDUNCHANGED_HEARTBEAT: `6h`
MySQL	MySQL: Innodb log written	Number of bytes written to the InnoDB log.	DEPENDENT	mysql.innodboslogwritten Preprocessing: - JSONPATH: `$.Innodb_os_log_written` - DISCARDUNCHANGED_HEARTBEAT: `6h`
MySQL	MySQL: Calculated value of innodblogfile_size	Calculated by (innodboslogwritten-innodboslogwritten(time shift -1h))/{$MYSQL.INNODBLOGFILES} value of the innodblogfilesize. Innodblogfilesize is the size in bytes of the each InnoDB redo log file in the log group. The combined size can be no more than 512GB. Larger values mean less disk I/O due to less flushing checkpoint activity, but also slower recovery from a crash.	CALCULATED	mysql.innodblogfilesize Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `6h` Expression: `(last(//mysql.innodb_os_log_written) - last(//mysql.innodb_os_log_written,#1:now-1h)) / {$MYSQL.INNODB_LOG_FILES}`
MySQL	MySQL: Size of database {#DATABASE}	-	ZABBIX_PASSIVE	mysql.db.size["{$MYSQL.DSN}","{$MYSQL.USER}","{$MYSQL.PASSWORD}","{#DATABASE}"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1h`
MySQL	MySQL: Replication Slave SQL Running State {#MASTER_HOST}	This shows the state of the SQL driver threads.	DEPENDENT	mysql.replication.slavesqlrunningstate["{#MASTERHOST}"] Preprocessing: - JSONPATH: `$.Slave_SQL_Running_State` - DISCARDUNCHANGEDHEARTBEAT: `6h`
MySQL	MySQL: Replication Seconds Behind Master {#MASTER_HOST}	Number of seconds that the slave SQL thread is behind processing the master binary log. A high number (or an increasing one) can indicate that the slave is unable to handle events from the master in a timely fashion.	DEPENDENT	mysql.replication.secondsbehindmaster["{#MASTERHOST}"] Preprocessing: - JSONPATH: `$.Seconds_Behind_Master` - MATCHESREGEX: `\d+` ⛔️ONFAIL: `CUSTOM_ERROR -> Replication is not performed.` - DISCARDUNCHANGED_HEARTBEAT: `1h`
MySQL	MySQL: Replication Slave IO Running {#MASTER_HOST}	Whether the I/O thread for reading the master's binary log is running. Normally, you want this to be Yes unless you have not yet started a replication or have explicitly stopped it with STOP SLAVE.	DEPENDENT	mysql.replication.slaveiorunning["{#MASTERHOST}"] Preprocessing: - JSONPATH: `$.Slave_IO_Running` - DISCARDUNCHANGED_HEARTBEAT: `1h`
MySQL	MySQL: Replication Slave SQL Running {#MASTER_HOST}	Whether the SQL thread for executing events in the relay log is running. As with the I/O thread, this should normally be Yes.	DEPENDENT	mysql.replication.slavesqlrunning["{#MASTERHOST}"] Preprocessing: - JSONPATH: `$.Slave_SQL_Running` - DISCARDUNCHANGED_HEARTBEAT: `1h`
MySQL	MySQL: Binlog commits	Total number of transactions committed to the binary log.	DEPENDENT	mysql.binlog_commits[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Binlog_commits`
MySQL	MySQL: Binlog group commits	Total number of group commits done to the binary log.	DEPENDENT	mysql.binloggroupcommits[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Binlog_group_commits`
MySQL	MySQL: Master GTID wait count	The number of times MASTERGTIDWAIT called.	DEPENDENT	mysql.mastergtidwaitcount[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Master_gtid_wait_count` - DISCARDUNCHANGED_HEARTBEAT: `6h`
MySQL	MySQL: Master GTID wait time	Total number of time spent in MASTERGTIDWAIT.	DEPENDENT	mysql.mastergtidwaittime[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Master_gtid_wait_time` - DISCARDUNCHANGED_HEARTBEAT: `6h`
MySQL	MySQL: Master GTID wait timeouts	Number of timeouts occurring in MASTERGTIDWAIT.	DEPENDENT	mysql.mastergtidwaittimeouts[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Master_gtid_wait_timeouts` - DISCARDUNCHANGED_HEARTBEAT: `6h`
Zabbix raw items	MySQL: Get status variables	The item gets server global status information.	ZABBIX_PASSIVE	mysql.getstatusvariables["{$MYSQL.DSN}","{$MYSQL.USER}","{$MYSQL.PASSWORD}"]
Zabbix raw items	MySQL: InnoDB buffer pool read requests	Number of logical read requests.	DEPENDENT	mysql.innodbbufferpoolreadrequests Preprocessing: - JSONPATH: `$.Innodb_buffer_pool_read_requests`
Zabbix raw items	MySQL: InnoDB buffer pool reads	Number of logical reads that InnoDB could not satisfy from the buffer pool, and had to read directly from the disk.	DEPENDENT	mysql.innodbbufferpool_reads Preprocessing: - JSONPATH: `$.Innodb_buffer_pool_reads`
Zabbix raw items	MySQL: Replication Slave status {#MASTER_HOST}	The item gets status information on the essential parameters of the slave threads.	ZABBIX_PASSIVE	mysql.replication.getslavestatus["{$MYSQL.DSN}","{$MYSQL.USER}","{$MYSQL.PASSWORD}","{#MASTER_HOST}"]

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
MySQL: Service is down	-	`last(/MySQL by Zabbix agent 2/mysql.ping["{$MYSQL.DSN}","{$MYSQL.USER}","{$MYSQL.PASSWORD}"])=0`	HIGH
MySQL: Version has changed	MySQL version has changed. Ack to close.	`last(/MySQL by Zabbix agent 2/mysql.version["{$MYSQL.DSN}","{$MYSQL.USER}","{$MYSQL.PASSWORD}"],#1)<>last(/MySQL by Zabbix agent 2/mysql.version["{$MYSQL.DSN}","{$MYSQL.USER}","{$MYSQL.PASSWORD}"],#2) and length(last(/MySQL by Zabbix agent 2/mysql.version["{$MYSQL.DSN}","{$MYSQL.USER}","{$MYSQL.PASSWORD}"]))>0`	INFO	Manual close: YES
MySQL: Service has been restarted	MySQL uptime is less than 10 minutes.	`last(/MySQL by Zabbix agent 2/mysql.uptime)<10m`	INFO
MySQL: Failed to fetch info data	Zabbix has not received data for items for the last 30 minutes.	`nodata(/MySQL by Zabbix agent 2/mysql.uptime,30m)=1`	INFO	Depends on: - MySQL: Service is down
MySQL: Server has aborted connections	The number of failed attempts to connect to the MySQL server is more than {$MYSQL.ABORTED_CONN.MAX.WARN} in the last 5 minutes.	`min(/MySQL by Zabbix agent 2/mysql.aborted_connects.rate,5m)>{$MYSQL.ABORTED_CONN.MAX.WARN}`	AVERAGE	Depends on: - MySQL: Refused connections
MySQL: Refused connections	Number of refused connections due to the max_connections limit being reached.	`last(/MySQL by Zabbix agent 2/mysql.connection_errors_max_connections.rate)>0`	AVERAGE
MySQL: Buffer pool utilization is too low	The buffer pool utilization is less than {$MYSQL.BUFF_UTIL.MIN.WARN}% in the last 5 minutes. This means that there is a lot of unused RAM allocated for the buffer pool, which you can easily reallocate at the moment.	`max(/MySQL by Zabbix agent 2/mysql.buffer_pool_utilization,5m)<{$MYSQL.BUFF_UTIL.MIN.WARN}`	WARNING
MySQL: Number of temporary files created per second is high	Possibly the application using the database is in need of query optimization.	`min(/MySQL by Zabbix agent 2/mysql.created_tmp_files.rate,5m)>{$MYSQL.CREATED_TMP_FILES.MAX.WARN}`	WARNING
MySQL: Number of on-disk temporary tables created per second is high	Possibly the application using the database is in need of query optimization.	`min(/MySQL by Zabbix agent 2/mysql.created_tmp_disk_tables.rate,5m)>{$MYSQL.CREATED_TMP_DISK_TABLES.MAX.WARN}`	WARNING
MySQL: Number of internal temporary tables created per second is high	Possibly the application using the database is in need of query optimization.	`min(/MySQL by Zabbix agent 2/mysql.created_tmp_tables.rate,5m)>{$MYSQL.CREATED_TMP_TABLES.MAX.WARN}`	WARNING
MySQL: Server has slow queries	The number of slow queries is more than {$MYSQL.SLOW_QUERIES.MAX.WARN} in the last 5 minutes.	`min(/MySQL by Zabbix agent 2/mysql.slow_queries.rate,5m)>{$MYSQL.SLOW_QUERIES.MAX.WARN}`	WARNING
MySQL: Replication lag is too high	-	`min(/MySQL by Zabbix agent 2/mysql.replication.seconds_behind_master["{#MASTER_HOST}"],5m)>{$MYSQL.REPL_LAG.MAX.WARN}`	WARNING
MySQL: The slave I/O thread is not running	Whether the I/O thread for reading the master's binary log is running.	`count(/MySQL by Zabbix agent 2/mysql.replication.slave_io_running["{#MASTER_HOST}"],#1,"eq","No")=1`	AVERAGE
MySQL: The slave I/O thread is not connected to a replication master	-	`count(/MySQL by Zabbix agent 2/mysql.replication.slave_io_running["{#MASTER_HOST}"],#1,"ne","Yes")=1`	WARNING	Depends on: - MySQL: The slave I/O thread is not running
MySQL: The SQL thread is not running	Whether the SQL thread for executing events in the relay log is running.	`count(/MySQL by Zabbix agent 2/mysql.replication.slave_sql_running["{#MASTER_HOST}"],#1,"eq","No")=1`	WARNING	Depends on: - MySQL: The slave I/O thread is not running

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

db_mysql_agent

View README Download JSON

MySQL by Zabbix agent

Overview

For Zabbix version: 6.2 and higher
The template is developed for monitoring DBMS MySQL and its forks.

This template was tested on:

MySQL, version 5.7, 8.0
Percona, version 8.0
MariaDB, version 10.4, 10.6.8

Setup

Install Zabbix agent and MySQL client. If necessary, add the path to the mysql and mysqladmin utilities to the global environment variable PATH.
Copy template_db_mysql.conf into the folder with Zabbix agent configuration (/etc/zabbix/zabbix_agentd.d/ by default). Don't forget to restart Zabbix agent.
Create a MySQL user for monitoring (<password> at your discretion):

CREATE USER 'zbx_monitor'@'%' IDENTIFIED BY '<password>';
GRANT REPLICATION CLIENT,PROCESS,SHOW DATABASES,SHOW VIEW ON *.* TO 'zbx_monitor'@'%';

For more information, please see MySQL documentation https://dev.mysql.com/doc/refman/8.0/en/grant.html

Create .my.cnf in the home directory of Zabbix agent for Linux (/var/lib/zabbix by default ) or my.cnf in c:\ for Windows. The file must have three strings:

[client]
user='zbx_monitor'
password='<password>'

NOTE: Use systemd to start Zabbix agent on Linux OS. For example, in Centos use "systemctl edit zabbix-agent.service" to set the required user to start the Zabbix agent.

Add the rule to the SELinux policy (example for Centos):

# cat <<EOF > zabbix_home.te
module zabbix_home 1.0;

require {
        type zabbix_agent_t;
        type zabbix_var_lib_t;
        type mysqld_etc_t;
        type mysqld_port_t;
        type mysqld_var_run_t;
        class file { open read };
        class tcp_socket name_connect;
        class sock_file write;
}

#============= zabbix_agent_t ==============

allow zabbix_agent_t zabbix_var_lib_t:file read;
allow zabbix_agent_t zabbix_var_lib_t:file open;
allow zabbix_agent_t mysqld_etc_t:file read;
allow zabbix_agent_t mysqld_port_t:tcp_socket name_connect;
allow zabbix_agent_t mysqld_var_run_t:sock_file write;
EOF
# checkmodule -M -m -o zabbix_home.mod zabbix_home.te
# semodule_package -o zabbix_home.pp -m zabbix_home.mod
# semodule -i zabbix_home.pp
# restorecon -R /var/lib/zabbix

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$MYSQL.ABORTED_CONN.MAX.WARN}	The number of failed attempts to connect to the MySQL server for trigger expression.	`3`
{$MYSQL.BUFF_UTIL.MIN.WARN}	The minimum buffer pool utilization in percentage for trigger expression.	`50`
{$MYSQL.CREATEDTMPDISK_TABLES.MAX.WARN}	The maximum number of created tmp tables on a disk per second for trigger expressions.	`10`
{$MYSQL.CREATEDTMPFILES.MAX.WARN}	The maximum number of created tmp files on a disk per second for trigger expressions.	`10`
{$MYSQL.CREATEDTMPTABLES.MAX.WARN}	The maximum number of created tmp tables in memory per second for trigger expressions.	`30`
{$MYSQL.HOST}	Hostname or IP of MySQL host or container.	`127.0.0.1`
{$MYSQL.INNODBLOGFILES}	Number of physical files in the InnoDB redo log for calculating innodblogfile_size.	`2`
{$MYSQL.PORT}	MySQL service port.	`3306`
{$MYSQL.REPL_LAG.MAX.WARN}	The lag of slave from master for trigger expression.	`30m`
{$MYSQL.SLOW_QUERIES.MAX.WARN}	The number of slow queries for trigger expression.	`3`

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Database discovery

Name	Description	Type	Key and additional info
Database discovery	Scanning databases in DBMS.	ZABBIX_PASSIVE	mysql.db.discovery["{$MYSQL.HOST}","{$MYSQL.PORT}"] Preprocessing: - JAVASCRIPT: `return JSON.stringify(value.split("\n").map(function (name) { return ({"{#DBNAME}": name}); }));` - DISCARDUNCHANGEDHEARTBEAT: `1d` Filter: ANDOR - {#DBNAME} NOTMATCHES_REGEX `information_schema`
MariaDB discovery	Additional metrics if MariaDB is used.	DEPENDENT	mysql.extra_metric.discovery Preprocessing: - JAVASCRIPT: `return JSON.stringify(value.search('MariaDB')>-1 ? [{'{#SINGLETON}': ''}] : []);`
Replication discovery	If "show slave status" returns Master_Host, "Replication: *" items are created.	ZABBIX_PASSIVE	mysql.replication.discovery["{$MYSQL.HOST}","{$MYSQL.PORT}"] Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `1d`

Scanning databases in DBMS.

ZABBIX_PASSIVE

mysql.db.discovery["{$MYSQL.HOST}","{$MYSQL.PORT}"]

Preprocessing:

- JAVASCRIPT: return JSON.stringify(value.split("\n").map(function (name) { return ({"{#DBNAME}": name}); }));

- DISCARDUNCHANGEDHEARTBEAT: 1d

Filter:

ANDOR

- {#DBNAME} NOTMATCHES_REGEX information_schema

MariaDB discovery

Additional metrics if MariaDB is used.

DEPENDENT

mysql.extra_metric.discovery

Preprocessing:

- JAVASCRIPT: return JSON.stringify(value.search('MariaDB')>-1 ? [{'{#SINGLETON}': ''}] : []);

Replication discovery

If "show slave status" returns Master_Host, "Replication: *" items are created.

ZABBIX_PASSIVE

mysql.replication.discovery["{$MYSQL.HOST}","{$MYSQL.PORT}"]

Preprocessing:

- JAVASCRIPT: The text is too long. Please see the template.

- DISCARDUNCHANGEDHEARTBEAT: 1d

Items collected

Group	Name	Description	Type	Key and additional info
MySQL	MySQL: Status		ZABBIX_PASSIVE	mysql.ping["{$MYSQL.HOST}","{$MYSQL.PORT}"] Preprocessing: - JAVASCRIPT: `return value.indexOf('is alive') !== -1 ? 1 : 0;` - DISCARDUNCHANGEDHEARTBEAT: `10m`
MySQL	MySQL: Version		ZABBIX_PASSIVE	mysql.version["{$MYSQL.HOST}","{$MYSQL.PORT}"] Preprocessing: - REGEX: `(Server version)\s+(.+) \2` - DISCARDUNCHANGEDHEARTBEAT: `1d`
MySQL	MySQL: Uptime	The amount of seconds that the server has been up.	DEPENDENT	mysql.uptime Preprocessing: - XMLPATH: `/resultset/row[field/text()='Uptime']/field[@name='Value']/text()`
MySQL	MySQL: Aborted clients per second	Number of connections that were aborted because the client died without closing the connection properly.	DEPENDENT	mysql.abortedclients.rate Preprocessing: - XMLPATH: `/resultset/row[field/text()='Aborted_clients']/field[@name='Value']/text()` - CHANGEPER_SECOND
MySQL	MySQL: Aborted connections per second	Number of failed attempts to connect to the MySQL server.	DEPENDENT	mysql.abortedconnects.rate Preprocessing: - XMLPATH: `/resultset/row[field/text()='Aborted_connects']/field[@name='Value']/text()` - CHANGEPER_SECOND
MySQL	MySQL: Connection errors accept per second	Number of errors that occurred during calls to accept() on the listening port.	DEPENDENT	mysql.connectionerrorsaccept.rate Preprocessing: - XMLPATH: `/resultset/row[field/text()='Connection_errors_accept']/field[@name='Value']/text()` - CHANGEPERSECOND
MySQL	MySQL: Connection errors internal per second	Number of refused connections due to internal server errors, for example, out of memory errors, or failed thread starts.	DEPENDENT	mysql.connectionerrorsinternal.rate Preprocessing: - XMLPATH: `/resultset/row[field/text()='Connection_errors_internal']/field[@name='Value']/text()` - CHANGEPERSECOND
MySQL	MySQL: Connection errors max connections per second	Number of refused connections due to the max_connections limit being reached.	DEPENDENT	mysql.connectionerrorsmaxconnections.rate Preprocessing: - XMLPATH: `/resultset/row[field/text()='Connection_errors_max_connections']/field[@name='Value']/text()` - CHANGEPER_SECOND
MySQL	MySQL: Connection errors peer address per second	Number of errors while searching for the connecting client IP address.	DEPENDENT	mysql.connectionerrorspeeraddress.rate Preprocessing: - XMLPATH: `/resultset/row[field/text()='Connection_errors_peer_address']/field[@name='Value']/text()` - CHANGEPER_SECOND
MySQL	MySQL: Connection errors select per second	Number of errors during calls to select() or poll() on the listening port. The client would not necessarily have been rejected in these cases.	DEPENDENT	mysql.connectionerrorsselect.rate Preprocessing: - XMLPATH: `/resultset/row[field/text()='Connection_errors_select']/field[@name='Value']/text()` - CHANGEPERSECOND
MySQL	MySQL: Connection errors tcpwrap per second	Number of connections the libwrap library has refused.	DEPENDENT	mysql.connectionerrorstcpwrap.rate Preprocessing: - XMLPATH: `/resultset/row[field/text()='Connection_errors_tcpwrap']/field[@name='Value']/text()` - CHANGEPERSECOND
MySQL	MySQL: Connections per second	Number of connection attempts (successful or not) to the MySQL server.	DEPENDENT	mysql.connections.rate Preprocessing: - XMLPATH: `/resultset/row[field/text()='Connections']/field[@name='Value']/text()` - CHANGEPERSECOND
MySQL	MySQL: Max used connections	The maximum number of connections that have been in use simultaneously since the server start.	DEPENDENT	mysql.maxusedconnections Preprocessing: - XMLPATH: `/resultset/row[field/text()='Max_used_connections']/field[@name='Value']/text()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
MySQL	MySQL: Threads cached	Number of threads in the thread cache.	DEPENDENT	mysql.threads_cached Preprocessing: - XMLPATH: `/resultset/row[field/text()='Threads_cached']/field[@name='Value']/text()`
MySQL	MySQL: Threads connected	Number of currently open connections.	DEPENDENT	mysql.threads_connected Preprocessing: - XMLPATH: `/resultset/row[field/text()='Threads_connected']/field[@name='Value']/text()`
MySQL	MySQL: Threads created per second	Number of threads created to handle connections. If Threadscreated is big, you may want to increase the threadcachesize value. The cache miss rate can be calculated as Threadscreated/Connections.	DEPENDENT	mysql.threadscreated.rate Preprocessing: - XMLPATH: `/resultset/row[field/text()='Threads_created']/field[@name='Value']/text()` - CHANGEPER_SECOND
MySQL	MySQL: Threads running	Number of threads which are not sleeping.	DEPENDENT	mysql.threads_running Preprocessing: - XMLPATH: `/resultset/row[field/text()='Threads_running']/field[@name='Value']/text()`
MySQL	MySQL: Buffer pool efficiency	The item shows how effectively the buffer pool is serving reads.	CALCULATED	mysql.bufferpoolefficiency Expression: `last(//mysql.innodb_buffer_pool_reads) / ( last(//mysql.innodb_buffer_pool_read_requests) + ( last(//mysql.innodb_buffer_pool_read_requests) = 0 ) ) * 100 * ( last(//mysql.innodb_buffer_pool_read_requests) > 0 )`
MySQL	MySQL: Buffer pool utilization	Ratio of used to total pages in the buffer pool.	CALCULATED	mysql.bufferpoolutilization Expression: `( last(//mysql.innodb_buffer_pool_pages_total) - last(//mysql.innodb_buffer_pool_pages_free) ) / ( last(//mysql.innodb_buffer_pool_pages_total) + ( last(//mysql.innodb_buffer_pool_pages_total) = 0 ) ) * 100 * ( last(//mysql.innodb_buffer_pool_pages_total) > 0 )`
MySQL	MySQL: Created tmp files on disk per second	How many temporary files mysqld has created.	DEPENDENT	mysql.createdtmpfiles.rate Preprocessing: - XMLPATH: `/resultset/row[field/text()='Created_tmp_files']/field[@name='Value']/text()` - CHANGEPERSECOND
MySQL	MySQL: Created tmp tables on disk per second	Number of internal on-disk temporary tables created by the server while executing statements.	DEPENDENT	mysql.createdtmpdisktables.rate Preprocessing: - XMLPATH: `/resultset/row[field/text()='Created_tmp_disk_tables']/field[@name='Value']/text()` - CHANGEPER_SECOND
MySQL	MySQL: Created tmp tables on memory per second	Number of internal temporary tables created by the server while executing statements.	DEPENDENT	mysql.createdtmptables.rate Preprocessing: - XMLPATH: `/resultset/row[field/text()='Created_tmp_tables']/field[@name='Value']/text()` - CHANGEPERSECOND
MySQL	MySQL: InnoDB buffer pool pages free	The total size of the InnoDB buffer pool, in pages.	DEPENDENT	mysql.innodbbufferpoolpagesfree Preprocessing: - XMLPATH: `/resultset/row[field/text()='Innodb_buffer_pool_pages_free']/field[@name='Value']/text()`
MySQL	MySQL: InnoDB buffer pool pages total	The total size of the InnoDB buffer pool, in pages.	DEPENDENT	mysql.innodbbufferpoolpagestotal Preprocessing: - XMLPATH: `/resultset/row[field/text()='Innodb_buffer_pool_pages_total']/field[@name='Value']/text()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
MySQL	MySQL: InnoDB buffer pool read requests per second	Number of logical read requests per second.	DEPENDENT	mysql.innodbbufferpoolreadrequests.rate Preprocessing: - XMLPATH: `/resultset/row[field/text()='Innodb_buffer_pool_read_requests']/field[@name='Value']/text()` - CHANGEPERSECOND
MySQL	MySQL: InnoDB buffer pool reads per second	Number of logical reads per second that InnoDB could not satisfy from the buffer pool, and had to read directly from the disk.	DEPENDENT	mysql.innodbbufferpoolreads.rate Preprocessing: - XMLPATH: `/resultset/row[field/text()='Innodb_buffer_pool_reads']/field[@name='Value']/text()` - CHANGEPER_SECOND
MySQL	MySQL: InnoDB row lock time	The total time spent in acquiring row locks for InnoDB tables, in milliseconds.	DEPENDENT	mysql.innodbrowlocktime Preprocessing: - XMLPATH: `/resultset/row[field/text()='Innodb_row_lock_time']/field[@name='Value']/text()` - MULTIPLIER: `0.001` - DISCARDUNCHANGED_HEARTBEAT: `1h`
MySQL	MySQL: InnoDB row lock time max	The maximum time to acquire a row lock for InnoDB tables, in milliseconds.	DEPENDENT	mysql.innodbrowlocktimemax Preprocessing: - XMLPATH: `/resultset/row[field/text()='Innodb_row_lock_time_max']/field[@name='Value']/text()` - MULTIPLIER: `0.001` - DISCARDUNCHANGEDHEARTBEAT: `1h`
MySQL	MySQL: InnoDB row lock waits	Number of times operations on InnoDB tables had to wait for a row lock.	DEPENDENT	mysql.innodbrowlock_waits Preprocessing: - XMLPATH: `/resultset/row[field/text()='Innodb_row_lock_waits']/field[@name='Value']/text()`
MySQL	MySQL: Slow queries per second	Number of queries that have taken more than longquerytime seconds.	DEPENDENT	mysql.slowqueries.rate Preprocessing: - XMLPATH: `/resultset/row[field/text()='Slow_queries']/field[@name='Value']/text()` - CHANGEPER_SECOND
MySQL	MySQL: Bytes received	Number of bytes received from all clients.	DEPENDENT	mysql.bytesreceived.rate Preprocessing: - XMLPATH: `/resultset/row[field/text()='Bytes_received']/field[@name='Value']/text()` - CHANGEPER_SECOND
MySQL	MySQL: Bytes sent	Number of bytes sent to all clients.	DEPENDENT	mysql.bytessent.rate Preprocessing: - XMLPATH: `/resultset/row[field/text()='Bytes_sent']/field[@name='Value']/text()` - CHANGEPER_SECOND
MySQL	MySQL: Command Delete per second	The Com_delete counter variable indicates the number of times the delete statement has been executed.	DEPENDENT	mysql.comdelete.rate Preprocessing: - XMLPATH: `/resultset/row[field/text()='Com_delete']/field[@name='Value']/text()` - CHANGEPER_SECOND
MySQL	MySQL: Command Insert per second	The Com_insert counter variable indicates the number of times the insert statement has been executed.	DEPENDENT	mysql.cominsert.rate Preprocessing: - XMLPATH: `/resultset/row[field/text()='Com_insert']/field[@name='Value']/text()` - CHANGEPER_SECOND
MySQL	MySQL: Command Select per second	The Com_select counter variable indicates the number of times the select statement has been executed.	DEPENDENT	mysql.comselect.rate Preprocessing: - XMLPATH: `/resultset/row[field/text()='Com_select']/field[@name='Value']/text()` - CHANGEPER_SECOND
MySQL	MySQL: Command Update per second	The Com_update counter variable indicates the number of times the update statement has been executed.	DEPENDENT	mysql.comupdate.rate Preprocessing: - XMLPATH: `/resultset/row[field/text()='Com_update']/field[@name='Value']/text()` - CHANGEPER_SECOND
MySQL	MySQL: Queries per second	Number of statements executed by the server. This variable includes statements executed within stored programs, unlike the Questions variable.	DEPENDENT	mysql.queries.rate Preprocessing: - XMLPATH: `/resultset/row[field/text()='Queries']/field[@name='Value']/text()` - CHANGEPERSECOND
MySQL	MySQL: Questions per second	Number of statements executed by the server. This includes only statements sent to the server by clients and not statements executed within stored programs, unlike the Queries variable.	DEPENDENT	mysql.questions.rate Preprocessing: - XMLPATH: `/resultset/row[field/text()='Questions']/field[@name='Value']/text()` - CHANGEPERSECOND
MySQL	MySQL: Binlog cache disk use	Number of transactions that used a temporary disk cache because they could not fit in the regular binary log cache, being larger than binlogcachesize.	DEPENDENT	mysql.binlogcachediskuse Preprocessing: - XMLPATH: `/resultset/row[field/text()='Binlog_cache_disk_use']/field[@name='Value']/text()` - DISCARDUNCHANGED_HEARTBEAT: `6h`
MySQL	MySQL: Innodb buffer pool wait free	Number of times InnoDB waited for a free page before reading or creating a page. Normally, writes to the InnoDB buffer pool happen in the background. When no clean pages are available, dirty pages are flushed first in order to free some up. This counts the numbers of wait for this operation to finish. If this value is not small, look at the increasing innodbbufferpool_size.	DEPENDENT	mysql.innodbbufferpoolwaitfree Preprocessing: - XMLPATH: `/resultset/row[field/text()='Innodb_buffer_pool_wait_free']/field[@name='Value']/text()` - DISCARDUNCHANGEDHEARTBEAT: `6h`
MySQL	MySQL: Innodb number open files	Number of open files held by InnoDB. InnoDB only.	DEPENDENT	mysql.innodbnumopenfiles Preprocessing: - XMLPATH: `/resultset/row[field/text()='Innodb_num_open_files']/field[@name='Value']/text()` - DISCARDUNCHANGED_HEARTBEAT: `6h`
MySQL	MySQL: Open table definitions	Number of cached table definitions.	DEPENDENT	mysql.opentabledefinitions Preprocessing: - XMLPATH: `/resultset/row[field/text()='Open_table_definitions']/field[@name='Value']/text()` - DISCARDUNCHANGEDHEARTBEAT: `6h`
MySQL	MySQL: Open tables	Number of tables that are open.	DEPENDENT	mysql.opentables Preprocessing: - XMLPATH: `/resultset/row[field/text()='Open_tables']/field[@name='Value']/text()` - DISCARDUNCHANGED_HEARTBEAT: `6h`
MySQL	MySQL: Innodb log written	Number of bytes written to the InnoDB log.	DEPENDENT	mysql.innodboslog_written Preprocessing: - XMLPATH: `/resultset/row[field/text()='Innodb_os_log_written']/field[@name='Value']/text()`
MySQL	MySQL: Calculated value of innodblogfile_size	Calculated by (innodboslogwritten-innodboslogwritten(time shift -1h))/{$MYSQL.INNODBLOGFILES} value of the innodblogfilesize. Innodblogfilesize is the size in bytes of the each InnoDB redo log file in the log group. The combined size can be no more than 512GB. Larger values mean less disk I/O due to less flushing checkpoint activity, but also slower recovery from a crash.	CALCULATED	mysql.innodblogfilesize Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `6h` Expression: `(last(//mysql.innodb_os_log_written) - last(//mysql.innodb_os_log_written,#1:now-1h)) / {$MYSQL.INNODB_LOG_FILES}`
MySQL	MySQL: Size of database {#DBNAME}	-	ZABBIX_PASSIVE	mysql.dbsize["{$MYSQL.HOST}","{$MYSQL.PORT}","{#DBNAME}"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1h`
MySQL	MySQL: Replication Slave SQL Running State {#MASTER_HOST}	This shows the state of the SQL driver threads.	DEPENDENT	mysql.slavesqlrunningstate["{#MASTERHOST}"] Preprocessing: - XMLPATH: `/resultset/row[field/text()='Slave_SQL_Running_State']/field[@name='Value']/text()` - DISCARDUNCHANGEDHEARTBEAT: `6h`
MySQL	MySQL: Replication Seconds Behind Master {#MASTERHOST}	The number of seconds that the slave SQL thread is behind processing the master binary log. A high number (or an increasing one) can indicate that the slave is unable to handle events from the master in a timely fashion.	DEPENDENT	mysql.secondsbehindmaster["{#MASTERHOST}"] Preprocessing: - XMLPATH: `/resultset/row/field[@name='Seconds_Behind_Master']/text()` - DISCARDUNCHANGEDHEARTBEAT: `1h` - NOTMATCHESREGEX: `null` ⛔️ON_FAIL: `CUSTOM_ERROR -> Replication is not performed.`
MySQL	MySQL: Replication Slave IO Running {#MASTERHOST}	Whether the I/O thread for reading the master's binary log is running. Normally, you want this to be Yes unless you have not yet started replication or have explicitly stopped it with STOP SLAVE.	DEPENDENT	mysql.slaveiorunning["{#MASTERHOST}"] Preprocessing: - XMLPATH: `/resultset/row/field[@name='Slave_IO_Running']/text()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
MySQL	MySQL: Replication Slave SQL Running {#MASTERHOST}	Whether the SQL thread for executing events in the relay log is running. As with the I/O thread, this should normally be Yes.	DEPENDENT	mysql.slavesqlrunning["{#MASTERHOST}"] Preprocessing: - XMLPATH: `/resultset/row/field[@name='Slave_SQL_Running']/text()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
MySQL	MySQL: Binlog commits	Total number of transactions committed to the binary log.	DEPENDENT	mysql.binlog_commits[{#SINGLETON}] Preprocessing: - XMLPATH: `/resultset/row[field/text()='Binlog_commits']/field[@name='Value']/text()`
MySQL	MySQL: Binlog group commits	Total number of group commits done to the binary log.	DEPENDENT	mysql.binloggroupcommits[{#SINGLETON}] Preprocessing: - XMLPATH: `/resultset/row[field/text()='Binlog_group_commits']/field[@name='Value']/text()`
MySQL	MySQL: Master GTID wait count	The number of times MASTERGTIDWAIT called.	DEPENDENT	mysql.mastergtidwaitcount[{#SINGLETON}] Preprocessing: - XMLPATH: `/resultset/row[field/text()='Master_gtid_wait_count']/field[@name='Value']/text()` - DISCARDUNCHANGED_HEARTBEAT: `6h`
MySQL	MySQL: Master GTID wait time	Total number of time spent in MASTERGTIDWAIT.	DEPENDENT	mysql.mastergtidwaittime[{#SINGLETON}] Preprocessing: - XMLPATH: `/resultset/row[field/text()='Master_gtid_wait_time']/field[@name='Value']/text()` - DISCARDUNCHANGED_HEARTBEAT: `6h`
MySQL	MySQL: Master GTID wait timeouts	Number of timeouts occurring in MASTERGTIDWAIT.	DEPENDENT	mysql.mastergtidwaittimeouts[{#SINGLETON}] Preprocessing: - XMLPATH: `/resultset/row[field/text()='Master_gtid_wait_timeouts']/field[@name='Value']/text()` - DISCARDUNCHANGED_HEARTBEAT: `6h`
Zabbix raw items	MySQL: Get status variables	The item gets server global status information.	ZABBIX_PASSIVE	mysql.getstatusvariables["{$MYSQL.HOST}","{$MYSQL.PORT}"]
Zabbix raw items	MySQL: InnoDB buffer pool read requests	Number of logical read requests.	DEPENDENT	mysql.innodbbufferpoolreadrequests Preprocessing: - XMLPATH: `/resultset/row[field/text()='Innodb_buffer_pool_read_requests']/field[@name='Value']/text()`
Zabbix raw items	MySQL: InnoDB buffer pool reads	Number of logical reads that InnoDB could not satisfy from the buffer pool, and had to read directly from the disk.	DEPENDENT	mysql.innodbbufferpool_reads Preprocessing: - XMLPATH: `/resultset/row[field/text()='Innodb_buffer_pool_reads']/field[@name='Value']/text()`
Zabbix raw items	MySQL: Replication Slave status {#MASTERHOST}	The item gets status information on the essential parameters of the slave threads.	ZABBIX_PASSIVE	mysql.slave_status["{$MYSQL.HOST}","{$MYSQL.PORT}","{#MASTERHOST}"]

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
MySQL: Service is down	-	`last(/MySQL by Zabbix agent/mysql.ping["{$MYSQL.HOST}","{$MYSQL.PORT}"])=0`	HIGH
MySQL: Version has changed	MySQL version has changed. Ack to close.	`last(/MySQL by Zabbix agent/mysql.version["{$MYSQL.HOST}","{$MYSQL.PORT}"],#1)<>last(/MySQL by Zabbix agent/mysql.version["{$MYSQL.HOST}","{$MYSQL.PORT}"],#2) and length(last(/MySQL by Zabbix agent/mysql.version["{$MYSQL.HOST}","{$MYSQL.PORT}"]))>0`	INFO	Manual close: YES
MySQL: Service has been restarted	MySQL uptime is less than 10 minutes.	`last(/MySQL by Zabbix agent/mysql.uptime)<10m`	INFO
MySQL: Failed to fetch info data	Zabbix has not received data for items for the last 30 minutes.	`nodata(/MySQL by Zabbix agent/mysql.uptime,30m)=1`	INFO	Depends on: - MySQL: Service is down
MySQL: Server has aborted connections	The number of failed attempts to connect to the MySQL server is more than {$MYSQL.ABORTED_CONN.MAX.WARN} in the last 5 minutes.	`min(/MySQL by Zabbix agent/mysql.aborted_connects.rate,5m)>{$MYSQL.ABORTED_CONN.MAX.WARN}`	AVERAGE	Depends on: - MySQL: Refused connections
MySQL: Refused connections	Number of refused connections due to the max_connections limit being reached.	`last(/MySQL by Zabbix agent/mysql.connection_errors_max_connections.rate)>0`	AVERAGE
MySQL: Buffer pool utilization is too low	The buffer pool utilization is less than {$MYSQL.BUFF_UTIL.MIN.WARN}% in the last 5 minutes. This means that there is a lot of unused RAM allocated for the buffer pool, which you can easily reallocate at the moment.	`max(/MySQL by Zabbix agent/mysql.buffer_pool_utilization,5m)<{$MYSQL.BUFF_UTIL.MIN.WARN}`	WARNING
MySQL: Number of temporary files created per second is high	Possibly the application using the database is in need of query optimization.	`min(/MySQL by Zabbix agent/mysql.created_tmp_files.rate,5m)>{$MYSQL.CREATED_TMP_FILES.MAX.WARN}`	WARNING
MySQL: Number of on-disk temporary tables created per second is high	Possibly the application using the database is in need of query optimization.	`min(/MySQL by Zabbix agent/mysql.created_tmp_disk_tables.rate,5m)>{$MYSQL.CREATED_TMP_DISK_TABLES.MAX.WARN}`	WARNING
MySQL: Number of internal temporary tables created per second is high	Possibly the application using the database is in need of query optimization.	`min(/MySQL by Zabbix agent/mysql.created_tmp_tables.rate,5m)>{$MYSQL.CREATED_TMP_TABLES.MAX.WARN}`	WARNING
MySQL: Server has slow queries	The number of slow queries is more than {$MYSQL.SLOW_QUERIES.MAX.WARN} in the last 5 minutes.	`min(/MySQL by Zabbix agent/mysql.slow_queries.rate,5m)>{$MYSQL.SLOW_QUERIES.MAX.WARN}`	WARNING
MySQL: Replication lag is too high	-	`min(/MySQL by Zabbix agent/mysql.seconds_behind_master["{#MASTERHOST}"],5m)>{$MYSQL.REPL_LAG.MAX.WARN}`	WARNING
MySQL: The slave I/O thread is not running	Whether the I/O thread for reading the master's binary log is running.	`count(/MySQL by Zabbix agent/mysql.slave_io_running["{#MASTERHOST}"],#1,"eq","No")=1`	AVERAGE
MySQL: The slave I/O thread is not connected to a replication master	-	`count(/MySQL by Zabbix agent/mysql.slave_io_running["{#MASTERHOST}"],#1,"ne","Yes")=1`	WARNING	Depends on: - MySQL: The slave I/O thread is not running
MySQL: The SQL thread is not running	Whether the SQL thread for executing events in the relay log is running.	`count(/MySQL by Zabbix agent/mysql.slave_sql_running["{#MASTERHOST}"],#1,"eq","No")=1`	WARNING	Depends on: - MySQL: The slave I/O thread is not running

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

db_mssql_odbc

View README Download JSON

MSSQL by ODBC

Overview

For Zabbix version: 6.2 and higher
The template is developed for monitoring DBMS Microsoft SQL Server via ODBC.

This template was tested on:

Microsoft SQL, version 2017, 2019

Setup

Create an MSSQL user for monitoring. For example, zbx_monitor. View Server State and View Any Definition permissions should be granted to the user. Grant this user read permissions to the sysjobschedules, sysjobhistory, sysjobs tables. For example, using T-SQL commands: GRANT SELECT ON OBJECT::msdb.dbo.sysjobs TO zbx_monitor GRANT SELECT ON OBJECT::msdb.dbo.sysjobservers TO zbx_monitor GRANT SELECT ON OBJECT::msdb.dbo.sysjobactivity TO zbx_monitor GRANT EXECUTE ON OBJECT::msdb.dbo.agent_datetime TO zbx_monitor For more information, see MSSQL documentation: Create a database user GRANT Server Permissions Configure a User to Create and Manage SQL Server Agent Jobs
Set the username and password in host macros ({$MSSQL.USER} and {$MSSQL.PASSWORD}). Do not forget to install Microsoft ODBC driver on Zabbix server or Zabbix proxy. See Microsoft documentation for instructions: https://docs.microsoft.com/en-us/sql/connect/odbc/linux-mac/installing-the-microsoft-odbc-driver-for-sql-server?view=sql-server-ver15. Note! Credentials in the odbc.ini do not work for MSSQL.

For named instance set the value of {$MSSQL.INSTANCE} macro as MSSQL$instance name. In case if MSSQL was installed using default configuration do not change {$MSSQL.INSTANCE} macro value.

The "Service's TCP port state" item uses {HOST.CONN} and {$MSSQL.PORT} macros to check the availability of MSSQL instance. If your instance uses a non-default TCP port, set the port in your section of odbc.ini in the line Server = IP or FQDN name, port.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$MSSQL.AVERAGEWAITTIME.MAX}	The maximum average wait time in ms - for the trigger expression.	`500`
{$MSSQL.BACKUP_DIFF.CRIT}	The maximum days without a differential backup - for the High trigger expression.	`6d`
{$MSSQL.BACKUP_DIFF.WARN}	The maximum days without a differential backup - for the Warning trigger expression.	`3d`
{$MSSQL.BACKUP_DURATION.WARN}	The maximum job duration - for the Warning trigger expression.	`1h`
{$MSSQL.BACKUP_FULL.CRIT}	The maximum days without a full backup - for the High trigger expression.	`10d`
{$MSSQL.BACKUP_FULL.WARN}	The maximum days without a full backup - for the Warning trigger expression.	`9d`
{$MSSQL.BACKUP_LOG.CRIT}	The maximum days without a log backup - for the High trigger expression.	`8h`
{$MSSQL.BACKUP_LOG.WARN}	The maximum days without a log backup - for the Warning trigger expression.	`4h`
{$MSSQL.BUFFERCACHERATIO.MIN.CRIT}	The minimum % buffer cache hit ratio - for the High trigger expression.	`30`
{$MSSQL.BUFFERCACHERATIO.MIN.WARN}	The minimum % buffer cache hit ratio - for the Warning trigger expression.	`50`
{$MSSQL.DBNAME.MATCHES}	This macro is used in database discovery. It can be overridden on a host or linked template level.	`.*`
{$MSSQL.DBNAME.NOT_MATCHES}	This macro is used in database discovery. It can be overridden on a host or linked template level.	`master	tempdb	model	msdb`
{$MSSQL.DEADLOCKS.MAX}	The maximum deadlocks per second - for the trigger expression.	`1`
{$MSSQL.DSN}	System data source name.	`<Put your DSN here>`
{$MSSQL.FREELISTSTALLS.MAX}	The maximum free list stalls per second - for the trigger expression.	`2`
{$MSSQL.INSTANCE}	The instance name for the default instance is SQLServer. For named instance set the macro value as MSSQL$instance name.	`SQLServer`
{$MSSQL.JOB.MATCHES}	This macro is used in job discovery. It can be overridden on a host or linked template level.	`.*`
{$MSSQL.JOB.NOT_MATCHES}	This macro is used in job discovery. It can be overridden on a host or linked template level.	`CHANGE_IF_NEEDED`
{$MSSQL.LAZY_WRITES.MAX}	The maximum lazy writes per second - for the trigger expression.	`20`
{$MSSQL.LOCK_REQUESTS.MAX}	The maximum lock requests per second - for the trigger expression.	`1000`
{$MSSQL.LOCK_TIMEOUTS.MAX}	The maximum lock timeouts per second - for the trigger expression.	`1`
{$MSSQL.LOGFLUSHWAITS.MAX}	The maximum log flush waits per second - for the trigger expression.	`1`
{$MSSQL.LOGFLUSHWAIT_TIME.MAX}	The maximum log flush wait time in ms - for the trigger expression.	`1`
{$MSSQL.PAGELIFEEXPECTANCY.MIN}	The minimum page life expectancy - for the trigger expression.	`300`
{$MSSQL.PAGE_READS.MAX}	The maximum page reads per second - for the trigger expression.	`90`
{$MSSQL.PAGE_WRITES.MAX}	The maximum page writes per second - for the trigger expression.	`90`
{$MSSQL.PASSWORD}	MSSQL user password.	`<Put your password here>`
{$MSSQL.PERCENT_COMPILATIONS.MAX}	The maximum percentage of Transact-SQL compilations - for the trigger expression.	`10`
{$MSSQL.PERCENTLOGUSED.MAX}	The maximum percentage of log used - for the trigger expression.	`80`
{$MSSQL.PERCENT_READAHEAD.MAX}	The maximum percentage of pages read/sec in anticipation of use - for the trigger expression.	`20`
{$MSSQL.PERCENT_RECOMPILATIONS.MAX}	The maximum percentage of Transact-SQL recompilations - for the trigger expression.	`10`
{$MSSQL.PORT}	MSSQL TCP port.	`1433`
{$MSSQL.USER}	MSSQL username.	`<Put your username here>`
{$MSSQL.WORKTABLESFROMCACHE_RATIO.MIN.CRIT}	The minimum percentage of the worktables from cache ratio - for the High trigger expression.	`90`
{$MSSQL.WORK_FILES.MAX}	The maximum number of work files created per second - for the trigger expression.	`20`
{$MSSQL.WORK_TABLES.MAX}	The maximum number of work tables created per second - for the trigger expression.	`20`

Template links

There are no template links in this template.

Discovery rules

Name	Description	Type	Key and additional info
Availability groups discovery	Discovery of the existing availability groups.	ODBC	db.odbc.discovery[availabilitygroups,"{$MSSQL.DSN}"] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `1d`
Database discovery	Scanning databases in DBMS.	ODBC	db.odbc.discovery[dbname,"{$MSSQL.DSN}"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1d` Filter: AND - {#DBNAME} MATCHESREGEX `{$MSSQL.DBNAME.MATCHES}` - {#DBNAME} NOTMATCHES_REGEX `{$MSSQL.DBNAME.NOT_MATCHES}`
Job discovery	Scanning jobs in DBMS.	ODBC	db.odbc.discovery[jobname,"{$MSSQL.DSN}"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1d` Filter: AND - {#JOBNAME} MATCHESREGEX `{$MSSQL.JOB.MATCHES}` - {#JOBNAME} NOTMATCHES_REGEX `{$MSSQL.JOB.NOT_MATCHES}`
Local database discovery	Discovery of the local availability databases.	ODBC	db.odbc.discovery[localdb,"{$MSSQL.DSN}"] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `1d`
Mirroring discovery	To see the row for a database other than master or tempdb, you must either be the database owner or have at least ALTER ANY DATABASE or VIEW ANY DATABASE server-level permission or CREATE DATABASE permission in the master database. To see non-NULL values on a mirror database, you must be a member of the sysadmin fixed server role.	ODBC	db.odbc.discovery[mirrors,"{$MSSQL.DSN}"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1d`
Non-local database discovery	Discovery of the non-local (not local to the SQL Server instance) availability databases.	ODBC	db.odbc.discovery[non-localdb,"{$MSSQL.DSN}"] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `1d`
Replication discovery	Discovery of the database replicas.	ODBC	db.odbc.discovery[replicas,"{$MSSQL.DSN}"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1d`

Items collected

Group	Name	Description	Type	Key and additional info
MSSQL	MSSQL: Service's TCP port state	Test the availability of MS SQL Server on a TCP port.	SIMPLE	net.tcp.service[tcp,{HOST.CONN},{$MSSQL.PORT}] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
MSSQL	MSSQL: Version	MS SQL Server version.	DEPENDENT	mssql.version Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}' && @.counter_name=='Version')].instance_name.first()` - DISCARDUNCHANGEDHEARTBEAT: `1d`
MSSQL	MSSQL: Uptime	MS SQL Server uptime in 'N days, hh:mm:ss' format.	DEPENDENT	mssql.uptime Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}' && @.counter_name=='Uptime')].cntr_value.first()`
MSSQL	MSSQL: Forwarded records per second	Number of records per second fetched through forwarded record pointers.	DEPENDENT	mssql.forwardedrecordssec.rate Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Access Methods' && @.counter_name=='Forwarded Records/sec')].cntr_value.first()` - CHANGEPERSECOND
MSSQL	MSSQL: Full scans per second	Number of unrestricted full scans per second. These can be either base-table or full-index scans. Values greater than 1 or 2 indicate that there are table / Index page scans. If that is combined with high CPU, this counter requires further investigation, otherwise, if the full scans are on small tables, it can be ignored.	DEPENDENT	mssql.fullscanssec.rate Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Access Methods' && @.counter_name=='Full Scans/sec')].cntr_value.first()` - CHANGEPERSECOND
MSSQL	MSSQL: Index searches per second	Number of index searches per second. These are used to start a range scan, reposition a range scan, revalidate a scan point, fetch a single index record, and search down the index to locate where to insert a new row.	DEPENDENT	mssql.indexsearchessec.rate Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Access Methods' && @.counter_name=='Index Searches/sec')].cntr_value.first()` - CHANGEPERSECOND
MSSQL	MSSQL: Page splits per second	Number of page splits per second that occur as the result of overflowing index pages.	DEPENDENT	mssql.pagesplitssec.rate Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Access Methods' && @.counter_name=='Page Splits/sec')].cntr_value.first()` - CHANGEPERSECOND
MSSQL	MSSQL: Work files created per second	Number of work files created per second. For example, work files can be used to store temporary results for hash joins and hash aggregates.	DEPENDENT	mssql.workfilescreatedsec.rate Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Access Methods' && @.counter_name=='Workfiles Created/sec')].cntr_value.first()` - CHANGEPERSECOND
MSSQL	MSSQL: Work tables created per second	Number of work tables created per second. For example, work tables can be used to store temporary results for query spool, lob variables, XML variables, and cursors.	DEPENDENT	mssql.worktablescreatedsec.rate Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Access Methods' && @.counter_name=='Worktables Created/sec')].cntr_value.first()` - CHANGEPERSECOND
MSSQL	MSSQL: Table lock escalations per second	Number of times locks on a table were escalated to the TABLE or HoBT granularity.	DEPENDENT	mssql.tablelockescalations.rate Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Access Methods' && @.counter_name=='Table Lock Escalations/sec')].cntr_value.first()` - CHANGEPERSECOND
MSSQL	MSSQL: Worktables from cache ratio	Percentage of work tables created where the initial two pages of the work table were not allocated but were immediately available from the work table cache.	DEPENDENT	mssql.worktablesfromcache_ratio Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Access Methods' && @.counter_name=='WorktablesFromCacheRatio')].cntr_value.first()`
MSSQL	MSSQL: Buffer cache hit ratio	Indicates the percentage of pages found in the buffer cache without having to read from disk. The ratio is the total number of cache hits divided by the total number of cache lookups over the last few thousand page accesses. After a long period of time, the ratio changes very little. Since reading from the cache is much less expensive than reading from the disk, a higher value is preferred for this item. To increase the buffer cache hit ratio, consider increasing the amount of memory available to SQL Server or using the buffer pool extension feature.	DEPENDENT	mssql.buffercachehit_ratio Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Buffer Manager' && @.counter_name=='BufferCacheHitRatio')].cntr_value.first()`
MSSQL	MSSQL: Checkpoint pages per second	Indicates the number of pages flushed to disk per second by a checkpoint or other operation which required all dirty pages to be flushed.	DEPENDENT	mssql.checkpointpagessec.rate Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Buffer Manager' && @.counter_name=='Checkpoint pages/sec')].cntr_value.first()` - CHANGEPERSECOND
MSSQL	MSSQL: Database pages	Indicates the number of pages in the buffer pool with database content.	DEPENDENT	mssql.database_pages Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Buffer Manager' && @.counter_name=='Database pages')].cntr_value.first()`
MSSQL	MSSQL: Free list stalls per second	Indicates the number of requests per second that had to wait for a free page.	DEPENDENT	mssql.freeliststallssec.rate Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Buffer Manager' && @.counter_name=='Free list stalls/sec')].cntr_value.first()` - CHANGEPER_SECOND
MSSQL	MSSQL: Lazy writes per second	Indicates the number of buffers written per second by the buffer manager's lazy writer. The lazy writer is a system process that flushes out batches of dirty, aged buffers (buffers that contain changes that must be written back to disk before the buffer can be reused for a different page) and makes them available to user processes. The lazy writer eliminates the need to perform frequent checkpoints in order to create available buffers.	DEPENDENT	mssql.lazywritessec.rate Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Buffer Manager' && @.counter_name=='Lazy writes/sec')].cntr_value.first()` - CHANGEPERSECOND
MSSQL	MSSQL: Page life expectancy	Indicates the number of seconds a page will stay in the buffer pool without references.	DEPENDENT	mssql.pagelifeexpectancy Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Buffer Manager' && @.counter_name=='Page life expectancy')].cntr_value.first()`
MSSQL	MSSQL: Page lookups per second	Indicates the number of requests per second to find a page in the buffer pool.	DEPENDENT	mssql.pagelookupssec.rate Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Buffer Manager' && @.counter_name=='Page lookups/sec')].cntr_value.first()` - CHANGEPERSECOND
MSSQL	MSSQL: Page reads per second	Indicates the number of physical database page reads that are issued per second. This statistic displays the total number of physical page reads across all databases. Because physical I/O is expensive, you may be able to minimize the cost, either by using a larger data cache, intelligent indexes, and more efficient queries, or by changing the database design.	DEPENDENT	mssql.pagereadssec.rate Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Buffer Manager' && @.counter_name=='Page reads/sec')].cntr_value.first()` - CHANGEPERSECOND
MSSQL	MSSQL: Page writes per second	Indicates the number of physical database page writes that are issued per second.	DEPENDENT	mssql.pagewritessec.rate Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Buffer Manager' && @.counter_name=='Page writes/sec')].cntr_value.first()` - CHANGEPERSECOND
MSSQL	MSSQL: Read-ahead pages per second	Indicates the number of pages read per second in anticipation of use.	DEPENDENT	mssql.readaheadpagessec.rate Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Buffer Manager' && @.counter_name=='Readahead pages/sec')].cntr_value.first()` - CHANGEPERSECOND
MSSQL	MSSQL: Target pages	The optimal number of pages in the buffer pool.	DEPENDENT	mssql.target_pages Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Buffer Manager' && @.counter_name=='Target pages')].cntr_value.first()`
MSSQL	MSSQL: Total data file size	Total size of all data files.	DEPENDENT	mssql.datafilessize Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Databases' && @.counter_name=='Data File(s) Size (KB)' && @.instance_name=='_Total')].cntr_value.first()` - MULTIPLIER: `1024`
MSSQL	MSSQL: Total log file size	Total size of all the transaction log files.	DEPENDENT	mssql.logfilessize Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Databases' && @.counter_name=='Log File(s) Size (KB)' && @.instance_name=='_Total')].cntr_value.first()` - MULTIPLIER: `1024`
MSSQL	MSSQL: Total log file used size	The cumulative used size of all the log files in the database.	DEPENDENT	mssql.logfilesused_size Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Databases' && @.counter_name=='Log File(s) Used Size (KB)' && @.instance_name=='_Total')].cntr_value.first()` - MULTIPLIER: `1024`
MSSQL	MSSQL: Total transactions per second	Total number of transactions started for all databases per second.	DEPENDENT	mssql.transactionssec.rate Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Databases' && @.counter_name=='Transactions/sec' && @.instance_name=='_Total')].cntr_value.first()` - CHANGEPER_SECOND
MSSQL	MSSQL: Logins per second	Total number of logins started per second. This does not include pooled connections. Any value over 2 may indicate insufficient connection pooling.	DEPENDENT	mssql.loginssec.rate Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:General Statistics' && @.counter_name=='Logins/sec')].cntr_value.first()` - CHANGEPER_SECOND
MSSQL	MSSQL: Logouts per second	Total number of logout operations started per second. Any value over 2 may indicate insufficient connection pooling.	DEPENDENT	mssql.logoutssec.rate Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:General Statistics' && @.counter_name=='Logouts/sec')].cntr_value.first()` - CHANGEPER_SECOND
MSSQL	MSSQL: Number of blocked processes	Number of currently blocked processes.	DEPENDENT	mssql.processes_blocked Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:General Statistics' && @.counter_name=='Processes blocked')].cntr_value.first()`
MSSQL	MSSQL: Number users connected	Number of users connected to MS SQL Server.	DEPENDENT	mssql.user_connections Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:General Statistics' && @.counter_name=='User Connections')].cntr_value.first()`
MSSQL	MSSQL: Average latch wait time	Average latch wait time (in milliseconds) for latch requests that had to wait.	CALCULATED	mssql.averagelatchwait_time Expression: `(last(//mssql.average_latch_wait_time_raw) - last(//mssql.average_latch_wait_time_raw,#2)) / (last(//mssql.average_latch_wait_time_base) - last(//mssql.average_latch_wait_time_base,#2) + (last(//mssql.average_latch_wait_time_base) - last(//mssql.average_latch_wait_time_base,#2)=0))`
MSSQL	MSSQL: Latch waits per second	The number of latch requests that could not be granted immediately. Latches are lightweight means of holding a very transient server resource, such as an address in memory.	DEPENDENT	mssql.latchwaitssec.rate Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Latches' && @.counter_name=='Latch Waits/sec')].cntr_value.first()` - CHANGEPERSECOND
MSSQL	MSSQL: Total latch wait Time	Total latch wait time (in milliseconds) for latch requests in the last second. This value should stay stable compared to the number of latch waits per second.	DEPENDENT	mssql.totallatchwaittime Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Latches' && @.counter_name=='Total Latch Wait Time (ms)')].cntr_value.first()` - CHANGEPER_SECOND
MSSQL	MSSQL: Total average wait time	The average wait time, in milliseconds, for each lock request that had to wait.	CALCULATED	mssql.averagewaittime Expression: `(last(//mssql.average_wait_time_raw) - last(//mssql.average_wait_time_raw,#2)) / (last(//mssql.average_wait_time_base) - last(//mssql.average_wait_time_base,#2) + (last(//mssql.average_wait_time_base) - last(//mssql.average_wait_time_base,#2)=0))`
MSSQL	MSSQL: Total lock requests per second	Number of new locks and lock conversions per second requested from the lock manager.	DEPENDENT	mssql.lockrequestssec.rate Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Locks' && @.counter_name=='Lock Requests/sec' && @.instance_name=='_Total')].cntr_value.first()` - CHANGEPERSECOND
MSSQL	MSSQL: Total lock requests per second that timed out	Number of timed out lock requests per second, including requests for NOWAIT locks.	DEPENDENT	mssql.locktimeoutssec.rate Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Locks' && @.counter_name=='Lock Timeouts/sec' && @.instance_name=='_Total')].cntr_value.first()` - CHANGEPERSECOND
MSSQL	MSSQL: Total lock requests per second that required waiting	Number of lock requests per second that required the caller to wait.	DEPENDENT	mssql.lockwaitssec.rate Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Locks' && @.counter_name=='Lock Waits/sec' && @.instance_name=='_Total')].cntr_value.first()` - CHANGEPERSECOND
MSSQL	MSSQL: Lock wait time	Average of total wait time (in milliseconds) for locks in the last second.	DEPENDENT	mssql.lockwaittime Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Locks' && @.counter_name=='Lock Wait Time (ms)' && @.instance_name=='_Total')].cntr_value.first()` - CHANGEPERSECOND
MSSQL	MSSQL: Total lock requests per second that have deadlocks	Number of lock requests per second that resulted in a deadlock.	DEPENDENT	mssql.numberdeadlockssec.rate Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Locks' && @.counter_name=='Number of Deadlocks/sec' && @.instance_name=='_Total')].cntr_value.first()` - CHANGEPERSECOND
MSSQL	MSSQL: Granted Workspace Memory	Specifies the total amount of memory currently granted to executing processes, such as hash, sort, bulk copy, and index creation operations.	DEPENDENT	mssql.grantedworkspacememory Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Memory Manager' && @.counter_name=='Granted Workspace Memory (KB)')].cntr_value.first()` - MULTIPLIER: `1024`
MSSQL	MSSQL: Maximum workspace memory	Indicates the maximum amount of memory available for executing processes, such as hash, sort, bulk copy, and index creation operations.	DEPENDENT	mssql.maximumworkspacememory Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Memory Manager' && @.counter_name=='Maximum Workspace Memory (KB)')].cntr_value.first()` - MULTIPLIER: `1024`
MSSQL	MSSQL: Memory grants outstanding	Specifies the total number of processes that have successfully acquired a workspace memory grant.	DEPENDENT	mssql.memorygrantsoutstanding Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Memory Manager' && @.counter_name=='Memory Grants Outstanding')].cntr_value.first()`
MSSQL	MSSQL: Memory grants pending	Specifies the total number of processes waiting for a workspace memory grant.	DEPENDENT	mssql.memorygrantspending Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Memory Manager' && @.counter_name=='Memory Grants Pending')].cntr_value.first()`
MSSQL	MSSQL: Target server memory	Indicates the ideal amount of memory the server can consume.	DEPENDENT	mssql.targetservermemory Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Memory Manager' && @.counter_name=='Target Server Memory (KB)')].cntr_value.first()` - MULTIPLIER: `1024`
MSSQL	MSSQL: Total server memory	Specifies the amount of memory the server has committed using the memory manager.	DEPENDENT	mssql.totalservermemory Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Memory Manager' && @.counter_name=='Total Server Memory (KB)')].cntr_value.first()` - MULTIPLIER: `1024`
MSSQL	MSSQL: Cache hit ratio	Ratio between cache hits and lookups.	DEPENDENT	mssql.cachehitratio Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Plan Cache' && @.counter_name=='CacheHitRatio' && @.instance_name=='_Total')].cntr_value.first()`
MSSQL	MSSQL: Cache object counts	Number of cache objects in the cache.	DEPENDENT	mssql.cacheobjectcounts Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Plan Cache' && @.counter_name=='Cache Object Counts' && @.instance_name=='_Total')].cntr_value.first()`
MSSQL	MSSQL: Cache objects in use	Number of cache objects in use.	DEPENDENT	mssql.cacheobjectsin_use Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Plan Cache' && @.counter_name=='Cache Objects in use' && @.instance_name=='_Total')].cntr_value.first()`
MSSQL	MSSQL: Cache pages	Number of 8-kilobyte (KB) pages used by cache objects.	DEPENDENT	mssql.cache_pages Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Plan Cache' && @.counter_name=='Cache Pages' && @.instance_name=='_Total')].cntr_value.first()`
MSSQL	MSSQL: Errors per second (DB offline errors)	Number of errors per second.	DEPENDENT	mssql.offlineerrorssec.rate Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:SQL Errors' && @.counter_name=='Errors/sec' && @.instance_name=='DB Offline Errors')].cntr_value.first()` - CHANGEPERSECOND
MSSQL	MSSQL: Errors per second (Info errors)	Number of errors per second.	DEPENDENT	mssql.infoerrorssec.rate Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:SQL Errors' && @.counter_name=='Errors/sec' && @.instance_name=='Info Errors')].cntr_value.first()` - CHANGEPERSECOND
MSSQL	MSSQL: Errors per second (Kill connection errors)	Number of errors per second.	DEPENDENT	mssql.killconnectionerrorssec.rate Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:SQL Errors' && @.counter_name=='Errors/sec' && @.instance_name=='Kill Connection Errors')].cntr_value.first()` - CHANGEPER_SECOND
MSSQL	MSSQL: Errors per second (User errors)	Number of errors per second.	DEPENDENT	mssql.usererrorssec.rate Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:SQL Errors' && @.counter_name=='Errors/sec' && @.instance_name=='User Errors')].cntr_value.first()` - CHANGEPERSECOND
MSSQL	MSSQL: Total errors per second	Number of errors per second.	DEPENDENT	mssql.errorssec.rate Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:SQL Errors' && @.counter_name=='Errors/sec' && @.instance_name=='_Total')].cntr_value.first()` - CHANGEPER_SECOND
MSSQL	MSSQL: Auto-param attempts per second	Number of auto-parameterization attempts per second. The total should be the sum of the failed, safe, and unsafe auto-parameterizations. Auto-parameterization occurs when an instance of SQL Server tries to parameterize a Transact-SQL request by replacing some literals with parameters to me reuse of the resulting cached execution plan across multiple similar-looking requests possible. Note that auto-parameterizations are also known as simple parameterizations in the newer versions of SQL Server. This counter does not include forced parameterizations.	DEPENDENT	mssql.autoparamattemptssec.rate Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:SQL Statistics' && @.counter_name=='Auto-Param Attempts/sec')].cntr_value.first()` - CHANGEPERSECOND
MSSQL	MSSQL: Batch requests per second	Number of Transact-SQL command batches received per second. This statistic is affected by all constraints (such as I/O, number of users, cache size, complexity of requests, and so on). High batch requests mean good throughput.	DEPENDENT	mssql.batchrequestssec.rate Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:SQL Statistics' && @.counter_name=='Batch Requests/sec')].cntr_value.first()` - CHANGEPERSECOND
MSSQL	MSSQL: Percent of Adhoc queries running	The ratio of SQL compilations per second to Batch requests per second in percentage.	CALCULATED	mssql.percentofadhoc_queries Expression: `last(//mssql.sql_compilations_sec.rate) * 100 / (last(//mssql.batch_requests_sec.rate) + (last(//mssql.batch_requests_sec.rate)=0))`
MSSQL	MSSQL: Percent of Recompiled Transact-SQL Objects	The ratio of SQL re-compilations per second to SQL compilations per second in percentage.	CALCULATED	mssql.percentrecompilationsto_compilations Expression: `last(//mssql.sql_recompilations_sec.rate) * 100 / (last(//mssql.sql_compilations_sec.rate) + (last(//mssql.sql_compilations_sec.rate)=0))`
MSSQL	MSSQL: Full scans to Index searches ratio	The ratio of Full scans per second to Index searches per second. The threshold recommendation is strictly for OLTP workloads.	CALCULATED	mssql.scantosearch Expression: `last(//mssql.full_scans_sec.rate) / (last(//mssql.index_searches_sec.rate) + (last(//mssql.index_searches_sec.rate)=0))`
MSSQL	MSSQL: Failed auto-params per second	Number of failed auto-parameterization attempts per second. This number should be small. Note that auto-parameterizations are also known as simple parameterizations in the newer versions of SQL Server.	DEPENDENT	mssql.failedautoparamssec.rate Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:SQL Statistics' && @.counter_name=='Failed Auto-Params/sec')].cntr_value.first()` - CHANGEPERSECOND
MSSQL	MSSQL: Safe auto-params per second	Number of safe auto-parameterization attempts per second. Safe refers to a determination that a cached execution plan can be shared between different similar-looking Transact-SQL statements. SQL Server makes many auto-parameterization attempts some of which turn out to be safe and others fail. Note that auto-parameterizations are also known as simple parameterizations in the newer versions of SQL Server. This does not include forced parameterizations.	DEPENDENT	mssql.safeautoparamssec.rate Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:SQL Statistics' && @.counter_name=='Safe Auto-Params/sec')].cntr_value.first()` - CHANGEPERSECOND
MSSQL	MSSQL: SQL compilations per second	Number of SQL compilations per second. Indicates the number of times the compile code path is entered. Includes runs caused by statement-level recompilations in SQL Server. After SQL Server user activity is stable, this value reaches a steady state.	DEPENDENT	mssql.sqlcompilationssec.rate Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:SQL Statistics' && @.counter_name=='SQL Compilations/sec')].cntr_value.first()` - CHANGEPERSECOND
MSSQL	MSSQL: SQL re-compilations per second	Number of statement recompiles per second. Counts the number of times statement recompiles are triggered. Generally, you want the recompiles to be low.	DEPENDENT	mssql.sqlrecompilationssec.rate Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:SQL Statistics' && @.counter_name=='SQL Re-Compilations/sec')].cntr_value.first()` - CHANGEPERSECOND
MSSQL	MSSQL: Unsafe auto-params per second	Number of unsafe auto-parameterization attempts per second. For example, the query has some characteristics that prevent the cached plan from being shared. These are designated as unsafe. This does not count the number of forced parameterizations.	DEPENDENT	mssql.unsafeautoparamssec.rate Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:SQL Statistics' && @.counter_name=='Unsafe Auto-Params/sec')].cntr_value.first()` - CHANGEPERSECOND
MSSQL	MSSQL: Total transactions number	The number of currently active transactions of all types.	DEPENDENT	mssql.transactions Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Transactions' && @.counter_name=='Transactions')].cntr_value.first()`
MSSQL	MSSQL DB '{#DBNAME}': State	0 = ONLINE 1 = RESTORING 2 = RECOVERING	SQL Server 2008 and later 3 = RECOVERY_PENDING	SQL Server 2008 and later 4 = SUSPECT 5 = EMERGENCY	SQL Server 2008 and later 6 = OFFLINE	SQL Server 2008 and later 7 = COPYING	Azure SQL Database Active Geo-Replication 10 = OFFLINE_SECONDARY	Azure SQL Database Active Geo-Replication	DEPENDENT	mssql.db.state["{#DBNAME}"] Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Databases' && @.counter_name=='State' && @.instance_name=='{#DBNAME}')].cntr_value.first()` - DISCARDUNCHANGEDHEARTBEAT: `15m`
MSSQL	MSSQL DB '{#DBNAME}': Active transactions	Number of active transactions for the database.	DEPENDENT	mssql.db.active_transactions["{#DBNAME}"] Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Databases' && @.counter_name=='Active Transactions' && @.instance_name=='{#DBNAME}')].cntr_value.first()`
MSSQL	MSSQL DB '{#DBNAME}': Data file size	Cumulative size of all the data files in the database including any automatic growth. Monitoring this counter is useful, for example, for determining the correct size of tempdb.	DEPENDENT	mssql.db.datafilessize["{#DBNAME}"] Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Databases' && @.counter_name=='Data File(s) Size (KB)' && @.instance_name=='{#DBNAME}')].cntr_value.first()` - MULTIPLIER: `1024`
MSSQL	MSSQL DB '{#DBNAME}': Log bytes flushed per second	Total number of log bytes flushed per second. Useful for determining trends and utilization of the transaction log.	DEPENDENT	mssql.db.logbytesflushedsec.rate["{#DBNAME}"] Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Databases' && @.counter_name=='Log Bytes Flushed/sec' && @.instance_name=='{#DBNAME}')].cntr_value.first()` - CHANGEPER_SECOND
MSSQL	MSSQL DB '{#DBNAME}': Log file size	Cumulative size of all the transaction log files in the database.	DEPENDENT	mssql.db.logfilessize["{#DBNAME}"] Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Databases' && @.counter_name=='Log File(s) Size (KB)' && @.instance_name=='{#DBNAME}')].cntr_value.first()` - MULTIPLIER: `1024`
MSSQL	MSSQL DB '{#DBNAME}': Log file used size	Cumulative used size of all the log files in the database.	DEPENDENT	mssql.db.logfilesused_size["{#DBNAME}"] Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Databases' && @.counter_name=='Log File(s) Used Size (KB)' && @.instance_name=='{#DBNAME}')].cntr_value.first()` - MULTIPLIER: `1024`
MSSQL	MSSQL DB '{#DBNAME}': Log flushes per second	Number of log flushes per second.	DEPENDENT	mssql.db.logflushessec.rate["{#DBNAME}"] Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Databases' && @.counter_name=='Log Flushes/sec' && @.instance_name=='{#DBNAME}')].cntr_value.first()` - CHANGEPERSECOND
MSSQL	MSSQL DB '{#DBNAME}': Log flush waits per second	Number of commits per second waiting for the log flush.	DEPENDENT	mssql.db.logflushwaitssec.rate["{#DBNAME}"] Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Databases' && @.counter_name=='Log Flush Waits/sec' && @.instance_name=='{#DBNAME}')].cntr_value.first()` - CHANGEPER_SECOND
MSSQL	MSSQL DB '{#DBNAME}': Log flush wait time	Total wait time (in milliseconds) to flush the log. On an AlwaysOn secondary database, this value indicates the wait time for log records to be hardened to disk.	DEPENDENT	mssql.db.logflushwaittime["{#DBNAME}"] Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Databases' && @.counter_name=='Log Flush Wait Time' && @.instance_name=='{#DBNAME}')].cntr_value.first()` - CHANGEPER_SECOND
MSSQL	MSSQL DB '{#DBNAME}': Log growths	Total number of times the transaction log for the database has been expanded.	DEPENDENT	mssql.db.log_growths["{#DBNAME}"] Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Databases' && @.counter_name=='Log Growths' && @.instance_name=='{#DBNAME}')].cntr_value.first()`
MSSQL	MSSQL DB '{#DBNAME}': Log shrinks	Total number of times the transaction log for the database has been shrunk.	DEPENDENT	mssql.db.log_shrinks["{#DBNAME}"] Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Databases' && @.counter_name=='Log Shrinks' && @.instance_name=='{#DBNAME}')].cntr_value.first()`
MSSQL	MSSQL DB '{#DBNAME}': Log truncations	Number of times the transaction log has been shrunk.	DEPENDENT	mssql.db.log_truncations["{#DBNAME}"] Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Databases' && @.counter_name=='Log Truncations' && @.instance_name=='{#DBNAME}')].cntr_value.first()`
MSSQL	MSSQL DB '{#DBNAME}': Percent log used	Percentage of space in the log that is in use.	DEPENDENT	mssql.db.percentlogused["{#DBNAME}"] Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Databases' && @.counter_name=='Percent Log Used' && @.instance_name=='{#DBNAME}')].cntr_value.first()`
MSSQL	MSSQL DB '{#DBNAME}': Transactions per second	Number of transactions started for the database per second.	DEPENDENT	mssql.db.transactionssec.rate["{#DBNAME}"] Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Databases' && @.counter_name=='Transactions/sec' && @.instance_name=='{#DBNAME}')].cntr_value.first()` - CHANGEPER_SECOND
MSSQL	MSSQL DB '{#DBNAME}': Last diff backup duration	Duration of the last differential backup.	DEPENDENT	mssql.backup.diff.duration["{#DBNAME}"] Preprocessing: - JSONPATH: `$[?(@.dbname=='{#DBNAME}' && @.type=='I')].duration.first()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
MSSQL	MSSQL DB '{#DBNAME}': Last diff backup (time ago)	The amount of time since the last differential backup.	DEPENDENT	mssql.backup.diff["{#DBNAME}"] Preprocessing: - JSONPATH: `$[?(@.dbname=='{#DBNAME}' && @.type=='I')].timesincelastbackup.first()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
MSSQL	MSSQL DB '{#DBNAME}': Last full backup duration	Duration of the last full backup.	DEPENDENT	mssql.backup.full.duration["{#DBNAME}"] Preprocessing: - JSONPATH: `$[?(@.dbname=='{#DBNAME}' && @.type=='D')].duration.first()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
MSSQL	MSSQL DB '{#DBNAME}': Last full backup (time ago)	The amount of time since the last full backup.	DEPENDENT	mssql.backup.full["{#DBNAME}"] Preprocessing: - JSONPATH: `$[?(@.dbname=='{#DBNAME}' && @.type=='D')].timesincelastbackup.first()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
MSSQL	MSSQL DB '{#DBNAME}': Last log backup duration	Duration of the last log backup.	DEPENDENT	mssql.backup.log.duration["{#DBNAME}"] Preprocessing: - JSONPATH: `$[?(@.dbname=='{#DBNAME}' && @.type=='L')].duration.first()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
MSSQL	MSSQL DB '{#DBNAME}': Last log backup	The amount of time since the last log backup.	DEPENDENT	mssql.backup.log["{#DBNAME}"] Preprocessing: - JSONPATH: `$[?(@.dbname=='{#DBNAME}' && @.type=='L')].timesincelastbackup.first()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
MSSQL	MSSQL AG '{#GROUP_NAME}': Primary replica recovery health	Indicates the recovery health of the primary replica: 0 = In progress 1 = Online 2 = Unavailable	DEPENDENT	mssql.primaryrecoveryhealth["{#GROUPNAME}"] Preprocessing: - JSONPATH: `$[?(@.group_name=='{#GROUP_NAME}')].primary_recovery_health.first()` - DISCARDUNCHANGED_HEARTBEAT: `1h`
MSSQL	MSSQL AG '{#GROUP_NAME}': Primary replica name	Name of the server instance that is hosting the current primary replica.	DEPENDENT	mssql.primaryreplica["{#GROUPNAME}"] Preprocessing: - JSONPATH: `$[?(@.group_name=='{#GROUP_NAME}')].primary_replica.first()` - DISCARDUNCHANGEDHEARTBEAT: `3h`
MSSQL	MSSQL AG '{#GROUP_NAME}': Secondary replica recovery health	Indicates the recovery health of a secondary replica: 0 = In progress 1 = Online 2 = Unavailable	DEPENDENT	mssql.secondaryrecoveryhealth["{#GROUPNAME}"] Preprocessing: - JSONPATH: `$[?(@.group_name=='{#GROUP_NAME}')].secondary_recovery_health.first()` - DISCARDUNCHANGED_HEARTBEAT: `1h`
MSSQL	MSSQL AG '{#GROUP_NAME}': Synchronization health	Reflects a rollup of the synchronization_health of all availability replicas in the availability group: 0: Not healthy. None of the availability replicas have a healthy synchronization. 1: Partially healthy. The synchronization of some, but not all, availability replicas is healthy. 2: Healthy. The synchronization of every availability replica is healthy.	DEPENDENT	mssql.synchronizationhealth["{#GROUPNAME}"] Preprocessing: - JSONPATH: `$[?(@.group_name=='{#GROUP_NAME}')].synchronization_health.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
MSSQL	MSSQL AG '{#GROUP_NAME}' Local DB '{#DBNAME}': State	0 = Online 1 = Restoring 2 = Recovering 3 = Recovery pending 4 = Suspect 5 = Emergency 6 = Offline	DEPENDENT	mssql.localdb.state["{#DBNAME}"] Preprocessing: - JSONPATH: `$[?(@.group_name=='{#GROUP_NAME}' && @.replica_name=='{#REPLICA_NAME}' && @.dbname=='{#DBNAME}')].database_state.first()` - DISCARDUNCHANGED_HEARTBEAT: `1h`
MSSQL	MSSQL AG '{#GROUP_NAME}' Local DB '{#DBNAME}': Suspended	Database state: 0 = Resumed 1 = Suspended	DEPENDENT	mssql.localdb.issuspended["{#DBNAME}"] Preprocessing: - JSONPATH: `$[?(@.group_name=='{#GROUP_NAME}' && @.replica_name=='{#REPLICA_NAME}' && @.dbname=='{#DBNAME}')].is_suspended.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
MSSQL	MSSQL AG '{#GROUP_NAME}' Local DB '{#DBNAME}': Synchronization health	Reflects the intersection of the synchronization state of a database that is joined to the availability group on the availability replica and the availability mode of the availability replica (synchronous-commit or asynchronous-commit mode): 0 = Not healthy. The synchronizationstate of the database is 0 (NOT SYNCHRONIZING). 1 = Partially healthy. A database on a synchronous-commit availability replica is considered partially healthy if synchronizationstate is 1 (SYNCHRONIZING). 2 = Healthy. A database on an synchronous-commit availability replica is considered healthy if synchronizationstate is 2 (SYNCHRONIZED), and a database on an asynchronous-commit availability replica is considered healthy if synchronizationstate is 1 (SYNCHRONIZING).	DEPENDENT	mssql.localdb.synchronizationhealth["{#DBNAME}"] Preprocessing: - JSONPATH: `$[?(@.group_name=='{#GROUP_NAME}' && @.replica_name=='{#REPLICA_NAME}' && @.dbname=='{#DBNAME}')].synchronization_health.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
MSSQL	MSSQL AG '{#GROUPNAME}' Non-Local DB '{#REPLICANAME}{#DBNAME}': Log queue size	Amount of the log records of the primary database that has not been sent to the secondary databases.	DEPENDENT	mssql.non-localdb.logsendqueuesize["{#GROUPNAME}{#REPLICANAME}{#DBNAME}"] Preprocessing: - JSONPATH: `$[?(@.group_name=='{#GROUP_NAME}' && @.replica_name=='{#REPLICA_NAME}' && @.dbname=='{#DBNAME}')].log_send_queue_size.first()` - MULTIPLIER: `1024` - DISCARDUNCHANGEDHEARTBEAT: `1h`
MSSQL	MSSQL AG '{#GROUPNAME}' Non-Local DB '{#REPLICANAME}{#DBNAME}': Redo log queue size	Amount of log records in the log files of the secondary replica that has not yet been redone.	DEPENDENT	mssql.non-localdb.redoqueuesize["{#GROUPNAME}{#REPLICA_NAME}{#DBNAME}"] Preprocessing: - JSONPATH: `$[?(@.group_name=='{#GROUP_NAME}' && @.replica_name=='{#REPLICA_NAME}' && @.dbname=='{#DBNAME}')].redo_queue_size.first()` - MULTIPLIER: `1024` - DISCARDUNCHANGEDHEARTBEAT: `1h`
MSSQL	MSSQL AG '{#GROUPNAME}' Replica '{#REPLICANAME}': Connected state	Whether a secondary replica is currently connected to the primary replica: 0 : Disconnected. The response of an availability replica to the DISCONNECTED state depends on its role: On the primary replica, if a secondary replica is disconnected, its secondary databases are marked as NOT SYNCHRONIZED on the primary replica, which waits for the secondary to reconnect; On a secondary replica, upon detecting that it is disconnected, the secondary replica attempts to reconnect to the primary replica. 1 : Connected. Each primary replica tracks the connection state for every secondary replica in the same availability group. Secondary replicas track the connection state of only the primary replica.	DEPENDENT	mssql.replica.connectedstate["{#GROUPNAME}{#REPLICANAME}"] Preprocessing: - JSONPATH: `$[?(@.group_name=='{#GROUP_NAME}' && @.replica_name=='{#REPLICA_NAME}')].connected_state.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
MSSQL	MSSQL AG '{#GROUPNAME}' Replica '{#REPLICANAME}': Is local	Whether the replica is local: 0 = Indicates a remote secondary replica in an availability group whose primary replica is hosted by the local server instance. This value occurs only on the primary replica location. 1 = Indicates a local replica. On secondary replicas, this is the only available value for the availability group to which the replica belongs.	DEPENDENT	mssql.replica.islocal["{#GROUPNAME}{#REPLICANAME}"] Preprocessing: - JSONPATH: `$[?(@.group_name=='{#GROUP_NAME}' && @.replica_name=='{#REPLICA_NAME}')].is_local.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
MSSQL	MSSQL AG '{#GROUPNAME}' Replica '{#REPLICANAME}': Join state	0 = Not joined 1 = Joined, standalone instance 2 = Joined, failover cluster instance	DEPENDENT	mssql.replica.joinstate["{#GROUPNAME}{#REPLICANAME}"] Preprocessing: - JSONPATH: `$[?(@.group_name=='{#GROUP_NAME}' && @.replica_name=='{#REPLICA_NAME}')].join_state.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
MSSQL	MSSQL AG '{#GROUPNAME}' Replica '{#REPLICANAME}': Operational state	Current operational state of the replica: 0 = Pending failover 1 = Pending 2 = Online 3 = Offline 4 = Failed 5 = Failed, no quorum 6 = Not local	DEPENDENT	mssql.replica.operationalstate["{#GROUPNAME}{#REPLICANAME}"] Preprocessing: - JSONPATH: `$[?(@.group_name=='{#GROUP_NAME}' && @.replica_name=='{#REPLICA_NAME}')].operational_state.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
MSSQL	MSSQL AG '{#GROUPNAME}' Replica '{#REPLICANAME}': Recovery health	Rollup of the databasestate column of the sys.dmhadrdatabasereplicastates dynamic management view: 0 : In progress. At least one joined database has a database state other than ONLINE (databasestate is not 0). 1 : Online. All the joined databases have a database state of ONLINE (database_state is 0).	DEPENDENT	mssql.replica.recoveryhealth["{#GROUPNAME}{#REPLICANAME}"] Preprocessing: - JSONPATH: `$[?(@.group_name=='{#GROUP_NAME}' && @.replica_name=='{#REPLICA_NAME}')].recovery_health.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
MSSQL	MSSQL AG '{#GROUPNAME}' Replica '{#REPLICANAME}': Role	Current Always On availability groups role of a local replica or a connected remote replica: 0 = Resolving 1 = Primary 2 = Secondary	DEPENDENT	mssql.replica.role["{#GROUPNAME}{#REPLICANAME}"] Preprocessing: - JSONPATH: `$[?(@.group_name=='{#GROUP_NAME}' && @.replica_name=='{#REPLICA_NAME}')].role.first()` - DISCARDUNCHANGED_HEARTBEAT: `1h`
MSSQL	MSSQL AG '{#GROUPNAME}' Replica '{#REPLICANAME}': Sync health	Reflects a rollup of the database synchronization state (synchronization_state)of all joined availability databases (also known as replicas) and the availability mode of the replica (synchronous-commit or asynchronous-commit mode). The rollup will reflect the least healthy accumulated state of the databases on the replica: 0 : Not healthy. At least one joined database is in the NOT SYNCHRONIZING state. 1 : Partially healthy. Some replicas are not in the target synchronization state: synchronous-commit replicas should be synchronized, and asynchronous-commit replicas should be synchronizing. 2 : Healthy. All replicas are in the target synchronization state: synchronous-commit replicas are synchronized, and asynchronous-commit replicas are synchronizing.	DEPENDENT	mssql.replica.synchronizationhealth["{#GROUPNAME}{#REPLICANAME}"] Preprocessing: - JSONPATH: `$[?(@.group_name=='{#GROUP_NAME}' && @.replica_name=='{#REPLICA_NAME}')].synchronization_health.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
MSSQL	MSSQL Mirroring '{#DBNAME}': Role	Current role of the local database plays in the database mirroring session. 1 = Principal 2 = Mirror	DEPENDENT	mssql.mirroring.role["{#DBNAME}"] Preprocessing: - JSONPATH: `$[?(@.dbname=='{#DBNAME}')].mirroring_role.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
MSSQL	MSSQL Mirroring '{#DBNAME}': Role sequence	The number of times that mirroring partners have switched the principal and mirror roles due to a failover or forced service.	DEPENDENT	mssql.mirroring.rolesequence["{#DBNAME}"] Preprocessing: - JSONPATH: `$[?(@.dbname=='{#DBNAME}')].mirroring_role_sequence.first()` - SIMPLECHANGE
MSSQL	MSSQL Mirroring '{#DBNAME}': State	State of the mirror database and of the database mirroring session. 0 = Suspended 1 = Disconnected from the other partner 2 = Synchronizing 3 = Pending Failover 4 = Synchronized 5 = The partners are not synchronized. Failover is not possible now. 6 = The partners are synchronized. Failover is potentially possible. For information about the requirements for the failover, see Database Mirroring Operating Modes.	DEPENDENT	mssql.mirroring.state["{#DBNAME}"] Preprocessing: - JSONPATH: `$[?(@.dbname=='{#DBNAME}')].mirroring_state.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
MSSQL	MSSQL Mirroring '{#DBNAME}': Witness state	State of the witness in the database mirroring session of the database: 0 = Unknown 1 = Connected 2 = Disconnected	DEPENDENT	mssql.mirroring.witnessstate["{#DBNAME}"] Preprocessing: - JSONPATH: `$[?(@.dbname=='{#DBNAME}')].mirroring_witness_state.first()` - DISCARDUNCHANGED_HEARTBEAT: `1h`
MSSQL	MSSQL Mirroring '{#DBNAME}': Safety level	Safety setting for updates on the mirror database: 0 = Unknown state 1 = Off [asynchronous] 2 = Full [synchronous]	DEPENDENT	mssql.mirroring.safetylevel["{#DBNAME}"] Preprocessing: - JSONPATH: `$[?(@.dbname=='{#DBNAME}')].mirroring_safety_level.first()` - DISCARDUNCHANGED_HEARTBEAT: `1h`
MSSQL	MSSQL Job '{#JOBNAME}': Last run date-time	The last date-time of the job run.	DEPENDENT	mssql.job.lastrundatetime["{#JOBNAME}"] Preprocessing: - JSONPATH: `$[?(@.JobName=='{#JOBNAME}')].LastRunDateTime.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGED_HEARTBEAT: `1d`
MSSQL	MSSQL Job '{#JOBNAME}': Next run date-time	The next date-time of the job run.	DEPENDENT	mssql.job.nextrundatetime["{#JOBNAME}"] Preprocessing: - JSONPATH: `$[?(@.JobName=='{#JOBNAME}')].NextRunDateTime.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGED_HEARTBEAT: `15m`
MSSQL	MSSQL Job '{#JOBNAME}': Last run status message	The informational message about the last run of the job.	DEPENDENT	mssql.job.lastrunstatusmessage["{#JOBNAME}"] Preprocessing: - JSONPATH: `$[?(@.JobName=='{#JOBNAME}')].LastRunStatusMessage.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGED_HEARTBEAT: `15m`
MSSQL	MSSQL Job '{#JOBNAME}': Run status	The job status possible values: 0 ⇒ Failed 1 ⇒ Succeeded 2 ⇒ Retry 3 ⇒ Canceled 4 ⇒ Running	DEPENDENT	mssql.job.runstatus["{#JOBNAME}"] Preprocessing: - JSONPATH: `$[?(@.JobName=='{#JOBNAME}')].RunStatus.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGED_HEARTBEAT: `15m`
MSSQL	MSSQL Job '{#JOBNAME}': Run duration	Duration of the last run job.	DEPENDENT	mssql.job.runduration["{#JOBNAME}"] Preprocessing: - JSONPATH: `$[?(@.JobName=='{#JOBNAME}')].RunDuration.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGEDHEARTBEAT: `15m`
Zabbix raw items	MSSQL: Get last backup	The item gets information about backup processes.	ODBC	db.odbc.get[getlastbackup,"{$MSSQL.DSN}"] Expression: `The text is too long. Please see the template.`
Zabbix raw items	MSSQL: Get job status	The item gets sql agent job status.	ODBC	db.odbc.get[getjobstatus,"{$MSSQL.DSN}"] Expression: `The text is too long. Please see the template.`
Zabbix raw items	MSSQL: Get performance counters	The item gets server global status information.	ODBC	db.odbc.get[getstatusvariables,"{$MSSQL.DSN}"] Expression: `The text is too long. Please see the template.`
Zabbix raw items	MSSQL: Average latch wait time raw	Average latch wait time (in milliseconds) for latch requests that had to wait.	DEPENDENT	mssql.averagelatchwaittimeraw Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Latches' && @.counter_name=='Average Latch Wait Time (ms)')].cntr_value.first()`
Zabbix raw items	MSSQL: Average latch wait time base	For internal use only.	DEPENDENT	mssql.averagelatchwaittimebase Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Latches' && @.counter_name=='Average Latch Wait Time Base')].cntr_value.first()`
Zabbix raw items	MSSQL: Total average wait time raw	Average amount of wait time (in milliseconds) for each lock request that resulted in a wait. Information for all locks.	DEPENDENT	mssql.averagewaittime_raw Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Locks' && @.counter_name=='Average Wait Time (ms)' && @.instance_name=='_Total')].cntr_value.first()`
Zabbix raw items	MSSQL: Total average wait time base	For internal use only.	DEPENDENT	mssql.averagewaittime_base Preprocessing: - JSONPATH: `$[?(@.object_name=='{$MSSQL.INSTANCE}:Locks' && @.counter_name=='Average Wait Time Base' && @.instance_name=='_Total')].cntr_value.first()`
Zabbix raw items	MSSQL AG '{#GROUP_NAME}': Get replica states	Getting replica states - name, primary and secondary health, synchronization health.	ODBC	db.odbc.get[{#GROUPNAME}replica_states,"{$MSSQL.DSN}"] Expression: `The text is too long. Please see the template.`
Zabbix raw items	MSSQL AG '{#GROUP_NAME}' Local DB '{#DBNAME}': Get local DB states	Getting the states of the local availability database.	ODBC	db.odbc.get["{#GROUPNAME}{#DBNAME}localdb.states","{$MSSQL.DSN}"] Expression: `The text is too long. Please see the template.`
Zabbix raw items	MSSQL AG '{#GROUPNAME}' Non-Local DB '{#REPLICANAME}{#DBNAME}': Get non-local DB states	Getting the states of the non-local availability database.	ODBC	db.odbc.get["{#GROUPNAME}{#REPLICANAME}{#DBNAME}non-localdb.states","{$MSSQL.DSN}"] Expression: `The text is too long. Please see the template.`
Zabbix raw items	MSSQL AG '{#GROUPNAME}' Replica '{#REPLICANAME}': Get the replica state	Getting the database replica states.	ODBC	db.odbc.get["{#GROUPNAME}{#REPLICANAME}replica.state","{$MSSQL.DSN}"] Expression: `The text is too long. Please see the template.`
Zabbix raw items	MSSQL Mirroring '{#DBNAME}': Get the mirror state	Getting mirrors state	ODBC	db.odbc.get["{#DBNAME}mirroringstate","{$MSSQL.DSN}"] Expression: `The text is too long. Please see the template.`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
MSSQL: Service is unavailable	The TCP port of the MS SQL Server service is currently unavailable.	`last(/MSSQL by ODBC/net.tcp.service[tcp,{HOST.CONN},{$MSSQL.PORT}])=0`	DISASTER
MSSQL: Version has changed	MSSQL version has changed. Ack to close.	`last(/MSSQL by ODBC/mssql.version,#1)<>last(/MSSQL by ODBC/mssql.version,#2) and length(last(/MSSQL by ODBC/mssql.version))>0`	INFO	Manual close: YES
MSSQL: Service has been restarted	Uptime is less than 10 minutes.	`last(/MSSQL by ODBC/mssql.uptime)<10m`	INFO	Manual close: YES
MSSQL: Failed to fetch info data	Zabbix has not received data for items for the last 30 minutes.	`nodata(/MSSQL by ODBC/mssql.uptime,30m)=1`	INFO	Depends on: - MSSQL: Service is unavailable
MSSQL: Too frequently using pointers	Rows with varchar columns can experience expansion when varchar values are updated with a longer string. In the case where the row cannot fit in the existing page, the row migrates and access to the row will traverse a pointer. This only happens on heaps (tables without clustered indexes). Evaluate clustered index for heap tables. In cases where clustered indexes cannot be used, drop non-clustered indexes, build a clustered index to reorg pages and rows, drop the clustered index, then recreate non-clustered indexes.	`last(/MSSQL by ODBC/mssql.forwarded_records_sec.rate) * 100 > 10 * last(/MSSQL by ODBC/mssql.batch_requests_sec.rate)`	WARNING
MSSQL: Number of work files created per second is high	Too many work files created per second to store temporary results for hash joins and hash aggregates.	`min(/MSSQL by ODBC/mssql.workfiles_created_sec.rate,5m)>{$MSSQL.WORK_FILES.MAX}`	AVERAGE
MSSQL: Number of work tables created per second is high	Too many work tables created per second to store temporary results for query spool, lob variables, XML variables, and cursors.	`min(/MSSQL by ODBC/mssql.worktables_created_sec.rate,5m)>{$MSSQL.WORK_TABLES.MAX}`	AVERAGE
MSSQL: Percentage of work tables available from the work table cache is low	A value less than 90% may indicate insufficient memory, since execution plans are being dropped, or on 32-bit systems, may indicate the need for an upgrade to a 64-bit system	`max(/MSSQL by ODBC/mssql.worktables_from_cache_ratio,5m)<{$MSSQL.WORKTABLES_FROM_CACHE_RATIO.MIN.CRIT}`	HIGH
MSSQL: Percentage of the buffer cache efficiency is low	Too low buffer cache hit ratio.	`max(/MSSQL by ODBC/mssql.buffer_cache_hit_ratio,5m)<{$MSSQL.BUFFER_CACHE_RATIO.MIN.CRIT}`	HIGH
MSSQL: Percentage of the buffer cache efficiency is low	Low buffer cache hit ratio.	`max(/MSSQL by ODBC/mssql.buffer_cache_hit_ratio,5m)<{$MSSQL.BUFFER_CACHE_RATIO.MIN.WARN}`	WARNING	Depends on: - MSSQL: Percentage of the buffer cache efficiency is low
MSSQL: Number of rps waiting for a free page is high	Some requests have to wait for a free page.	`min(/MSSQL by ODBC/mssql.free_list_stalls_sec.rate,5m)>{$MSSQL.FREE_LIST_STALLS.MAX}`	WARNING
MSSQL: Number of buffers written per second by the lazy writer is high	The number of buffers written per second by the buffer manager's lazy writer exceeds the threshold.	`min(/MSSQL by ODBC/mssql.lazy_writes_sec.rate,5m)>{$MSSQL.LAZY_WRITES.MAX}`	WARNING
MSSQL: Page life expectancy is low	The page stays in the buffer pool without references of less time than the threshold value.	`max(/MSSQL by ODBC/mssql.page_life_expectancy,15m)<{$MSSQL.PAGE_LIFE_EXPECTANCY.MIN}`	HIGH
MSSQL: Number of physical database page reads per second is high	The physical database page reads are issued too frequently.	`min(/MSSQL by ODBC/mssql.page_reads_sec.rate,5m)>{$MSSQL.PAGE_READS.MAX}`	WARNING
MSSQL: Number of physical database page writes per second is high	The physical database page writes are issued too frequently.	`min(/MSSQL by ODBC/mssql.page_writes_sec.rate,5m)>{$MSSQL.PAGE_WRITES.MAX}`	WARNING
MSSQL: Too many physical reads occurring	If this value makes up even a sizeable minority of the total Page Reads/sec (say, greater than 20% of the total page reads), you may have too many physical reads occurring.	`last(/MSSQL by ODBC/mssql.readahead_pages_sec.rate) > {$MSSQL.PERCENT_READAHEAD.MAX} / 100 * last(/MSSQL by ODBC/mssql.page_reads_sec.rate)`	WARNING
MSSQL: Total average wait time for locks is high	An average wait time longer than 500ms may indicate excessive blocking. This value should generally correlate to 'Lock Waits/sec' and move up or down with it accordingly.	`min(/MSSQL by ODBC/mssql.average_wait_time,5m)>{$MSSQL.AVERAGE_WAIT_TIME.MAX}`	WARNING
MSSQL: Total number of locks per second is high	Number of new locks and lock conversions per second requested from the lock manager is high.	`min(/MSSQL by ODBC/mssql.lock_requests_sec.rate,5m)>{$MSSQL.LOCK_REQUESTS.MAX}`	WARNING
MSSQL: Total lock requests per second that timed out is high	The total number of timed out lock requests per second, including requests for NOWAIT locks, is high.	`min(/MSSQL by ODBC/mssql.lock_timeouts_sec.rate,5m)>{$MSSQL.LOCK_TIMEOUTS.MAX}`	WARNING
MSSQL: Some blocking is occurring for 5m	Values greater than zero indicate at least some blocking is occurring, while a value of zero can quickly eliminate blocking as a potential root-cause problem.	`min(/MSSQL by ODBC/mssql.lock_waits_sec.rate,5m)>0`	AVERAGE
MSSQL: Number of deadlock is high	Too many deadlocks are occurring currently.	`min(/MSSQL by ODBC/mssql.number_deadlocks_sec.rate,5m)>{$MSSQL.DEADLOCKS.MAX}`	AVERAGE
MSSQL: Percent of adhoc queries running is high	The lower this value is the better. High values often indicate excessive adhoc querying and should be as low as possible. If excessive adhoc querying is happening, try rewriting the queries as procedures or invoke the queries using sp_executeSQL. When rewriting isn't possible, consider using a plan guide or setting the database to parameterization forced mode.	`min(/MSSQL by ODBC/mssql.percent_of_adhoc_queries,15m) > {$MSSQL.PERCENT_COMPILATIONS.MAX}`	WARNING
MSSQL: Percent of times statement recompiles is high	This number should be at or near zero, since recompiles can cause deadlocks and exclusive compile locks. This counter's value should follow in proportion to “Batch Requests/sec” and “SQL Compilations/sec”.	`min(/MSSQL by ODBC/mssql.percent_recompilations_to_compilations,15m) > {$MSSQL.PERCENT_RECOMPILATIONS.MAX}`	WARNING
MSSQL: Number of index and table scans exceeds index searches in the last 15m	Index searches are preferable to index and table scans. For OLTP applications, optimize for more index searches and less scans (preferably, 1 full scan for every 1000 index searches). Index and table scans are expensive I/O operations.	`min(/MSSQL by ODBC/mssql.scan_to_search,15m) > 0.001`	WARNING
MSSQL DB '{#DBNAME}': State is {ITEM.VALUE}	The DB has a non-working state.	`last(/MSSQL by ODBC/mssql.db.state["{#DBNAME}"])>1`	HIGH
MSSQL DB '{#DBNAME}': Number of commits waiting for the log flush is high	Too many commits are waiting for the log flush.	`min(/MSSQL by ODBC/mssql.db.log_flush_waits_sec.rate["{#DBNAME}"],5m)>{$MSSQL.LOG_FLUSH_WAITS.MAX:"{#DBNAME}"}`	WARNING
MSSQL DB '{#DBNAME}': Total wait time to flush the log is high	The wait time to flush the log is too long.	`min(/MSSQL by ODBC/mssql.db.log_flush_wait_time["{#DBNAME}"],5m)>{$MSSQL.LOG_FLUSH_WAIT_TIME.MAX:"{#DBNAME}"}`	WARNING
MSSQL DB '{#DBNAME}': Percent of log using is high	There's not enough space left in the log.	`min(/MSSQL by ODBC/mssql.db.percent_log_used["{#DBNAME}"],5m)>{$MSSQL.PERCENT_LOG_USED.MAX:"{#DBNAME}"}`	WARNING
MSSQL DB '{#DBNAME}': Diff backup is old	The differential backup has not been executed for a long time.	`last(/MSSQL by ODBC/mssql.backup.diff["{#DBNAME}"])>{$MSSQL.BACKUP_DIFF.CRIT:"{#DBNAME}"}`	HIGH	Manual close: YES
MSSQL DB '{#DBNAME}': Diff backup is old	The differential backup has not been executed for a long time.	`last(/MSSQL by ODBC/mssql.backup.diff["{#DBNAME}"])>{$MSSQL.BACKUP_DIFF.WARN:"{#DBNAME}"}`	WARNING	Manual close: YES Depends on: - MSSQL DB '{#DBNAME}': Diff backup is old
MSSQL DB '{#DBNAME}': Full backup is old	The full backup has not been executed for a long time.	`last(/MSSQL by ODBC/mssql.backup.full["{#DBNAME}"])>{$MSSQL.BACKUP_FULL.CRIT:"{#DBNAME}"}`	HIGH	Manual close: YES
MSSQL DB '{#DBNAME}': Full backup is old	The full backup has not been executed for a long time.	`last(/MSSQL by ODBC/mssql.backup.full["{#DBNAME}"])>{$MSSQL.BACKUP_FULL.WARN:"{#DBNAME}"}`	WARNING	Manual close: YES Depends on: - MSSQL DB '{#DBNAME}': Full backup is old
MSSQL DB '{#DBNAME}': Log backup is old	The log backup has not been executed for a long time.	`last(/MSSQL by ODBC/mssql.backup.log["{#DBNAME}"])>{$MSSQL.BACKUP_LOG.CRIT:"{#DBNAME}"}`	HIGH	Manual close: YES
MSSQL DB '{#DBNAME}': Log backup is old	The log backup has not been executed for a long time.	`last(/MSSQL by ODBC/mssql.backup.log["{#DBNAME}"])>{$MSSQL.BACKUP_LOG.WARN:"{#DBNAME}"}`	WARNING	Manual close: YES Depends on: - MSSQL DB '{#DBNAME}': Log backup is old
MSSQL AG '{#GROUP_NAME}': Primary replica recovery health in progress	The primary replica is in the synchronization process.	`last(/MSSQL by ODBC/mssql.primary_recovery_health["{#GROUP_NAME}"])=0`	WARNING
MSSQL AG '{#GROUP_NAME}': Secondary replica recovery health in progress	The secondary replica is in the synchronization process.	`last(/MSSQL by ODBC/mssql.secondary_recovery_health["{#GROUP_NAME}"])=0`	WARNING
MSSQL AG '{#GROUP_NAME}': All replicas unhealthy	None of the availability replicas have a healthy synchronization.	`last(/MSSQL by ODBC/mssql.synchronization_health["{#GROUP_NAME}"])=0`	DISASTER
MSSQL AG '{#GROUP_NAME}': Some replicas unhealthy	The synchronization health of some, but not all, availability replicas is healthy.	`last(/MSSQL by ODBC/mssql.synchronization_health["{#GROUP_NAME}"])=1`	HIGH
MSSQL AG '{#GROUP_NAME}' Local DB '{#DBNAME}': "{#DBNAME}" is {ITEM.VALUE}	The local availability database has a non-working state.	`last(/MSSQL by ODBC/mssql.local_db.state["{#DBNAME}"])>0`	WARNING
MSSQL AG '{#GROUP_NAME}' Local DB '{#DBNAME}': "{#DBNAME}" is Not healthy	The synchronization state of the local availability database is NOT SYNCHRONIZING.	`last(/MSSQL by ODBC/mssql.local_db.synchronization_health["{#DBNAME}"])=0`	HIGH
MSSQL AG '{#GROUP_NAME}' Local DB '{#DBNAME}': "{#DBNAME}" is Partially healthy	A database on a synchronous-commit availability replica is considered partially healthy if synchronization state is SYNCHRONIZING.	`last(/MSSQL by ODBC/mssql.local_db.synchronization_health["{#DBNAME}"])=1`	AVERAGE
MSSQL AG '{#GROUPNAME}' Non-Local DB '{#REPLICANAME}{#DBNAME}': Log queue size is growing	The log records of the primary database are not sent to the secondary databases.	`last(/MSSQL by ODBC/mssql.non-local_db.log_send_queue_size["{#GROUP_NAME}{#REPLICA_NAME}{#DBNAME}"],#1)>last(/MSSQL by ODBC/mssql.non-local_db.log_send_queue_size["{#GROUP_NAME}{#REPLICA_NAME}{#DBNAME}"],#2) and last(/MSSQL by ODBC/mssql.non-local_db.log_send_queue_size["{#GROUP_NAME}{#REPLICA_NAME}{#DBNAME}"],#2)>last(/MSSQL by ODBC/mssql.non-local_db.log_send_queue_size["{#GROUP_NAME}{#REPLICA_NAME}{#DBNAME}"],#3)`	HIGH
MSSQL AG '{#GROUPNAME}' Non-Local DB '{#REPLICANAME}{#DBNAME}': Redo log queue size is growing	The log records in the log files of the secondary replica have not yet been redone.	`last(/MSSQL by ODBC/mssql.non-local_db.redo_queue_size["{#GROUP_NAME}{#REPLICA_NAME}{#DBNAME}"],#1)>last(/MSSQL by ODBC/mssql.non-local_db.redo_queue_size["{#GROUP_NAME}{#REPLICA_NAME}{#DBNAME}"],#2) and last(/MSSQL by ODBC/mssql.non-local_db.redo_queue_size["{#GROUP_NAME}{#REPLICA_NAME}{#DBNAME}"],#2)>last(/MSSQL by ODBC/mssql.non-local_db.redo_queue_size["{#GROUP_NAME}{#REPLICA_NAME}{#DBNAME}"],#3)`	HIGH
MSSQL AG '{#GROUPNAME}' Replica '{#REPLICANAME}': {#REPLICA_NAME} is disconnected	The response of an availability replica to the DISCONNECTED state depends on its role: On the primary replica, if a secondary replica is disconnected, its secondary databases are marked as NOT SYNCHRONIZED on the primary replica, which waits for the secondary to reconnect; On a secondary replica, upon detecting that it is disconnected, the secondary replica attempts to reconnect to the primary replica.	`last(/MSSQL by ODBC/mssql.replica.connected_state["{#GROUP_NAME}_{#REPLICA_NAME}"])=0 and last(/MSSQL by ODBC/mssql.replica.role["{#GROUP_NAME}_{#REPLICA_NAME}"])=2`	WARNING
MSSQL AG '{#GROUPNAME}' Replica '{#REPLICANAME}': {#REPLICA_NAME} is {ITEM.VALUE}	The operational state of the replica in a given availability group is "Pending" or "Offline".	`last(/MSSQL by ODBC/mssql.replica.operational_state["{#GROUP_NAME}_{#REPLICA_NAME}"])=0 or last(/MSSQL by ODBC/mssql.replica.operational_state["{#GROUP_NAME}_{#REPLICA_NAME}"])=1 or last(/MSSQL by ODBC/mssql.replica.operational_state["{#GROUP_NAME}_{#REPLICA_NAME}"])=3`	WARNING
MSSQL AG '{#GROUPNAME}' Replica '{#REPLICANAME}': {#REPLICA_NAME} is {ITEM.VALUE}	The operational state of the replica in a given availability group is "Failed".	`last(/MSSQL by ODBC/mssql.replica.operational_state["{#GROUP_NAME}_{#REPLICA_NAME}"])=4`	AVERAGE
MSSQL AG '{#GROUPNAME}' Replica '{#REPLICANAME}': {#REPLICA_NAME} is {ITEM.VALUE}	The operational state of the replica in a given availability group is "Failed, no quorum".	`last(/MSSQL by ODBC/mssql.replica.operational_state["{#GROUP_NAME}_{#REPLICA_NAME}"])=5`	HIGH
MSSQL AG '{#GROUPNAME}' Replica '{#REPLICANAME}': {#REPLICA_NAME} Recovery in progress	At least one joined database has a database state other than ONLINE.	`last(/MSSQL by ODBC/mssql.replica.recovery_health["{#GROUP_NAME}_{#REPLICA_NAME}"])=0`	INFO
MSSQL AG '{#GROUPNAME}' Replica '{#REPLICANAME}': {#REPLICA_NAME} is Not healthy	At least one joined database is in the NOT SYNCHRONIZING state.	`last(/MSSQL by ODBC/mssql.replica.synchronization_health["{#GROUP_NAME}_{#REPLICA_NAME}"])=0`	AVERAGE
MSSQL AG '{#GROUPNAME}' Replica '{#REPLICANAME}': {#REPLICA_NAME} is Partially healthy	Some replicas are not in the target synchronization state: synchronous-commit replicas should be synchronized, and asynchronous-commit replicas should be synchronizing.	`last(/MSSQL by ODBC/mssql.replica.synchronization_health["{#GROUP_NAME}_{#REPLICA_NAME}"])=1`	WARNING
MSSQL Mirroring '{#DBNAME}': "{#DBNAME}" is {ITEM.VALUE}	The state of the mirror database and of the database mirroring session is "Suspended", "Disconnected from the other partner", or "Synchronizing".	`last(/MSSQL by ODBC/mssql.mirroring.state["{#DBNAME}"])>=0 and last(/MSSQL by ODBC/mssql.mirroring.state["{#DBNAME}"])<=2`	INFO
MSSQL Mirroring '{#DBNAME}': "{#DBNAME}" is {ITEM.VALUE}	The state of the mirror database and of the database mirroring session is "Pending Failover".	`last(/MSSQL by ODBC/mssql.mirroring.state["{#DBNAME}"])=3`	WARNING
MSSQL Mirroring '{#DBNAME}': "{#DBNAME}" is {ITEM.VALUE}	The state of the mirror database and of the database mirroring session is "Not synchronized". The partners are not synchronized. A failover is not possible now.	`last(/MSSQL by ODBC/mssql.mirroring.state["{#DBNAME}"])=5`	HIGH
MSSQL Mirroring '{#DBNAME}': "{#DBNAME}" Witness is disconnected	The state of the witness in the database mirroring session of the database is "Disconnected".	`last(/MSSQL by ODBC/mssql.mirroring.witness_state["{#DBNAME}"])=2`	WARNING
MSSQL Job '{#JOBNAME}': Failed to run	The last run of the job has failed.	`last(/MSSQL by ODBC/mssql.job.runstatus["{#JOBNAME}"])=0`	WARNING	Manual close: YES
MSSQL Job '{#JOBNAME}': Job duration is high	The job is taking too long.	`last(/MSSQL by ODBC/mssql.job.run_duration["{#JOBNAME}"])>{$MSSQL.BACKUP_DURATION.WARN:"{#JOBNAME}"}`	WARNING	Manual close: YES

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

References

http://www.grumpyolddba.co.uk/monitoring/Performance%20Counter%20Guidance%20-%20SQL%20Server.htm https://docs.microsoft.com/en-us/sql/relational-databases/performance-monitor/sql-server-access-methods-object?view=sql-server-ver15

db_mongodb_cluster

View README Download JSON

MongoDB cluster by Zabbix agent 2

Overview

For Zabbix version: 6.2 and higher
The template to monitor MongoDB sharded cluster by Zabbix that work without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

MongoDB cluster by Zabbix agent 2 — collects metrics from mongos proxy(router) by polling zabbix-agent2.

This template was tested on:

MongoDB, version 4.0.21, 4.4.3

Setup

Setup and configure zabbix-agent2 compiled with the MongoDB monitoring plugin.
Set the {$MONGODB.CONNSTRING} such as
Set the user name and password in host macros ({$MONGODB.USER}, {$MONGODB.PASSWORD}) if you want to override parameters from the Zabbix agent configuration file.

Note, depending on the number of DBs and collections discovery operation may be expensive. Use filters with macros {$MONGODB.LLD.FILTER.DB.MATCHES}, {$MONGODB.LLD.FILTER.DB.NOTMATCHES}, {$MONGODB.LLD.FILTER.COLLECTION.MATCHES}, {$MONGODB.LLD.FILTER.COLLECTION.NOTMATCHES}.

All sharded Mongodb nodes (mongod) will be discovered with attached template "MongoDB node by Zabbix agent 2".

Test availability: zabbix_get -s mongos.node -k 'mongodb.ping["{$MONGODB.CONNSTRING}","{$MONGODB.USER}","{$MONGODB.PASSWORD}"]"

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$MONGODB.CONNS.AVAILABLE.MIN.WARN}	Minimum number of available connections	`1000`
{$MONGODB.CONNSTRING}	Connection string in the URI format (password is not used). This param overwrites a value configured in the "Server" option of the configuration file (if it's set), otherwise, the plugin's default value is used: "tcp://localhost:27017"	`tcp://localhost:27017`
{$MONGODB.CURSOR.OPEN.MAX.WARN}	Maximum number of open cursors	`10000`
{$MONGODB.CURSOR.TIMEOUT.MAX.WARN}	Maximum number of cursors timing out per second	`1`
{$MONGODB.LLD.FILTER.COLLECTION.MATCHES}	Filter of discoverable collections	`.*`
{$MONGODB.LLD.FILTER.COLLECTION.NOT_MATCHES}	Filter to exclude discovered collections	`CHANGE_IF_NEEDED`
{$MONGODB.LLD.FILTER.DB.MATCHES}	Filter of discoverable databases	`.*`
{$MONGODB.LLD.FILTER.DB.NOT_MATCHES}	Filter to exclude discovered databases	`(admin	config	local)`
{$MONGODB.PASSWORD}	MongoDB user password	``
{$MONGODB.USER}	MongoDB username	``

Template links

There are no template links in this template.

Discovery rules

Name	Description	Type	Key and additional info
Collection discovery	Collect collections metrics. Note, depending on the number of DBs and collections this discovery operation may be expensive. Use filters with macros {$MONGODB.LLD.FILTER.DB.MATCHES}, {$MONGODB.LLD.FILTER.DB.NOTMATCHES}, {$MONGODB.LLD.FILTER.COLLECTION.MATCHES}, {$MONGODB.LLD.FILTER.COLLECTION.NOTMATCHES}.	ZABBIX_PASSIVE	mongodb.collections.discovery["{$MONGODB.CONNSTRING}","{$MONGODB.USER}","{$MONGODB.PASSWORD}"] Filter: AND - {#DBNAME} MATCHESREGEX `{$MONGODB.LLD.FILTER.DB.MATCHES}` - {#DBNAME} NOTMATCHESREGEX `{$MONGODB.LLD.FILTER.DB.NOT_MATCHES}` - {#COLLECTION} MATCHESREGEX `{$MONGODB.LLD.FILTER.COLLECTION.MATCHES}` - {#COLLECTION} NOTMATCHESREGEX `{$MONGODB.LLD.FILTER.COLLECTION.NOT_MATCHES}`
Config servers discovery	Discovery shared cluster config servers.	ZABBIX_PASSIVE	mongodb.cfg.discovery["{$MONGODB.CONNSTRING}","{$MONGODB.USER}","{$MONGODB.PASSWORD}"]
Database discovery	Collect database metrics. Note, depending on the number of DBs this discovery operation may be expensive. Use filters with macros {$MONGODB.LLD.FILTER.DB.MATCHES}, {$MONGODB.LLD.FILTER.DB.NOT_MATCHES}.	ZABBIX_PASSIVE	mongodb.db.discovery["{$MONGODB.CONNSTRING}","{$MONGODB.USER}","{$MONGODB.PASSWORD}"] Filter: AND - {#DBNAME} MATCHESREGEX `{$MONGODB.LLD.FILTER.DB.MATCHES}` - {#DBNAME} NOTMATCHES_REGEX `{$MONGODB.LLD.FILTER.DB.NOT_MATCHES}`
Shards discovery	Discovery shared cluster hosts.	ZABBIX_PASSIVE	mongodb.sh.discovery["{$MONGODB.CONNSTRING}","{$MONGODB.USER}","{$MONGODB.PASSWORD}"]

Items collected

Group	Name	Description	Type	Key and additional info
MongoDB sharded cluster	MongoDB cluster: Ping	Test if a connection is alive or not.	ZABBIX_PASSIVE	mongodb.ping["{$MONGODB.CONNSTRING}","{$MONGODB.USER}","{$MONGODB.PASSWORD}"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `30m`
MongoDB sharded cluster	MongoDB cluster: Jumbo chunks	Total number of 'jumbo' chunks in the mongo cluster.	ZABBIX_PASSIVE	mongodb.jumbo_chunks.count["{$MONGODB.CONNSTRING}","{$MONGODB.USER}","{$MONGODB.PASSWORD}"]
MongoDB sharded cluster	MongoDB cluster: Mongos version	Version of the Mongos server	DEPENDENT	mongodb.version Preprocessing: - JSONPATH: `$.version` - DISCARDUNCHANGEDHEARTBEAT: `3h`
MongoDB sharded cluster	MongoDB cluster: Uptime	Number of seconds since Mongos server start	DEPENDENT	mongodb.uptime Preprocessing: - JSONPATH: `$.uptime`
MongoDB sharded cluster	MongoDB cluster: Operations: command	"The number of commands issued to the database per second. Counts all commands except the write commands: insert, update, and delete."	DEPENDENT	mongodb.opcounters.command.rate Preprocessing: - JSONPATH: `$.opcounters.command` - CHANGEPERSECOND
MongoDB sharded cluster	MongoDB cluster: Operations: delete	The number of delete operations the mongos instance per second.	DEPENDENT	mongodb.opcounters.delete.rate Preprocessing: - JSONPATH: `$.opcounters.delete` - CHANGEPERSECOND
MongoDB sharded cluster	MongoDB cluster: Operations: update, rate	The number of update operations the mongos instance per second.	DEPENDENT	mongodb.opcounters.update.rate Preprocessing: - JSONPATH: `$.opcounters.update` - CHANGEPERSECOND
MongoDB sharded cluster	MongoDB cluster: Operations: query, rate	The number of queries received the mongos instance per second.	DEPENDENT	mongodb.opcounters.query.rate Preprocessing: - JSONPATH: `$.opcounters.query` - CHANGEPERSECOND
MongoDB sharded cluster	MongoDB cluster: Operations: insert, rate	The number of insert operations received the mongos instance per second.	DEPENDENT	mongodb.opcounters.insert.rate Preprocessing: - JSONPATH: `$.opcounters.insert` - CHANGEPERSECOND
MongoDB sharded cluster	MongoDB cluster: Operations: getmore, rate	"The number of “getmore” operations the mongos per second. This counter can be high even if the query count is low. Secondary nodes send getMore operations as part of the replication process."	DEPENDENT	mongodb.opcounters.getmore.rate Preprocessing: - JSONPATH: `$.opcounters.getmore` - CHANGEPERSECOND
MongoDB sharded cluster	MongoDB cluster: Last seen configserver	The latest optime of the CSRS primary that the mongos has seen.	DEPENDENT	mongodb.lastseenconfig_server Preprocessing: - JSONPATH: `$.sharding.lastSeenConfigServerOpTime.ts.T`
MongoDB sharded cluster	MongoDB cluster: Configserver heartbeat	Difference between the latest optime of the CSRS primary that the mongos has seen and cluster time.	DEPENDENT	mongodb.configserverheartbeat Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.`
MongoDB sharded cluster	MongoDB cluster: Bytes in, rate	The total number of bytes that the server has received over network connections initiated by clients or other mongod/mongos instances per second.	DEPENDENT	mongodb.network.bytesin.rate Preprocessing: - JSONPATH: `$.network.bytesIn` - CHANGEPER_SECOND
MongoDB sharded cluster	MongoDB cluster: Bytes out, rate	The total number of bytes that the server has sent over network connections initiated by clients or other mongod/mongos instances per second.	DEPENDENT	mongodb.network.bytesout.rate Preprocessing: - JSONPATH: `$.network.bytesOut` - CHANGEPER_SECOND
MongoDB sharded cluster	MongoDB cluster: Requests, rate	Number of distinct requests that the server has received per second	DEPENDENT	mongodb.network.numRequests.rate Preprocessing: - JSONPATH: `$.network.numRequests` - CHANGEPERSECOND
MongoDB sharded cluster	MongoDB cluster: Connections, current	"The number of incoming connections from clients to the database server. This number includes the current shell session"	DEPENDENT	mongodb.connections.current Preprocessing: - JSONPATH: `$.connections.current`
MongoDB sharded cluster	MongoDB cluster: New connections, rate	"Rate of all incoming connections created to the server."	DEPENDENT	mongodb.connections.rate Preprocessing: - JSONPATH: `$.connections.totalCreated` - CHANGEPERSECOND
MongoDB sharded cluster	MongoDB cluster: Connections, active	"The number of active client connections to the server. Active client connections refers to client connections that currently have operations in progress. Available starting in 4.0.7, 0 for older versions."	DEPENDENT	mongodb.connections.active Preprocessing: - JSONPATH: `$.connections.active` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
MongoDB sharded cluster	MongoDB cluster: Connections, available	"The number of unused incoming connections available."	DEPENDENT	mongodb.connections.available Preprocessing: - JSONPATH: `$.connections.available`
MongoDB sharded cluster	MongoDB cluster: Connection pool: client connections	The number of active and stored outgoing synchronous connections from the current mongos instance to other members of the sharded cluster.	DEPENDENT	mongodb.connection_pool.client Preprocessing: - JSONPATH: `$.numClientConnections`
MongoDB sharded cluster	MongoDB cluster: Connection pool: scoped	Number of active and stored outgoing scoped synchronous connections from the current mongos instance to other members of the sharded cluster.	DEPENDENT	mongodb.connection_pool.scoped Preprocessing: - JSONPATH: `$.numAScopedConnections`
MongoDB sharded cluster	MongoDB cluster: Connection pool: created, rate	The total number of outgoing connections created per second by the current mongos instance to other members of the sharded cluster.	DEPENDENT	mongodb.connectionpool.created.rate Preprocessing: - JSONPATH: `$.totalCreated` - CHANGEPER_SECOND
MongoDB sharded cluster	MongoDB cluster: Connection pool: available	The total number of available outgoing connections from the current mongos instance to other members of the sharded cluster.	DEPENDENT	mongodb.connection_pool.available Preprocessing: - JSONPATH: `$.totalAvailable`
MongoDB sharded cluster	MongoDB cluster: Connection pool: in use	Reports the total number of outgoing connections from the current mongos instance to other members of the sharded cluster set that are currently in use.	DEPENDENT	mongodb.connectionpool.inuse Preprocessing: - JSONPATH: `$.totalInUse`
MongoDB sharded cluster	MongoDB cluster: Connection pool: refreshing	Reports the total number of outgoing connections from the current mongos instance to other members of the sharded cluster that are currently being refreshed.	DEPENDENT	mongodb.connection_pool.refreshing Preprocessing: - JSONPATH: `$.totalRefreshing`
MongoDB sharded cluster	MongoDB cluster: Cursor: open no timeout	Number of open cursors with the option DBQuery.Option.noTimeout set to prevent timeout after a period of inactivity.	DEPENDENT	mongodb.metrics.cursor.open.notimeout Preprocessing: - JSONPATH: `$.metrics.cursor.open.noTimeout` ⛔️ONFAIL: `DISCARD_VALUE ->`
MongoDB sharded cluster	MongoDB cluster: Cursor: open pinned	Number of pinned open cursors.	DEPENDENT	mongodb.cursor.open.pinned Preprocessing: - JSONPATH: `$.metrics.cursor.open.pinned`
MongoDB sharded cluster	MongoDB cluster: Cursor: open total	Number of cursors that MongoDB is maintaining for clients.	DEPENDENT	mongodb.cursor.open.total Preprocessing: - JSONPATH: `$.metrics.cursor.open.total`
MongoDB sharded cluster	MongoDB cluster: Cursor: timed out, rate	Number of cursors that time out, per second.	DEPENDENT	mongodb.cursor.timedout.rate Preprocessing: - JSONPATH: `$.metrics.cursor.timedOut` - CHANGEPER_SECOND
MongoDB sharded cluster	MongoDB cluster: Architecture	A number, either 64 or 32, that indicates whether the MongoDB instance is compiled for 64-bit or 32-bit architecture.	DEPENDENT	mongodb.mem.bits Preprocessing: - JSONPATH: `$.mem.bits` - DISCARDUNCHANGEDHEARTBEAT: `3h`
MongoDB sharded cluster	MongoDB cluster: Memory: resident	Amount of memory currently used by the database process.	DEPENDENT	mongodb.mem.resident Preprocessing: - JSONPATH: `$.mem.resident` - MULTIPLIER: `1048576`
MongoDB sharded cluster	MongoDB cluster: Memory: virtual	Amount of virtual memory used by the mongos process.	DEPENDENT	mongodb.mem.virtual Preprocessing: - JSONPATH: `$.mem.virtual` - MULTIPLIER: `1048576`
MongoDB sharded cluster	MongoDB {#DBNAME}: Objects, avg size	The average size of each document in bytes.	DEPENDENT	mongodb.db.size["{#DBNAME}"] Preprocessing: - JSONPATH: `$.avgObjSize`
MongoDB sharded cluster	MongoDB {#DBNAME}: Size, data	Total size of the data held in this database including the padding factor.	DEPENDENT	mongodb.db.data_size["{#DBNAME}"] Preprocessing: - JSONPATH: `$.dataSize`
MongoDB sharded cluster	MongoDB {#DBNAME}: Size, file	Total size of the data held in this database including the padding factor (only available with the mmapv1 storage engine).	DEPENDENT	mongodb.db.filesize["{#DBNAME}"] Preprocessing: - JSONPATH: `$.fileSize` ⛔️ONFAIL: `DISCARD_VALUE ->`
MongoDB sharded cluster	MongoDB {#DBNAME}: Size, index	Total size of all indexes created on this database.	DEPENDENT	mongodb.db.index_size["{#DBNAME}"] Preprocessing: - JSONPATH: `$.indexSize`
MongoDB sharded cluster	MongoDB {#DBNAME}: Size, storage	Total amount of space allocated to collections in this database for document storage.	DEPENDENT	mongodb.db.storage_size["{#DBNAME}"] Preprocessing: - JSONPATH: `$.storageSize`
MongoDB sharded cluster	MongoDB {#DBNAME}: Objects, count	Number of objects (documents) in the database across all collections.	DEPENDENT	mongodb.db.objects["{#DBNAME}"] Preprocessing: - JSONPATH: `$.objects`
MongoDB sharded cluster	MongoDB {#DBNAME}: Extents	Contains a count of the number of extents in the database across all collections.	DEPENDENT	mongodb.db.extents["{#DBNAME}"] Preprocessing: - JSONPATH: `$.numExtents` ⛔️ON_FAIL: `DISCARD_VALUE ->`
MongoDB sharded cluster	MongoDB {#DBNAME}.{#COLLECTION}: Size	The total size in bytes of the data in the collection plus the size of every indexes on the mongodb.collection.	DEPENDENT	mongodb.collection.size["{#DBNAME}","{#COLLECTION}"] Preprocessing: - JSONPATH: `$.size`
MongoDB sharded cluster	MongoDB {#DBNAME}.{#COLLECTION}: Objects, avg size	The size of the average object in the collection in bytes.	DEPENDENT	mongodb.collection.avgobjsize["{#DBNAME}","{#COLLECTION}"] Preprocessing: - JSONPATH: `$.avgObjSize` ⛔️ON_FAIL: `DISCARD_VALUE ->`
MongoDB sharded cluster	MongoDB {#DBNAME}.{#COLLECTION}: Objects, count	Total number of objects in the collection.	DEPENDENT	mongodb.collection.count["{#DBNAME}","{#COLLECTION}"] Preprocessing: - JSONPATH: `$.count`
MongoDB sharded cluster	MongoDB {#DBNAME}.{#COLLECTION}: Capped, max number	Maximum number of documents in a capped collection.	DEPENDENT	mongodb.collection.max["{#DBNAME}","{#COLLECTION}"] Preprocessing: - JSONPATH: `$.max` ⛔️ON_FAIL: `DISCARD_VALUE ->`
MongoDB sharded cluster	MongoDB {#DBNAME}.{#COLLECTION}: Capped, max size	Maximum size of a capped collection in bytes.	DEPENDENT	mongodb.collection.maxsize["{#DBNAME}","{#COLLECTION}"] Preprocessing: - JSONPATH: `$.maxSize` ⛔️ONFAIL: `DISCARD_VALUE ->`
MongoDB sharded cluster	MongoDB {#DBNAME}.{#COLLECTION}: Storage size	Total storage space allocated to this collection for document storage.	DEPENDENT	mongodb.collection.storage_size["{#DBNAME}","{#COLLECTION}"] Preprocessing: - JSONPATH: `$.storageSize`
MongoDB sharded cluster	MongoDB {#DBNAME}.{#COLLECTION}: Indexes	Total number of indices on the collection.	DEPENDENT	mongodb.collection.nindexes["{#DBNAME}","{#COLLECTION}"] Preprocessing: - JSONPATH: `$.nindexes`
MongoDB sharded cluster	MongoDB {#DBNAME}.{#COLLECTION}: Capped	Whether or not the collection is capped.	DEPENDENT	mongodb.collection.capped["{#DBNAME}","{#COLLECTION}"] Preprocessing: - JSONPATH: `$.capped` - BOOLTODECIMAL - DISCARDUNCHANGEDHEARTBEAT: `3h`
Zabbix raw items	MongoDB cluster: Get server status	The mongos statistic	ZABBIX_PASSIVE	mongodb.server.status["{$MONGODB.CONNSTRING}","{$MONGODB.USER}","{$MONGODB.PASSWORD}"]
Zabbix raw items	MongoDB cluster: Get mongodb.connpool.stats	Returns current info about connpool.stats.	ZABBIX_PASSIVE	mongodb.connpool.stats["{$MONGODB.CONNSTRING}","{$MONGODB.USER}","{$MONGODB.PASSWORD}"]
Zabbix raw items	MongoDB {#DBNAME}: Get db stats {#DBNAME}	Returns statistics reflecting the database system's state.	ZABBIX_PASSIVE	mongodb.db.stats["{$MONGODB.CONNSTRING}","{$MONGODB.USER}","{$MONGODB.PASSWORD}","{#DBNAME}"]
Zabbix raw items	MongoDB {#DBNAME}.{#COLLECTION}: Get collection stats {#DBNAME}.{#COLLECTION}	Returns a variety of storage statistics for a given collection.	ZABBIX_PASSIVE	mongodb.collection.stats["{$MONGODB.CONNSTRING}","{$MONGODB.USER}","{$MONGODB.PASSWORD}","{#DBNAME}","{#COLLECTION}"]

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
MongoDB cluster: Connection to mongos proxy is unavailable	Connection to mongos proxy instance is currently unavailable.	`last(/MongoDB cluster by Zabbix agent 2/mongodb.ping["{$MONGODB.CONNSTRING}","{$MONGODB.USER}","{$MONGODB.PASSWORD}"])=0`	HIGH
MongoDB cluster: Version has changed	MongoDB cluster version has changed. Ack to close.	`last(/MongoDB cluster by Zabbix agent 2/mongodb.version,#1)<>last(/MongoDB cluster by Zabbix agent 2/mongodb.version,#2) and length(last(/MongoDB cluster by Zabbix agent 2/mongodb.version))>0`	INFO	Manual close: YES
MongoDB cluster: has been restarted	Uptime is less than 10 minutes.	`last(/MongoDB cluster by Zabbix agent 2/mongodb.uptime)<10m`	INFO	Manual close: YES
MongoDB cluster: Failed to fetch info data	Zabbix has not received data for items for the last 10 minutes	`nodata(/MongoDB cluster by Zabbix agent 2/mongodb.uptime,10m)=1`	WARNING	Manual close: YES Depends on: - MongoDB cluster: Connection to mongos proxy is unavailable
MongoDB cluster: Available connections is low	"Too few available connections. Consider this value in combination with the value of connections current to understand the connection load on the database"	`max(/MongoDB cluster by Zabbix agent 2/mongodb.connections.available,5m)<{$MONGODB.CONNS.AVAILABLE.MIN.WARN}`	WARNING
MongoDB cluster: Too many cursors opened by MongoDB for clients	-	`min(/MongoDB cluster by Zabbix agent 2/mongodb.cursor.open.total,5m)>{$MONGODB.CURSOR.OPEN.MAX.WARN}`	WARNING
MongoDB cluster: Too many cursors are timing out	-	`min(/MongoDB cluster by Zabbix agent 2/mongodb.cursor.timed_out.rate,5m)>{$MONGODB.CURSOR.TIMEOUT.MAX.WARN}`	WARNING

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

db_mongodb

View README Download JSON

MongoDB node by Zabbix agent 2

Overview

For Zabbix version: 6.2 and higher
The template to monitor single MongoDB server by Zabbix that work without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

MongoDB node by Zabbix Agent 2 — collects metrics by polling zabbix-agent2.

This template was tested on:

MongoDB, version 4.0.21, 4.4.3

Setup

Setup and configure zabbix-agent2 compiled with the MongoDB monitoring plugin.
Set the {$MONGODB.CONNSTRING} such as
Set the user name and password in host macros ({$MONGODB.USER}, {$MONGODB.PASSWORD}) if you want to override parameters from the Zabbix agent configuration file.

Test availability: zabbix_get -s mongodb.node -k 'mongodb.ping["{$MONGODB.CONNSTRING}","{$MONGODB.USER}","{$MONGODB.PASSWORD}"]"

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$MONGODB.CONNS.PCT.USED.MAX.WARN}	Maximum percentage of used connections	`80`
{$MONGODB.CONNSTRING}	Connection string in the URI format (password is not used). This param overwrites a value configured in the "Server" option of the configuration file (if it's set), otherwise, the plugin's default value is used: "tcp://localhost:27017"	`tcp://localhost:27017`
{$MONGODB.CURSOR.OPEN.MAX.WARN}	Maximum number of open cursors	`10000`
{$MONGODB.CURSOR.TIMEOUT.MAX.WARN}	Maximum number of cursors timing out per second	`1`
{$MONGODB.LLD.FILTER.COLLECTION.MATCHES}	Filter of discoverable collections	`.*`
{$MONGODB.LLD.FILTER.COLLECTION.NOT_MATCHES}	Filter to exclude discovered collections	`CHANGE_IF_NEEDED`
{$MONGODB.LLD.FILTER.DB.MATCHES}	Filter of discoverable databases	`.*`
{$MONGODB.LLD.FILTER.DB.NOT_MATCHES}	Filter to exclude discovered databases	`(admin	config	local)`
{$MONGODB.PASSWORD}	MongoDB user password	``
{$MONGODB.REPL.LAG.MAX.WARN}	Maximum replication lag in seconds	`10s`
{$MONGODB.USER}	MongoDB username	``
{$MONGODB.WIRED_TIGER.TICKETS.AVAILABLE.MIN.WARN}	Minimum number of available WiredTiger read or write tickets remaining	`5`

Template links

There are no template links in this template.

Discovery rules

Name	Description	Type	Key and additional info
Collection discovery	Collect collections metrics. Note, depending on the number of DBs and collections this discovery operation may be expensive. Use filters with macros {$MONGODB.LLD.FILTER.DB.MATCHES}, {$MONGODB.LLD.FILTER.DB.NOTMATCHES}, {$MONGODB.LLD.FILTER.COLLECTION.MATCHES}, {$MONGODB.LLD.FILTER.COLLECTION.NOTMATCHES}.	ZABBIX_PASSIVE	mongodb.collections.discovery["{$MONGODB.CONNSTRING}","{$MONGODB.USER}","{$MONGODB.PASSWORD}"] Filter: AND - {#DBNAME} MATCHESREGEX `{$MONGODB.LLD.FILTER.DB.MATCHES}` - {#DBNAME} NOTMATCHESREGEX `{$MONGODB.LLD.FILTER.DB.NOT_MATCHES}` - {#COLLECTION} MATCHESREGEX `{$MONGODB.LLD.FILTER.COLLECTION.MATCHES}` - {#COLLECTION} NOTMATCHESREGEX `{$MONGODB.LLD.FILTER.COLLECTION.NOT_MATCHES}`
Database discovery	Collect database metrics. Note, depending on the number of DBs this discovery operation may be expensive. Use filters with macros {$MONGODB.LLD.FILTER.DB.MATCHES}, {$MONGODB.LLD.FILTER.DB.NOT_MATCHES}.	ZABBIX_PASSIVE	mongodb.db.discovery["{$MONGODB.CONNSTRING}","{$MONGODB.USER}","{$MONGODB.PASSWORD}"] Filter: AND - {#DBNAME} MATCHESREGEX `{$MONGODB.LLD.FILTER.DB.MATCHES}` - {#DBNAME} NOTMATCHES_REGEX `{$MONGODB.LLD.FILTER.DB.NOT_MATCHES}`
Replication discovery	Collect metrics by Zabbix agent if it exists	DEPENDENT	mongodb.rs.discovery Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `1h` Overrides: Primary metrics - {#NODESTATE} MATCHESREGEX `1` - ITEMPROTOTYPE LIKE `Number of replicas` - DISCOVER - ITEMPROTOTYPE LIKE `Unhealthy replicas` - DISCOVER - ITEMPROTOTYPE LIKE `Number of unhealthy replicas` - DISCOVER - ITEMPROTOTYPE LIKE `Replication lag` - NODISCOVER Arbiter metrics - {#NODESTATE} MATCHESREGEX `7` - ITEMPROTOTYPE LIKE `Replication lag` - NO_DISCOVER
WiredTiger metrics	Collect metrics of WiredTiger Storage Engine if it exists	DEPENDENT	mongodb.wiredtiger.discovery Preprocessing: - JAVASCRIPT: `return JSON.stringify(JSON.parse(value).wiredTiger ? [{'{#SINGLETON}': ''}] : []);` - DISCARDUNCHANGED_HEARTBEAT: `6h`

Items collected

Group	Name	Description	Type	Key and additional info
MongoDB	MongoDB: Ping	Test if a connection is alive or not.	ZABBIX_PASSIVE	mongodb.ping["{$MONGODB.CONNSTRING}","{$MONGODB.USER}","{$MONGODB.PASSWORD}"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `30m`
MongoDB	MongoDB: MongoDB version	Version of the MongoDB server.	DEPENDENT	mongodb.version Preprocessing: - JSONPATH: `$.version` - DISCARDUNCHANGEDHEARTBEAT: `3h`
MongoDB	MongoDB: Uptime	Number of seconds that the mongod process has been active.	DEPENDENT	mongodb.uptime Preprocessing: - JSONPATH: `$.uptime`
MongoDB	MongoDB: Asserts: message, rate	The number of message assertions raised per second. Check the log file for more information about these messages.	DEPENDENT	mongodb.asserts.msg.rate Preprocessing: - JSONPATH: `$.asserts.msg` - CHANGEPERSECOND
MongoDB	MongoDB: Asserts: user, rate	The number of “user asserts” that have occurred per second. These are errors that user may generate, such as out of disk space or duplicate key.	DEPENDENT	mongodb.asserts.user.rate Preprocessing: - JSONPATH: `$.asserts.user` - CHANGEPERSECOND
MongoDB	MongoDB: Asserts: warning, rate	The number of warnings raised per second.	DEPENDENT	mongodb.asserts.warning.rate Preprocessing: - JSONPATH: `$.asserts.warning` - CHANGEPERSECOND
MongoDB	MongoDB: Asserts: regular, rate	The number of regular assertions raised per second. Check the log file for more information about these messages.	DEPENDENT	mongodb.asserts.regular.rate Preprocessing: - JSONPATH: `$.asserts.regular` - CHANGEPERSECOND
MongoDB	MongoDB: Asserts: rollovers, rate	Number of times that the rollover counters roll over per second. The counters rollover to zero every 2^30 assertions.	DEPENDENT	mongodb.asserts.rollovers.rate Preprocessing: - JSONPATH: `$.asserts.rollovers` - CHANGEPERSECOND
MongoDB	MongoDB: Active clients: writers	The number of active client connections performing write operations.	DEPENDENT	mongodb.active_clients.writers Preprocessing: - JSONPATH: `$.globalLock.activeClients.writers`
MongoDB	MongoDB: Active clients: readers	The number of the active client connections performing read operations.	DEPENDENT	mongodb.active_clients.readers Preprocessing: - JSONPATH: `$.globalLock.activeClients.readers`
MongoDB	MongoDB: Active clients: total	The total number of internal client connections to the database including system threads as well as queued readers and writers.	DEPENDENT	mongodb.active_clients.total Preprocessing: - JSONPATH: `$.globalLock.activeClients.total`
MongoDB	MongoDB: Current queue: writers	The number of operations that are currently queued and waiting for the write lock. A consistently small write-queue, particularly of shorter operations, is no cause for concern.	DEPENDENT	mongodb.current_queue.writers Preprocessing: - JSONPATH: `$.globalLock.currentQueue.writers`
MongoDB	MongoDB: Current queue: readers	The number of operations that are currently queued and waiting for the read lock. A consistently small read-queue, particularly of shorter operations, should cause no concern.	DEPENDENT	mongodb.current_queue.readers Preprocessing: - JSONPATH: `$.globalLock.currentQueue.readers`
MongoDB	MongoDB: Current queue: total	The total number of operations queued waiting for the lock.	DEPENDENT	mongodb.current_queue.total Preprocessing: - JSONPATH: `$.globalLock.currentQueue.total`
MongoDB	MongoDB: Operations: command, rate	The number of commands issued to the database the mongod instance per second. Counts all commands except the write commands: insert, update, and delete.	DEPENDENT	mongodb.opcounters.command.rate Preprocessing: - JSONPATH: `$.opcounters.command` - CHANGEPERSECOND
MongoDB	MongoDB: Operations: delete, rate	The number of delete operations the mongod instance per second.	DEPENDENT	mongodb.opcounters.delete.rate Preprocessing: - JSONPATH: `$.opcounters.delete` - CHANGEPERSECOND
MongoDB	MongoDB: Operations: update, rate	The number of update operations the mongod instance per second.	DEPENDENT	mongodb.opcounters.update.rate Preprocessing: - JSONPATH: `$.opcounters.update` - CHANGEPERSECOND
MongoDB	MongoDB: Operations: query, rate	The number of queries received the mongod instance per second.	DEPENDENT	mongodb.opcounters.query.rate Preprocessing: - JSONPATH: `$.opcounters.query` - CHANGEPERSECOND
MongoDB	MongoDB: Operations: insert, rate	The number of insert operations received since the mongod instance per second.	DEPENDENT	mongodb.opcounters.insert.rate Preprocessing: - JSONPATH: `$.opcounters.insert` - CHANGEPERSECOND
MongoDB	MongoDB: Operations: getmore, rate	The number of “getmore” operations since the mongod instance per second. This counter can be high even if the query count is low. Secondary nodes send getMore operations as part of the replication process.	DEPENDENT	mongodb.opcounters.getmore.rate Preprocessing: - JSONPATH: `$.opcounters.getmore` - CHANGEPERSECOND
MongoDB	MongoDB: Connections, current	The number of incoming connections from clients to the database server. This number includes the current shell session	DEPENDENT	mongodb.connections.current Preprocessing: - JSONPATH: `$.connections.current`
MongoDB	MongoDB: New connections, rate	Rate of all incoming connections created to the server.	DEPENDENT	mongodb.connections.rate Preprocessing: - JSONPATH: `$.connections.totalCreated` - CHANGEPERSECOND
MongoDB	MongoDB: Connections, available	The number of unused incoming connections available.	DEPENDENT	mongodb.connections.available Preprocessing: - JSONPATH: `$.connections.available`
MongoDB	MongoDB: Connections, active	The number of active client connections to the server. Active client connections refers to client connections that currently have operations in progress. Available starting in 4.0.7, 0 for older versions.	DEPENDENT	mongodb.connections.active Preprocessing: - JSONPATH: `$.connections.active` ⛔️ON_FAIL: `DISCARD_VALUE ->`
MongoDB	MongoDB: Bytes in, rate	The total number of bytes that the server has received over network connections initiated by clients or other mongod/mongos instances per second.	DEPENDENT	mongodb.network.bytesin.rate Preprocessing: - JSONPATH: `$.network.bytesIn` - CHANGEPER_SECOND
MongoDB	MongoDB: Bytes out, rate	The total number of bytes that the server has sent over network connections initiated by clients or other mongod/mongos instances per second.	DEPENDENT	mongodb.network.bytesout.rate Preprocessing: - JSONPATH: `$.network.bytesOut` - CHANGEPER_SECOND
MongoDB	MongoDB: Requests, rate	Number of distinct requests that the server has received per second	DEPENDENT	mongodb.network.numRequests.rate Preprocessing: - JSONPATH: `$.network.numRequests` - CHANGEPERSECOND
MongoDB	MongoDB: Document: deleted, rate	Number of documents deleted per second.	DEPENDENT	mongod.document.deleted.rate Preprocessing: - JSONPATH: `$.metrics.document.deleted` - CHANGEPERSECOND
MongoDB	MongoDB: Document: inserted, rate	Number of documents inserted per second.	DEPENDENT	mongod.document.inserted.rate Preprocessing: - JSONPATH: `$.metrics.document.inserted` - CHANGEPERSECOND
MongoDB	MongoDB: Document: returned, rate	Number of documents returned by queries per second.	DEPENDENT	mongod.document.returned.rate Preprocessing: - JSONPATH: `$.metrics.document.returned` - CHANGEPERSECOND
MongoDB	MongoDB: Document: updated, rate	Number of documents updated per second.	DEPENDENT	mongod.document.updated.rate Preprocessing: - JSONPATH: `$.metrics.document.updated` - CHANGEPERSECOND
MongoDB	MongoDB: Cursor: open no timeout	Number of open cursors with the option DBQuery.Option.noTimeout set to prevent timeout after a period of inactivity.	DEPENDENT	mongodb.metrics.cursor.open.no_timeout Preprocessing: - JSONPATH: `$.metrics.cursor.open.noTimeout`
MongoDB	MongoDB: Cursor: open pinned	Number of pinned open cursors.	DEPENDENT	mongodb.cursor.open.pinned Preprocessing: - JSONPATH: `$.metrics.cursor.open.pinned`
MongoDB	MongoDB: Cursor: open total	Number of cursors that MongoDB is maintaining for clients.	DEPENDENT	mongodb.cursor.open.total Preprocessing: - JSONPATH: `$.metrics.cursor.open.total`
MongoDB	MongoDB: Cursor: timed out, rate	Number of cursors that time out, per second.	DEPENDENT	mongodb.cursor.timedout.rate Preprocessing: - JSONPATH: `$.metrics.cursor.timedOut` - CHANGEPER_SECOND
MongoDB	MongoDB: Architecture	A number, either 64 or 32, that indicates whether the MongoDB instance is compiled for 64-bit or 32-bit architecture.	DEPENDENT	mongodb.mem.bits Preprocessing: - JSONPATH: `$.mem.bits` - DISCARDUNCHANGEDHEARTBEAT: `3h`
MongoDB	MongoDB: Memory: mapped	Amount of mapped memory by the database.	DEPENDENT	mongodb.mem.mapped Preprocessing: - JSONPATH: `$.mem.mapped` ⛔️ON_FAIL: `DISCARD_VALUE ->` - MULTIPLIER: `1048576`
MongoDB	MongoDB: Memory: mapped with journal	The amount of mapped memory, including the memory used for journaling.	DEPENDENT	mongodb.mem.mappedwithjournal Preprocessing: - JSONPATH: `$.mem.mappedWithJournal` ⛔️ON_FAIL: `DISCARD_VALUE ->` - MULTIPLIER: `1048576`
MongoDB	MongoDB: Memory: resident	Amount of memory currently used by the database process.	DEPENDENT	mongodb.mem.resident Preprocessing: - JSONPATH: `$.mem.resident` - MULTIPLIER: `1048576`
MongoDB	MongoDB: Memory: virtual	Amount of virtual memory used by the mongod process.	DEPENDENT	mongodb.mem.virtual Preprocessing: - JSONPATH: `$.mem.virtual` - MULTIPLIER: `1048576`
MongoDB	MongoDB {#DBNAME}: Objects, avg size	The average size of each document in bytes.	DEPENDENT	mongodb.db.size["{#DBNAME}"] Preprocessing: - JSONPATH: `$.avgObjSize`
MongoDB	MongoDB {#DBNAME}: Size, data	Total size of the data held in this database including the padding factor.	DEPENDENT	mongodb.db.data_size["{#DBNAME}"] Preprocessing: - JSONPATH: `$.dataSize`
MongoDB	MongoDB {#DBNAME}: Size, file	Total size of the data held in this database including the padding factor (only available with the mmapv1 storage engine).	DEPENDENT	mongodb.db.filesize["{#DBNAME}"] Preprocessing: - JSONPATH: `$.fileSize` ⛔️ONFAIL: `DISCARD_VALUE ->`
MongoDB	MongoDB {#DBNAME}: Size, index	Total size of all indexes created on this database.	DEPENDENT	mongodb.db.index_size["{#DBNAME}"] Preprocessing: - JSONPATH: `$.indexSize`
MongoDB	MongoDB {#DBNAME}: Size, storage	Total amount of space allocated to collections in this database for document storage.	DEPENDENT	mongodb.db.storage_size["{#DBNAME}"] Preprocessing: - JSONPATH: `$.storageSize`
MongoDB	MongoDB {#DBNAME}: Collections	Contains a count of the number of collections in that database.	DEPENDENT	mongodb.db.collections["{#DBNAME}"] Preprocessing: - JSONPATH: `$.collections`
MongoDB	MongoDB {#DBNAME}: Objects, count	Number of objects (documents) in the database across all collections.	DEPENDENT	mongodb.db.objects["{#DBNAME}"] Preprocessing: - JSONPATH: `$.objects`
MongoDB	MongoDB {#DBNAME}: Extents	Contains a count of the number of extents in the database across all collections.	DEPENDENT	mongodb.db.extents["{#DBNAME}"] Preprocessing: - JSONPATH: `$.numExtents` ⛔️ON_FAIL: `DISCARD_VALUE ->`
MongoDB	MongoDB {#DBNAME}.{#COLLECTION}: Size	The total size in bytes of the data in the collection plus the size of every indexes on the mongodb.collection.	DEPENDENT	mongodb.collection.size["{#DBNAME}","{#COLLECTION}"] Preprocessing: - JSONPATH: `$.size`
MongoDB	MongoDB {#DBNAME}.{#COLLECTION}: Objects, avg size	The size of the average object in the collection in bytes.	DEPENDENT	mongodb.collection.avgobjsize["{#DBNAME}","{#COLLECTION}"] Preprocessing: - JSONPATH: `$.avgObjSize` ⛔️ON_FAIL: `DISCARD_VALUE ->`
MongoDB	MongoDB {#DBNAME}.{#COLLECTION}: Objects, count	Total number of objects in the collection.	DEPENDENT	mongodb.collection.count["{#DBNAME}","{#COLLECTION}"] Preprocessing: - JSONPATH: `$.count`
MongoDB	MongoDB {#DBNAME}.{#COLLECTION}: Capped: max number	Maximum number of documents that may be present in a capped collection.	DEPENDENT	mongodb.collection.maxnumber["{#DBNAME}","{#COLLECTION}"] Preprocessing: - JSONPATH: `$.max` ⛔️ONFAIL: `DISCARD_VALUE ->`
MongoDB	MongoDB {#DBNAME}.{#COLLECTION}: Capped: max size	Maximum size of a capped collection in bytes.	DEPENDENT	mongodb.collection.maxsize["{#DBNAME}","{#COLLECTION}"] Preprocessing: - JSONPATH: `$.maxSize` ⛔️ONFAIL: `DISCARD_VALUE ->`
MongoDB	MongoDB {#DBNAME}.{#COLLECTION}: Storage size	Total storage space allocated to this collection for document storage.	DEPENDENT	mongodb.collection.storage_size["{#DBNAME}","{#COLLECTION}"] Preprocessing: - JSONPATH: `$.storageSize`
MongoDB	MongoDB {#DBNAME}.{#COLLECTION}: Indexes	Total number of indices on the collection.	DEPENDENT	mongodb.collection.nindexes["{#DBNAME}","{#COLLECTION}"] Preprocessing: - JSONPATH: `$.nindexes`
MongoDB	MongoDB {#DBNAME}.{#COLLECTION}: Capped	Whether or not the collection is capped.	DEPENDENT	mongodb.collection.capped["{#DBNAME}","{#COLLECTION}"] Preprocessing: - JSONPATH: `$.capped` - BOOLTODECIMAL - DISCARDUNCHANGEDHEARTBEAT: `3h`
MongoDB	MongoDB {#DBNAME}.{#COLLECTION}: Operations: total, rate	The number of operations per second.	DEPENDENT	mongodb.collection.ops.total.rate["{#DBNAME}","{#COLLECTION}"] Preprocessing: - JSONPATH: `$.totals["{#DBNAME}.{#COLLECTION}"].total.count` - CHANGEPERSECOND
MongoDB	MongoDB {#DBNAME}.{#COLLECTION}: Read lock, rate	The number of operations per second.	DEPENDENT	mongodb.collection.readlock.rate["{#DBNAME}","{#COLLECTION}"] Preprocessing: - JSONPATH: `$.totals["{#DBNAME}.{#COLLECTION}"].readLock.count` - CHANGEPER_SECOND
MongoDB	MongoDB {#DBNAME}.{#COLLECTION}: Write lock, rate	The number of operations per second.	DEPENDENT	mongodb.collection.writelock.rate["{#DBNAME}","{#COLLECTION}"] Preprocessing: - JSONPATH: `$.totals["{#DBNAME}.{#COLLECTION}"].writeLock.count` - CHANGEPER_SECOND
MongoDB	MongoDB {#DBNAME}.{#COLLECTION}: Operations: queries, rate	The number of operations per second.	DEPENDENT	mongodb.collection.ops.queries.rate["{#DBNAME}","{#COLLECTION}"] Preprocessing: - JSONPATH: `$.totals["{#DBNAME}.{#COLLECTION}"].queries.count` - CHANGEPERSECOND
MongoDB	MongoDB {#DBNAME}.{#COLLECTION}: Operations: getmore, rate	The number of operations per second.	DEPENDENT	mongodb.collection.ops.getmore.rate["{#DBNAME}","{#COLLECTION}"] Preprocessing: - JSONPATH: `$.totals["{#DBNAME}.{#COLLECTION}"].getmore.count` - CHANGEPERSECOND
MongoDB	MongoDB {#DBNAME}.{#COLLECTION}: Operations: insert, rate	The number of operations per second.	DEPENDENT	mongodb.collection.ops.insert.rate["{#DBNAME}","{#COLLECTION}"] Preprocessing: - JSONPATH: `$.totals["{#DBNAME}.{#COLLECTION}"].insert.count` - CHANGEPERSECOND
MongoDB	MongoDB {#DBNAME}.{#COLLECTION}: Operations: update, rate	The number of operations per second.	DEPENDENT	mongodb.collection.ops.update.rate["{#DBNAME}","{#COLLECTION}"] Preprocessing: - JSONPATH: `$.totals["{#DBNAME}.{#COLLECTION}"].update.count` - CHANGEPERSECOND
MongoDB	MongoDB {#DBNAME}.{#COLLECTION}: Operations: remove, rate	The number of operations per second.	DEPENDENT	mongodb.collection.ops.remove.rate["{#DBNAME}","{#COLLECTION}"] Preprocessing: - JSONPATH: `$.totals["{#DBNAME}.{#COLLECTION}"].remove.count` - CHANGEPERSECOND
MongoDB	MongoDB {#DBNAME}.{#COLLECTION}: Operations: commands, rate	The number of operations per second.	DEPENDENT	mongodb.collection.ops.commands.rate["{#DBNAME}","{#COLLECTION}"] Preprocessing: - JSONPATH: `$.totals["{#DBNAME}.{#COLLECTION}"].commands.count` - CHANGEPERSECOND
MongoDB	MongoDB {#DBNAME}.{#COLLECTION}: Operations: total, ms/s	Fraction of time (ms/s) the mongod has spent to operations.	DEPENDENT	mongodb.collection.ops.total.ms["{#DBNAME}","{#COLLECTION}"] Preprocessing: - JSONPATH: `$.totals["{#DBNAME}.{#COLLECTION}"].total.time` - CHANGEPERSECOND
MongoDB	MongoDB {#DBNAME}.{#COLLECTION}: Read lock, ms/s	Fraction of time (ms/s) the mongod has spent to operations.	DEPENDENT	mongodb.collection.readlock.ms["{#DBNAME}","{#COLLECTION}"] Preprocessing: - JSONPATH: `$.totals["{#DBNAME}.{#COLLECTION}"].readLock.time` - CHANGEPER_SECOND
MongoDB	MongoDB {#DBNAME}.{#COLLECTION}: Write lock, ms/s	Fraction of time (ms/s) the mongod has spent to operations.	DEPENDENT	mongodb.collection.writelock.ms["{#DBNAME}","{#COLLECTION}"] Preprocessing: - JSONPATH: `$.totals["{#DBNAME}.{#COLLECTION}"].writeLock.time` - CHANGEPER_SECOND
MongoDB	MongoDB {#DBNAME}.{#COLLECTION}: Operations: queries, ms/s	Fraction of time (ms/s) the mongod has spent to operations.	DEPENDENT	mongodb.collection.ops.queries.ms["{#DBNAME}","{#COLLECTION}"] Preprocessing: - JSONPATH: `$.totals["{#DBNAME}.{#COLLECTION}"].queries.time` - CHANGEPERSECOND
MongoDB	MongoDB {#DBNAME}.{#COLLECTION}: Operations: getmore, ms/s	Fraction of time (ms/s) the mongod has spent to operations.	DEPENDENT	mongodb.collection.ops.getmore.ms["{#DBNAME}","{#COLLECTION}"] Preprocessing: - JSONPATH: `$.totals["{#DBNAME}.{#COLLECTION}"].getmore.time` - CHANGEPERSECOND
MongoDB	MongoDB {#DBNAME}.{#COLLECTION}: Operations: insert, ms/s	Fraction of time (ms/s) the mongod has spent to operations.	DEPENDENT	mongodb.collection.ops.insert.ms["{#DBNAME}","{#COLLECTION}"] Preprocessing: - JSONPATH: `$.totals["{#DBNAME}.{#COLLECTION}"].insert.time` - CHANGEPERSECOND
MongoDB	MongoDB {#DBNAME}.{#COLLECTION}: Operations: update, ms/s	Fraction of time (ms/s) the mongod has spent to operations.	DEPENDENT	mongodb.collection.ops.update.ms["{#DBNAME}","{#COLLECTION}"] Preprocessing: - JSONPATH: `$.totals["{#DBNAME}.{#COLLECTION}"].update.time` - CHANGEPERSECOND
MongoDB	MongoDB {#DBNAME}.{#COLLECTION}: Operations: remove, ms/s	Fraction of time (ms/s) the mongod has spent to operations.	DEPENDENT	mongodb.collection.ops.remove.ms["{#DBNAME}","{#COLLECTION}"] Preprocessing: - JSONPATH: `$.totals["{#DBNAME}.{#COLLECTION}"].remove.time` - CHANGEPERSECOND
MongoDB	MongoDB {#DBNAME}.{#COLLECTION}: Operations: commands, ms/s	Fraction of time (ms/s) the mongod has spent to operations.	DEPENDENT	mongodb.collection.ops.commands.ms["{#DBNAME}","{#COLLECTION}"] Preprocessing: - JSONPATH: `$.totals["{#DBNAME}.{#COLLECTION}"].commands.time` - CHANGEPERSECOND
MongoDB	MongoDB: Node state	An integer between 0 and 10 that represents the replica state of the current member.	DEPENDENT	mongodb.rs.state[{#RSNAME}] Preprocessing: - JSONPATH: `$.myState` - DISCARDUNCHANGED_HEARTBEAT: `1h`
MongoDB	MongoDB: Replication lag	Delay between a write operation on the primary and its copy to a secondary.	DEPENDENT	mongodb.rs.lag[{#RS_NAME}] Preprocessing: - JSONPATH: `$.members[?(@.self == "true")].lag.first()`
MongoDB	MongoDB: Number of replicas	The number of replicated nodes in current ReplicaSet.	DEPENDENT	mongodb.rs.totalnodes[{#RSNAME}] Preprocessing: - JSONPATH: `$.members[?(@.self == "true")].totalNodes.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
MongoDB	MongoDB: Number of unhealthy replicas	The number of replicated nodes with member health value = 0.	DEPENDENT	mongodb.rs.unhealthycount[{#RSNAME}] Preprocessing: - JSONPATH: `$.members[?(@.self == "true")].unhealthyCount.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
MongoDB	MongoDB: Unhealthy replicas	The replicated nodes in current ReplicaSet with member health value = 0.	DEPENDENT	mongodb.rs.unhealthy[{#RSNAME}] Preprocessing: - JSONPATH: `$.members[?(@.self == "true")].unhealthyNodes.first()` - JAVASCRIPT: `var value = JSON.parse(value); return value.length ? JSON.stringify(value) : '';` - DISCARDUNCHANGED_HEARTBEAT: `1h`
MongoDB	MongoDB: Apply batches, rate	Number of batches applied across all databases per second.	DEPENDENT	mongodb.rs.apply.batches.rate[{#RSNAME}] Preprocessing: - JSONPATH: `$.metrics.repl.apply.batches.num` - CHANGEPER_SECOND
MongoDB	MongoDB: Apply batches, ms/s	Fraction of time (ms/s) the mongod has spent applying operations from the oplog.	DEPENDENT	mongodb.rs.apply.batches.ms.rate[{#RSNAME}] Preprocessing: - JSONPATH: `$.metrics.repl.apply.batches.totalMillis` - CHANGEPER_SECOND
MongoDB	MongoDB: Apply ops, rate	Number of oplog operations applied per second.	DEPENDENT	mongodb.rs.apply.rate[{#RSNAME}] Preprocessing: - JSONPATH: `$.metrics.repl.apply.ops` - CHANGEPER_SECOND
MongoDB	MongoDB: Buffer	Number of operations in the oplog buffer.	DEPENDENT	mongodb.rs.buffer.count[{#RS_NAME}] Preprocessing: - JSONPATH: `$.metrics.repl.buffer.count`
MongoDB	MongoDB: Buffer, max size	Maximum size of the buffer.	DEPENDENT	mongodb.rs.buffer.maxsize[{#RSNAME}] Preprocessing: - JSONPATH: `$.metrics.repl.buffer.maxSizeBytes`
MongoDB	MongoDB: Buffer, size	Current size of the contents of the oplog buffer.	DEPENDENT	mongodb.rs.buffer.size[{#RS_NAME}] Preprocessing: - JSONPATH: `$.metrics.repl.buffer.sizeBytes`
MongoDB	MongoDB: Network bytes, rate	Amount of data read from the replication sync source per second.	DEPENDENT	mongodb.rs.network.bytes.rate[{#RSNAME}] Preprocessing: - JSONPATH: `$.metrics.repl.network.bytes` - CHANGEPER_SECOND
MongoDB	MongoDB: Network getmores, rate	Number of getmore operations per second.	DEPENDENT	mongodb.rs.network.getmores.rate[{#RSNAME}] Preprocessing: - JSONPATH: `$.metrics.repl.network.getmores.num` - CHANGEPER_SECOND
MongoDB	MongoDB: Network getmores, ms/s	Fraction of time (ms/s) required to collect data from getmore operations.	DEPENDENT	mongodb.rs.network.getmores.ms.rate[{#RSNAME}] Preprocessing: - JSONPATH: `$.metrics.repl.network.getmores.totalMillis` - CHANGEPER_SECOND
MongoDB	MongoDB: Network ops, rate	Number of operations read from the replication source per second.	DEPENDENT	mongodb.rs.network.ops.rate[{#RSNAME}] Preprocessing: - JSONPATH: `$.metrics.repl.network.ops` - CHANGEPER_SECOND
MongoDB	MongoDB: Network readers created, rate	Number of oplog query processes created per second.	DEPENDENT	mongodb.rs.network.readers.rate[{#RSNAME}] Preprocessing: - JSONPATH: `$.metrics.repl.network.readersCreated` - CHANGEPER_SECOND
MongoDB	MongoDB {#RS_NAME}: Oplog time diff	Oplog window: difference between the first and last operation in the oplog. Only present if there are entries in the oplog.	DEPENDENT	mongodb.rs.oplog.timediff[{#RS_NAME}] Preprocessing: - JSONPATH: `$.timediff`
MongoDB	MongoDB: Preload docs, rate	Number of documents loaded per second during the pre-fetch stage of replication.	DEPENDENT	mongodb.rs.preload.docs.rate[{#RSNAME}] Preprocessing: - JSONPATH: `$.metrics.repl.preload.docs.num` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
MongoDB	MongoDB: Preload docs, ms/s	Fraction of time (ms/s) spent loading documents as part of the pre-fetch stage of replication.	DEPENDENT	mongodb.rs.preload.docs.ms.rate[{#RSNAME}] Preprocessing: - JSONPATH: `$.metrics.repl.preload.docs.totalMillis` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
MongoDB	MongoDB: Preload indexes, rate	Number of index entries loaded by members before updating documents as part of the pre-fetch stage of replication.	DEPENDENT	mongodb.rs.preload.indexes.rate[{#RSNAME}] Preprocessing: - JSONPATH: `$.metrics.repl.preload.indexes.num` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
MongoDB	MongoDB: Preload indexes, ms/s	Fraction of time (ms/s) spent loading documents as part of the pre-fetch stage of replication.	DEPENDENT	mongodb.rs.preload.indexes.ms.rate[{#RSNAME}] Preprocessing: - JSONPATH: `$.metrics.repl.preload.indexes.totalMillis` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
MongoDB	MongoDB: WiredTiger cache: bytes	Size of the data currently in cache.	DEPENDENT	mongodb.wiredtiger.cache.bytesin_cache[{#SINGLETON}] Preprocessing: - JSONPATH: `$.wiredTiger.cache['bytes currently in the cache']`
MongoDB	MongoDB: WiredTiger cache: in-memory page splits	In-memory page splits.	DEPENDENT	mongodb.wired_tiger.cache.splits[{#SINGLETON}] Preprocessing: - JSONPATH: `$.wiredTiger.cache['in-memory page splits']`
MongoDB	MongoDB: WiredTiger cache: bytes, max	Maximum cache size.	DEPENDENT	mongodb.wiredtiger.cache.maximumbytes_configured[{#SINGLETON}] Preprocessing: - JSONPATH: `$.wiredTiger.cache['maximum bytes configured']`
MongoDB	MongoDB: WiredTiger cache: max page size at eviction	Maximum page size at eviction.	DEPENDENT	mongodb.wiredtiger.cache.maxpagesizeeviction[{#SINGLETON}] Preprocessing: - JSONPATH: `$.wiredTiger.cache['maximum page size at eviction']`
MongoDB	MongoDB: WiredTiger cache: modified pages evicted	Number of pages, that have been modified, evicted from the cache.	DEPENDENT	mongodb.wiredtiger.cache.modifiedpages_evicted[{#SINGLETON}] Preprocessing: - JSONPATH: `$.wiredTiger.cache['modified pages evicted']`
MongoDB	MongoDB: WiredTiger cache: pages read into cache	Number of pages read into the cache.	DEPENDENT	mongodb.wiredtiger.cache.pagesread[{#SINGLETON}] Preprocessing: - JSONPATH: `$.wiredTiger.cache['pages read into cache']`
MongoDB	MongoDB: WiredTiger cache: pages written from cache	Number of pages written from the cache.	DEPENDENT	mongodb.wiredtiger.cache.pageswritten[{#SINGLETON}] Preprocessing: - JSONPATH: `$.wiredTiger.cache['pages written from cache']`
MongoDB	MongoDB: WiredTiger cache: pages held in cache	Number of pages currently held in the cache.	DEPENDENT	mongodb.wiredtiger.cache.pagesin_cache[{#SINGLETON}] Preprocessing: - JSONPATH: `$.wiredTiger.cache['pages currently held in the cache']`
MongoDB	MongoDB: WiredTiger cache: pages evicted by application threads, rate	Number of page evicted by application threads per second.	DEPENDENT	mongodb.wiredtiger.cache.pagesevicted_threads.rate[{#SINGLETON}] Preprocessing: - JSONPATH: `$.wiredTiger.cache.['pages evicted by application threads']`
MongoDB	MongoDB: WiredTiger cache: tracked dirty bytes in the cache	Size of the dirty data in the cache.	DEPENDENT	mongodb.wiredtiger.cache.trackeddirty_bytes[{#SINGLETON}] Preprocessing: - JSONPATH: `$.wiredTiger.cache.['tracked dirty bytes in the cache']`
MongoDB	MongoDB: WiredTiger cache: unmodified pages evicted	Number of pages, that were not modified, evicted from the cache.	DEPENDENT	mongodb.wiredtiger.cache.unmodifiedpages_evicted[{#SINGLETON}] Preprocessing: - JSONPATH: `$.wiredTiger.cache.['unmodified pages evicted']`
MongoDB	MongoDB: WiredTiger concurrent transactions: read, available	Number of available read tickets (concurrent transactions) remaining.	DEPENDENT	mongodb.wiredtiger.concurrenttransactions.read.available[{#SINGLETON}] Preprocessing: - JSONPATH: `$.wiredTiger.concurrentTransactions.read.available`
MongoDB	MongoDB: WiredTiger concurrent transactions: read, out	Number of read tickets (concurrent transactions) in use.	DEPENDENT	mongodb.wiredtiger.concurrenttransactions.read.out[{#SINGLETON}] Preprocessing: - JSONPATH: `$.wiredTiger.concurrentTransactions.read.out`
MongoDB	MongoDB: WiredTiger concurrent transactions: read, total tickets	Total number of read tickets (concurrent transactions) available.	DEPENDENT	mongodb.wiredtiger.concurrenttransactions.read.totalTickets[{#SINGLETON}] Preprocessing: - JSONPATH: `$.wiredTiger.concurrentTransactions.read.totalTickets`
MongoDB	MongoDB: WiredTiger concurrent transactions: write, available	Number of available write tickets (concurrent transactions) remaining.	DEPENDENT	mongodb.wiredtiger.concurrenttransactions.write.available[{#SINGLETON}] Preprocessing: - JSONPATH: `$.wiredTiger.concurrentTransactions.write.available`
MongoDB	MongoDB: WiredTiger concurrent transactions: write, out	Number of write tickets (concurrent transactions) in use.	DEPENDENT	mongodb.wiredtiger.concurrenttransactions.write.out[{#SINGLETON}] Preprocessing: - JSONPATH: `$.wiredTiger.concurrentTransactions.write.out`
MongoDB	MongoDB: WiredTiger concurrent transactions: write, total tickets	Total number of write tickets (concurrent transactions) available.	DEPENDENT	mongodb.wiredtiger.concurrenttransactions.write.totalTickets[{#SINGLETON}] Preprocessing: - JSONPATH: `$.wiredTiger.concurrentTransactions.write.totalTickets`
Zabbix raw items	MongoDB: Get server status	Returns a database's state.	ZABBIX_PASSIVE	mongodb.server.status["{$MONGODB.CONNSTRING}","{$MONGODB.USER}","{$MONGODB.PASSWORD}"]
Zabbix raw items	MongoDB: Get Replica Set status	Returns the replica set status from the point of view of the member where the method is run.	ZABBIX_PASSIVE	mongodb.rs.status["{$MONGODB.CONNSTRING}","{$MONGODB.USER}","{$MONGODB.PASSWORD}"]
Zabbix raw items	MongoDB: Get oplog stats	Returns status of the replica set, using data polled from the oplog.	ZABBIX_PASSIVE	mongodb.oplog.stats["{$MONGODB.CONNSTRING}","{$MONGODB.USER}","{$MONGODB.PASSWORD}"]
Zabbix raw items	MongoDB: Get collections usage stats	Returns usage statistics for each collection.	ZABBIX_PASSIVE	mongodb.collections.usage["{$MONGODB.CONNSTRING}","{$MONGODB.USER}","{$MONGODB.PASSWORD}"]
Zabbix raw items	MongoDB {#DBNAME}: Get db stats {#DBNAME}	Returns statistics reflecting the database system's state.	ZABBIX_PASSIVE	mongodb.db.stats["{$MONGODB.CONNSTRING}","{$MONGODB.USER}","{$MONGODB.PASSWORD}","{#DBNAME}"]
Zabbix raw items	MongoDB {#DBNAME}.{#COLLECTION}: Get collection stats {#DBNAME}.{#COLLECTION}	Returns a variety of storage statistics for a given collection.	ZABBIX_PASSIVE	mongodb.collection.stats["{$MONGODB.CONNSTRING}","{$MONGODB.USER}","{$MONGODB.PASSWORD}","{#DBNAME}","{#COLLECTION}"]

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
MongoDB: Connection to MongoDB is unavailable	Connection to MongoDB instance is currently unavailable.	`last(/MongoDB node by Zabbix agent 2/mongodb.ping["{$MONGODB.CONNSTRING}","{$MONGODB.USER}","{$MONGODB.PASSWORD}"])=0`	HIGH
MongoDB: Version has changed	MongoDB version has changed. Ack to close.	`last(/MongoDB node by Zabbix agent 2/mongodb.version,#1)<>last(/MongoDB node by Zabbix agent 2/mongodb.version,#2) and length(last(/MongoDB node by Zabbix agent 2/mongodb.version))>0`	INFO	Manual close: YES
MongoDB: has been restarted	Uptime is less than 10 minutes.	`last(/MongoDB node by Zabbix agent 2/mongodb.uptime)<10m`	INFO	Manual close: YES
MongoDB: Failed to fetch info data	Zabbix has not received data for items for the last 10 minutes	`nodata(/MongoDB node by Zabbix agent 2/mongodb.uptime,10m)=1`	WARNING	Manual close: YES Depends on: - MongoDB: Connection to MongoDB is unavailable
MongoDB: Total number of open connections is too high	Too few available connections. If MongoDB runs low on connections, in may not be able to handle incoming requests in a timely manner.	`min(/MongoDB node by Zabbix agent 2/mongodb.connections.current,5m)/(last(/MongoDB node by Zabbix agent 2/mongodb.connections.available)+last(/MongoDB node by Zabbix agent 2/mongodb.connections.current))*100>{$MONGODB.CONNS.PCT.USED.MAX.WARN}`	WARNING
MongoDB: Too many cursors opened by MongoDB for clients	-	`min(/MongoDB node by Zabbix agent 2/mongodb.cursor.open.total,5m)>{$MONGODB.CURSOR.OPEN.MAX.WARN}`	WARNING
MongoDB: Too many cursors are timing out	-	`min(/MongoDB node by Zabbix agent 2/mongodb.cursor.timed_out.rate,5m)>{$MONGODB.CURSOR.TIMEOUT.MAX.WARN}`	WARNING
MongoDB: Node in ReplicaSet changed the state	Node in ReplicaSet changed the state. Ack to close.	`last(/MongoDB node by Zabbix agent 2/mongodb.rs.state[{#RS_NAME}],#1)<>last(/MongoDB node by Zabbix agent 2/mongodb.rs.state[{#RS_NAME}],#2)`	WARNING	Manual close: YES
MongoDB: Replication lag with primary is too high	-	`min(/MongoDB node by Zabbix agent 2/mongodb.rs.lag[{#RS_NAME}],5m)>{$MONGODB.REPL.LAG.MAX.WARN}`	WARNING
MongoDB: There are unhealthy replicas in ReplicaSet	-	`last(/MongoDB node by Zabbix agent 2/mongodb.rs.unhealthy_count[{#RS_NAME}])>0 and length(last(/MongoDB node by Zabbix agent 2/mongodb.rs.unhealthy[{#RS_NAME}]))>0`	AVERAGE
MongoDB: Available WiredTiger read tickets is low	"Too few available read tickets. When the number of available read tickets remaining reaches zero, new read requests will be queued until a new read ticket is available."	`max(/MongoDB node by Zabbix agent 2/mongodb.wired_tiger.concurrent_transactions.read.available[{#SINGLETON}],5m)<{$MONGODB.WIRED_TIGER.TICKETS.AVAILABLE.MIN.WARN}`	WARNING
MongoDB: Available WiredTiger write tickets is low	"Too few available write tickets. When the number of available write tickets remaining reaches zero, new write requests will be queued until a new write ticket is available."	`max(/MongoDB node by Zabbix agent 2/mongodb.wired_tiger.concurrent_transactions.write.available[{#SINGLETON}],5m)<{$MONGODB.WIRED_TIGER.TICKETS.AVAILABLE.MIN.WARN}`	WARNING

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

db_influxdb_http

View README Download JSON

InfluxDB by HTTP

Overview

For Zabbix version: 6.2 and higher
The template to monitor InfluxDB by Zabbix that works without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

Template InfluxDB by HTTP — collects metrics by HTTP agent from InfluxDB /metrics endpoint. See:

This template was tested on:

InfluxDB, version 2.0

Setup

This template works with self-hosted InfluxDB instances. Internal service metrics are collected from InfluxDB /metrics endpoint. For organization discovery template need to use Authorization via API token. See docs: https://docs.influxdata.com/influxdb/v2.0/security/tokens/

Don't forget to change the macros {$INFLUXDB.URL}, {$INFLUXDB.API.TOKEN}. Also, see the Macros section for a list of macros used to set trigger values. NOTE. Some metrics may not be collected depending on your InfluxDB instance version and configuration.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$INFLUXDB.API.TOKEN}	InfluxDB API Authorization Token	``
{$INFLUXDB.ORG_NAME.MATCHES}	Filter of discoverable organizations	`.*`
{$INFLUXDB.ORGNAME.NOTMATCHES}	Filter to exclude discovered organizations	`CHANGE_IF_NEEDED`
{$INFLUXDB.REQ.FAIL.MAX.WARN}	Maximum number of query requests failures for trigger expression.	`2`
{$INFLUXDB.TASK.RUN.FAIL.MAX.WARN}	Maximum number of tasks runs failures for trigger expression.	`2`
{$INFLUXDB.URL}	InfluxDB instance URL	`http://localhost:8086`

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Organizations discovery

Discovery of organizations metrics.

HTTP_AGENT

influxdb.orgs.discovery

Preprocessing:

- JAVASCRIPT: The text is too long. Please see the template.

- DISCARDUNCHANGEDHEARTBEAT: 1h

Filter:

AND

- {#ORGNAME} NOTMATCHESREGEX {$INFLUXDB.ORG_NAME.NOT_MATCHES}

- {#ORGNAME} MATCHES_REGEX {$INFLUXDB.ORG_NAME.MATCHES}

Items collected

Group	Name	Description	Type	Key and additional info
InfluxDB	InfluxDB: Instance status	Get the health of an instance.	HTTP_AGENT	influx.healthcheck Preprocessing: - CHECKNOTSUPPORTED ⛔️ONFAIL: `CUSTOM_VALUE -> {"status":"fail"}]}` - JAVASCRIPT: `return JSON.parse(value).status == 'pass' ? 1: 0` - DISCARDUNCHANGED_HEARTBEAT: `30m`
InfluxDB	InfluxDB: Boltdb reads, rate	Total number of boltdb reads per second.	DEPENDENT	influxdb.boltdbreads.rate Preprocessing: - JSONPATH: `$[?(@.name=="boltdb_reads_total")].value.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
InfluxDB	InfluxDB: Boltdb writes, rate	Total number of boltdb writes per second.	DEPENDENT	influxdb.boltdbwrites.rate Preprocessing: - JSONPATH: `$[?(@.name=="boltdb_writes_total")].value.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
InfluxDB	InfluxDB: Buckets, total	Number of total buckets on the server.	DEPENDENT	influxdb.buckets.total Preprocessing: - JSONPATH: `$[?(@.name=="influxdb_buckets_total")].value.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGED_HEARTBEAT: `30m`
InfluxDB	InfluxDB: Dashboards, total	Number of total dashboards on the server.	DEPENDENT	influxdb.dashboards.total Preprocessing: - JSONPATH: `$[?(@.name=="influxdb_dashboards_total")].value.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGED_HEARTBEAT: `30m`
InfluxDB	InfluxDB: Organizations, total	Number of total organizations on the server.	DEPENDENT	influxdb.organizations.total Preprocessing: - JSONPATH: `$[?(@.name=="influxdb_organizations_total")].value.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGED_HEARTBEAT: `30m`
InfluxDB	InfluxDB: Scrapers, total	Number of total scrapers on the server.	DEPENDENT	influxdb.scrapers.total Preprocessing: - JSONPATH: `$[?(@.name=="influxdb_scrapers_total")].value.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGED_HEARTBEAT: `30m`
InfluxDB	InfluxDB: Telegraf plugins, total	Number of individual telegraf plugins configured.	DEPENDENT	influxdb.telegrafplugins.total Preprocessing: - JSONPATH: `$[?(@.name=="influxdb_telegraf_plugins_count")].value.sum()` ⛔️ONFAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGEDHEARTBEAT: `30m`
InfluxDB	InfluxDB: Telegrafs, total	Number of total telegraf configurations on the server.	DEPENDENT	influxdb.telegrafs.total Preprocessing: - JSONPATH: `$[?(@.name=="influxdb_telegrafs_total")].value.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGED_HEARTBEAT: `30m`
InfluxDB	InfluxDB: Tokens, total	Number of total tokens on the server.	DEPENDENT	influxdb.tokens.total Preprocessing: - JSONPATH: `$[?(@.name=="influxdb_tokens_total")].value.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGED_HEARTBEAT: `30m`
InfluxDB	InfluxDB: Users, total	Number of total users on the server.	DEPENDENT	influxdb.users.total Preprocessing: - JSONPATH: `$[?(@.name=="influxdb_users_total")].value.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGED_HEARTBEAT: `30m`
InfluxDB	InfluxDB: Version	Version of the InfluxDB instance.	DEPENDENT	influxdb.version Preprocessing: - JSONPATH: `$[?(@.name=="influxdb_info")].labels.version.first()` - DISCARDUNCHANGEDHEARTBEAT: `3h`
InfluxDB	InfluxDB: Uptime	InfluxDB process uptime in seconds.	DEPENDENT	influxdb.uptime Preprocessing: - JSONPATH: `$[?(@.name=="influxdb_uptime_seconds")].value.first()`
InfluxDB	InfluxDB: Workers currently running	Total number of workers currently running tasks.	DEPENDENT	influxdb.taskexecutorrunsactive.total Preprocessing: - JSONPATH: `$[?(@.name=="task_executor_total_runs_active")].value.first()` ⛔️ONFAIL: `DISCARD_VALUE ->`
InfluxDB	InfluxDB: Workers busy, pct	Percent of total available workers that are currently busy.	DEPENDENT	influxdb.taskexecutorworkersbusy.pct Preprocessing: - JSONPATH: `$[?(@.name=="task_executor_workers_busy")].value.first()` ⛔️ONFAIL: `DISCARD_VALUE ->`
InfluxDB	InfluxDB: Task runs failed, rate	Total number of failure runs across all tasks.	DEPENDENT	influxdb.taskexecutorcomplete.failed.rate Preprocessing: - JSONPATH: `$[?(@.name=="task_executor_total_runs_complete" && @.labels.status == "failed")].value.sum()` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
InfluxDB	InfluxDB: Task runs successful, rate	Total number of runs successful completed across all tasks.	DEPENDENT	influxdb.taskexecutorcomplete.successful.rate Preprocessing: - JSONPATH: `$[?(@.name=="task_executor_total_runs_complete" && @.labels.status == "success")].value.sum()` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
InfluxDB	InfluxDB: [{#ORG_NAME}] Query requests bytes, success	Count of bytes received with status 200 per second.	DEPENDENT	influxdb.org.queryrequestbytes.success.rate["{#ORGNAME}"] Preprocessing: - JSONPATH: `$[?(@.name=="http_query_request_bytes" && @.labels.status == "200" && @.labels.endpoint == "/api/v2/query" && @.labels.org_id == "{#ORG_ID}") ].value.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
InfluxDB	InfluxDB: [{#ORG_NAME}] Query requests bytes, failed	Count of bytes received with status not 200 per second.	DEPENDENT	influxdb.org.queryrequestbytes.failed.rate["{#ORGNAME}"] Preprocessing: - JSONPATH: `$[?(@.name=="http_query_request_bytes" && @.labels.status != "200" && @.labels.endpoint == "/api/v2/query" && @.labels.org_id == "{#ORG_ID}") ].value.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
InfluxDB	InfluxDB: [{#ORG_NAME}] Query requests, failed	Total number of query requests with status not 200 per second.	DEPENDENT	influxdb.org.queryrequest.failed.rate["{#ORGNAME}"] Preprocessing: - JSONPATH: `$[?(@.name=="http_query_request_count" && @.labels.status != "200" && @.labels.endpoint == "/api/v2/query" && @.labels.org_id == "{#ORG_ID}") ].value.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
InfluxDB	InfluxDB: [{#ORG_NAME}] Query requests, success	Total number of query requests with status 200 per second.	DEPENDENT	influxdb.org.queryrequest.success.rate["{#ORGNAME}"] Preprocessing: - JSONPATH: `$[?(@.name=="http_query_request_count" && @.labels.status == "200" && @.labels.endpoint == "/api/v2/query" && @.labels.org_id == "{#ORG_ID}") ].value.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
InfluxDB	InfluxDB: [{#ORG_NAME}] Query response bytes, success	Count of bytes returned with status 200 per second.	DEPENDENT	influxdb.org.httpqueryresponsebytes.success.rate["{#ORGNAME}"] Preprocessing: - JSONPATH: `$[?(@.name=="http_query_response_bytes" && @.labels.status == "200" && @.labels.endpoint == "/api/v2/query" && @.labels.org_id == "{#ORG_ID}") ].value.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
InfluxDB	InfluxDB: [{#ORG_NAME}] Query response bytes, failed	Count of bytes returned with status not 200 per second.	DEPENDENT	influxdb.org.httpqueryresponsebytes.failed.rate["{#ORGNAME}"] Preprocessing: - JSONPATH: `$[?(@.name=="http_query_response_bytes" && @.labels.status != "200" && @.labels.endpoint == "/api/v2/query" && @.labels.org_id == "{#ORG_ID}") ].value.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
Zabbix raw items	InfluxDB: Get instance metrics	-	HTTP_AGENT	influx.getmetrics Preprocessing: - CHECKNOTSUPPORTED ⛔️ONFAIL: `DISCARD_VALUE ->` - PROMETHEUSTOJSON

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
InfluxDB: Health check was failed	The InfluxDB instance is not available or unhealthy.	`last(/InfluxDB by HTTP/influx.healthcheck)=0`	HIGH
InfluxDB: Version has changed	InfluxDB version has changed. Ack to close.	`last(/InfluxDB by HTTP/influxdb.version,#1)<>last(/InfluxDB by HTTP/influxdb.version,#2) and length(last(/InfluxDB by HTTP/influxdb.version))>0`	INFO	Manual close: YES
InfluxDB: has been restarted	Uptime is less than 10 minutes.	`last(/InfluxDB by HTTP/influxdb.uptime)<10m`	INFO	Manual close: YES
InfluxDB: Too many tasks failure runs	"Number of failure runs completed across all tasks is too high."	`min(/InfluxDB by HTTP/influxdb.task_executor_complete.failed.rate,5m)>{$INFLUXDB.TASK.RUN.FAIL.MAX.WARN}`	WARNING
InfluxDB: [{#ORG_NAME}]: Too many requests failures	Too many query requests failed.	`min(/InfluxDB by HTTP/influxdb.org.query_request.failed.rate["{#ORG_NAME}"],5m)>{$INFLUXDB.REQ.FAIL.MAX.WARN}`	WARNING

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

db_ignite_jmx

View README Download JSON

Ignite by JMX

Overview

For Zabbix version: 6.2 and higher
Official JMX Template for Apache Ignite computing platform. This template is based on the original template developed by Igor Akkuratov, Senior Engineer at GridGain Systems and Apache Ignite Contributor.

This template was tested on:

Ignite, version 2.9.0

Setup

This template works with standalone and cluster instances. Metrics are collected by JMX. All metrics are discoverable.

Enable and configure JMX access to Apache Ignite. See documentation for instructions. Current JMX tree hierarchy contains classloader by default. Add the following jvm option -DIGNITE_MBEAN_APPEND_CLASS_LOADER_ID=falseto will exclude one level with Classloader name. You can configure Cache and Data Region metrics which you want using official guide.
Set the user name and password in host macros {$IGNITE.USER} and {$IGNITE.PASSWORD}.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$IGNITE.CHECKPOINT.PUSED.MAX.HIGH}	The maximum percent of checkpoint buffer utilization for high trigger expression.	`80`
{$IGNITE.CHECKPOINT.PUSED.MAX.WARN}	The maximum percent of checkpoint buffer utilization for warning trigger expression.	`66`
{$IGNITE.DATA.REGION.PUSED.MAX.HIGH}	The maximum percent of data region utilization for high trigger expression.	`90`
{$IGNITE.DATA.REGION.PUSED.MAX.WARN}	The maximum percent of data region utilization for warning trigger expression.	`80`
{$IGNITE.JOBS.QUEUE.MAX.WARN}	The maximum number of queued jobs for trigger expression.	`10`
{$IGNITE.LLD.FILTER.CACHE.MATCHES}	Filter of discoverable cache groups.	`.*`
{$IGNITE.LLD.FILTER.CACHE.NOT_MATCHES}	Filter to exclude discovered cache groups.	`CHANGE_IF_NEEDED`
{$IGNITE.LLD.FILTER.DATA.REGION.MATCHES}	Filter of discoverable data regions.	`.*`
{$IGNITE.LLD.FILTER.DATA.REGION.NOT_MATCHES}	Filter to exclude discovered data regions.	`^(sysMemPlc	TxLog)$`
{$IGNITE.LLD.FILTER.THREAD.POOL.MATCHES}	Filter of discoverable thread pools.	`.*`
{$IGNITE.LLD.FILTER.THREAD.POOL.NOT_MATCHES}	Filter to exclude discovered thread pools.	`^(GridCallbackExecutor	GridRebalanceStripedExecutor	GridDataStreamExecutor	StripedExecutor)$`
{$IGNITE.PASSWORD}	-	`<secret>`
{$IGNITE.PME.DURATION.MAX.HIGH}	The maximum PME duration in ms for high trigger expression.	`60000`
{$IGNITE.PME.DURATION.MAX.WARN}	The maximum PME duration in ms for warning trigger expression.	`10000`
{$IGNITE.THREAD.QUEUE.MAX.WARN}	Threshold for thread pool queue size. Can be used with thread pool name as context.	`1000`
{$IGNITE.THREADS.COUNT.MAX.WARN}	The maximum number of running threads for trigger expression.	`1000`
{$IGNITE.USER}	-	`zabbix`

Template links

There are no template links in this template.

Discovery rules

Name	Description	Type	Key and additional info
Cache groups	-	JMX	jmx.discovery[beans,"org.apache:group=\"Cache groups\","] Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `3h` Filter: AND - {#JMXNAME} MATCHESREGEX `{$IGNITE.LLD.FILTER.CACHE.MATCHES}` - {#JMXNAME} NOT*MATCHES_REGEX `{$IGNITE.LLD.FILTER.CACHE.NOT_MATCHES}`
Cache metrics	-	JMX	jmx.discovery[beans,"org.apache:name=\"org.apache.ignite.internal.processors.cache.CacheLocalMetricsMXBeanImpl\","] Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `3h` Filter: AND - {#JMXGROUP} MATCHESREGEX `{$IGNITE.LLD.FILTER.CACHE.MATCHES}` - {#JMXGROUP} NOT*MATCHES_REGEX `{$IGNITE.LLD.FILTER.CACHE.NOT_MATCHES}`
Cluster metrics	-	JMX	jmx.discovery[beans,"org.apache:group=Kernal,name=ClusterMetricsMXBeanImpl,"] Preprocessing*: - JAVASCRIPT: `The text is too long. Please see the template.`
Data region metrics	-	JMX	jmx.discovery[beans,"org.apache:group=DataRegionMetrics,"] Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `3h` Filter: AND - {#JMXNAME} MATCHESREGEX `{$IGNITE.LLD.FILTER.DATA.REGION.MATCHES}` - {#JMXNAME} NOT*MATCHES_REGEX `{$IGNITE.LLD.FILTER.DATA.REGION.NOT_MATCHES}`
Ignite kernal metrics	-	JMX	jmx.discovery[beans,"org.apache:group=Kernal,name=IgniteKernal,"] Preprocessing*: - JAVASCRIPT: `The text is too long. Please see the template.`
Local node metrics	-	JMX	jmx.discovery[beans,"org.apache:group=Kernal,name=ClusterLocalNodeMetricsMXBeanImpl,"] Preprocessing*: - JAVASCRIPT: `The text is too long. Please see the template.`
TCP Communication SPI metrics	-	JMX	jmx.discovery[beans,"org.apache:group=SPIs,name=TcpCommunicationSpi,"] Preprocessing*: - JAVASCRIPT: `The text is too long. Please see the template.`
TCP discovery SPI	-	JMX	jmx.discovery[beans,"org.apache:group=SPIs,name=TcpDiscoverySpi,"] Preprocessing*: - JAVASCRIPT: `The text is too long. Please see the template.`
Thread pool metrics	-	JMX	jmx.discovery[beans,"org.apache:group=\"Thread Pools\","] Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `3h` Filter: AND - {#JMXNAME} MATCHESREGEX `{$IGNITE.LLD.FILTER.THREAD.POOL.MATCHES}` - {#JMXNAME} NOT*MATCHES_REGEX `{$IGNITE.LLD.FILTER.THREAD.POOL.NOT_MATCHES}`
Transaction metrics	-	JMX	jmx.discovery[beans,"org.apache:group=TransactionMetrics,name=TransactionMetricsMxBeanImpl,"] Preprocessing*: - JAVASCRIPT: `The text is too long. Please see the template.`

Items collected

Group	Name	Description	Type	Key and additional info
Ignite	Ignite [{#JMXIGNITEINSTANCENAME}]: Uptime	Uptime of Ignite instance.	JMX	jmx["{#JMXOBJ}",UpTime] Preprocessing: - MULTIPLIER: `0.001`
Ignite	Ignite [{#JMXIGNITEINSTANCENAME}]: Version	Version of Ignite instance.	JMX	jmx["{#JMXOBJ}",FullVersion] Preprocessing: - REGEX: `(.*)-\d+ \1` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Ignite	Ignite [{#JMXIGNITEINSTANCENAME}]: Local node ID	Unique identifier for this node within grid.	JMX	jmx["{#JMXOBJ}",LocalNodeId] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `3h`
Ignite	Ignite [{#JMXIGNITEINSTANCENAME}]: Nodes, Baseline	Total baseline nodes that are registered in the baseline topology.	JMX	jmx["{#JMXOBJ}",TotalBaselineNodes] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `3h`
Ignite	Ignite [{#JMXIGNITEINSTANCENAME}]: Nodes, Active baseline	The number of nodes that are currently active in the baseline topology.	JMX	jmx["{#JMXOBJ}",ActiveBaselineNodes] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `3h`
Ignite	Ignite [{#JMXIGNITEINSTANCENAME}]: Nodes, Client	The number of client nodes in the cluster.	JMX	jmx["{#JMXOBJ}",TotalClientNodes] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `3h`
Ignite	Ignite [{#JMXIGNITEINSTANCENAME}]: Nodes, total	Total number of nodes.	JMX	jmx["{#JMXOBJ}",TotalNodes] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `3h`
Ignite	Ignite [{#JMXIGNITEINSTANCENAME}]: Nodes, Server	The number of server nodes in the cluster.	JMX	jmx["{#JMXOBJ}",TotalServerNodes] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `3h`
Ignite	Ignite [{#JMXIGNITEINSTANCENAME}]: Jobs cancelled, current	Number of cancelled jobs that are still running.	JMX	jmx["{#JMXOBJ}",CurrentCancelledJobs]
Ignite	Ignite [{#JMXIGNITEINSTANCENAME}]: Jobs rejected, current	Number of jobs rejected after more recent collision resolution operation.	JMX	jmx["{#JMXOBJ}",CurrentRejectedJobs]
Ignite	Ignite [{#JMXIGNITEINSTANCENAME}]: Jobs waiting, current	Number of queued jobs currently waiting to be executed.	JMX	jmx["{#JMXOBJ}",CurrentWaitingJobs]
Ignite	Ignite [{#JMXIGNITEINSTANCENAME}]: Jobs active, current	Number of currently active jobs concurrently executing on the node.	JMX	jmx["{#JMXOBJ}",CurrentActiveJobs]
Ignite	Ignite [{#JMXIGNITEINSTANCENAME}]: Jobs executed, rate	Total number of jobs handled by the node per second.	JMX	jmx["{#JMXOBJ}",TotalExecutedJobs] Preprocessing: - CHANGEPERSECOND
Ignite	Ignite [{#JMXIGNITEINSTANCENAME}]: Jobs cancelled, rate	Total number of jobs cancelled by the node per second.	JMX	jmx["{#JMXOBJ}",TotalCancelledJobs] Preprocessing: - CHANGEPERSECOND
Ignite	Ignite [{#JMXIGNITEINSTANCENAME}]: Jobs rejects, rate	Total number of jobs this node rejects during collision resolution operations since node startup per second.	JMX	jmx["{#JMXOBJ}",TotalRejectedJobs] Preprocessing: - CHANGEPERSECOND
Ignite	Ignite [{#JMXIGNITEINSTANCENAME}]: PME duration, current	Current PME duration in milliseconds.	JMX	jmx["{#JMXOBJ}",CurrentPmeDuration]
Ignite	Ignite [{#JMXIGNITEINSTANCENAME}]: Threads count, current	Current number of live threads.	JMX	jmx["{#JMXOBJ}",CurrentThreadCount]
Ignite	Ignite [{#JMXIGNITEINSTANCENAME}]: Heap memory used	Current heap size that is used for object allocation.	JMX	jmx["{#JMXOBJ}",HeapMemoryUsed]
Ignite	Ignite [{#JMXIGNITEINSTANCENAME}]: Coordinator	Current coordinator UUID.	JMX	jmx["{#JMXOBJ}",Coordinator] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `3h`
Ignite	Ignite [{#JMXIGNITEINSTANCENAME}]: Nodes left	Nodes left count.	JMX	jmx["{#JMXOBJ}",NodesLeft]
Ignite	Ignite [{#JMXIGNITEINSTANCENAME}]: Nodes joined	Nodes join count.	JMX	jmx["{#JMXOBJ}",NodesJoined]
Ignite	Ignite [{#JMXIGNITEINSTANCENAME}]: Nodes failed	Nodes failed count.	JMX	jmx["{#JMXOBJ}",NodesFailed]
Ignite	Ignite [{#JMXIGNITEINSTANCENAME}]: Discovery message worker queue	Message worker queue current size.	JMX	jmx["{#JMXOBJ}",MessageWorkerQueueSize]
Ignite	Ignite [{#JMXIGNITEINSTANCENAME}]: Discovery reconnect, rate	Number of times node tries to (re)establish connection to another node per second.	JMX	jmx["{#JMXOBJ}",ReconnectCount] Preprocessing: - CHANGEPERSECOND
Ignite	Ignite [{#JMXIGNITEINSTANCENAME}]: TotalProcessedMessages	The number of messages received per second.	JMX	jmx["{#JMXOBJ}",TotalProcessedMessages] Preprocessing: - CHANGEPERSECOND
Ignite	Ignite [{#JMXIGNITEINSTANCENAME}]: Discovery messages received, rate	The number of messages processed per second.	JMX	jmx["{#JMXOBJ}",TotalReceivedMessages] Preprocessing: - CHANGEPERSECOND
Ignite	Ignite [{#JMXIGNITEINSTANCENAME}]: Communication outbound messages queue	Outbound messages queue size.	JMX	jmx["{#JMXOBJ}",OutboundMessagesQueueSize]
Ignite	Ignite [{#JMXIGNITEINSTANCENAME}]: Communication messages received, rate	The number of messages received per second.	JMX	jmx["{#JMXOBJ}",ReceivedMessagesCount] Preprocessing: - CHANGEPERSECOND
Ignite	Ignite [{#JMXIGNITEINSTANCENAME}]: Communication messages sent, rate	The number of messages sent per second.	JMX	jmx["{#JMXOBJ}",SentMessagesCount] Preprocessing: - CHANGEPERSECOND
Ignite	Ignite [{#JMXIGNITEINSTANCENAME}]: Locked keys	The number of keys locked on the node.	JMX	jmx["{#JMXOBJ}",LockedKeysNumber]
Ignite	Ignite [{#JMXIGNITEINSTANCENAME}]: Transactions owner, current	The number of active transactions for which this node is the initiator.	JMX	jmx["{#JMXOBJ}",OwnerTransactionsNumber]
Ignite	Ignite [{#JMXIGNITEINSTANCENAME}]: Transactions holding lock, current	The number of active transactions holding at least one key lock.	JMX	jmx["{#JMXOBJ}",TransactionsHoldingLockNumber]
Ignite	Ignite [{#JMXIGNITEINSTANCENAME}]: Transactions rolledback, rate	The number of transactions which were rollback per second.	JMX	jmx["{#JMXOBJ}",TransactionsRolledBackNumber]
Ignite	Ignite [{#JMXIGNITEINSTANCENAME}]: Transactions committed, rate	The number of transactions which were committed per second.	JMX	jmx["{#JMXOBJ}",TransactionsCommittedNumber]
Ignite	Cache group [{#JMXGROUP}]: Cache gets, rate	The number of gets to the cache per second.	JMX	jmx["{#JMXOBJ}",CacheGets] Preprocessing: - CHANGEPERSECOND
Ignite	Cache group [{#JMXGROUP}]: Cache puts, rate	The number of puts to the cache per second.	JMX	jmx["{#JMXOBJ}",CachePuts] Preprocessing: - CHANGEPERSECOND
Ignite	Cache group [{#JMXGROUP}]: Cache removals, rate	The number of removals from the cache per second.	JMX	jmx["{#JMXOBJ}",CacheRemovals] Preprocessing: - CHANGEPERSECOND
Ignite	Cache group [{#JMXGROUP}]: Cache hits, pct	Percentage of successful hits.	JMX	jmx["{#JMXOBJ}",CacheHitPercentage]
Ignite	Cache group [{#JMXGROUP}]: Cache misses, pct	Percentage of accesses that failed to find anything.	JMX	jmx["{#JMXOBJ}",CacheMissPercentage]
Ignite	Cache group [{#JMXGROUP}]: Cache transaction commits, rate	The number of transaction commits per second.	JMX	jmx["{#JMXOBJ}",CacheTxCommits] Preprocessing: - CHANGEPERSECOND
Ignite	Cache group [{#JMXGROUP}]: Cache transaction rollbacks, rate	The number of transaction rollback per second.	JMX	jmx["{#JMXOBJ}",CacheTxRollbacks] Preprocessing: - CHANGEPERSECOND
Ignite	Cache group [{#JMXGROUP}]: Cache size	The number of non-null values in the cache as a long value.	JMX	jmx["{#JMXOBJ}",CacheSize]
Ignite	Cache group [{#JMXGROUP}]: Cache heap entries	The number of entries in heap memory.	JMX	jmx["{#JMXOBJ}",HeapEntriesCount] Preprocessing: - CHANGEPERSECOND
Ignite	Data region {#JMXNAME}: Allocation, rate	Allocation rate (pages per second) averaged across rateTimeInternal.	JMX	jmx["{#JMXOBJ}",AllocationRate]
Ignite	Data region {#JMXNAME}: Allocated, bytes	Total size of memory allocated in bytes.	JMX	jmx["{#JMXOBJ}",TotalAllocatedSize]
Ignite	Data region {#JMXNAME}: Dirty pages	Number of pages in memory not yet synchronized with persistent storage.	JMX	jmx["{#JMXOBJ}",DirtyPages]
Ignite	Data region {#JMXNAME}: Eviction, rate	Eviction rate (pages per second).	JMX	jmx["{#JMXOBJ}",EvictionRate]
Ignite	Data region {#JMXNAME}: Size, max	Maximum memory region size defined by its data region.	JMX	jmx["{#JMXOBJ}",MaxSize]
Ignite	Data region {#JMXNAME}: Offheap size	Offheap size in bytes.	JMX	jmx["{#JMXOBJ}",OffHeapSize]
Ignite	Data region {#JMXNAME}: Offheap used size	Total used offheap size in bytes.	JMX	jmx["{#JMXOBJ}",OffheapUsedSize]
Ignite	Data region {#JMXNAME}: Pages fill factor	The percentage of the used space.	JMX	jmx["{#JMXOBJ}",PagesFillFactor]
Ignite	Data region {#JMXNAME}: Pages replace, rate	Rate at which pages in memory are replaced with pages from persistent storage (pages per second).	JMX	jmx["{#JMXOBJ}",PagesReplaceRate]
Ignite	Data region {#JMXNAME}: Used checkpoint buffer size	Used checkpoint buffer size in bytes.	JMX	jmx["{#JMXOBJ}",UsedCheckpointBufferSize]
Ignite	Data region {#JMXNAME}: Checkpoint buffer size	Total size in bytes for checkpoint buffer.	JMX	jmx["{#JMXOBJ}",CheckpointBufferSize]
Ignite	Cache group [{#JMXNAME}]: Backups	Count of backups configured for cache group.	JMX	jmx["{#JMXOBJ}",Backups]
Ignite	Cache group [{#JMXNAME}]: Partitions	Count of partitions for cache group.	JMX	jmx["{#JMXOBJ}",Partitions]
Ignite	Cache group [{#JMXNAME}]: Caches	List of caches.	JMX	jmx["{#JMXOBJ}",Caches] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `3h`
Ignite	Cache group [{#JMXNAME}]: Local node partitions, moving	Count of partitions with state MOVING for this cache group located on this node.	JMX	jmx["{#JMXOBJ}",LocalNodeMovingPartitionsCount]
Ignite	Cache group [{#JMXNAME}]: Local node partitions, renting	Count of partitions with state RENTING for this cache group located on this node.	JMX	jmx["{#JMXOBJ}",LocalNodeRentingPartitionsCount]
Ignite	Cache group [{#JMXNAME}]: Local node entries, renting	Count of entries remains to evict in RENTING partitions located on this node for this cache group.	JMX	jmx["{#JMXOBJ}",LocalNodeRentingEntriesCount]
Ignite	Cache group [{#JMXNAME}]: Local node partitions, owning	Count of partitions with state OWNING for this cache group located on this node.	JMX	jmx["{#JMXOBJ}",LocalNodeOwningPartitionsCount]
Ignite	Cache group [{#JMXNAME}]: Partition copies, min	Minimum number of partition copies for all partitions of this cache group.	JMX	jmx["{#JMXOBJ}",MinimumNumberOfPartitionCopies]
Ignite	Cache group [{#JMXNAME}]: Partition copies, max	Maximum number of partition copies for all partitions of this cache group.	JMX	jmx["{#JMXOBJ}",MaximumNumberOfPartitionCopies]
Ignite	Thread pool [{#JMXNAME}]: Queue size	Current size of the execution queue.	JMX	jmx["{#JMXOBJ}",QueueSize]
Ignite	Thread pool [{#JMXNAME}]: Pool size	Current number of threads in the pool.	JMX	jmx["{#JMXOBJ}",PoolSize]
Ignite	Thread pool [{#JMXNAME}]: Pool size, max	The maximum allowed number of threads.	JMX	jmx["{#JMXOBJ}",MaximumPoolSize]
Ignite	Thread pool [{#JMXNAME}]: Pool size, core	The core number of threads.	JMX	jmx["{#JMXOBJ}",CorePoolSize]

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Ignite [{#JMXIGNITEINSTANCENAME}]: has been restarted	Uptime is less than 10 minutes.	`last(/Ignite by JMX/jmx["{#JMXOBJ}",UpTime])<10m`	INFO	Manual close: YES
Ignite [{#JMXIGNITEINSTANCENAME}]: Failed to fetch info data	Zabbix has not received data for items for the last 10 minutes.	`nodata(/Ignite by JMX/jmx["{#JMXOBJ}",UpTime],10m)=1`	WARNING	Manual close: YES
Ignite [{#JMXIGNITEINSTANCENAME}]: Version has changed	Ignite [{#JMXIGNITEINSTANCENAME}] version has changed. Ack to close.	`last(/Ignite by JMX/jmx["{#JMXOBJ}",FullVersion],#1)<>last(/Ignite by JMX/jmx["{#JMXOBJ}",FullVersion],#2) and length(last(/Ignite by JMX/jmx["{#JMXOBJ}",FullVersion]))>0`	INFO	Manual close: YES
Ignite [{#JMXIGNITEINSTANCENAME}]: Server node left the topology	One or more server node left the topology. Ack to close.	`change(/Ignite by JMX/jmx["{#JMXOBJ}",TotalServerNodes])<0`	WARNING	Manual close: YES
Ignite [{#JMXIGNITEINSTANCENAME}]: Server node added to the topology	One or more server node added to the topology. Ack to close.	`change(/Ignite by JMX/jmx["{#JMXOBJ}",TotalServerNodes])>0`	INFO	Manual close: YES
Ignite [{#JMXIGNITEINSTANCENAME}]: There are nodes is not in topology	One or more server node left the topology. Ack to close.	`last(/Ignite by JMX/jmx["{#JMXOBJ}",TotalServerNodes])>last(/Ignite by JMX/jmx["{#JMXOBJ}",TotalBaselineNodes])`	INFO	Manual close: YES
Ignite [{#JMXIGNITEINSTANCENAME}]: Number of queued jobs is too high	Number of queued jobs is over {$IGNITE.JOBS.QUEUE.MAX.WARN}.	`min(/Ignite by JMX/jmx["{#JMXOBJ}",CurrentWaitingJobs],15m) > {$IGNITE.JOBS.QUEUE.MAX.WARN}`	WARNING
Ignite [{#JMXIGNITEINSTANCENAME}]: PME duration is too long	PME duration is over {$IGNITE.PME.DURATION.MAX.WARN}ms.	`min(/Ignite by JMX/jmx["{#JMXOBJ}",CurrentPmeDuration],5m) > {$IGNITE.PME.DURATION.MAX.WARN}`	WARNING	Depends on: - Ignite [{#JMXIGNITEINSTANCENAME}]: PME duration is too long
Ignite [{#JMXIGNITEINSTANCENAME}]: PME duration is too long	PME duration is over {$IGNITE.PME.DURATION.MAX.HIGH}ms. Looks like PME is hung.	`min(/Ignite by JMX/jmx["{#JMXOBJ}",CurrentPmeDuration],5m) > {$IGNITE.PME.DURATION.MAX.HIGH}`	HIGH
Ignite [{#JMXIGNITEINSTANCENAME}]: Number of running threads is too high	Number of running threads is over {$IGNITE.THREADS.COUNT.MAX.WARN}.	`min(/Ignite by JMX/jmx["{#JMXOBJ}",CurrentThreadCount],15m) > {$IGNITE.THREADS.COUNT.MAX.WARN}`	WARNING	Depends on: - Ignite [{#JMXIGNITEINSTANCENAME}]: PME duration is too long
Ignite [{#JMXIGNITEINSTANCENAME}]: Coordinator has changed	Ignite [{#JMXIGNITEINSTANCENAME}] version has changed. Ack to close.	`last(/Ignite by JMX/jmx["{#JMXOBJ}",Coordinator],#1)<>last(/Ignite by JMX/jmx["{#JMXOBJ}",Coordinator],#2) and length(last(/Ignite by JMX/jmx["{#JMXOBJ}",Coordinator]))>0`	WARNING	Manual close: YES
Cache group [{#JMXGROUP}]: There are no success transactions for cache for 5m	-	`min(/Ignite by JMX/jmx["{#JMXOBJ}",CacheTxRollbacks],5m)>0 and max(/Ignite by JMX/jmx["{#JMXOBJ}",CacheTxCommits],5m)=0`	AVERAGE
Cache group [{#JMXGROUP}]: Success transactions less than rollbacks for 5m	-	`min(/Ignite by JMX/jmx["{#JMXOBJ}",CacheTxRollbacks],5m) > max(/Ignite by JMX/jmx["{#JMXOBJ}",CacheTxCommits],5m)`	WARNING	Depends on: - Cache group [{#JMXGROUP}]: There are no success transactions for cache for 5m
Cache group [{#JMXGROUP}]: All entries are in heap	All entries are in heap. Possibly you use eager queries it may cause out of memory exceptions for big caches. Ack to close.	`last(/Ignite by JMX/jmx["{#JMXOBJ}",CacheSize])=last(/Ignite by JMX/jmx["{#JMXOBJ}",HeapEntriesCount])`	INFO	Manual close: YES
Data region {#JMXNAME}: Node started to evict pages	You store more data than region can accommodate. Data started to move to disk it can make requests work slower. Ack to close.	`min(/Ignite by JMX/jmx["{#JMXOBJ}",EvictionRate],5m)>0`	INFO	Manual close: YES
Data region {#JMXNAME}: Data region utilization is too high	Data region utilization is high. Increase data region size or delete any data.	`min(/Ignite by JMX/jmx["{#JMXOBJ}",OffheapUsedSize],5m)/last(/Ignite by JMX/jmx["{#JMXOBJ}",OffHeapSize])*100>{$IGNITE.DATA.REGION.PUSED.MAX.WARN}`	WARNING	Depends on: - Data region {#JMXNAME}: Data region utilization is too high
Data region {#JMXNAME}: Data region utilization is too high	Data region utilization is high. Increase data region size or delete any data.	`min(/Ignite by JMX/jmx["{#JMXOBJ}",OffheapUsedSize],5m)/last(/Ignite by JMX/jmx["{#JMXOBJ}",OffHeapSize])*100>{$IGNITE.DATA.REGION.PUSED.MAX.HIGH}`	HIGH
Data region {#JMXNAME}: Pages replace rate more than 0	There is more data than DataRegionMaxSize. Cluster started to replace pages in memory. Page replacement can slow down operations.	`min(/Ignite by JMX/jmx["{#JMXOBJ}",PagesReplaceRate],5m)>0`	WARNING
Data region {#JMXNAME}: Checkpoint buffer utilization is too high	Checkpoint buffer utilization is high. Threads will be throttled to avoid buffer overflow. It can be caused by high disk utilization.	`min(/Ignite by JMX/jmx["{#JMXOBJ}",UsedCheckpointBufferSize],5m)/last(/Ignite by JMX/jmx["{#JMXOBJ}",CheckpointBufferSize])*100>{$IGNITE.CHECKPOINT.PUSED.MAX.WARN}`	WARNING	Depends on: - Data region {#JMXNAME}: Checkpoint buffer utilization is too high
Data region {#JMXNAME}: Checkpoint buffer utilization is too high	Checkpoint buffer utilization is high. Threads will be throttled to avoid buffer overflow. It can be caused by high disk utilization.	`min(/Ignite by JMX/jmx["{#JMXOBJ}",UsedCheckpointBufferSize],5m)/last(/Ignite by JMX/jmx["{#JMXOBJ}",CheckpointBufferSize])*100>{$IGNITE.CHECKPOINT.PUSED.MAX.HIGH}`	HIGH
Cache group [{#JMXNAME}]: One or more backups are unavailable	-	`min(/Ignite by JMX/jmx["{#JMXOBJ}",Backups],5m)>=max(/Ignite by JMX/jmx["{#JMXOBJ}",MinimumNumberOfPartitionCopies],5m)`	WARNING
Cache group [{#JMXNAME}]: List of caches has changed	List of caches has changed. Significant changes have occurred in the cluster. Ack to close.	`last(/Ignite by JMX/jmx["{#JMXOBJ}",Caches],#1)<>last(/Ignite by JMX/jmx["{#JMXOBJ}",Caches],#2) and length(last(/Ignite by JMX/jmx["{#JMXOBJ}",Caches]))>0`	INFO	Manual close: YES
Cache group [{#JMXNAME}]: Rebalance in progress	Ack to close.	`max(/Ignite by JMX/jmx["{#JMXOBJ}",LocalNodeMovingPartitionsCount],30m)>0`	INFO	Manual close: YES
Cache group [{#JMXNAME}]: There is no copy for partitions	-	`max(/Ignite by JMX/jmx["{#JMXOBJ}",MinimumNumberOfPartitionCopies],30m)=0`	WARNING
Thread pool [{#JMXNAME}]: Too many messages in queue	Number of messages in queue more than {$IGNITE.THREAD.QUEUE.MAX.WARN:"{#JMXNAME}"}.	`min(/Ignite by JMX/jmx["{#JMXOBJ}",QueueSize],5m) > {$IGNITE.THREAD.QUEUE.MAX.WARN:"{#JMXNAME}"}`	AVERAGE

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

db_gridgain_jmx

View README Download JSON

GridGain by JMX

Overview

For Zabbix version: 6.2 and higher
Official JMX Template for GridGain In-Memory Computing Platform. This template is based on the original template developed by Igor Akkuratov, Senior Engineer at GridGain Systems and GridGain In-Memory Computing Platform Contributor.

This template was tested on:

GridGain, version 8.8.5

Setup

This template works with standalone and cluster instances. Metrics are collected by JMX. All metrics are discoverable.

Enable and configure JMX access to GridGain In-Memory Computing Platform. See documentation for instructions. Current JMX tree hierarchy contains classloader by default. Add the following jvm option -DIGNITE_MBEAN_APPEND_CLASS_LOADER_ID=falseto will exclude one level with Classloader name. You can configure Cache and Data Region metrics which you want using official guide.
Set the user name and password in host macros {$GRIDGAIN.USER} and {$GRIDGAIN.PASSWORD}.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$GRIDGAIN.CHECKPOINT.PUSED.MAX.HIGH}	The maximum percent of checkpoint buffer utilization for high trigger expression.	`80`
{$GRIDGAIN.CHECKPOINT.PUSED.MAX.WARN}	The maximum percent of checkpoint buffer utilization for warning trigger expression.	`66`
{$GRIDGAIN.DATA.REGION.PUSED.MAX.HIGH}	The maximum percent of data region utilization for high trigger expression.	`90`
{$GRIDGAIN.DATA.REGION.PUSED.MAX.WARN}	The maximum percent of data region utilization for warning trigger expression.	`80`
{$GRIDGAIN.JOBS.QUEUE.MAX.WARN}	The maximum number of queued jobs for trigger expression.	`10`
{$GRIDGAIN.LLD.FILTER.CACHE.MATCHES}	Filter of discoverable cache groups.	`.*`
{$GRIDGAIN.LLD.FILTER.CACHE.NOT_MATCHES}	Filter to exclude discovered cache groups.	`CHANGE_IF_NEEDED`
{$GRIDGAIN.LLD.FILTER.DATA.REGION.MATCHES}	Filter of discoverable data regions.	`.*`
{$GRIDGAIN.LLD.FILTER.DATA.REGION.NOT_MATCHES}	Filter to exclude discovered data regions.	`^(sysMemPlc	TxLog)$`
{$GRIDGAIN.LLD.FILTER.THREAD.POOL.MATCHES}	Filter of discoverable thread pools.	`.*`
{$GRIDGAIN.LLD.FILTER.THREAD.POOL.NOT_MATCHES}	Filter to exclude discovered thread pools.	`^(GridCallbackExecutor	GridRebalanceStripedExecutor	GridDataStreamExecutor	StripedExecutor)$`
{$GRIDGAIN.PASSWORD}	-	`<secret>`
{$GRIDGAIN.PME.DURATION.MAX.HIGH}	The maximum PME duration in ms for high trigger expression.	`60000`
{$GRIDGAIN.PME.DURATION.MAX.WARN}	The maximum PME duration in ms for warning trigger expression.	`10000`
{$GRIDGAIN.THREAD.QUEUE.MAX.WARN}	Threshold for thread pool queue size. Can be used with thread pool name as context.	`1000`
{$GRIDGAIN.THREADS.COUNT.MAX.WARN}	The maximum number of running threads for trigger expression.	`1000`
{$GRIDGAIN.USER}	-	`zabbix`

Template links

There are no template links in this template.

Discovery rules

Name	Description	Type	Key and additional info
Cache groups	-	JMX	jmx.discovery[beans,"org.apache:group=\"Cache groups\","] Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `3h` Filter: AND - {#JMXNAME} MATCHESREGEX `{$GRIDGAIN.LLD.FILTER.CACHE.MATCHES}` - {#JMXNAME} NOT*MATCHES_REGEX `{$GRIDGAIN.LLD.FILTER.CACHE.NOT_MATCHES}`
Cache metrics	-	JMX	jmx.discovery[beans,"org.apache:name=\"org.apache.gridgain.internal.processors.cache.CacheLocalMetricsMXBeanImpl\","] Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `3h` Filter: AND - {#JMXGROUP} MATCHESREGEX `{$GRIDGAIN.LLD.FILTER.CACHE.MATCHES}` - {#JMXGROUP} NOT*MATCHES_REGEX `{$GRIDGAIN.LLD.FILTER.CACHE.NOT_MATCHES}`
Cluster metrics	-	JMX	jmx.discovery[beans,"org.apache:group=Kernal,name=ClusterMetricsMXBeanImpl,"] Preprocessing*: - JAVASCRIPT: `The text is too long. Please see the template.`
Data region metrics	-	JMX	jmx.discovery[beans,"org.apache:group=DataRegionMetrics,"] Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `3h` Filter: AND - {#JMXNAME} MATCHESREGEX `{$GRIDGAIN.LLD.FILTER.DATA.REGION.MATCHES}` - {#JMXNAME} NOT*MATCHES_REGEX `{$GRIDGAIN.LLD.FILTER.DATA.REGION.NOT_MATCHES}`
GridGain kernal metrics	-	JMX	jmx.discovery[beans,"org.apache:group=Kernal,name=IgniteKernal,"] Preprocessing*: - JAVASCRIPT: `The text is too long. Please see the template.`
Local node metrics	-	JMX	jmx.discovery[beans,"org.apache:group=Kernal,name=ClusterLocalNodeMetricsMXBeanImpl,"] Preprocessing*: - JAVASCRIPT: `The text is too long. Please see the template.`
TCP Communication SPI metrics	-	JMX	jmx.discovery[beans,"org.apache:group=SPIs,name=TcpCommunicationSpi,"] Preprocessing*: - JAVASCRIPT: `The text is too long. Please see the template.`
TCP discovery SPI	-	JMX	jmx.discovery[beans,"org.apache:group=SPIs,name=TcpDiscoverySpi,"] Preprocessing*: - JAVASCRIPT: `The text is too long. Please see the template.`
Thread pool metrics	-	JMX	jmx.discovery[beans,"org.apache:group=\"Thread Pools\","] Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `3h` Filter: AND - {#JMXNAME} MATCHESREGEX `{$GRIDGAIN.LLD.FILTER.THREAD.POOL.MATCHES}` - {#JMXNAME} NOT*MATCHES_REGEX `{$GRIDGAIN.LLD.FILTER.THREAD.POOL.NOT_MATCHES}`
Transaction metrics	-	JMX	jmx.discovery[beans,"org.apache:group=TransactionMetrics,name=TransactionMetricsMxBeanImpl,"] Preprocessing*: - JAVASCRIPT: `The text is too long. Please see the template.`

Items collected

Group	Name	Description	Type	Key and additional info
GridGain	GridGain [{#JMXIGNITEINSTANCENAME}]: Uptime	Uptime of GridGain instance.	JMX	jmx["{#JMXOBJ}",UpTime] Preprocessing: - MULTIPLIER: `0.001`
GridGain	GridGain [{#JMXIGNITEINSTANCENAME}]: Version	Version of GridGain instance.	JMX	jmx["{#JMXOBJ}",FullVersion] Preprocessing: - REGEX: `(.*)-\d+ \1` - DISCARDUNCHANGEDHEARTBEAT: `3h`
GridGain	GridGain [{#JMXIGNITEINSTANCENAME}]: Local node ID	Unique identifier for this node within grid.	JMX	jmx["{#JMXOBJ}",LocalNodeId] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `3h`
GridGain	GridGain [{#JMXIGNITEINSTANCENAME}]: Nodes, Baseline	Total baseline nodes that are registered in the baseline topology.	JMX	jmx["{#JMXOBJ}",TotalBaselineNodes] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `3h`
GridGain	GridGain [{#JMXIGNITEINSTANCENAME}]: Nodes, Active baseline	The number of nodes that are currently active in the baseline topology.	JMX	jmx["{#JMXOBJ}",ActiveBaselineNodes] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `3h`
GridGain	GridGain [{#JMXIGNITEINSTANCENAME}]: Nodes, Client	The number of client nodes in the cluster.	JMX	jmx["{#JMXOBJ}",TotalClientNodes] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `3h`
GridGain	GridGain [{#JMXIGNITEINSTANCENAME}]: Nodes, total	Total number of nodes.	JMX	jmx["{#JMXOBJ}",TotalNodes] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `3h`
GridGain	GridGain [{#JMXIGNITEINSTANCENAME}]: Nodes, Server	The number of server nodes in the cluster.	JMX	jmx["{#JMXOBJ}",TotalServerNodes] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `3h`
GridGain	GridGain [{#JMXIGNITEINSTANCENAME}]: Jobs cancelled, current	Number of cancelled jobs that are still running.	JMX	jmx["{#JMXOBJ}",CurrentCancelledJobs]
GridGain	GridGain [{#JMXIGNITEINSTANCENAME}]: Jobs rejected, current	Number of jobs rejected after more recent collision resolution operation.	JMX	jmx["{#JMXOBJ}",CurrentRejectedJobs]
GridGain	GridGain [{#JMXIGNITEINSTANCENAME}]: Jobs waiting, current	Number of queued jobs currently waiting to be executed.	JMX	jmx["{#JMXOBJ}",CurrentWaitingJobs]
GridGain	GridGain [{#JMXIGNITEINSTANCENAME}]: Jobs active, current	Number of currently active jobs concurrently executing on the node.	JMX	jmx["{#JMXOBJ}",CurrentActiveJobs]
GridGain	GridGain [{#JMXIGNITEINSTANCENAME}]: Jobs executed, rate	Total number of jobs handled by the node per second.	JMX	jmx["{#JMXOBJ}",TotalExecutedJobs] Preprocessing: - CHANGEPERSECOND
GridGain	GridGain [{#JMXIGNITEINSTANCENAME}]: Jobs cancelled, rate	Total number of jobs cancelled by the node per second.	JMX	jmx["{#JMXOBJ}",TotalCancelledJobs] Preprocessing: - CHANGEPERSECOND
GridGain	GridGain [{#JMXIGNITEINSTANCENAME}]: Jobs rejects, rate	Total number of jobs this node rejects during collision resolution operations since node startup per second.	JMX	jmx["{#JMXOBJ}",TotalRejectedJobs] Preprocessing: - CHANGEPERSECOND
GridGain	GridGain [{#JMXIGNITEINSTANCENAME}]: PME duration, current	Current PME duration in milliseconds.	JMX	jmx["{#JMXOBJ}",CurrentPmeDuration]
GridGain	GridGain [{#JMXIGNITEINSTANCENAME}]: Threads count, current	Current number of live threads.	JMX	jmx["{#JMXOBJ}",CurrentThreadCount]
GridGain	GridGain [{#JMXIGNITEINSTANCENAME}]: Heap memory used	Current heap size that is used for object allocation.	JMX	jmx["{#JMXOBJ}",HeapMemoryUsed]
GridGain	GridGain [{#JMXIGNITEINSTANCENAME}]: Coordinator	Current coordinator UUID.	JMX	jmx["{#JMXOBJ}",Coordinator] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `3h`
GridGain	GridGain [{#JMXIGNITEINSTANCENAME}]: Nodes left	Nodes left count.	JMX	jmx["{#JMXOBJ}",NodesLeft]
GridGain	GridGain [{#JMXIGNITEINSTANCENAME}]: Nodes joined	Nodes join count.	JMX	jmx["{#JMXOBJ}",NodesJoined]
GridGain	GridGain [{#JMXIGNITEINSTANCENAME}]: Nodes failed	Nodes failed count.	JMX	jmx["{#JMXOBJ}",NodesFailed]
GridGain	GridGain [{#JMXIGNITEINSTANCENAME}]: Discovery message worker queue	Message worker queue current size.	JMX	jmx["{#JMXOBJ}",MessageWorkerQueueSize]
GridGain	GridGain [{#JMXIGNITEINSTANCENAME}]: Discovery reconnect, rate	Number of times node tries to (re)establish connection to another node per second.	JMX	jmx["{#JMXOBJ}",ReconnectCount] Preprocessing: - CHANGEPERSECOND
GridGain	GridGain [{#JMXIGNITEINSTANCENAME}]: TotalProcessedMessages	The number of messages received per second.	JMX	jmx["{#JMXOBJ}",TotalProcessedMessages] Preprocessing: - CHANGEPERSECOND
GridGain	GridGain [{#JMXIGNITEINSTANCENAME}]: Discovery messages received, rate	The number of messages processed per second.	JMX	jmx["{#JMXOBJ}",TotalReceivedMessages] Preprocessing: - CHANGEPERSECOND
GridGain	GridGain [{#JMXIGNITEINSTANCENAME}]: Communication outbound messages queue	Outbound messages queue size.	JMX	jmx["{#JMXOBJ}",OutboundMessagesQueueSize]
GridGain	GridGain [{#JMXIGNITEINSTANCENAME}]: Communication messages received, rate	The number of messages received per second.	JMX	jmx["{#JMXOBJ}",ReceivedMessagesCount] Preprocessing: - CHANGEPERSECOND
GridGain	GridGain [{#JMXIGNITEINSTANCENAME}]: Communication messages sent, rate	The number of messages sent per second.	JMX	jmx["{#JMXOBJ}",SentMessagesCount] Preprocessing: - CHANGEPERSECOND
GridGain	GridGain [{#JMXIGNITEINSTANCENAME}]: Communication reconnect rate	Gets maximum number of reconnect attempts used when establishing connection with remote nodes per second.	JMX	jmx["{#JMXOBJ}",ReconnectCount] Preprocessing: - CHANGEPERSECOND
GridGain	GridGain [{#JMXIGNITEINSTANCENAME}]: Locked keys	The number of keys locked on the node.	JMX	jmx["{#JMXOBJ}",LockedKeysNumber]
GridGain	GridGain [{#JMXIGNITEINSTANCENAME}]: Transactions owner, current	The number of active transactions for which this node is the initiator.	JMX	jmx["{#JMXOBJ}",OwnerTransactionsNumber]
GridGain	GridGain [{#JMXIGNITEINSTANCENAME}]: Transactions holding lock, current	The number of active transactions holding at least one key lock.	JMX	jmx["{#JMXOBJ}",TransactionsHoldingLockNumber]
GridGain	GridGain [{#JMXIGNITEINSTANCENAME}]: Transactions rolledback, rate	The number of transactions which were rollback per second.	JMX	jmx["{#JMXOBJ}",TransactionsRolledBackNumber]
GridGain	GridGain [{#JMXIGNITEINSTANCENAME}]: Transactions committed, rate	The number of transactions which were committed per second.	JMX	jmx["{#JMXOBJ}",TransactionsCommittedNumber]
GridGain	Cache group [{#JMXGROUP}]: Cache gets, rate	The number of gets to the cache per second.	JMX	jmx["{#JMXOBJ}",CacheGets] Preprocessing: - CHANGEPERSECOND
GridGain	Cache group [{#JMXGROUP}]: Cache puts, rate	The number of puts to the cache per second.	JMX	jmx["{#JMXOBJ}",CachePuts] Preprocessing: - CHANGEPERSECOND
GridGain	Cache group [{#JMXGROUP}]: Cache removals, rate	The number of removals from the cache per second.	JMX	jmx["{#JMXOBJ}",CacheRemovals] Preprocessing: - CHANGEPERSECOND
GridGain	Cache group [{#JMXGROUP}]: Cache hits, pct	Percentage of successful hits.	JMX	jmx["{#JMXOBJ}",CacheHitPercentage]
GridGain	Cache group [{#JMXGROUP}]: Cache misses, pct	Percentage of accesses that failed to find anything.	JMX	jmx["{#JMXOBJ}",CacheMissPercentage]
GridGain	Cache group [{#JMXGROUP}]: Cache transaction commits, rate	The number of transaction commits per second.	JMX	jmx["{#JMXOBJ}",CacheTxCommits] Preprocessing: - CHANGEPERSECOND
GridGain	Cache group [{#JMXGROUP}]: Cache transaction rollbacks, rate	The number of transaction rollback per second.	JMX	jmx["{#JMXOBJ}",CacheTxRollbacks] Preprocessing: - CHANGEPERSECOND
GridGain	Cache group [{#JMXGROUP}]: Cache size	The number of non-null values in the cache as a long value.	JMX	jmx["{#JMXOBJ}",CacheSize]
GridGain	Cache group [{#JMXGROUP}]: Cache heap entries	The number of entries in heap memory.	JMX	jmx["{#JMXOBJ}",HeapEntriesCount] Preprocessing: - CHANGEPERSECOND
GridGain	Data region {#JMXNAME}: Allocation, rate	Allocation rate (pages per second) averaged across rateTimeInternal.	JMX	jmx["{#JMXOBJ}",AllocationRate]
GridGain	Data region {#JMXNAME}: Allocated, bytes	Total size of memory allocated in bytes.	JMX	jmx["{#JMXOBJ}",TotalAllocatedSize]
GridGain	Data region {#JMXNAME}: Dirty pages	Number of pages in memory not yet synchronized with persistent storage.	JMX	jmx["{#JMXOBJ}",DirtyPages]
GridGain	Data region {#JMXNAME}: Eviction, rate	Eviction rate (pages per second).	JMX	jmx["{#JMXOBJ}",EvictionRate]
GridGain	Data region {#JMXNAME}: Size, max	Maximum memory region size defined by its data region.	JMX	jmx["{#JMXOBJ}",MaxSize]
GridGain	Data region {#JMXNAME}: Offheap size	Offheap size in bytes.	JMX	jmx["{#JMXOBJ}",OffHeapSize]
GridGain	Data region {#JMXNAME}: Offheap used size	Total used offheap size in bytes.	JMX	jmx["{#JMXOBJ}",OffheapUsedSize]
GridGain	Data region {#JMXNAME}: Pages fill factor	The percentage of the used space.	JMX	jmx["{#JMXOBJ}",PagesFillFactor]
GridGain	Data region {#JMXNAME}: Pages replace, rate	Rate at which pages in memory are replaced with pages from persistent storage (pages per second).	JMX	jmx["{#JMXOBJ}",PagesReplaceRate]
GridGain	Data region {#JMXNAME}: Used checkpoint buffer size	Used checkpoint buffer size in bytes.	JMX	jmx["{#JMXOBJ}",UsedCheckpointBufferSize]
GridGain	Data region {#JMXNAME}: Checkpoint buffer size	Total size in bytes for checkpoint buffer.	JMX	jmx["{#JMXOBJ}",CheckpointBufferSize]
GridGain	Cache group [{#JMXNAME}]: Backups	Count of backups configured for cache group.	JMX	jmx["{#JMXOBJ}",Backups]
GridGain	Cache group [{#JMXNAME}]: Partitions	Count of partitions for cache group.	JMX	jmx["{#JMXOBJ}",Partitions]
GridGain	Cache group [{#JMXNAME}]: Caches	List of caches.	JMX	jmx["{#JMXOBJ}",Caches] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `3h`
GridGain	Cache group [{#JMXNAME}]: Local node partitions, moving	Count of partitions with state MOVING for this cache group located on this node.	JMX	jmx["{#JMXOBJ}",LocalNodeMovingPartitionsCount]
GridGain	Cache group [{#JMXNAME}]: Local node partitions, renting	Count of partitions with state RENTING for this cache group located on this node.	JMX	jmx["{#JMXOBJ}",LocalNodeRentingPartitionsCount]
GridGain	Cache group [{#JMXNAME}]: Local node entries, renting	Count of entries remains to evict in RENTING partitions located on this node for this cache group.	JMX	jmx["{#JMXOBJ}",LocalNodeRentingEntriesCount]
GridGain	Cache group [{#JMXNAME}]: Local node partitions, owning	Count of partitions with state OWNING for this cache group located on this node.	JMX	jmx["{#JMXOBJ}",LocalNodeOwningPartitionsCount]
GridGain	Cache group [{#JMXNAME}]: Partition copies, min	Minimum number of partition copies for all partitions of this cache group.	JMX	jmx["{#JMXOBJ}",MinimumNumberOfPartitionCopies]
GridGain	Cache group [{#JMXNAME}]: Partition copies, max	Maximum number of partition copies for all partitions of this cache group.	JMX	jmx["{#JMXOBJ}",MaximumNumberOfPartitionCopies]
GridGain	Thread pool [{#JMXNAME}]: Queue size	Current size of the execution queue.	JMX	jmx["{#JMXOBJ}",QueueSize]
GridGain	Thread pool [{#JMXNAME}]: Pool size	Current number of threads in the pool.	JMX	jmx["{#JMXOBJ}",PoolSize]
GridGain	Thread pool [{#JMXNAME}]: Pool size, max	The maximum allowed number of threads.	JMX	jmx["{#JMXOBJ}",MaximumPoolSize]
GridGain	Thread pool [{#JMXNAME}]: Pool size, core	The core number of threads.	JMX	jmx["{#JMXOBJ}",CorePoolSize]

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
GridGain [{#JMXIGNITEINSTANCENAME}]: has been restarted	Uptime is less than 10 minutes.	`last(/GridGain by JMX/jmx["{#JMXOBJ}",UpTime])<10m`	INFO	Manual close: YES
GridGain [{#JMXIGNITEINSTANCENAME}]: Failed to fetch info data	Zabbix has not received data for items for the last 10 minutes.	`nodata(/GridGain by JMX/jmx["{#JMXOBJ}",UpTime],10m)=1`	WARNING	Manual close: YES
GridGain [{#JMXIGNITEINSTANCENAME}]: Version has changed	GridGain [{#JMXIGNITEINSTANCENAME}] version has changed. Ack to close.	`last(/GridGain by JMX/jmx["{#JMXOBJ}",FullVersion],#1)<>last(/GridGain by JMX/jmx["{#JMXOBJ}",FullVersion],#2) and length(last(/GridGain by JMX/jmx["{#JMXOBJ}",FullVersion]))>0`	INFO	Manual close: YES
GridGain [{#JMXIGNITEINSTANCENAME}]: Server node left the topology	One or more server node left the topology. Ack to close.	`change(/GridGain by JMX/jmx["{#JMXOBJ}",TotalServerNodes])<0`	WARNING	Manual close: YES
GridGain [{#JMXIGNITEINSTANCENAME}]: Server node added to the topology	One or more server node added to the topology. Ack to close.	`change(/GridGain by JMX/jmx["{#JMXOBJ}",TotalServerNodes])>0`	INFO	Manual close: YES
GridGain [{#JMXIGNITEINSTANCENAME}]: There are nodes is not in topology	One or more server node left the topology. Ack to close.	`last(/GridGain by JMX/jmx["{#JMXOBJ}",TotalServerNodes])>last(/GridGain by JMX/jmx["{#JMXOBJ}",TotalBaselineNodes])`	INFO	Manual close: YES
GridGain [{#JMXIGNITEINSTANCENAME}]: Number of queued jobs is too high	Number of queued jobs is over {$GRIDGAIN.JOBS.QUEUE.MAX.WARN}.	`min(/GridGain by JMX/jmx["{#JMXOBJ}",CurrentWaitingJobs],15m) > {$GRIDGAIN.JOBS.QUEUE.MAX.WARN}`	WARNING
GridGain [{#JMXIGNITEINSTANCENAME}]: PME duration is too long	PME duration is over {$GRIDGAIN.PME.DURATION.MAX.WARN}ms.	`min(/GridGain by JMX/jmx["{#JMXOBJ}",CurrentPmeDuration],5m) > {$GRIDGAIN.PME.DURATION.MAX.WARN}`	WARNING	Depends on: - GridGain [{#JMXIGNITEINSTANCENAME}]: PME duration is too long
GridGain [{#JMXIGNITEINSTANCENAME}]: PME duration is too long	PME duration is over {$GRIDGAIN.PME.DURATION.MAX.HIGH}ms. Looks like PME is hung.	`min(/GridGain by JMX/jmx["{#JMXOBJ}",CurrentPmeDuration],5m) > {$GRIDGAIN.PME.DURATION.MAX.HIGH}`	HIGH
GridGain [{#JMXIGNITEINSTANCENAME}]: Number of running threads is too high	Number of running threads is over {$GRIDGAIN.THREADS.COUNT.MAX.WARN}.	`min(/GridGain by JMX/jmx["{#JMXOBJ}",CurrentThreadCount],15m) > {$GRIDGAIN.THREADS.COUNT.MAX.WARN}`	WARNING	Depends on: - GridGain [{#JMXIGNITEINSTANCENAME}]: PME duration is too long
GridGain [{#JMXIGNITEINSTANCENAME}]: Coordinator has changed	GridGain [{#JMXIGNITEINSTANCENAME}] version has changed. Ack to close.	`last(/GridGain by JMX/jmx["{#JMXOBJ}",Coordinator],#1)<>last(/GridGain by JMX/jmx["{#JMXOBJ}",Coordinator],#2) and length(last(/GridGain by JMX/jmx["{#JMXOBJ}",Coordinator]))>0`	WARNING	Manual close: YES
Cache group [{#JMXGROUP}]: There are no success transactions for cache for 5m	-	`min(/GridGain by JMX/jmx["{#JMXOBJ}",CacheTxRollbacks],5m)>0 and max(/GridGain by JMX/jmx["{#JMXOBJ}",CacheTxCommits],5m)=0`	AVERAGE
Cache group [{#JMXGROUP}]: Success transactions less than rollbacks for 5m	-	`min(/GridGain by JMX/jmx["{#JMXOBJ}",CacheTxRollbacks],5m) > max(/GridGain by JMX/jmx["{#JMXOBJ}",CacheTxCommits],5m)`	WARNING	Depends on: - Cache group [{#JMXGROUP}]: There are no success transactions for cache for 5m
Cache group [{#JMXGROUP}]: All entries are in heap	All entries are in heap. Possibly you use eager queries it may cause out of memory exceptions for big caches. Ack to close.	`last(/GridGain by JMX/jmx["{#JMXOBJ}",CacheSize])=last(/GridGain by JMX/jmx["{#JMXOBJ}",HeapEntriesCount])`	INFO	Manual close: YES
Data region {#JMXNAME}: Node started to evict pages	You store more data than region can accommodate. Data started to move to disk it can make requests work slower. Ack to close.	`min(/GridGain by JMX/jmx["{#JMXOBJ}",EvictionRate],5m)>0`	INFO	Manual close: YES
Data region {#JMXNAME}: Data region utilization is too high	Data region utilization is high. Increase data region size or delete any data.	`min(/GridGain by JMX/jmx["{#JMXOBJ}",OffheapUsedSize],5m)/last(/GridGain by JMX/jmx["{#JMXOBJ}",OffHeapSize])*100>{$GRIDGAIN.DATA.REGION.PUSED.MAX.WARN}`	WARNING	Depends on: - Data region {#JMXNAME}: Data region utilization is too high
Data region {#JMXNAME}: Data region utilization is too high	Data region utilization is high. Increase data region size or delete any data.	`min(/GridGain by JMX/jmx["{#JMXOBJ}",OffheapUsedSize],5m)/last(/GridGain by JMX/jmx["{#JMXOBJ}",OffHeapSize])*100>{$GRIDGAIN.DATA.REGION.PUSED.MAX.HIGH}`	HIGH
Data region {#JMXNAME}: Pages replace rate more than 0	There is more data than DataRegionMaxSize. Cluster started to replace pages in memory. Page replacement can slow down operations.	`min(/GridGain by JMX/jmx["{#JMXOBJ}",PagesReplaceRate],5m)>0`	WARNING
Data region {#JMXNAME}: Checkpoint buffer utilization is too high	Checkpoint buffer utilization is high. Threads will be throttled to avoid buffer overflow. It can be caused by high disk utilization.	`min(/GridGain by JMX/jmx["{#JMXOBJ}",UsedCheckpointBufferSize],5m)/last(/GridGain by JMX/jmx["{#JMXOBJ}",CheckpointBufferSize])*100>{$GRIDGAIN.CHECKPOINT.PUSED.MAX.WARN}`	WARNING	Depends on: - Data region {#JMXNAME}: Checkpoint buffer utilization is too high
Data region {#JMXNAME}: Checkpoint buffer utilization is too high	Checkpoint buffer utilization is high. Threads will be throttled to avoid buffer overflow. It can be caused by high disk utilization.	`min(/GridGain by JMX/jmx["{#JMXOBJ}",UsedCheckpointBufferSize],5m)/last(/GridGain by JMX/jmx["{#JMXOBJ}",CheckpointBufferSize])*100>{$GRIDGAIN.CHECKPOINT.PUSED.MAX.HIGH}`	HIGH
Cache group [{#JMXNAME}]: One or more backups are unavailable	-	`min(/GridGain by JMX/jmx["{#JMXOBJ}",Backups],5m)>=max(/GridGain by JMX/jmx["{#JMXOBJ}",MinimumNumberOfPartitionCopies],5m)`	WARNING
Cache group [{#JMXNAME}]: List of caches has changed	List of caches has changed. Significant changes have occurred in the cluster. Ack to close.	`last(/GridGain by JMX/jmx["{#JMXOBJ}",Caches],#1)<>last(/GridGain by JMX/jmx["{#JMXOBJ}",Caches],#2) and length(last(/GridGain by JMX/jmx["{#JMXOBJ}",Caches]))>0`	INFO	Manual close: YES
Cache group [{#JMXNAME}]: Rebalance in progress	Ack to close.	`max(/GridGain by JMX/jmx["{#JMXOBJ}",LocalNodeMovingPartitionsCount],30m)>0`	INFO	Manual close: YES
Cache group [{#JMXNAME}]: There is no copy for partitions	-	`max(/GridGain by JMX/jmx["{#JMXOBJ}",MinimumNumberOfPartitionCopies],30m)=0`	WARNING
Thread pool [{#JMXNAME}]: Too many messages in queue	Number of messages in queue more than {$GRIDGAIN.THREAD.QUEUE.MAX.WARN:"{#JMXNAME}"}.	`min(/GridGain by JMX/jmx["{#JMXOBJ}",QueueSize],5m) > {$GRIDGAIN.THREAD.QUEUE.MAX.WARN:"{#JMXNAME}"}`	AVERAGE

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

db_cockroachdb_http

View README Download JSON

CockroachDB by HTTP

Overview

For Zabbix version: 6.2 and higher
The template to monitor CockroachDB nodes by Zabbix that works without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

Template CockroachDB node by HTTP — collects metrics by HTTP agent from Prometheus endpoint and health endpoints.

This template was tested on:

CockroachDB, version 21.2.8

Setup

Internal node metrics are collected from Prometheus /_status/vars endpoint. Node health metrics are collected from /health and /health?ready=1 endpoints. Template doesn't require usage of session token.

Don't forget change macros {$COCKROACHDB.API.SCHEME} according to your situation (secure/insecure node). Also, see the Macros section for a list of macros used to set trigger values.

NOTE. Some metrics may not be collected depending on your CockroachDB version and configuration.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$COCKROACHDB.API.PORT}	The port of CockroachDB API and Prometheus endpoint.	`8080`
{$COCKROACHDB.API.SCHEME}	Request scheme which may be http or https.	`http`
{$COCKROACHDB.CERT.CA.EXPIRY.WARN}	Number of days until the CA certificate expires.	`90`
{$COCKROACHDB.CERT.NODE.EXPIRY.WARN}	Number of days until the node certificate expires.	`30`
{$COCKROACHDB.CLOCK.OFFSET.MAX.WARN}	Maximum clock offset of the node against the rest of the cluster in milliseconds for trigger expression.	`300`
{$COCKROACHDB.OPEN.FDS.MAX.WARN}	Maximum percentage of used file descriptors.	`80`
{$COCKROACHDB.STATEMENTS.ERRORS.MAX.WARN}	Maximum number of SQL statements errors for trigger expression.	`2`
{$COCKROACHDB.STORE.USED.MIN.CRIT}	The critical threshold of the available disk space in percent.	`10`
{$COCKROACHDB.STORE.USED.MIN.WARN}	The warning threshold of the available disk space in percent.	`20`

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Storage metrics discovery

Discover per store metrics.

DEPENDENT

cockroachdb.store.discovery

Preprocessing:

- PROMETHEUSTOJSON: capacity

- DISCARDUNCHANGEDHEARTBEAT: 3h

Items collected

Group	Name	Description	Type	Key and additional info
CockroachDB	CockroachDB: Service ping	Check if HTTP/HTTPS service accepts TCP connections.	SIMPLE	net.tcp.service["{$COCKROACHDB.API.SCHEME}","{HOST.CONN}","{$COCKROACHDB.API.PORT}"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
CockroachDB	CockroachDB: Clock offset	Mean clock offset of the node against the rest of the cluster.	DEPENDENT	cockroachdb.clock.offset Preprocessing: - PROMETHEUS_PATTERN: `clock_offset_meannanos`: `value`: ``</p><p>- MULTIPLIER:`0.000000001`
CockroachDB	CockroachDB: Version	Build information.	DEPENDENT	cockroachdb.version Preprocessing: - PROMETHEUSPATTERN: `build_timestamp`: `label`: `tag` - DISCARDUNCHANGED_HEARTBEAT: `3h`
CockroachDB	CockroachDB: CPU: System time	System CPU time.	DEPENDENT	cockroachdb.cpu.systemtime Preprocessing: - PROMETHEUSPATTERN: `sys_cpu_sys_ns`: `value`: ``</p><p>- CHANGE_PER_SECOND</p><p>- MULTIPLIER:`0.000000001`
CockroachDB	CockroachDB: CPU: User time	User CPU time.	DEPENDENT	cockroachdb.cpu.usertime Preprocessing: - PROMETHEUSPATTERN: `sys_cpu_user_ns`: `value`: ``</p><p>- CHANGE_PER_SECOND</p><p>- MULTIPLIER:`0.000000001`
CockroachDB	CockroachDB: CPU: Utilization	CPU utilization in %.	DEPENDENT	cockroachdb.cpu.util Preprocessing: - PROMETHEUS_PATTERN: `sys_cpu_combined_percent_normalized`: `value`: ``</p><p>- MULTIPLIER:`100`
CockroachDB	CockroachDB: Disk: IOPS in progress, rate	Number of disk IO operations currently in progress on this host.	DEPENDENT	cockroachdb.disk.iops.inprogress.rate Preprocessing: - PROMETHEUSPATTERN: `sys_host_disk_iopsinprogress`: `value`: `` - CHANGEPERSECOND
CockroachDB	CockroachDB: Disk: Reads, rate	Bytes read from all disks per second since this process started	DEPENDENT	cockroachdb.disk.read.rate Preprocessing: - PROMETHEUSPATTERN: `sys_host_disk_read_bytes`: `value`: `` - CHANGEPER_SECOND
CockroachDB	CockroachDB: Disk: Read IOPS, rate	Number of disk read operations per second across all disks since this process started.	DEPENDENT	cockroachdb.disk.iops.read.rate Preprocessing: - PROMETHEUSPATTERN: `sys_host_disk_read_count`: `value`: `` - CHANGEPER_SECOND
CockroachDB	CockroachDB: Disk: Writes, rate	Bytes written to all disks per second since this process started.	DEPENDENT	cockroachdb.disk.write.rate Preprocessing: - PROMETHEUSPATTERN: `sys_host_disk_write_bytes`: `value`: `` - CHANGEPER_SECOND
CockroachDB	CockroachDB: Disk: Write IOPS, rate	Disk write operations per second across all disks since this process started.	DEPENDENT	cockroachdb.disk.iops.write.rate Preprocessing: - PROMETHEUSPATTERN: `sys_host_disk_write_count`: `value`: `` - CHANGEPER_SECOND
CockroachDB	CockroachDB: File descriptors: Limit	Open file descriptors soft limit of the process.	DEPENDENT	cockroachdb.descriptors.limit Preprocessing: - PROMETHEUS_PATTERN: `sys_fd_softlimit`: `value`: ``</p><p>- DISCARD_UNCHANGED_HEARTBEAT:`3h`
CockroachDB	CockroachDB: File descriptors: Open	The number of open file descriptors.	DEPENDENT	cockroachdb.descriptors.open Preprocessing: - PROMETHEUS_PATTERN: `sys_fd_open`: `value`: ``
CockroachDB	CockroachDB: GC: Pause time	The amount of processor time used by Go's garbage collector across all nodes. During garbage collection, application code execution is paused.	DEPENDENT	cockroachdb.gc.pausetime Preprocessing: - PROMETHEUSPATTERN: `sys_gc_pause_ns`: `value`: ``</p><p>- CHANGE_PER_SECOND</p><p>- MULTIPLIER:`0.000000001`
CockroachDB	CockroachDB: GC: Runs, rate	The number of times that Go's garbage collector was invoked per second across all nodes.	DEPENDENT	cockroachdb.gc.runs.rate Preprocessing: - PROMETHEUSPATTERN: `sys_gc_count`: `value`: `` - CHANGEPER_SECOND
CockroachDB	CockroachDB: Go: Goroutines count	Current number of Goroutines. This count should rise and fall based on load.	DEPENDENT	cockroachdb.go.goroutines.count Preprocessing: - PROMETHEUS_PATTERN: `sys_goroutines`: `value`: ``
CockroachDB	CockroachDB: KV transactions: Aborted, rate	Number of aborted KV transactions per second.	DEPENDENT	cockroachdb.kv.transactions.aborted.rate Preprocessing: - PROMETHEUSPATTERN: `txn_aborts`: `value`: `` - CHANGEPER_SECOND
CockroachDB	CockroachDB: KV transactions: Committed, rate	Number of KV transactions (including 1PC) committed per second.	DEPENDENT	cockroachdb.kv.transactions.committed.rate Preprocessing: - PROMETHEUSPATTERN: `txn_commits`: `value`: `` - CHANGEPER_SECOND
CockroachDB	CockroachDB: Live nodes count	The number of live nodes in the cluster (will be 0 if this node is not itself live).	DEPENDENT	cockroachdb.livecount Preprocessing: - PROMETHEUSPATTERN: `liveness_livenodes`: `value`: ``</p><p>- DISCARD_UNCHANGED_HEARTBEAT:`3h`
CockroachDB	CockroachDB: Liveness heartbeats, rate	Number of successful node liveness heartbeats per second from this node.	DEPENDENT	cockroachdb.heartbeaths.success.rate Preprocessing: - PROMETHEUSPATTERN: `liveness_heartbeatsuccesses`: `value`: `` - CHANGEPER_SECOND
CockroachDB	CockroachDB: Memory: Allocated by Cgo	Current bytes of memory allocated by the C layer.	DEPENDENT	cockroachdb.memory.cgo.allocated Preprocessing: - PROMETHEUS_PATTERN: `sys_cgo_allocbytes`: `value`: ``
CockroachDB	CockroachDB: Memory: Allocated by Go	Current bytes of memory allocated by the Go layer.	DEPENDENT	cockroachdb.memory.go.allocated Preprocessing: - PROMETHEUS_PATTERN: `sys_go_allocbytes`: `value`: ``
CockroachDB	CockroachDB: Memory: Managed by Cgo	Total bytes of memory managed by the C layer.	DEPENDENT	cockroachdb.memory.cgo.managed Preprocessing: - PROMETHEUS_PATTERN: `sys_cgo_totalbytes`: `value`: ``
CockroachDB	CockroachDB: Memory: Managed by Go	Total bytes of memory managed by the Go layer.	DEPENDENT	cockroachdb.memory.go.managed Preprocessing: - PROMETHEUS_PATTERN: `sys_go_totalbytes`: `value`: ``
CockroachDB	CockroachDB: Memory: Total usage	Resident set size (RSS) of memory in use by the node.	DEPENDENT	cockroachdb.memory.total Preprocessing: - PROMETHEUS_PATTERN: `sys_rss`: `value`: ``
CockroachDB	CockroachDB: Network: Bytes received, rate	Bytes received per second on all network interfaces since this process started.	DEPENDENT	cockroachdb.network.bytes.received.rate Preprocessing: - PROMETHEUSPATTERN: `sys_host_net_recv_bytes`: `value`: `` - CHANGEPER_SECOND
CockroachDB	CockroachDB: Network: Bytes sent, rate	Bytes sent per second on all network interfaces since this process started.	DEPENDENT	cockroachdb.network.bytes.sent.rate Preprocessing: - PROMETHEUSPATTERN: `sys_host_net_send_bytes`: `value`: `` - CHANGEPER_SECOND
CockroachDB	CockroachDB: Time series: Sample errors, rate	The number of errors encountered while attempting to write metrics to disk, per second.	DEPENDENT	cockroachdb.ts.samples.errors.rate Preprocessing: - PROMETHEUSPATTERN: `timeseries_write_errors`: `value`: `` - CHANGEPER_SECOND
CockroachDB	CockroachDB: Time series: Samples written, rate	The number of successfully written metric samples per second.	DEPENDENT	cockroachdb.ts.samples.written.rate Preprocessing: - PROMETHEUSPATTERN: `timeseries_write_samples`: `value`: `` - CHANGEPER_SECOND
CockroachDB	CockroachDB: Slow requests: DistSender RPCs	Number of RPCs stuck or retrying for a long time.	DEPENDENT	cockroachdb.slowrequests.rpc Preprocessing: - PROMETHEUSPATTERN: `requests_slow_distsender`: `value`: ``
CockroachDB	CockroachDB: SQL: Bytes received, rate	Total amount of incoming SQL client network traffic in bytes per second.	DEPENDENT	cockroachdb.sql.bytes.received.rate Preprocessing: - PROMETHEUSPATTERN: `sql_bytesin`: `value`: `` - CHANGEPER_SECOND
CockroachDB	CockroachDB: SQL: Bytes sent, rate	Total amount of outgoing SQL client network traffic in bytes per second.	DEPENDENT	cockroachdb.sql.bytes.sent.rate Preprocessing: - PROMETHEUSPATTERN: `sql_bytesout`: `value`: `` - CHANGEPER_SECOND
CockroachDB	CockroachDB: Memory: Allocated by SQL	Current SQL statement memory usage for root.	DEPENDENT	cockroachdb.memory.sql Preprocessing: - PROMETHEUS_PATTERN: `sql_mem_root_current`: `value`: ``
CockroachDB	CockroachDB: SQL: Schema changes, rate	Total number of SQL DDL statements successfully executed per second.	DEPENDENT	cockroachdb.sql.schemachanges.rate Preprocessing: - PROMETHEUSPATTERN: `sql_ddl_count`: `value`: `` - CHANGEPERSECOND
CockroachDB	CockroachDB: SQL sessions: Open	Total number of open SQL sessions.	DEPENDENT	cockroachdb.sql.sessions Preprocessing: - PROMETHEUS_PATTERN: `sql_conns`: `value`: ``
CockroachDB	CockroachDB: SQL statements: Active	Total number of SQL statements currently active.	DEPENDENT	cockroachdb.sql.statements.active Preprocessing: - PROMETHEUS_PATTERN: `sql_distsql_queries_active`: `value`: ``
CockroachDB	CockroachDB: SQL statements: DELETE, rate	A moving average of the number of DELETE statements successfully executed per second.	DEPENDENT	cockroachdb.sql.statements.delete.rate Preprocessing: - PROMETHEUSPATTERN: `sql_delete_count`: `value`: `` - CHANGEPER_SECOND
CockroachDB	CockroachDB: SQL statements: Executed, rate	Number of SQL queries executed per second.	DEPENDENT	cockroachdb.sql.statements.executed.rate Preprocessing: - PROMETHEUSPATTERN: `sql_query_count`: `value`: `` - CHANGEPER_SECOND
CockroachDB	CockroachDB: SQL statements: Denials, rate	The number of statements denied per second by a feature flag.	DEPENDENT	cockroachdb.sql.statements.denials.rate Preprocessing: - PROMETHEUSPATTERN: `sql_feature_flag_denial`: `value`: `` - CHANGEPER_SECOND
CockroachDB	CockroachDB: SQL statements: Active flows distributed, rate	The number of distributed SQL flows currently active per second.	DEPENDENT	cockroachdb.sql.statements.flows.active.rate Preprocessing: - PROMETHEUSPATTERN: `sql_distsql_flows_active`: `value`: `` - CHANGEPER_SECOND
CockroachDB	CockroachDB: SQL statements: INSERT, rate	A moving average of the number of INSERT statements successfully executed per second.	DEPENDENT	cockroachdb.sql.statements.insert.rate Preprocessing: - PROMETHEUSPATTERN: `sql_insert_count`: `value`: `` - CHANGEPER_SECOND
CockroachDB	CockroachDB: SQL statements: SELECT, rate	A moving average of the number of SELECT statements successfully executed per second.	DEPENDENT	cockroachdb.sql.statements.select.rate Preprocessing: - PROMETHEUSPATTERN: `sql_select_count`: `value`: `` - CHANGEPER_SECOND
CockroachDB	CockroachDB: SQL statements: UPDATE, rate	A moving average of the number of UPDATE statements successfully executed per second.	DEPENDENT	cockroachdb.sql.statements.update.rate Preprocessing: - PROMETHEUSPATTERN: `sql_update_count`: `value`: `` - CHANGEPER_SECOND
CockroachDB	CockroachDB: SQL statements: Contention, rate	Total number of SQL statements that experienced contention per second.	DEPENDENT	cockroachdb.sql.statements.contention.rate Preprocessing: - PROMETHEUSPATTERN: `sql_distsql_contended_queries_count`: `value`: `` - CHANGEPER_SECOND
CockroachDB	CockroachDB: SQL statements: Errors, rate	Total number of statements which returned a planning or runtime error per second.	DEPENDENT	cockroachdb.sql.statements.errors.rate Preprocessing: - PROMETHEUSPATTERN: `sql_failure_count`: `value`: `` - CHANGEPER_SECOND
CockroachDB	CockroachDB: SQL transactions: Open	Total number of currently open SQL transactions.	DEPENDENT	cockroachdb.sql.transactions.open Preprocessing: - PROMETHEUS_PATTERN: `sql_txns_open`: `value`: ``
CockroachDB	CockroachDB: SQL transactions: Aborted, rate	Total number of SQL transaction abort errors per second.	DEPENDENT	cockroachdb.sql.transactions.aborted.rate Preprocessing: - PROMETHEUSPATTERN: `sql_txn_abort_count`: `value`: `` - CHANGEPER_SECOND
CockroachDB	CockroachDB: SQL transactions: Committed, rate	Total number of SQL transaction COMMIT statements successfully executed per second.	DEPENDENT	cockroachdb.sql.transactions.committed.rate Preprocessing: - PROMETHEUSPATTERN: `sql_txn_commit_count`: `value`: `` - CHANGEPER_SECOND
CockroachDB	CockroachDB: SQL transactions: Initiated, rate	Total number of SQL transaction BEGIN statements successfully executed per second.	DEPENDENT	cockroachdb.sql.transactions.initiated.rate Preprocessing: - PROMETHEUSPATTERN: `sql_txn_begin_count`: `value`: `` - CHANGEPER_SECOND
CockroachDB	CockroachDB: SQL transactions: Rolled back, rate	Total number of SQL transaction ROLLBACK statements successfully executed per second.	DEPENDENT	cockroachdb.sql.transactions.rollbacks.rate Preprocessing: - PROMETHEUSPATTERN: `sql_txn_rollback_count`: `value`: `` - CHANGEPER_SECOND
CockroachDB	CockroachDB: Uptime	Process uptime.	DEPENDENT	cockroachdb.uptime Preprocessing: - PROMETHEUS_PATTERN: `sys_uptime`: `value`: ``
CockroachDB	CockroachDB: Node certificate expiration date	Node certificate expires at that date.	DEPENDENT	cockroachdb.cert.expiredate.node Preprocessing: - PROMETHEUSPATTERN: `security_certificate_expiration_node`: `value`: ``</p><p>⛔️ON_FAIL:`DISCARD_VALUE -> `</p><p>- DISCARD_UNCHANGED_HEARTBEAT:`6h`
CockroachDB	CockroachDB: CA certificate expiration date	CA certificate expires at that date.	DEPENDENT	cockroachdb.cert.expiredate.ca Preprocessing: - PROMETHEUSPATTERN: `security_certificate_expiration_ca`: `value`: ``</p><p>⛔️ON_FAIL:`DISCARD_VALUE -> `</p><p>- DISCARD_UNCHANGED_HEARTBEAT:`6h`
CockroachDB	CockroachDB: Storage [{#STORE}]: Bytes: Live	Number of logical bytes stored in live key-value pairs on this node. Live data excludes historical and deleted data.	DEPENDENT	cockroachdb.storage.bytes.[{#STORE},live] Preprocessing: - PROMETHEUS_PATTERN: `livebytes{store="{#STORE}"}`: `value`: ``
CockroachDB	CockroachDB: Storage [{#STORE}]: Bytes: System	Number of physical bytes stored in system key-value pairs.	DEPENDENT	cockroachdb.storage.bytes.[{#STORE},system] Preprocessing: - PROMETHEUS_PATTERN: `sysbytes{store="{#STORE}"}`: `value`: ``
CockroachDB	CockroachDB: Storage [{#STORE}]: Capacity available	Available storage capacity.	DEPENDENT	cockroachdb.storage.capacity.[{#STORE},available] Preprocessing: - PROMETHEUS_PATTERN: `capacity_available{store="{#STORE}"}`: `value`: ``
CockroachDB	CockroachDB: Storage [{#STORE}]: Capacity total	Total storage capacity. This value may be explicitly set using --store. If a store size has not been set, this metric displays the actual disk capacity.	DEPENDENT	cockroachdb.storage.capacity.[{#STORE},total] Preprocessing: - PROMETHEUS_PATTERN: `capacity{store="{#STORE}"}`: `value`: ``</p><p>- DISCARD_UNCHANGED_HEARTBEAT:`3h`
CockroachDB	CockroachDB: Storage [{#STORE}]: Capacity used	Disk space in use by CockroachDB data on this node. This excludes the Cockroach binary, operating system, and other system files.	DEPENDENT	cockroachdb.storage.capacity.[{#STORE},used] Preprocessing: - PROMETHEUS_PATTERN: `capacity_used{store="{#STORE}"}`: `value`: ``
CockroachDB	CockroachDB: Storage [{#STORE}]: Capacity available in %	Available storage capacity in %.	CALCULATED	cockroachdb.storage.capacity.[{#STORE},available_percent] Expression: `last(//cockroachdb.storage.capacity.[{#STORE},available]) / last(//cockroachdb.storage.capacity.[{#STORE},total]) * 100`
CockroachDB	CockroachDB: Storage [{#STORE}]: Replication: Lease holders	Number of lease holders.	DEPENDENT	cockroachdb.replication.[{#STORE},leaseholders] Preprocessing: - PROMETHEUSPATTERN: `replicas_leaseholders{store="{#STORE}"}`: `value`: ``
CockroachDB	CockroachDB: Storage [{#STORE}]: Bytes: Logical	Number of logical bytes stored in key-value pairs on this node. This includes historical and deleted data.	DEPENDENT	cockroachdb.storage.bytes.[{#STORE},logical] Preprocessing: - PROMETHEUS_PATTERN: `totalbytes{store="{#STORE}"}`: `value`: ``
CockroachDB	CockroachDB: Storage [{#STORE}]: Rebalancing: Average queries, rate	Number of kv-level requests received per second by the store, averaged over a large time period as used in rebalancing decisions.	DEPENDENT	cockroachdb.rebalancing.queries.average.[{#STORE},rate] Preprocessing: - PROMETHEUS_PATTERN: `rebalancing_queriespersecond{store="{#STORE}"}`: `value`: ``
CockroachDB	CockroachDB: Storage [{#STORE}]: Rebalancing: Average writes, rate	Number of keys written (i.e. applied by raft) per second to the store, averaged over a large time period as used in rebalancing decisions.	DEPENDENT	cockroachdb.rebalancing.writes.average.[{#STORE},rate] Preprocessing: - PROMETHEUS_PATTERN: `rebalancing_writespersecond{store="{#STORE}"}`: `value`: ``
CockroachDB	CockroachDB: Storage [{#STORE}]: Queue processing failures: Consistency, rate	Number of replicas which failed processing in the consistency checker queue per second.	DEPENDENT	cockroachdb.queue.processingfailures.consistency.[{#STORE},rate] Preprocessing: - PROMETHEUSPATTERN: `queue_consistency_process_failure{store="{#STORE}"}`: `value`: `` - CHANGEPERSECOND
CockroachDB	CockroachDB: Storage [{#STORE}]: Queue processing failures: GC, rate	Number of replicas which failed processing in the GC queue per second.	DEPENDENT	cockroachdb.queue.processingfailures.gc.[{#STORE},rate] Preprocessing: - PROMETHEUSPATTERN: `queue_gc_process_failure{store="{#STORE}"}`: `value`: `` - CHANGEPERSECOND
CockroachDB	CockroachDB: Storage [{#STORE}]: Queue processing failures: Raft log, rate	Number of replicas which failed processing in the Raft log queue per second.	DEPENDENT	cockroachdb.queue.processingfailures.raftlog.[{#STORE},rate] Preprocessing: - PROMETHEUSPATTERN: `queue_raftlog_process_failure{store="{#STORE}"}`: `value`: `` - CHANGEPERSECOND
CockroachDB	CockroachDB: Storage [{#STORE}]: Queue processing failures: Raft snapshot, rate	Number of replicas which failed processing in the Raft repair queue per second.	DEPENDENT	cockroachdb.queue.processingfailures.raftsnapshot.[{#STORE},rate] Preprocessing: - PROMETHEUSPATTERN: `queue_raftsnapshot_process_failure{store="{#STORE}"}`: `value`: `` - CHANGEPERSECOND
CockroachDB	CockroachDB: Storage [{#STORE}]: Queue processing failures: Replica GC, rate	Number of replicas which failed processing in the replica GC queue per second.	DEPENDENT	cockroachdb.queue.processingfailures.gcreplica.[{#STORE},rate] Preprocessing: - PROMETHEUSPATTERN: `queue_replicagc_process_failure{store="{#STORE}"}`: `value`: `` - CHANGEPER_SECOND
CockroachDB	CockroachDB: Storage [{#STORE}]: Queue processing failures: Replicate, rate	Number of replicas which failed processing in the replicate queue per second.	DEPENDENT	cockroachdb.queue.processingfailures.replicate.[{#STORE},rate] Preprocessing: - PROMETHEUSPATTERN: `queue_replicate_process_failure{store="{#STORE}"}`: `value`: `` - CHANGEPERSECOND
CockroachDB	CockroachDB: Storage [{#STORE}]: Queue processing failures: Split, rate	Number of replicas which failed processing in the split queue per second.	DEPENDENT	cockroachdb.queue.processingfailures.split.[{#STORE},rate] Preprocessing: - PROMETHEUSPATTERN: `queue_split_process_failure{store="{#STORE}"}`: `value`: `` - CHANGEPERSECOND
CockroachDB	CockroachDB: Storage [{#STORE}]: Queue processing failures: Time series maintenance, rate	Number of replicas which failed processing in the time series maintenance queue per second.	DEPENDENT	cockroachdb.queue.processingfailures.tsmaintenance.[{#STORE},rate] Preprocessing: - PROMETHEUSPATTERN: `queue_tsmaintenance_process_failure{store="{#STORE}"}`: `value`: `` - CHANGEPERSECOND
CockroachDB	CockroachDB: Storage [{#STORE}]: Ranges count	Number of ranges.	DEPENDENT	cockroachdb.ranges.[{#STORE},count] Preprocessing: - PROMETHEUS_PATTERN: `ranges{store="{#STORE}"}`: `value`: ``
CockroachDB	CockroachDB: Storage [{#STORE}]: Ranges unavailable	Number of ranges with fewer live replicas than needed for quorum.	DEPENDENT	cockroachdb.ranges.[{#STORE},unavailable] Preprocessing: - PROMETHEUS_PATTERN: `ranges_unavailable{store="{#STORE}"}`: `value`: ``
CockroachDB	CockroachDB: Storage [{#STORE}]: Ranges underreplicated	Number of ranges with fewer live replicas than the replication target.	DEPENDENT	cockroachdb.ranges.[{#STORE},underreplicated] Preprocessing: - PROMETHEUS_PATTERN: `ranges_underreplicated{store="{#STORE}"}`: `value`: ``
CockroachDB	CockroachDB: Storage [{#STORE}]: RocksDB read amplification	The average number of real read operations executed per logical read operation.	DEPENDENT	cockroachdb.rocksdb.[{#STORE},readamp] Preprocessing: - PROMETHEUSPATTERN: `rocksdb_read_amplification{store="{#STORE}"}`: `value`: ``
CockroachDB	CockroachDB: Storage [{#STORE}]: RocksDB cache hits, rate	Count of block cache hits per second.	DEPENDENT	cockroachdb.rocksdb.cache.hits.[{#STORE},rate] Preprocessing: - PROMETHEUSPATTERN: `rocksdb_block_cache_hits{store="{#STORE}"}`: `value`: `` - CHANGEPER_SECOND
CockroachDB	CockroachDB: Storage [{#STORE}]: RocksDB cache misses, rate	Count of block cache misses per second.	DEPENDENT	cockroachdb.rocksdb.cache.misses.[{#STORE},rate] Preprocessing: - PROMETHEUSPATTERN: `rocksdb_block_cache_misses{store="{#STORE}"}`: `value`: `` - CHANGEPER_SECOND
CockroachDB	CockroachDB: Storage [{#STORE}]: RocksDB cache hit ratio	Block cache hit ratio in %.	CALCULATED	cockroachdb.rocksdb.cache.[{#STORE},hit_ratio] Expression: `last(//cockroachdb.rocksdb.cache.hits.[{#STORE},rate]) / (last(//cockroachdb.rocksdb.cache.hits.[{#STORE},rate]) + last(//cockroachdb.rocksdb.cache.misses.[{#STORE},rate])) * 100`
CockroachDB	CockroachDB: Storage [{#STORE}]: Replication: Replicas	Number of replicas.	DEPENDENT	cockroachdb.replication.replicas.[{#STORE},count] Preprocessing: - PROMETHEUS_PATTERN: `replicas{store="{#STORE}"}`: `value`: ``
CockroachDB	CockroachDB: Storage [{#STORE}]: Replication: Replicas quiesced	Number of quiesced replicas.	DEPENDENT	cockroachdb.replication.replicas.[{#STORE},quiesced] Preprocessing: - PROMETHEUS_PATTERN: `replicas_quiescent{store="{#STORE}"}`: `value`: ``
CockroachDB	CockroachDB: Storage [{#STORE}]: Slow requests: Latch acquisitions	Number of requests that have been stuck for a long time acquiring latches.	DEPENDENT	cockroachdb.slowrequests.[{#STORE},latchacquisitions] Preprocessing: - PROMETHEUS_PATTERN: `requests_slow_latch{store="{#STORE}"}`: `value`: ``
CockroachDB	CockroachDB: Storage [{#STORE}]: Slow requests: Lease acquisitions	Number of requests that have been stuck for a long time acquiring a lease.	DEPENDENT	cockroachdb.slowrequests.[{#STORE},leaseacquisitions] Preprocessing: - PROMETHEUS_PATTERN: `requests_slow_lease{store="{#STORE}"}`: `value`: ``
CockroachDB	CockroachDB: Storage [{#STORE}]: Slow requests: Raft proposals	Number of requests that have been stuck for a long time in raft.	DEPENDENT	cockroachdb.slowrequests.[{#STORE},raftproposals] Preprocessing: - PROMETHEUS_PATTERN: `requests_slow_raft{store="{#STORE}"}`: `value`: ``
CockroachDB	CockroachDB: Storage [{#STORE}]: RocksDB SSTables	The number of SSTables in use.	DEPENDENT	cockroachdb.rocksdb.[{#STORE},sstables] Preprocessing: - PROMETHEUS_PATTERN: `rocksdb_num_sstables{store="{#STORE}"}`: `value`: ``
Zabbix raw items	CockroachDB: Get metrics	Get raw metrics from the Prometheus endpoint.	HTTP_AGENT	cockroachdb.getmetrics Preprocessing: - CHECKNOTSUPPORTED ⛔️ONFAIL: `DISCARD_VALUE ->`
Zabbix raw items	CockroachDB: Get health	Get node /health endpoint	HTTP_AGENT	cockroachdb.gethealth Preprocessing: - CHECKNOTSUPPORTED ⛔️ONFAIL: `DISCARD_VALUE ->` - REGEX: `HTTP.*\s(\d+)`: `\1` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Zabbix raw items	CockroachDB: Get readiness	Get node /health?ready=1 endpoint	HTTP_AGENT	cockroachdb.getreadiness Preprocessing: - CHECKNOTSUPPORTED ⛔️ONFAIL: `DISCARD_VALUE ->` - REGEX: `HTTP.*\s(\d+)`: `\1` - DISCARDUNCHANGEDHEARTBEAT: `3h`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
CockroachDB: Service is down	-	`last(/CockroachDB by HTTP/net.tcp.service["{$COCKROACHDB.API.SCHEME}","{HOST.CONN}","{$COCKROACHDB.API.PORT}"]) = 0`	AVERAGE
CockroachDB: Clock offset is too high	Cockroach-measured clock offset is nearing limit (by default, servers kill themselves at 400ms from the mean).	`min(/CockroachDB by HTTP/cockroachdb.clock.offset,5m) > {$COCKROACHDB.CLOCK.OFFSET.MAX.WARN} * 0.001`	WARNING
CockroachDB: Version has changed	-	`last(/CockroachDB by HTTP/cockroachdb.version) <> last(/CockroachDB by HTTP/cockroachdb.version,#2) and length(last(/CockroachDB by HTTP/cockroachdb.version)) > 0`	INFO
CockroachDB: Current number of open files is too high	Getting close to open file descriptor limit.	`min(/CockroachDB by HTTP/cockroachdb.descriptors.open,10m) / last(/CockroachDB by HTTP/cockroachdb.descriptors.limit) * 100 > {$COCKROACHDB.OPEN.FDS.MAX.WARN}`	WARNING
CockroachDB: Node is not executing SQL	Node is not executing SQL despite having connections.	`last(/CockroachDB by HTTP/cockroachdb.sql.sessions) > 0 and last(/CockroachDB by HTTP/cockroachdb.sql.statements.executed.rate) = 0`	WARNING
CockroachDB: SQL statements errors rate is too high	-	`min(/CockroachDB by HTTP/cockroachdb.sql.statements.errors.rate,5m) > {$COCKROACHDB.STATEMENTS.ERRORS.MAX.WARN}`	WARNING
CockroachDB: Node has been restarted	Uptime is less than 10 minutes.	`last(/CockroachDB by HTTP/cockroachdb.uptime) < 10m`	INFO
CockroachDB: Failed to fetch node data	Zabbix has not received data for items for the last 5 minutes.	`nodata(/CockroachDB by HTTP/cockroachdb.uptime,5m) = 1`	WARNING	Depends on: - CockroachDB: Service is down
CockroachDB: Node certificate expires soon	Node certificate expires soon.	`(last(/CockroachDB by HTTP/cockroachdb.cert.expire_date.node) - now()) / 86400 < {$COCKROACHDB.CERT.NODE.EXPIRY.WARN}`	WARNING
CockroachDB: CA certificate expires soon	CA certificate expires soon.	`(last(/CockroachDB by HTTP/cockroachdb.cert.expire_date.ca) - now()) / 86400 < {$COCKROACHDB.CERT.CA.EXPIRY.WARN}`	WARNING
CockroachDB: Storage [{#STORE}]: Available storage capacity is low	Storage is running low on free space (less than {$COCKROACHDB.STORE.USED.MIN.WARN}% available).	`max(/CockroachDB by HTTP/cockroachdb.storage.capacity.[{#STORE},available_percent],5m) < {$COCKROACHDB.STORE.USED.MIN.WARN}` Recovery expression: `min(/CockroachDB by HTTP/cockroachdb.storage.capacity.[{#STORE},available_percent],5m) > {$COCKROACHDB.STORE.USED.MIN.WARN}`	WARNING	Depends on: - CockroachDB: Storage [{#STORE}]: Available storage capacity is critically low
CockroachDB: Storage [{#STORE}]: Available storage capacity is critically low	Storage is running critically low on free space (less than {$COCKROACHDB.STORE.USED.MIN.CRIT}% available).	`max(/CockroachDB by HTTP/cockroachdb.storage.capacity.[{#STORE},available_percent],5m) < {$COCKROACHDB.STORE.USED.MIN.CRIT}` Recovery expression: `min(/CockroachDB by HTTP/cockroachdb.storage.capacity.[{#STORE},available_percent],5m) > {$COCKROACHDB.STORE.USED.MIN.CRIT}`	AVERAGE
CockroachDB: Node is unhealthy	Node's /health endpoint has returned HTTP 500 Internal Server Error which indicates unhealthy mode.	`last(/CockroachDB by HTTP/cockroachdb.get_health) = 500`	AVERAGE	Depends on: - CockroachDB: Service is down
CockroachDB: Node is not ready	Node's /health?ready=1 endpoint has returned HTTP 503 Service Unavailable. Possible reasons: - node is in the wait phase of the node shutdown sequence; - node is unable to communicate with a majority of the other nodes in the cluster, likely because the cluster is unavailable due to too many nodes being down.	`last(/CockroachDB by HTTP/cockroachdb.get_readiness) = 503 and last(/CockroachDB by HTTP/cockroachdb.uptime) > 5m`	AVERAGE	Depends on: - CockroachDB: Service is down

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

db_clickhouse_http

View README Download JSON

ClickHouse by HTTP

Overview

For Zabbix version: 6.2 and higher
The template to monitor ClickHouse by Zabbix that work without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

This template was tested on:

ClickHouse, version 19.14+, 20.3+

Setup

Create a user to monitor the service:

create file /etc/clickhouse-server/users.d/zabbix.xml
<yandex>
    <users>
      <zabbix>
        <password>zabbix_pass</password>
        <networks incl="networks" />
        <profile>web</profile>
        <quota>default</quota>
        <allow_databases>
          <database>test</database>
        </allow_databases>
      </zabbix>
    </users>
  </yandex>

{$CLICKHOUSE.USER}
{$CLICKHOUSE.PASSWORD} If you don't need authentication - remove headers from HTTP-Agent type items

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$CLICKHOUSE.DELAYED.FILES.DISTRIBUTED.COUNT.MAX.WARN}	Maximum size of distributed files queue to insert for trigger expression.	`600`
{$CLICKHOUSE.DELAYED.INSERTS.MAX.WARN}	Maximum number of delayed inserts for trigger expression.	`0`
{$CLICKHOUSE.LLD.FILTER.DB.MATCHES}	Filter of discoverable databases	`.*`
{$CLICKHOUSE.LLD.FILTER.DB.NOT_MATCHES}	Filter to exclude discovered databases	`CHANGE_IF_NEEDED`
{$CLICKHOUSE.LLD.FILTER.DICT.MATCHES}	Filter of discoverable dictionaries	`.*`
{$CLICKHOUSE.LLD.FILTER.DICT.NOT_MATCHES}	Filter to exclude discovered dictionaries	`CHANGE_IF_NEEDED`
{$CLICKHOUSE.LOG_POSITION.DIFF.MAX.WARN}	Maximum diff between logpointer and logmax_index.	`30`
{$CLICKHOUSE.NETWORK.ERRORS.MAX.WARN}	Maximum number of smth for trigger expression	`5`
{$CLICKHOUSE.PARTS.PER.PARTITION.WARN}	Maximum number of parts per partition for trigger expression.	`300`
{$CLICKHOUSE.PASSWORD}	-	`zabbix_pass`
{$CLICKHOUSE.PORT}	The port of ClickHouse HTTP endpoint	`8123`
{$CLICKHOUSE.QUERY_TIME.MAX.WARN}	Maximum ClickHouse query time in seconds for trigger expression	`600`
{$CLICKHOUSE.QUEUE.SIZE.MAX.WARN}	Maximum size of the queue for operations waiting to be performed for trigger expression.	`20`
{$CLICKHOUSE.REPLICA.MAX.WARN}	Replication lag across all tables for trigger expression.	`600`
{$CLICKHOUSE.SCHEME}	Request scheme which may be http or https	`http`
{$CLICKHOUSE.USER}	-	`zabbix`

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Dictionaries

Info about dictionaries

DEPENDENT

clickhouse.dictionaries.discovery

Filter:

AND

- {#NAME} MATCHESREGEX {$CLICKHOUSE.LLD.FILTER.DICT.MATCHES}

- {#NAME} NOTMATCHES_REGEX {$CLICKHOUSE.LLD.FILTER.DICT.NOT_MATCHES}

Replicas

Info about replicas

DEPENDENT

clickhouse.replicas.discovery

Filter:

AND

- {#DB} MATCHESREGEX {$CLICKHOUSE.LLD.FILTER.DB.MATCHES}

- {#DB} NOTMATCHES_REGEX {$CLICKHOUSE.LLD.FILTER.DB.NOT_MATCHES}

Tables

Info about tables

DEPENDENT

clickhouse.tables.discovery

Filter:

AND

- {#DB} MATCHESREGEX {$CLICKHOUSE.LLD.FILTER.DB.MATCHES}

- {#DB} NOTMATCHES_REGEX {$CLICKHOUSE.LLD.FILTER.DB.NOT_MATCHES}

Items collected

Group	Name	Description	Type	Key and additional info
ClickHouse	ClickHouse: Longest currently running query time	Get longest running query.	HTTP_AGENT	clickhouse.process.elapsed
ClickHouse	ClickHouse: Check port availability	-	SIMPLE	net.tcp.service[{$CLICKHOUSE.SCHEME},"{HOST.CONN}","{$CLICKHOUSE.PORT}"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
ClickHouse	ClickHouse: Ping		HTTP_AGENT	clickhouse.ping Preprocessing: - REGEX: `Ok\. 1` ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - DISCARDUNCHANGED_HEARTBEAT: `10m`
ClickHouse	ClickHouse: Version	Version of the server	HTTP_AGENT	clickhouse.version Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1d`
ClickHouse	ClickHouse: Revision	Revision of the server.	DEPENDENT	clickhouse.revision Preprocessing: - JSONPATH: `$[?(@.metric == "Revision")].value.first()`
ClickHouse	ClickHouse: Uptime	Number of seconds since ClickHouse server start	DEPENDENT	clickhouse.uptime Preprocessing: - JSONPATH: `$[?(@.metric == "Uptime")].value.first()`
ClickHouse	ClickHouse: New queries per second	Number of queries to be interpreted and potentially executed. Does not include queries that failed to parse or were rejected due to AST size limits, quota limits or limits on the number of simultaneously running queries. May include internal queries initiated by ClickHouse itself. Does not count subqueries.	DEPENDENT	clickhouse.query.rate Preprocessing: - JSONPATH: `$[?(@.data.event == "Query")].value.first()` ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - CHANGEPER_SECOND
ClickHouse	ClickHouse: New SELECT queries per second	Number of SELECT queries to be interpreted and potentially executed. Does not include queries that failed to parse or were rejected due to AST size limits, quota limits or limits on the number of simultaneously running queries. May include internal queries initiated by ClickHouse itself. Does not count subqueries.	DEPENDENT	clickhouse.selectquery.rate Preprocessing: - JSONPATH: `$[?(@.event == "SelectQuery")].value.first()` ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - CHANGEPERSECOND
ClickHouse	ClickHouse: New INSERT queries per second	Number of INSERT queries to be interpreted and potentially executed. Does not include queries that failed to parse or were rejected due to AST size limits, quota limits or limits on the number of simultaneously running queries. May include internal queries initiated by ClickHouse itself. Does not count subqueries.	DEPENDENT	clickhouse.insertquery.rate Preprocessing: - JSONPATH: `$[?(@.event == "InsertQuery")].value.first()` ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - CHANGEPERSECOND
ClickHouse	ClickHouse: Delayed insert queries	"Number of INSERT queries that are throttled due to high number of active data parts for partition in a MergeTree table."	DEPENDENT	clickhouse.insert.delay Preprocessing: - JSONPATH: `$[?(@.metric == "DelayedInserts")].value.first()`
ClickHouse	ClickHouse: Current running queries	Number of executing queries	DEPENDENT	clickhouse.query.current Preprocessing: - JSONPATH: `$[?(@.metric == "Query")].value.first()`
ClickHouse	ClickHouse: Current running merges	Number of executing background merges	DEPENDENT	clickhouse.merge.current Preprocessing: - JSONPATH: `$[?(@.metric == "Merge")].value.first()`
ClickHouse	ClickHouse: Inserted bytes per second	The number of uncompressed bytes inserted in all tables.	DEPENDENT	clickhouse.insertedbytes.rate Preprocessing: - JSONPATH: `$[?(@.event == "InsertedBytes")].value.first()` ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - CHANGEPERSECOND
ClickHouse	ClickHouse: Read bytes per second	"Number of bytes (the number of bytes before decompression) read from compressed sources (files, network)."	DEPENDENT	clickhouse.readbytes.rate Preprocessing: - JSONPATH: `$[?(@.event == "ReadCompressedBytes")].value.first()` ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - CHANGEPERSECOND
ClickHouse	ClickHouse: Inserted rows per second	The number of rows inserted in all tables.	DEPENDENT	clickhouse.insertedrows.rate Preprocessing: - JSONPATH: `$[?(@.event == "InsertedRows")].value.first()` ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - CHANGEPERSECOND
ClickHouse	ClickHouse: Merged rows per second	Rows read for background merges.	DEPENDENT	clickhouse.mergerows.rate Preprocessing: - JSONPATH: `$[?(@.event == "MergedRows")].value.first()` ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - CHANGEPERSECOND
ClickHouse	ClickHouse: Uncompressed bytes merged per second	Uncompressed bytes that were read for background merges	DEPENDENT	clickhouse.mergebytes.rate Preprocessing: - JSONPATH: `$[?(@.event == "MergedUncompressedBytes")].value.first()` ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - CHANGEPERSECOND
ClickHouse	ClickHouse: Max count of parts per partition across all tables	Clickhouse MergeTree table engine split each INSERT query to partitions (PARTITION BY expression) and add one or more PARTS per INSERT inside each partition, after that background merge process run.	DEPENDENT	clickhouse.max.part.count.for.partition Preprocessing: - JSONPATH: `$[?(@.metric == "MaxPartCountForPartition")].value.first()`
ClickHouse	ClickHouse: Current TCP connections	Number of connections to TCP server (clients with native interface).	DEPENDENT	clickhouse.connections.tcp Preprocessing: - JSONPATH: `$[?(@.metric == "TCPConnection")].value.first()`
ClickHouse	ClickHouse: Current HTTP connections	Number of connections to HTTP server.	DEPENDENT	clickhouse.connections.http Preprocessing: - JSONPATH: `$[?(@.metric == "HTTPConnection")].value.first()`
ClickHouse	ClickHouse: Current distribute connections	Number of connections to remote servers sending data that was INSERTed into Distributed tables.	DEPENDENT	clickhouse.connections.distribute Preprocessing: - JSONPATH: `$[?(@.metric == "DistributedSend")].value.first()`
ClickHouse	ClickHouse: Current MySQL connections	Number of connections to MySQL server.	DEPENDENT	clickhouse.connections.mysql Preprocessing: - JSONPATH: `$[?(@.metric == "MySQLConnection")].value.first()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
ClickHouse	ClickHouse: Current Interserver connections	Number of connections from other replicas to fetch parts.	DEPENDENT	clickhouse.connections.interserver Preprocessing: - JSONPATH: `$[?(@.metric == "InterserverConnection")].value.first()`
ClickHouse	ClickHouse: Network errors per second	Network errors (timeouts and connection failures) during query execution, background pool tasks and DNS cache update.	DEPENDENT	clickhouse.network.error.rate Preprocessing: - JSONPATH: `$[?(@.event == "NetworkErrors")].value.first()` ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - CHANGEPER_SECOND
ClickHouse	ClickHouse: Read syscalls in fly	Number of read (read, pread, io_getevents, etc.) syscalls in fly	DEPENDENT	clickhouse.read Preprocessing: - JSONPATH: `$[?(@.metric == "Read")].value.first()`
ClickHouse	ClickHouse: Write syscalls in fly	Number of write (write, pwrite, io_getevents, etc.) syscalls in fly	DEPENDENT	clickhouse.write Preprocessing: - JSONPATH: `$[?(@.metric == "Write")].value.first()`
ClickHouse	ClickHouse: Allocated bytes	"Total number of bytes allocated by the application."	DEPENDENT	clickhouse.jemalloc.allocated Preprocessing: - JSONPATH: `$[?(@.metric == "jemalloc.allocated")].value.first()`
ClickHouse	ClickHouse: Resident memory	Maximum number of bytes in physically resident data pages mapped by the allocator, comprising all pages dedicated to allocator metadata, pages backing active allocations, and unused dirty pages.	DEPENDENT	clickhouse.jemalloc.resident Preprocessing: - JSONPATH: `$[?(@.metric == "jemalloc.resident")].value.first()`
ClickHouse	ClickHouse: Mapped memory	"Total number of bytes in active extents mapped by the allocator."	DEPENDENT	clickhouse.jemalloc.mapped Preprocessing: - JSONPATH: `$[?(@.metric == "jemalloc.mapped")].value.first()`
ClickHouse	ClickHouse: Memory used for queries	"Total amount of memory (bytes) allocated in currently executing queries."	DEPENDENT	clickhouse.memory.tracking Preprocessing: - JSONPATH: `$[?(@.metric == "MemoryTracking")].value.first()`
ClickHouse	ClickHouse: Memory used for background merges	"Total amount of memory (bytes) allocated in background processing pool (that is dedicated for background merges, mutations and fetches). Note that this value may include a drift when the memory was allocated in a context of background processing pool and freed in other context or vice-versa. This happens naturally due to caches for tables indexes and doesn't indicate memory leaks."	DEPENDENT	clickhouse.memory.tracking.background Preprocessing: - JSONPATH: `$[?(@.metric == "MemoryTrackingInBackgroundProcessingPool")].value.first()`
ClickHouse	ClickHouse: Memory used for background moves	"Total amount of memory (bytes) allocated in background processing pool (that is dedicated for background moves). Note that this value may include a drift when the memory was allocated in a context of background processing pool and freed in other context or vice-versa. This happens naturally due to caches for tables indexes and doesn't indicate memory leaks."	DEPENDENT	clickhouse.memory.tracking.background.moves Preprocessing: - JSONPATH: `$[?(@.metric == "MemoryTrackingInBackgroundMoveProcessingPool")].value.first()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
ClickHouse	ClickHouse: Memory used for background schedule pool	"Total amount of memory (bytes) allocated in background schedule pool (that is dedicated for bookkeeping tasks of Replicated tables)."	DEPENDENT	clickhouse.memory.tracking.schedule.pool Preprocessing: - JSONPATH: `$[?(@.metric == "MemoryTrackingInBackgroundSchedulePool")].value.first()`
ClickHouse	ClickHouse: Memory used for merges	Total amount of memory (bytes) allocated for background merges. Included in MemoryTrackingInBackgroundProcessingPool. Note that this value may include a drift when the memory was allocated in a context of background processing pool and freed in other context or vice-versa. This happens naturally due to caches for tables indexes and doesn't indicate memory leaks.	DEPENDENT	clickhouse.memory.tracking.merges Preprocessing: - JSONPATH: `$[?(@.metric == "MemoryTrackingForMerges")].value.first()`
ClickHouse	ClickHouse: Current distributed files to insert	Number of pending files to process for asynchronous insertion into Distributed tables. Number of files for every shard is summed.	DEPENDENT	clickhouse.distributed.files Preprocessing: - JSONPATH: `$[?(@.metric == "DistributedFilesToInsert")].value.first()`
ClickHouse	ClickHouse: Distributed connection fail with retry per second	Connection retries in replicated DB connection pool	DEPENDENT	clickhouse.distributed.files.retry.rate Preprocessing: - JSONPATH: `$[?(@.metric == "DistributedConnectionFailTry")].value.first()` ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - CHANGEPER_SECOND
ClickHouse	ClickHouse: Distributed connection fail with retry per second	"Connection failures after all retries in replicated DB connection pool"	DEPENDENT	clickhouse.distributed.files.fail.rate Preprocessing: - JSONPATH: `$[?(@.metric == "DistributedConnectionFailAtAll")].value.first()` ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - CHANGEPER_SECOND
ClickHouse	ClickHouse: Replication lag across all tables	Maximum replica queue delay relative to current time	DEPENDENT	clickhouse.replicas.max.absolute.delay Preprocessing: - JSONPATH: `$[?(@.metric == "ReplicasMaxAbsoluteDelay")].value.first()`
ClickHouse	ClickHouse: Total replication tasks in queue		DEPENDENT	clickhouse.replicas.sum.queue.size Preprocessing: - JSONPATH: `$[?(@.metric == "ReplicasSumQueueSize")].value.first()`
ClickHouse	ClickHouse: Total number read-only Replicas	Number of Replicated tables that are currently in readonly state due to re-initialization after ZooKeeper session loss or due to startup without ZooKeeper configured.	DEPENDENT	clickhouse.replicas.readonly.total Preprocessing: - JSONPATH: `$[?(@.metric == "ReadonlyReplica")].value.first()`
ClickHouse	ClickHouse: {#DB}.{#TABLE}: Bytes	Table size in bytes. Database: {#DB}, table: {#TABLE}	DEPENDENT	clickhouse.table.bytes["{#DB}.{#TABLE}"] Preprocessing: - JSONPATH: `$[?(@.database == "{#DB}" && @.table == "{#TABLE}")].bytes.first()`
ClickHouse	ClickHouse: {#DB}.{#TABLE}: Parts	Number of parts of the table. Database: {#DB}, table: {#TABLE}	DEPENDENT	clickhouse.table.parts["{#DB}.{#TABLE}"] Preprocessing: - JSONPATH: `$[?(@.database == "{#DB}" && @.table == "{#TABLE}")].parts.first()`
ClickHouse	ClickHouse: {#DB}.{#TABLE}: Rows	Number of rows in the table. Database: {#DB}, table: {#TABLE}	DEPENDENT	clickhouse.table.rows["{#DB}.{#TABLE}"] Preprocessing: - JSONPATH: `$[?(@.database == "{#DB}" && @.table == "{#TABLE}")].rows.first()`
ClickHouse	ClickHouse: {#DB}: Bytes	Database size in bytes.	DEPENDENT	clickhouse.db.bytes["{#DB}"] Preprocessing: - JSONPATH: `$[?(@.database == "{#DB}")].bytes.sum()`
ClickHouse	ClickHouse: {#DB}.{#TABLE}: Replica readonly	Whether the replica is in read-only mode. This mode is turned on if the config doesn't have sections with ZooKeeper, if an unknown error occurred when re-initializing sessions in ZooKeeper, and during session re-initialization in ZooKeeper.	DEPENDENT	clickhouse.replica.is_readonly["{#DB}.{#TABLE}"] Preprocessing: - JSONPATH: `$[?(@.database == "{#DB}" && @.table == "{#TABLE}")].is_readonly.first()`
ClickHouse	ClickHouse: {#DB}.{#TABLE}: Replica session expired	True if the ZooKeeper session expired	DEPENDENT	clickhouse.replica.issessionexpired["{#DB}.{#TABLE}"] Preprocessing: - JSONPATH: `$[?(@.database == "{#DB}" && @.table == "{#TABLE}")].is_session_expired.first()`
ClickHouse	ClickHouse: {#DB}.{#TABLE}: Replica future parts	Number of data parts that will appear as the result of INSERTs or merges that haven't been done yet.	DEPENDENT	clickhouse.replica.future_parts["{#DB}.{#TABLE}"] Preprocessing: - JSONPATH: `$[?(@.database == "{#DB}" && @.table == "{#TABLE}")].future_parts.first()`
ClickHouse	ClickHouse: {#DB}.{#TABLE}: Replica parts to check	Number of data parts in the queue for verification. A part is put in the verification queue if there is suspicion that it might be damaged.	DEPENDENT	clickhouse.replica.partstocheck["{#DB}.{#TABLE}"] Preprocessing: - JSONPATH: `$[?(@.database == "{#DB}" && @.table == "{#TABLE}")].parts_to_check.first()`
ClickHouse	ClickHouse: {#DB}.{#TABLE}: Replica queue size	Size of the queue for operations waiting to be performed.	DEPENDENT	clickhouse.replica.queue_size["{#DB}.{#TABLE}"] Preprocessing: - JSONPATH: `$[?(@.database == "{#DB}" && @.table == "{#TABLE}")].queue_size.first()`
ClickHouse	ClickHouse: {#DB}.{#TABLE}: Replica queue inserts size	Number of inserts of blocks of data that need to be made.	DEPENDENT	clickhouse.replica.insertsinqueue["{#DB}.{#TABLE}"] Preprocessing: - JSONPATH: `$[?(@.database == "{#DB}" && @.table == "{#TABLE}")].inserts_in_queue.first()`
ClickHouse	ClickHouse: {#DB}.{#TABLE}: Replica queue merges size	Number of merges waiting to be made.	DEPENDENT	clickhouse.replica.mergesinqueue["{#DB}.{#TABLE}"] Preprocessing: - JSONPATH: `$[?(@.database == "{#DB}" && @.table == "{#TABLE}")].merges_in_queue.first()`
ClickHouse	ClickHouse: {#DB}.{#TABLE}: Replica log max index	Maximum entry number in the log of general activity. (Have a non-zero value only where there is an active session with ZooKeeper).	DEPENDENT	clickhouse.replica.logmaxindex["{#DB}.{#TABLE}"] Preprocessing: - JSONPATH: `$[?(@.database == "{#DB}" && @.table == "{#TABLE}")].log_max_index.first()`
ClickHouse	ClickHouse: {#DB}.{#TABLE}: Replica log pointer	Maximum entry number in the log of general activity that the replica copied to its execution queue, plus one. (Have a non-zero value only where there is an active session with ZooKeeper).	DEPENDENT	clickhouse.replica.log_pointer["{#DB}.{#TABLE}"] Preprocessing: - JSONPATH: `$[?(@.database == "{#DB}" && @.table == "{#TABLE}")].log_pointer.first()`
ClickHouse	ClickHouse: {#DB}.{#TABLE}: Total replicas	Total number of known replicas of this table. (Have a non-zero value only where there is an active session with ZooKeeper).	DEPENDENT	clickhouse.replica.total_replicas["{#DB}.{#TABLE}"] Preprocessing: - JSONPATH: `$[?(@.database == "{#DB}" && @.table == "{#TABLE}")].total_replicas.first()`
ClickHouse	ClickHouse: {#DB}.{#TABLE}: Active replicas	Number of replicas of this table that have a session in ZooKeeper (i.e., the number of functioning replicas). (Have a non-zero value only where there is an active session with ZooKeeper).	DEPENDENT	clickhouse.replica.active_replicas["{#DB}.{#TABLE}"] Preprocessing: - JSONPATH: `$[?(@.database == "{#DB}" && @.table == "{#TABLE}")].active_replicas.first()`
ClickHouse	ClickHouse: {#DB}.{#TABLE}: Replica lag	Difference between logmaxindex and log_pointer	DEPENDENT	clickhouse.replica.lag["{#DB}.{#TABLE}"] Preprocessing: - JSONPATH: `$[?(@.database == "{#DB}" && @.table == "{#TABLE}")].replica_lag.first()`
ClickHouse	ClickHouse: Dictionary {#NAME}: Bytes allocated	The amount of RAM the dictionary uses.	DEPENDENT	clickhouse.dictionary.bytes_allocated["{#NAME}"] Preprocessing: - JSONPATH: `$[?(@.name == "{#NAME}")].bytes_allocated.first()`
ClickHouse	ClickHouse: Dictionary {#NAME}: Element count	Number of items stored in the dictionary.	DEPENDENT	clickhouse.dictionary.element_count["{#NAME}"] Preprocessing: - JSONPATH: `$[?(@.name == "{#NAME}")].element_count.first()`
ClickHouse	ClickHouse: Dictionary {#NAME}: Load factor	The percentage filled in the dictionary (for a hashed dictionary, the percentage filled in the hash table).	DEPENDENT	clickhouse.dictionary.load_factor["{#NAME}"] Preprocessing: - JSONPATH: `$[?(@.name == "{#NAME}")].bytes_allocated.first()` - MULTIPLIER: `100`
ClickHouse ZooKeeper	ClickHouse: ZooKeeper sessions	Number of sessions (connections) to ZooKeeper. Should be no more than one.	DEPENDENT	clickhouse.zookeeper.session Preprocessing: - JSONPATH: `$[?(@.metric == "ZooKeeperSession")].value.first()`
ClickHouse ZooKeeper	ClickHouse: ZooKeeper watches	Number of watches (e.g., event subscriptions) in ZooKeeper.	DEPENDENT	clickhouse.zookeeper.watch Preprocessing: - JSONPATH: `$[?(@.metric == "ZooKeeperWatch")].value.first()`
ClickHouse ZooKeeper	ClickHouse: ZooKeeper requests	Number of requests to ZooKeeper in progress.	DEPENDENT	clickhouse.zookeeper.request Preprocessing: - JSONPATH: `$[?(@.metric == "ZooKeeperRequest")].value.first()`
ClickHouse ZooKeeper	ClickHouse: ZooKeeper wait time	Time spent in waiting for ZooKeeper operations.	DEPENDENT	clickhouse.zookeeper.wait.time Preprocessing: - JSONPATH: `$[?(@.event == "ZooKeeperWaitMicroseconds")].value.first()` ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - MULTIPLIER: `0.000001` - CHANGEPER_SECOND
ClickHouse ZooKeeper	ClickHouse: ZooKeeper exceptions per second	Count of ZooKeeper exceptions that does not belong to user/hardware exceptions.	DEPENDENT	clickhouse.zookeeper.exceptions.rate Preprocessing: - JSONPATH: `$[?(@.event == "ZooKeeperOtherExceptions")].value.first()` ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - CHANGEPER_SECOND
ClickHouse ZooKeeper	ClickHouse: ZooKeeper hardware exceptions per second	Count of ZooKeeper exceptions caused by session moved/expired, connection loss, marshalling error, operation timed out and invalid zhandle state.	DEPENDENT	clickhouse.zookeeper.hwexceptions.rate Preprocessing: - JSONPATH: `$[?(@.event == "ZooKeeperHardwareExceptions")].value.first()` ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - CHANGEPERSECOND
ClickHouse ZooKeeper	ClickHouse: ZooKeeper user exceptions per second	Count of ZooKeeper exceptions caused by no znodes, bad version, node exists, node empty and no children for ephemeral.	DEPENDENT	clickhouse.zookeeper.userexceptions.rate Preprocessing: - JSONPATH: `$[?(@.event == "ZooKeeperUserExceptions")].value.first()` ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - CHANGEPERSECOND
Zabbix raw items	ClickHouse: Get system.events	Get information about the number of events that have occurred in the system.	HTTP_AGENT	clickhouse.system.events Preprocessing: - JSONPATH: `$.data`
Zabbix raw items	ClickHouse: Get system.metrics	Get metrics which can be calculated instantly, or have a current value format JSONEachRow	HTTP_AGENT	clickhouse.system.metrics Preprocessing: - JSONPATH: `$.data`
Zabbix raw items	ClickHouse: Get system.asynchronous_metrics	Get metrics that are calculated periodically in the background	HTTP_AGENT	clickhouse.system.asynchronous_metrics Preprocessing: - JSONPATH: `$.data`
Zabbix raw items	ClickHouse: Get system.settings	Get information about settings that are currently in use.	HTTP_AGENT	clickhouse.system.settings Preprocessing: - JSONPATH: `$.data` - DISCARDUNCHANGEDHEARTBEAT: `1h`
Zabbix raw items	ClickHouse: Get replicas info	-	HTTP_AGENT	clickhouse.replicas Preprocessing: - JSONPATH: `$.data`
Zabbix raw items	ClickHouse: Get tables info	-	HTTP_AGENT	clickhouse.tables Preprocessing: - JSONPATH: `$.data`
Zabbix raw items	ClickHouse: Get dictionaries info	-	HTTP_AGENT	clickhouse.dictionaries Preprocessing: - JSONPATH: `$.data`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
ClickHouse: There are queries running is long	-	`last(/ClickHouse by HTTP/clickhouse.process.elapsed)>{$CLICKHOUSE.QUERY_TIME.MAX.WARN}`	AVERAGE	Manual close: YES
ClickHouse: Port {$CLICKHOUSE.PORT} is unavailable	-	`last(/ClickHouse by HTTP/net.tcp.service[{$CLICKHOUSE.SCHEME},"{HOST.CONN}","{$CLICKHOUSE.PORT}"])=0`	AVERAGE	Manual close: YES
ClickHouse: Service is down	-	`last(/ClickHouse by HTTP/clickhouse.ping)=0 or last(/ClickHouse by HTTP/net.tcp.service[{$CLICKHOUSE.SCHEME},"{HOST.CONN}","{$CLICKHOUSE.PORT}"]) = 0`	AVERAGE	Manual close: YES Depends on: - ClickHouse: Port {$CLICKHOUSE.PORT} is unavailable
ClickHouse: Version has changed	ClickHouse version has changed. Ack to close.	`last(/ClickHouse by HTTP/clickhouse.version,#1)<>last(/ClickHouse by HTTP/clickhouse.version,#2) and length(last(/ClickHouse by HTTP/clickhouse.version))>0`	INFO	Manual close: YES
ClickHouse: has been restarted	Uptime is less than 10 minutes.	`last(/ClickHouse by HTTP/clickhouse.uptime)<10m`	INFO	Manual close: YES
ClickHouse: Failed to fetch info data	Zabbix has not received data for items for the last 30 minutes	`nodata(/ClickHouse by HTTP/clickhouse.uptime,30m)=1`	WARNING	Manual close: YES Depends on: - ClickHouse: Service is down
ClickHouse: Too many throttled insert queries	Clickhouse have INSERT queries that are throttled due to high number of active data parts for partition in a MergeTree, please decrease INSERT frequency	`min(/ClickHouse by HTTP/clickhouse.insert.delay,5m)>{$CLICKHOUSE.DELAYED.INSERTS.MAX.WARN}`	WARNING	Manual close: YES
ClickHouse: Too many MergeTree parts	Descease INSERT queries frequency. Clickhouse MergeTree table engine split each INSERT query to partitions (PARTITION BY expression) and add one or more PARTS per INSERT inside each partition, after that background merge process run, and when you have too much unmerged parts inside partition, SELECT queries performance can significate degrade, so clickhouse try delay insert, or abort it.	`min(/ClickHouse by HTTP/clickhouse.max.part.count.for.partition,5m)>{$CLICKHOUSE.PARTS.PER.PARTITION.WARN} * 0.9`	WARNING	Manual close: YES
ClickHouse: Too many network errors	Number of errors (timeouts and connection failures) during query execution, background pool tasks and DNS cache update is too high.	`min(/ClickHouse by HTTP/clickhouse.network.error.rate,5m)>{$CLICKHOUSE.NETWORK.ERRORS.MAX.WARN}`	WARNING
ClickHouse: Too many distributed files to insert	"Clickhouse servers and in config.xml https://clickhouse.tech/docs/en/operations/table_engines/distributed/"	`min(/ClickHouse by HTTP/clickhouse.distributed.files,5m)>{$CLICKHOUSE.DELAYED.FILES.DISTRIBUTED.COUNT.MAX.WARN}`	WARNING	Manual close: YES
ClickHouse: Replication lag is too high	When replica have too much lag, it can be skipped from Distributed SELECT Queries without errors and you will have wrong query results.	`min(/ClickHouse by HTTP/clickhouse.replicas.max.absolute.delay,5m)>{$CLICKHOUSE.REPLICA.MAX.WARN}`	WARNING	Manual close: YES
ClickHouse: {#DB}.{#TABLE} Replica is readonly	This mode is turned on if the config doesn't have sections with ZooKeeper, if an unknown error occurred when re-initializing sessions in ZooKeeper, and during session re-initialization in ZooKeeper.	`min(/ClickHouse by HTTP/clickhouse.replica.is_readonly["{#DB}.{#TABLE}"],5m)=1`	WARNING
ClickHouse: {#DB}.{#TABLE} Replica session is expired	This mode is turned on if the config doesn't have sections with ZooKeeper, if an unknown error occurred when re-initializing sessions in ZooKeeper, and during session re-initialization in ZooKeeper.	`min(/ClickHouse by HTTP/clickhouse.replica.is_session_expired["{#DB}.{#TABLE}"],5m)=1`	WARNING
ClickHouse: {#DB}.{#TABLE}: Too many operations in queue	-	`min(/ClickHouse by HTTP/clickhouse.replica.queue_size["{#DB}.{#TABLE}"],5m)>{$CLICKHOUSE.QUEUE.SIZE.MAX.WARN:"{#TABLE}"}`	WARNING
ClickHouse: {#DB}.{#TABLE}: Number of active replicas less than number of total replicas	-	`max(/ClickHouse by HTTP/clickhouse.replica.active_replicas["{#DB}.{#TABLE}"],5m) < last(/ClickHouse by HTTP/clickhouse.replica.total_replicas["{#DB}.{#TABLE}"])`	WARNING
ClickHouse: {#DB}.{#TABLE}: Difference between logmaxindex and log_pointer is too high	-	`min(/ClickHouse by HTTP/clickhouse.replica.lag["{#DB}.{#TABLE}"],5m) > {$CLICKHOUSE.LOG_POSITION.DIFF.MAX.WARN}`	WARNING
ClickHouse: Too many ZooKeeper sessions opened	Number of sessions (connections) to ZooKeeper. Should be no more than one, because using more than one connection to ZooKeeper may lead to bugs due to lack of linearizability (stale reads) that ZooKeeper consistency model allows.	`min(/ClickHouse by HTTP/clickhouse.zookeeper.session,5m)>1`	WARNING
ClickHouse: Configuration has been changed	ClickHouse configuration has been changed. Ack to close.	`last(/ClickHouse by HTTP/clickhouse.system.settings,#1)<>last(/ClickHouse by HTTP/clickhouse.system.settings,#2) and length(last(/ClickHouse by HTTP/clickhouse.system.settings))>0`	INFO	Manual close: YES

Feedback

Please report any issues with the template at https://support.zabbix.com

db_cassandra_jmx

View README Download JSON

Apache Cassandra by JMX

Overview

For Zabbix version: 6.2 and higher
Official JMX Template for Apache Cassandra DBSM.

This template was tested on:

Apache Cassandra, version 3.11.8

Setup

This template works with standalone and cluster instances. Metrics are collected by JMX.

Enable and configure JMX access to Apache cassandra. See documentation for instructions.
Set the user name and password in host macros {$CASSANDRA.USER} and {$CASSANDRA.PASSWORD}.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$CASSANDRA.KEY_SPACE.MATCHES}	Filter of discoverable key spaces	`.*`
{$CASSANDRA.KEYSPACE.NOTMATCHES}	Filter to exclude discovered key spaces	`(system	system_auth	system_distributed	system_schema)`
{$CASSANDRA.PASSWORD}	-	`zabbix`
{$CASSANDRA.PENDING_TASKS.MAX.HIGH}	-	`500`
{$CASSANDRA.PENDING_TASKS.MAX.WARN}	-	`350`
{$CASSANDRA.USER}	-	`zabbix`

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Tables

Info about keyspaces and tables

JMX

jmx.discovery[beans,"org.apache.cassandra.metrics:type=Table,keyspace=,scope=,name=ReadLatency"]

Filter:

AND

- {#JMXKEYSPACE} MATCHESREGEX {$CASSANDRA.KEY_SPACE.MATCHES}

- {#JMXKEYSPACE} NOTMATCHES_REGEX {$CASSANDRA.KEY_SPACE.NOT_MATCHES}

Items collected

Group	Name	Description	Type	Key and additional info
Cassandra	Cluster: Nodes down	-	JMX	jmx["org.apache.cassandra.net:type=FailureDetector","DownEndpointCount"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1h`
Cassandra	Cluster: Nodes up	-	JMX	jmx["org.apache.cassandra.net:type=FailureDetector","UpEndpointCount"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1h`
Cassandra	Cluster: Name	-	JMX	jmx["org.apache.cassandra.db:type=StorageService","ClusterName"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1h`
Cassandra	Version	-	JMX	jmx["org.apache.cassandra.db:type=StorageService","ReleaseVersion"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1h`
Cassandra	Dropped messages: Write (Mutation)	Number of dropped regular writes messages.	JMX	jmx["org.apache.cassandra.metrics:type=DroppedMessage,scope=MUTATION,name=Dropped","Count"]
Cassandra	Dropped messages: Read	Number of dropped regular reads messages.	JMX	jmx["org.apache.cassandra.metrics:type=DroppedMessage,scope=READ,name=Dropped","Count"]
Cassandra	Storage: Used (bytes)	Size, in bytes, of the on disk data size this node manages.	JMX	jmx["org.apache.cassandra.metrics:type=Storage,name=Load","Count"]
Cassandra	Storage: Errors	Number of internal exceptions caught. Under normal exceptions this should be zero.	JMX	jmx["org.apache.cassandra.metrics:type=Storage,name=Exceptions","Count"]
Cassandra	Storage: Hints	Number of hint messages written to this node since [re]start. Includes one entry for each host to be hinted per hint.	JMX	jmx["org.apache.cassandra.metrics:type=Storage,name=TotalHints","Count"]
Cassandra	Compaction: Number of completed tasks	Number of completed compactions since server [re]start.	JMX	jmx["org.apache.cassandra.metrics:name=CompletedTasks,type=Compaction","Value"]
Cassandra	Compaction: Total compactions completed	Throughput of completed compactions since server [re]start.	JMX	jmx["org.apache.cassandra.metrics:name=TotalCompactionsCompleted,type=Compaction","Count"]
Cassandra	Compaction: Pending tasks	Estimated number of compactions remaining to perform.	JMX	jmx["org.apache.cassandra.metrics:type=Compaction,name=PendingTasks","Value"]
Cassandra	Commitlog: Pending tasks	Number of commit log messages written but yet to be fsync'd.	JMX	jmx["org.apache.cassandra.metrics:name=PendingTasks,type=CommitLog","Value"]
Cassandra	Commitlog: Total size	Current size, in bytes, used by all the commit log segments.	JMX	jmx["org.apache.cassandra.metrics:name=TotalCommitLogSize,type=CommitLog","Value"]
Cassandra	Latency: Read median	Latency read from disk in milliseconds - median.	JMX	jmx["org.apache.cassandra.metrics:name=ReadLatency,type=Table","50thPercentile"] Preprocessing: - MULTIPLIER: `0.001`
Cassandra	Latency: Read 75 percentile	Latency read from disk in milliseconds - p75.	JMX	jmx["org.apache.cassandra.metrics:name=ReadLatency,type=Table","75thPercentile"] Preprocessing: - MULTIPLIER: `0.001`
Cassandra	Latency: Read 95 percentile	Latency read from disk in milliseconds - p95.	JMX	jmx["org.apache.cassandra.metrics:name=ReadLatency,type=Table","95thPercentile"] Preprocessing: - MULTIPLIER: `0.001`
Cassandra	Latency: Write median	Latency write to disk in milliseconds - median.	JMX	jmx["org.apache.cassandra.metrics:name=WriteLatency,type=Table","50thPercentile"] Preprocessing: - MULTIPLIER: `0.001`
Cassandra	Latency: Write 75 percentile	Latency write to disk in milliseconds - p75.	JMX	jmx["org.apache.cassandra.metrics:name=WriteLatency,type=Table","75thPercentile"] Preprocessing: - MULTIPLIER: `0.001`
Cassandra	Latency: Write 95 percentile	Latency write to disk in milliseconds - p95.	JMX	jmx["org.apache.cassandra.metrics:name=WriteLatency,type=Table","95thPercentile"] Preprocessing: - MULTIPLIER: `0.001`
Cassandra	Latency: Client request read median	Total latency serving data to clients in milliseconds - median.	JMX	jmx["org.apache.cassandra.metrics:type=ClientRequest,scope=Read,name=Latency","50thPercentile"] Preprocessing: - MULTIPLIER: `0.001`
Cassandra	Latency: Client request read 75 percentile	Total latency serving data to clients in milliseconds - p75.	JMX	jmx["org.apache.cassandra.metrics:type=ClientRequest,scope=Read,name=Latency","75thPercentile"] Preprocessing: - MULTIPLIER: `0.001`
Cassandra	Latency: Client request read 95 percentile	Total latency serving data to clients in milliseconds - p95.	JMX	jmx["org.apache.cassandra.metrics:type=ClientRequest,scope=Read,name=Latency","95thPercentile"] Preprocessing: - MULTIPLIER: `0.001`
Cassandra	Latency: Client request write median	Total latency serving write requests from clients in milliseconds - median.	JMX	jmx["org.apache.cassandra.metrics:type=ClientRequest,scope=Write,name=Latency","50thPercentile"] Preprocessing: - MULTIPLIER: `0.001`
Cassandra	Latency: Client request write 75 percentile	Total latency serving write requests from clients in milliseconds - p75.	JMX	jmx["org.apache.cassandra.metrics:type=ClientRequest,scope=Write,name=Latency","75thPercentile"] Preprocessing: - MULTIPLIER: `0.001`
Cassandra	Latency: Client request write 95 percentile	Total latency serving write requests from clients in milliseconds - p95.	JMX	jmx["org.apache.cassandra.metrics:type=ClientRequest,scope=Write,name=Latency","95thPercentile"] Preprocessing: - MULTIPLIER: `0.001`
Cassandra	KeyCache: Capacity	Cache capacity in bytes.	JMX	jmx["org.apache.cassandra.metrics:type=Cache,scope=KeyCache,name=Capacity","Value"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1h`
Cassandra	KeyCache: Entries	Total number of cache entries.	JMX	jmx["org.apache.cassandra.metrics:type=Cache,scope=KeyCache,name=Entries","Value"]
Cassandra	KeyCache: HitRate	All time cache hit rate.	JMX	jmx["org.apache.cassandra.metrics:type=Cache,scope=KeyCache,name=HitRate","Value"] Preprocessing: - MULTIPLIER: `100`
Cassandra	KeyCache: Hits per second	Rate of cache hits.	JMX	jmx["org.apache.cassandra.metrics:type=Cache,scope=KeyCache,name=Hits","Count"] Preprocessing: - CHANGEPERSECOND
Cassandra	KeyCache: requests per second	Rate of cache requests.	JMX	jmx["org.apache.cassandra.metrics:type=Cache,scope=KeyCache,name=Requests","Count"] Preprocessing: - CHANGEPERSECOND
Cassandra	KeyCache: Size	Total size of occupied cache, in bytes.	JMX	jmx["org.apache.cassandra.metrics:type=Cache,scope=KeyCache,name=Size","Value"]
Cassandra	Client connections: Native	Number of clients connected to this nodes native protocol server.	JMX	jmx["org.apache.cassandra.metrics:type=Client,name=connectedNativeClients","Value"]
Cassandra	Client connections: Trifts	Number of connected to this nodes thrift clients.	JMX	jmx["org.apache.cassandra.metrics:type=Client,name=connectedThriftClients","Value"]
Cassandra	Client request: Read per second	The number of client requests per second.	JMX	jmx["org.apache.cassandra.metrics:type=ClientRequest,scope=Read,name=Latency","Count"] Preprocessing: - CHANGEPERSECOND
Cassandra	Client request: Write per second	The number of local write requests per second.	JMX	jmx["org.apache.cassandra.metrics:type=ClientRequest,scope=Write,name=Latency","Count"] Preprocessing: - CHANGEPERSECOND
Cassandra	Client request: Write Timeouts	Number of write requests timeouts encountered.	JMX	jmx["org.apache.cassandra.metrics:type=ClientRequest,scope=Write,name=Timeouts","Count"]
Cassandra	Thread pool.MutationStage: Pending tasks	Number of queued tasks queued up on this pool. MutationStage: Responsible for writes (exclude materialized and counter writes).	JMX	jmx["org.apache.cassandra.metrics:type=ThreadPools,path=request,scope=MutationStage,name=PendingTasks","Value"]
Cassandra	Thread pool MutationStage: Currently blocked task	Number of tasks that are currently blocked due to queue saturation but on retry will become unblocked. MutationStage: Responsible for writes (exclude materialized and counter writes).	JMX	jmx["org.apache.cassandra.metrics:type=ThreadPools,path=request,scope=MutationStage,name=CurrentlyBlockedTasks","Count"]
Cassandra	Thread pool MutationStage: Total blocked tasks	Number of tasks that were blocked due to queue saturation. MutationStage: Responsible for writes (exclude materialized and counter writes).	JMX	jmx["org.apache.cassandra.metrics:type=ThreadPools,path=request,scope=MutationStage,name=TotalBlockedTasks","Count"]
Cassandra	Thread pool CounterMutationStage: Pending tasks	Number of queued tasks queued up on this pool. CounterMutationStage: Responsible for counter writes.	JMX	jmx["org.apache.cassandra.metrics:type=ThreadPools,path=request,scope=CounterMutationStage,name=PendingTasks","Value"]
Cassandra	Thread pool CounterMutationStage: Currently blocked task	Number of tasks that are currently blocked due to queue saturation but on retry will become unblocked. CounterMutationStage: Responsible for counter writes.	JMX	jmx["org.apache.cassandra.metrics:type=ThreadPools,path=request,scope=CounterMutationStage,name=CurrentlyBlockedTasks","Count"]
Cassandra	Thread pool CounterMutationStage: Total blocked tasks	Number of tasks that were blocked due to queue saturation. CounterMutationStage: Responsible for counter writes.	JMX	jmx["org.apache.cassandra.metrics:type=ThreadPools,path=request,scope=CounterMutationStage,name=TotalBlockedTasks","Count"]
Cassandra	Thread pool ReadStage: Pending tasks	Number of queued tasks queued up on this pool. ReadStage: Local reads run on this thread pool.	JMX	jmx["org.apache.cassandra.metrics:type=ThreadPools,path=request,scope=ReadStage,name=PendingTasks","Value"]
Cassandra	Thread pool ReadStage: Currently blocked task	Number of tasks that are currently blocked due to queue saturation but on retry will become unblocked. ReadStage: Local reads run on this thread pool.	JMX	jmx["org.apache.cassandra.metrics:type=ThreadPools,path=request,scope=ReadStage,name=CurrentlyBlockedTasks","Count"]
Cassandra	Thread pool ReadStage: Total blocked tasks	Number of tasks that were blocked due to queue saturation. ReadStage: Local reads run on this thread pool.	JMX	jmx["org.apache.cassandra.metrics:type=ThreadPools,path=request,scope=ReadStage,name=TotalBlockedTasks","Count"]
Cassandra	Thread pool ViewMutationStage: Pending tasks	Number of queued tasks queued up on this pool. ViewMutationStage: Responsible for materialized view writes.	JMX	jmx["org.apache.cassandra.metrics:type=ThreadPools,path=request,scope=ViewMutationStage,name=PendingTasks","Value"]
Cassandra	Thread pool ViewMutationStage: Currently blocked task	Number of tasks that are currently blocked due to queue saturation but on retry will become unblocked. ViewMutationStage: Responsible for materialized view writes.	JMX	jmx["org.apache.cassandra.metrics:type=ThreadPools,path=request,scope=ViewMutationStage,name=CurrentlyBlockedTasks","Count"]
Cassandra	Thread pool ViewMutationStage: Total blocked tasks	Number of tasks that were blocked due to queue saturation. ViewMutationStage: Responsible for materialized view writes.	JMX	jmx["org.apache.cassandra.metrics:type=ThreadPools,path=request,scope=ViewMutationStage,name=TotalBlockedTasks","Count"]
Cassandra	Thread pool MemtableFlushWriter: Pending tasks	Number of queued tasks queued up on this pool. MemtableFlushWriter: Writes memtables to disk.	JMX	jmx["org.apache.cassandra.metrics:type=ThreadPools,path=internal,scope=MemtableFlushWriter,name=PendingTasks","Value"]
Cassandra	Thread pool MemtableFlushWriter: Currently blocked task	Number of tasks that are currently blocked due to queue saturation but on retry will become unblocked. MemtableFlushWriter: Writes memtables to disk.	JMX	jmx["org.apache.cassandra.metrics:type=ThreadPools,path=internal,scope=MemtableFlushWriter,name=CurrentlyBlockedTasks","Count"]
Cassandra	Thread pool MemtableFlushWriter: Total blocked tasks	Number of tasks that were blocked due to queue saturation. MemtableFlushWriter: Writes memtables to disk.	JMX	jmx["org.apache.cassandra.metrics:type=ThreadPools,path=internal,scope=MemtableFlushWriter,name=TotalBlockedTasks","Count"]
Cassandra	Thread pool HintsDispatcher: Pending tasks	Number of queued tasks queued up on this pool. HintsDispatcher: Performs hinted handoff.	JMX	jmx["org.apache.cassandra.metrics:type=ThreadPools,path=internal,scope=HintsDispatcher,name=PendingTasks","Value"]
Cassandra	Thread pool HintsDispatcher: Currently blocked task	Number of tasks that are currently blocked due to queue saturation but on retry will become unblocked. HintsDispatcher: Performs hinted handoff.	JMX	jmx["org.apache.cassandra.metrics:type=ThreadPools,path=internal,scope=HintsDispatcher,name=CurrentlyBlockedTasks","Count"]
Cassandra	Thread pool HintsDispatcher: Total blocked tasks	Number of tasks that were blocked due to queue saturation. HintsDispatcher: Performs hinted handoff.	JMX	jmx["org.apache.cassandra.metrics:type=ThreadPools,path=internal,scope=HintsDispatcher,name=TotalBlockedTasks","Count"]
Cassandra	Thread pool MemtablePostFlush: Pending tasks	Number of queued tasks queued up on this pool. MemtablePostFlush: Cleans up commit log after memtable is written to disk.	JMX	jmx["org.apache.cassandra.metrics:type=ThreadPools,path=internal,scope=MemtablePostFlush,name=PendingTasks","Value"]
Cassandra	Thread pool MemtablePostFlush: Currently blocked task	Number of tasks that are currently blocked due to queue saturation but on retry will become unblocked. MemtablePostFlush: Cleans up commit log after memtable is written to disk.	JMX	jmx["org.apache.cassandra.metrics:type=ThreadPools,path=internal,scope=MemtablePostFlush,name=CurrentlyBlockedTasks","Count"]
Cassandra	Thread pool MemtablePostFlush: Total blocked tasks	Number of tasks that were blocked due to queue saturation. MemtablePostFlush: Cleans up commit log after memtable is written to disk.	JMX	jmx["org.apache.cassandra.metrics:type=ThreadPools,path=internal,scope=MemtablePostFlush,name=TotalBlockedTasks","Count"]
Cassandra	Thread pool MigrationStage: Pending tasks	Number of queued tasks queued up on this pool. MigrationStage: Runs schema migrations.	JMX	jmx["org.apache.cassandra.metrics:type=ThreadPools,path=internal,scope=MigrationStage,name=PendingTasks","Value"]
Cassandra	Thread pool MigrationStage: Currently blocked task	Number of tasks that are currently blocked due to queue saturation but on retry will become unblocked. MigrationStage: Runs schema migrations.	JMX	jmx["org.apache.cassandra.metrics:type=ThreadPools,path=internal,scope=MigrationStage,name=CurrentlyBlockedTasks","Count"]
Cassandra	Thread pool MigrationStage: Total blocked tasks	Number of tasks that were blocked due to queue saturation. MigrationStage: Runs schema migrations.	JMX	jmx["org.apache.cassandra.metrics:type=ThreadPools,path=internal,scope=MigrationStage,name=TotalBlockedTasks","Count"]
Cassandra	Thread pool MiscStage: Pending tasks	Number of queued tasks queued up on this pool. MiscStage: Miscellaneous tasks run here.	JMX	jmx["org.apache.cassandra.metrics:type=ThreadPools,path=internal,scope=MiscStage,name=PendingTasks","Value"]
Cassandra	Thread pool MiscStage: Currently blocked task	Number of tasks that are currently blocked due to queue saturation but on retry will become unblocked. MiscStage: Miscellaneous tasks run here.	JMX	jmx["org.apache.cassandra.metrics:type=ThreadPools,path=internal,scope=MiscStage,name=CurrentlyBlockedTasks","Count"]
Cassandra	Thread pool MiscStage: Total blocked tasks	Number of tasks that were blocked due to queue saturation. MiscStage: Miscellaneous tasks run here.	JMX	jmx["org.apache.cassandra.metrics:type=ThreadPools,path=internal,scope=MiscStage,name=TotalBlockedTasks","Count"]
Cassandra	Thread pool SecondaryIndexManagement: Pending tasks	Number of queued tasks queued up on this pool. SecondaryIndexManagement: Performs updates to secondary indexes.	JMX	jmx["org.apache.cassandra.metrics:type=ThreadPools,path=internal,scope=SecondaryIndexManagement,name=PendingTasks","Value"]
Cassandra	Thread pool SecondaryIndexManagement: Currently blocked task	Number of tasks that are currently blocked due to queue saturation but on retry will become unblocked. SecondaryIndexManagement: Performs updates to secondary indexes.	JMX	jmx["org.apache.cassandra.metrics:type=ThreadPools,path=internal,scope=SecondaryIndexManagement,name=CurrentlyBlockedTasks","Count"]
Cassandra	Thread pool SecondaryIndexManagement: Total blocked tasks	Number of tasks that were blocked due to queue saturation. SecondaryIndexManagement: Performs updates to secondary indexes.	JMX	jmx["org.apache.cassandra.metrics:type=ThreadPools,path=internal,scope=SecondaryIndexManagement,name=TotalBlockedTasks","Count"]
Cassandra	{#JMXKEYSPACE}.{#JMXSCOPE}: SS Tables per read 75 percentile	The number of SSTable data files accessed per read - p75.	JMX	jmx["org.apache.cassandra.metrics:type=Table,keyspace={#JMXKEYSPACE},scope={#JMXSCOPE},name=SSTablesPerReadHistogram","75thPercentile"]
Cassandra	{#JMXKEYSPACE}.{#JMXSCOPE}: SS Tables per read 95 percentile	The number of SSTable data files accessed per read - p95.	JMX	jmx["org.apache.cassandra.metrics:type=Table,keyspace={#JMXKEYSPACE},scope={#JMXSCOPE},name=SSTablesPerReadHistogram","95thPercentile"]
Cassandra	{#JMXKEYSPACE}.{#JMXSCOPE}: Tombstone scanned 75 percentile	Number of tombstones scanned per read - p75.	JMX	jmx["org.apache.cassandra.metrics:type=Table,keyspace={#JMXKEYSPACE},scope={#JMXSCOPE},name=TombstoneScannedHistogram","75thPercentile"]
Cassandra	{#JMXKEYSPACE}.{#JMXSCOPE}: Tombstone scanned 95 percentile	Number of tombstones scanned per read - p95.	JMX	jmx["org.apache.cassandra.metrics:type=Table,keyspace={#JMXKEYSPACE},scope={#JMXSCOPE},name=TombstoneScannedHistogram","95thPercentile"]
Cassandra	{#JMXKEYSPACE}.{#JMXSCOPE}: Waiting on free memtable space 75 percentile	The time spent waiting for free memtable space either on- or off-heap - p75.	JMX	jmx["org.apache.cassandra.metrics:type=Table,keyspace={#JMXKEYSPACE},scope={#JMXSCOPE},name=WaitingOnFreeMemtableSpace","75thPercentile"] Preprocessing: - MULTIPLIER: `0.001`
Cassandra	{#JMXKEYSPACE}.{#JMXSCOPE}: Waiting on free memtable space95 percentile	The time spent waiting for free memtable space either on- or off-heap - p95.	JMX	jmx["org.apache.cassandra.metrics:type=Table,keyspace={#JMXKEYSPACE},scope={#JMXSCOPE},name=WaitingOnFreeMemtableSpace","95thPercentile"] Preprocessing: - MULTIPLIER: `0.001`
Cassandra	{#JMXKEYSPACE}.{#JMXSCOPE}: Col update time delta75 percentile	The column update time delta - p75.	JMX	jmx["org.apache.cassandra.metrics:type=Table,keyspace={#JMXKEYSPACE},scope={#JMXSCOPE},name=ColUpdateTimeDeltaHistogram","75thPercentile"] Preprocessing: - MULTIPLIER: `0.001`
Cassandra	{#JMXKEYSPACE}.{#JMXSCOPE}: Col update time delta 95 percentile	The column update time delta - p95.	JMX	jmx["org.apache.cassandra.metrics:type=Table,keyspace={#JMXKEYSPACE},scope={#JMXSCOPE},name=ColUpdateTimeDeltaHistogram","95thPercentile"] Preprocessing: - MULTIPLIER: `0.001`
Cassandra	{#JMXKEYSPACE}.{#JMXSCOPE}: Bloom filter false ratio	The ratio of Bloom filter false positives to total checks.	JMX	jmx["org.apache.cassandra.metrics:type=Table,keyspace={#JMXKEYSPACE},scope={#JMXSCOPE},name=BloomFilterFalseRatio","Value"]
Cassandra	{#JMXKEYSPACE}.{#JMXSCOPE}: Compression ratio	The compression ratio for all SSTables.	JMX	jmx["org.apache.cassandra.metrics:type=Table,keyspace={#JMXKEYSPACE},scope={#JMXSCOPE},name=CompressionRatio","Value"]
Cassandra	{#JMXKEYSPACE}.{#JMXSCOPE}: KeyCache hit rate	The key cache hit rate.	JMX	jmx["org.apache.cassandra.metrics:type=Table,keyspace={#JMXKEYSPACE},scope={#JMXSCOPE},name=KeyCacheHitRate","Value"]
Cassandra	{#JMXKEYSPACE}.{#JMXSCOPE}: Live SS Table	Number of "live" (in use) SSTables.	JMX	jmx["org.apache.cassandra.metrics:type=Table,keyspace={#JMXKEYSPACE},scope={#JMXSCOPE},name=LiveSSTableCount","Value"]
Cassandra	{#JMXKEYSPACE}.{#JMXSCOPE}: Max sartition size	The size of the largest compacted partition.	JMX	jmx["org.apache.cassandra.metrics:type=Table,keyspace={#JMXKEYSPACE},scope={#JMXSCOPE},name=MaxPartitionSize","Value"]
Cassandra	{#JMXKEYSPACE}.{#JMXSCOPE}: Mean partition size	The average size of compacted partition.	JMX	jmx["org.apache.cassandra.metrics:type=Table,keyspace={#JMXKEYSPACE},scope={#JMXSCOPE},name=MeanPartitionSize","Value"]
Cassandra	{#JMXKEYSPACE}.{#JMXSCOPE}: Pending compactions	The number of pending compactions.	JMX	jmx["org.apache.cassandra.metrics:type=Table,keyspace={#JMXKEYSPACE},scope={#JMXSCOPE},name=PendingCompactions","Value"]
Cassandra	{#JMXKEYSPACE}.{#JMXSCOPE}: Snapshots size	The disk space truly used by snapshots.	JMX	jmx["org.apache.cassandra.metrics:type=Table,keyspace={#JMXKEYSPACE},scope={#JMXSCOPE},name=SnapshotsSize","Value"]
Cassandra	{#JMXKEYSPACE}.{#JMXSCOPE}: Compaction bytes written	The amount of data that was compacted since (re)start.	JMX	jmx["org.apache.cassandra.metrics:type=Table,keyspace={#JMXKEYSPACE},scope={#JMXSCOPE},name=CompactionBytesWritten","Count"]
Cassandra	{#JMXKEYSPACE}.{#JMXSCOPE}: Bytes flushed	The amount of data that was flushed since (re)start.	JMX	jmx["org.apache.cassandra.metrics:type=Table,keyspace={#JMXKEYSPACE},scope={#JMXSCOPE},name=BytesFlushed","Count"]
Cassandra	{#JMXKEYSPACE}.{#JMXSCOPE}: Pending flushes	The number of pending flushes.	JMX	jmx["org.apache.cassandra.metrics:type=Table,keyspace={#JMXKEYSPACE},scope={#JMXSCOPE},name=PendingFlushes","Count"]
Cassandra	{#JMXKEYSPACE}.{#JMXSCOPE}: Live disk space used	The disk space used by "live" SSTables (only counts in use files).	JMX	jmx["org.apache.cassandra.metrics:type=Table,keyspace={#JMXKEYSPACE},scope={#JMXSCOPE},name=LiveDiskSpaceUsed","Count"]
Cassandra	{#JMXKEYSPACE}.{#JMXSCOPE}: Disk space used	Disk space used.	JMX	jmx["org.apache.cassandra.metrics:type=Table,keyspace={#JMXKEYSPACE},scope={#JMXSCOPE},name=TotalDiskSpaceUsed","Count"]
Cassandra	{#JMXKEYSPACE}.{#JMXSCOPE}: Out of row cache hits	The number of row cache hits that do not satisfy the query filter and went to disk.	JMX	jmx["org.apache.cassandra.metrics:type=Table,keyspace={#JMXKEYSPACE},scope={#JMXSCOPE},name=RowCacheHitOutOfRange","Count"]
Cassandra	{#JMXKEYSPACE}.{#JMXSCOPE}: Row cache hits	The number of row cache hits.	JMX	jmx["org.apache.cassandra.metrics:type=Table,keyspace={#JMXKEYSPACE},scope={#JMXSCOPE},name=RowCacheHit","Count"]
Cassandra	{#JMXKEYSPACE}.{#JMXSCOPE}: Row cache misses	The number of table row cache misses.	JMX	jmx["org.apache.cassandra.metrics:type=Table,keyspace={#JMXKEYSPACE},scope={#JMXSCOPE},name=RowCacheMiss","Count"]
Cassandra	{#JMXKEYSPACE}.{#JMXSCOPE}: Read latency 75 percentile	Latency read from disk in milliseconds.	JMX	jmx["org.apache.cassandra.metrics:type=Table,keyspace={#JMXKEYSPACE},scope={#JMXSCOPE},name=ReadLatency","75thPercentile"] Preprocessing: - MULTIPLIER: `0.001`
Cassandra	{#JMXKEYSPACE}.{#JMXSCOPE}: Read latency 95 percentile	Latency read from disk in milliseconds.	JMX	jmx["org.apache.cassandra.metrics:type=Table,keyspace={#JMXKEYSPACE},scope={#JMXSCOPE},name=ReadLatency","95thPercentile"] Preprocessing: - MULTIPLIER: `0.001`
Cassandra	{#JMXKEYSPACE}.{#JMXSCOPE}: Read per second	The number of client requests per second.	JMX	jmx["org.apache.cassandra.metrics:type=Table,keyspace={#JMXKEYSPACE},scope={#JMXSCOPE},name=ReadLatency","Count"] Preprocessing: - CHANGEPERSECOND
Cassandra	{#JMXKEYSPACE}.{#JMXSCOPE}: Write latency 75 percentile	Latency write to disk in milliseconds.	JMX	jmx["org.apache.cassandra.metrics:type=Table,keyspace={#JMXKEYSPACE},scope={#JMXSCOPE},name=WriteLatency","75thPercentile"] Preprocessing: - MULTIPLIER: `0.001`
Cassandra	{#JMXKEYSPACE}.{#JMXSCOPE}: Write latency 95 percentile	Latency write to disk in milliseconds.	JMX	jmx["org.apache.cassandra.metrics:type=Table,keyspace={#JMXKEYSPACE},scope={#JMXSCOPE},name=WriteLatency","95thPercentile"] Preprocessing: - MULTIPLIER: `0.001`
Cassandra	{#JMXKEYSPACE}.{#JMXSCOPE}: Write per second	The number of local write requests per second.	JMX	jmx["org.apache.cassandra.metrics:type=Table,keyspace={#JMXKEYSPACE},scope={#JMXSCOPE},name=WriteLatency","Count"] Preprocessing: - CHANGEPERSECOND

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
There are down nodes in cluster	-	`last(/Apache Cassandra by JMX/jmx["org.apache.cassandra.net:type=FailureDetector","DownEndpointCount"])>0`	AVERAGE
Version has changed	Cassandra version has changed. Ack to close.	`last(/Apache Cassandra by JMX/jmx["org.apache.cassandra.db:type=StorageService","ReleaseVersion"],#1)<>last(/Apache Cassandra by JMX/jmx["org.apache.cassandra.db:type=StorageService","ReleaseVersion"],#2) and length(last(/Apache Cassandra by JMX/jmx["org.apache.cassandra.db:type=StorageService","ReleaseVersion"]))>0`	INFO	Manual close: YES
Failed to fetch info data	Zabbix has not received data for items for the last 15 minutes	`nodata(/Apache Cassandra by JMX/jmx["org.apache.cassandra.metrics:type=Storage,name=Load","Count"],15m)=1`	WARNING
Too many storage exceptions	-	`min(/Apache Cassandra by JMX/jmx["org.apache.cassandra.metrics:type=Storage,name=Exceptions","Count"],5m)>0`	WARNING
Many pending tasks	-	`min(/Apache Cassandra by JMX/jmx["org.apache.cassandra.metrics:type=Compaction,name=PendingTasks","Value"],15m)>{$CASSANDRA.PENDING_TASKS.MAX.WARN}`	WARNING	Depends on: - Too many pending tasks
Too many pending tasks	-	`min(/Apache Cassandra by JMX/jmx["org.apache.cassandra.metrics:type=Compaction,name=PendingTasks","Value"],15m)>{$CASSANDRA.PENDING_TASKS.MAX.HIGH}`	AVERAGE

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.