app

app_zookeeper_http

Zookeeper by HTTP

Overview

For Zabbix version: 6.2 and higher
The template to monitor Apache Zookeeper by Zabbix that work without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

This template was tested on:

Apache Zookeeper, version 3.6+

Setup

This template works with standalone and cluster instances. Metrics are collected from each Zookeeper node by requests to AdminServer. By default AdminServer is enabled and listens on port 8080. You can enable or configure AdminServer parameters according official documentations. Don't forget to change macros {$ZOOKEEPER.COMMAND_URL}, {$ZOOKEEPER.PORT}, {$ZOOKEEPER.SCHEME}.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$ZOOKEEPER.COMMAND_URL}	The URL for listing and issuing commands relative to the root URL (admin.commandURL).	`commands`
{$ZOOKEEPER.FILE_DESCRIPTORS.MAX.WARN}	Maximum percentage of file descriptors usage alert threshold (for trigger expression).	`85`
{$ZOOKEEPER.OUTSTANDING_REQ.MAX.WARN}	Maximum number of outstanding requests (for trigger expression).	`10`
{$ZOOKEEPER.PENDING_SYNCS.MAX.WARN}	Maximum number of pending syncs from the followers (for trigger expression).	`10`
{$ZOOKEEPER.PORT}	The port the embedded Jetty server listens on (admin.serverPort).	`8080`
{$ZOOKEEPER.SCHEME}	Request scheme which may be http or https	`http`

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Clients discovery

Name	Description	Type	Key and additional info
Clients discovery	Get list of client connections. Note, depending on the number of client connections this operation may be expensive (i.e. impact server performance).	HTTP_AGENT	zookeeper.clients Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.`
Leader metrics discovery	Additional metrics for leader node	DEPENDENT	zookeeper.metrics.leader Preprocessing: - JSONPATH: `$.server_state` - JAVASCRIPT: `return JSON.stringify(value == 'leader' ? [{'{#SINGLETON}': ''}] : []);`

Get list of client connections.

Note, depending on the number of client connections this operation may be expensive (i.e. impact server performance).

HTTP_AGENT

zookeeper.clients

Preprocessing:

- JAVASCRIPT: The text is too long. Please see the template.

Leader metrics discovery

Additional metrics for leader node

DEPENDENT

zookeeper.metrics.leader

Preprocessing:

- JSONPATH: $.server_state

- JAVASCRIPT: return JSON.stringify(value == 'leader' ? [{'{#SINGLETON}': ''}] : []);

Items collected

Group	Name	Description	Type	Key and additional info
Zabbix raw items	Zookeeper: Get server metrics	-	HTTP_AGENT	zookeeper.get_metrics
Zabbix raw items	Zookeeper: Get connections stats	Get information on client connections to server. Note, depending on the number of client connections this operation may be expensive (i.e. impact server performance).	HTTP_AGENT	zookeeper.getconnectionsstats
Zookeeper	Zookeeper: Server mode	Mode of the server. In an ensemble, this may either be leader or follower. Otherwise, it is standalone	DEPENDENT	zookeeper.serverstate Preprocessing: - JSONPATH: `$.server_state` - DISCARDUNCHANGED_HEARTBEAT: `1h`
Zookeeper	Zookeeper: Uptime	Uptime that a peer has been in a table leading/following/observing state.	DEPENDENT	zookeeper.uptime Preprocessing: - JSONPATH: `$.uptime` - MULTIPLIER: `0.001`
Zookeeper	Zookeeper: Version	Version of Zookeeper server.	DEPENDENT	zookeeper.version Preprocessing: - JSONPATH: `$.version` - REGEX: `([^,]+)--(.+) \1` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Zookeeper	Zookeeper: Approximate data size	Data tree size in bytes.The size includes the znode path and its value.	DEPENDENT	zookeeper.approximatedatasize Preprocessing: - JSONPATH: `$.approximate_data_size`
Zookeeper	Zookeeper: File descriptors, max	Maximum number of file descriptors that a zookeeper server can open.	DEPENDENT	zookeeper.maxfiledescriptorcount Preprocessing: - JSONPATH: `$.max_file_descriptor_count` - DISCARDUNCHANGED_HEARTBEAT: `1h`
Zookeeper	Zookeeper: File descriptors, open	Number of file descriptors that a zookeeper server has open.	DEPENDENT	zookeeper.openfiledescriptor_count Preprocessing: - JSONPATH: `$.open_file_descriptor_count`
Zookeeper	Zookeeper: Outstanding requests	The number of queued requests when the server is under load and is receiving more sustained requests than it can process.	DEPENDENT	zookeeper.outstanding_requests Preprocessing: - JSONPATH: `$.outstanding_requests`
Zookeeper	Zookeeper: Commit per sec	The number of commits performed per second	DEPENDENT	zookeeper.commitcount.rate Preprocessing: - JSONPATH: `$.commit_count` - CHANGEPER_SECOND
Zookeeper	Zookeeper: Diff syncs per sec	Number of diff syncs performed per second	DEPENDENT	zookeeper.diffcount.rate Preprocessing: - JSONPATH: `$.diff_count` - CHANGEPER_SECOND
Zookeeper	Zookeeper: Snap syncs per sec	Number of snap syncs performed per second	DEPENDENT	zookeeper.snapcount.rate Preprocessing: - JSONPATH: `$.snap_count` - CHANGEPER_SECOND
Zookeeper	Zookeeper: Looking per sec	Rate of transitions into looking state.	DEPENDENT	zookeeper.lookingcount.rate Preprocessing: - JSONPATH: `$.looking_count` - CHANGEPER_SECOND
Zookeeper	Zookeeper: Alive connections	Number of active clients connected to a zookeeper server.	DEPENDENT	zookeeper.numaliveconnections Preprocessing: - JSONPATH: `$.num_alive_connections`
Zookeeper	Zookeeper: Global sessions	Number of global sessions.	DEPENDENT	zookeeper.global_sessions Preprocessing: - JSONPATH: `$.global_sessions`
Zookeeper	Zookeeper: Local sessions	Number of local sessions.	DEPENDENT	zookeeper.local_sessions Preprocessing: - JSONPATH: `$.local_sessions`
Zookeeper	Zookeeper: Drop connections per sec	Rate of connection drops.	DEPENDENT	zookeeper.connectiondropcount.rate Preprocessing: - JSONPATH: `$.connection_drop_count` - CHANGEPERSECOND
Zookeeper	Zookeeper: Rejected connections per sec	Rate of connection rejected.	DEPENDENT	zookeeper.connectionrejected.rate Preprocessing: - JSONPATH: `$.connection_rejected` - CHANGEPER_SECOND
Zookeeper	Zookeeper: Revalidate connections per sec	Rate of connection revalidations.	DEPENDENT	zookeeper.connectionrevalidatecount.rate Preprocessing: - JSONPATH: `$.connection_revalidate_count` - CHANGEPERSECOND
Zookeeper	Zookeeper: Revalidate per sec	Rate of revalidations.	DEPENDENT	zookeeper.revalidatecount.rate Preprocessing: - JSONPATH: `$.revalidate_count` - CHANGEPER_SECOND
Zookeeper	Zookeeper: Latency, max	The maximum amount of time it takes for the server to respond to a client request.	DEPENDENT	zookeeper.max_latency Preprocessing: - JSONPATH: `$.max_latency`
Zookeeper	Zookeeper: Latency, min	The minimum amount of time it takes for the server to respond to a client request.	DEPENDENT	zookeeper.min_latency Preprocessing: - JSONPATH: `$.min_latency`
Zookeeper	Zookeeper: Latency, avg	The average amount of time it takes for the server to respond to a client request.	DEPENDENT	zookeeper.avg_latency Preprocessing: - JSONPATH: `$.avg_latency`
Zookeeper	Zookeeper: Znode count	The number of znodes in the ZooKeeper namespace (the data)	DEPENDENT	zookeeper.znodecount Preprocessing: - JSONPATH: `$.znode_count` - DISCARDUNCHANGED_HEARTBEAT: `1h`
Zookeeper	Zookeeper: Ephemeral nodes count	Number of ephemeral nodes that a zookeeper server has in its data tree.	DEPENDENT	zookeeper.ephemerals_count Preprocessing: - JSONPATH: `$.ephemerals_count`
Zookeeper	Zookeeper: Watch count	Number of watches currently set on the local ZooKeeper process.	DEPENDENT	zookeeper.watch_count Preprocessing: - JSONPATH: `$.watch_count`
Zookeeper	Zookeeper: Packets sent per sec	The number of zookeeper packets sent from a server per second.	DEPENDENT	zookeeper.packetssent Preprocessing: - JSONPATH: `$.packets_sent` - CHANGEPER_SECOND
Zookeeper	Zookeeper: Packets received per sec	The number of zookeeper packets received by a server per second.	DEPENDENT	zookeeper.packetsreceived.rate Preprocessing: - JSONPATH: `$.packets_received` - CHANGEPER_SECOND
Zookeeper	Zookeeper: Bytes received per sec	Number of bytes received per second.	DEPENDENT	zookeeper.bytesreceivedcount.rate Preprocessing: - JSONPATH: `$.bytes_received_count` - CHANGEPERSECOND
Zookeeper	Zookeeper: Election time, avg	Time between entering and leaving election.	DEPENDENT	zookeeper.avgelectiontime Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.`
Zookeeper	Zookeeper: Elections	Number of elections happened.	DEPENDENT	zookeeper.cntelectiontime Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.`
Zookeeper	Zookeeper: Fsync time, avg	Time to fsync transaction log.	DEPENDENT	zookeeper.avg_fsynctime Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.`
Zookeeper	Zookeeper: Fsync	Count of performed fsyncs.	DEPENDENT	zookeeper.cntfsynctime Preprocessing: - JAVASCRIPT: `var metrics = JSON.parse(value) return metrics.cntfsynctime	metrics.fsynctime_count`
Zookeeper	Zookeeper: Snapshot write time, avg	Average time to write a snapshot.	DEPENDENT	zookeeper.avg_snapshottime Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.`
Zookeeper	Zookeeper: Snapshot writes	Count of performed snapshot writes.	DEPENDENT	zookeeper.cntsnapshottime Preprocessing: - JAVASCRIPT: `var metrics = JSON.parse(value) return metrics.snapshottimecount	metrics.cnt_snapshottime`
Zookeeper	Zookeeper: Pending syncs{#SINGLETON}	Number of pending syncs to carry out to ZooKeeper ensemble followers.	DEPENDENT	zookeeper.pending_syncs[{#SINGLETON}] Preprocessing: - JSONPATH: `$.pending_syncs`
Zookeeper	Zookeeper: Quorum size{#SINGLETON}	-	DEPENDENT	zookeeper.quorum_size[{#SINGLETON}] Preprocessing: - JSONPATH: `$.quorum_size`
Zookeeper	Zookeeper: Synced followers{#SINGLETON}	Number of synced followers reported when a node server_state is leader.	DEPENDENT	zookeeper.synced_followers[{#SINGLETON}] Preprocessing: - JSONPATH: `$.synced_followers`
Zookeeper	Zookeeper: Synced non-voting follower{#SINGLETON}	Number of synced voting followers reported when a node server_state is leader.	DEPENDENT	zookeeper.syncednonvoting_followers[{#SINGLETON}] Preprocessing: - JSONPATH: `$.synced_non_voting_followers`
Zookeeper	Zookeeper: Synced observers{#SINGLETON}	Number of synced observers.	DEPENDENT	zookeeper.synced_observers[{#SINGLETON}] Preprocessing: - JSONPATH: `$.synced_observers`
Zookeeper	Zookeeper: Learners{#SINGLETON}	Number of learners.	DEPENDENT	zookeeper.learners[{#SINGLETON}] Preprocessing: - JSONPATH: `$.learners`
Zookeeper	Zookeeper client {#TYPE} [{#CLIENT}]: Latency, max	The maximum amount of time it takes for the server to respond to a client request.	DEPENDENT	zookeeper.max_latency[{#TYPE},{#CLIENT}] Preprocessing: - JSONPATH: `$.{#TYPE}.[?(@.remote_socket_address == "{#ADDRESS}")].max_latency.first()`
Zookeeper	Zookeeper client {#TYPE} [{#CLIENT}]: Latency, min	The minimum amount of time it takes for the server to respond to a client request.	DEPENDENT	zookeeper.min_latency[{#TYPE},{#CLIENT}] Preprocessing: - JSONPATH: `$.{#TYPE}.[?(@.remote_socket_address == "{#ADDRESS}")].min_latency.first()`
Zookeeper	Zookeeper client {#TYPE} [{#CLIENT}]: Latency, avg	The average amount of time it takes for the server to respond to a client request.	DEPENDENT	zookeeper.avg_latency[{#TYPE},{#CLIENT}] Preprocessing: - JSONPATH: `$.{#TYPE}.[?(@.remote_socket_address == "{#ADDRESS}")].avg_latency.first()`
Zookeeper	Zookeeper client {#TYPE} [{#CLIENT}]: Packets sent per sec	The number of packets sent.	DEPENDENT	zookeeper.packetssent[{#TYPE},{#CLIENT}] Preprocessing: - JSONPATH: `$.{#TYPE}.[?(@.remote_socket_address == "{#ADDRESS}")].packets_sent.first()` - CHANGEPER_SECOND
Zookeeper	Zookeeper client {#TYPE} [{#CLIENT}]: Packets received per sec	The number of packets received.	DEPENDENT	zookeeper.packetsreceived[{#TYPE},{#CLIENT}] Preprocessing: - JSONPATH: `$.{#TYPE}.[?(@.remote_socket_address == "{#ADDRESS}")].packets_received.first()` - CHANGEPER_SECOND
Zookeeper	Zookeeper client {#TYPE} [{#CLIENT}]: Outstanding requests	The number of queued requests when the server is under load and is receiving more sustained requests than it can process.	DEPENDENT	zookeeper.outstanding_requests[{#TYPE},{#CLIENT}] Preprocessing: - JSONPATH: `$.{#TYPE}.[?(@.remote_socket_address == "{#ADDRESS}")].outstanding_requests.first()`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Zookeeper: Server mode has changed	Zookeeper node state has changed. Ack to close.	`last(/Zookeeper by HTTP/zookeeper.server_state,#1)<>last(/Zookeeper by HTTP/zookeeper.server_state,#2) and length(last(/Zookeeper by HTTP/zookeeper.server_state))>0`	INFO	Manual close: YES
Zookeeper: Failed to fetch info data	Zabbix has not received data for items for the last 10 minutes	`nodata(/Zookeeper by HTTP/zookeeper.uptime,10m)=1`	WARNING	Manual close: YES
Zookeeper: Version has changed	Zookeeper version has changed. Ack to close.	`last(/Zookeeper by HTTP/zookeeper.version,#1)<>last(/Zookeeper by HTTP/zookeeper.version,#2) and length(last(/Zookeeper by HTTP/zookeeper.version))>0`	INFO	Manual close: YES
Zookeeper: Too many file descriptors used	Number of file descriptors used more than {$ZOOKEEPER.FILE_DESCRIPTORS.MAX.WARN}% of the available number of file descriptors.	`min(/Zookeeper by HTTP/zookeeper.open_file_descriptor_count,5m) * 100 / last(/Zookeeper by HTTP/zookeeper.max_file_descriptor_count) > {$ZOOKEEPER.FILE_DESCRIPTORS.MAX.WARN}`	WARNING
Zookeeper: Too many queued requests	Number of queued requests in the server. This goes up when the server receives more requests than it can process.	`min(/Zookeeper by HTTP/zookeeper.outstanding_requests,5m)>{$ZOOKEEPER.OUTSTANDING_REQ.MAX.WARN}`	AVERAGE	Manual close: YES
Zookeeper: Too many pending syncs	-	`min(/Zookeeper by HTTP/zookeeper.pending_syncs[{#SINGLETON}],5m)>{$ZOOKEEPER.PENDING_SYNCS.MAX.WARN}`	AVERAGE	Manual close: YES
Zookeeper: Too few active followers	The number of followers should equal the total size of your ZooKeeper ensemble, minus 1 (the leader is not included in the follower count). If the ensemble fails to maintain quorum, all automatic failover features are suspended.	`last(/Zookeeper by HTTP/zookeeper.synced_followers[{#SINGLETON}]) < last(/Zookeeper by HTTP/zookeeper.quorum_size[{#SINGLETON}])-1`	AVERAGE

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

app

app_zabbix_server_remote

View README Download JSON

Remote Zabbix server health

Overview

For Zabbix version: 6.2 and higher. This template is designed to monitor internal Zabbix metrics on the remote Zabbix server.

Setup

Specify the address of the remote Zabbix server by changing {$ADDRESS} and {$PORT} macros. Don't forget to adjust the StatsAllowedIP parameter in the remote server's configuration file to allow the collection of statistics.

Configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$ADDRESS}	-	``
{$PORT}	-	``

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

High availability cluster node discovery

Name	Description	Type	Key and additional info
High availability cluster node discovery	LLD rule with item and trigger prototypes for the node discovery.	DEPENDENT	zabbix.nodes.discovery Preprocessing: - JSONPATH: `$.data.ha`

LLD rule with item and trigger prototypes for the node discovery.

DEPENDENT

zabbix.nodes.discovery

Preprocessing:

- JSONPATH: $.data.ha

Items collected

Group	Name	Description	Type	Key and additional info
Cluster	Cluster node [{#NODE.NAME}]: Stats	Provides the statistics of a node.	DEPENDENT	zabbix.nodes.stats[{#NODE.ID}] Preprocessing: - JSONPATH: `$.data.ha[?(@.id=="{#NODE.ID}")].first()`
Cluster	Cluster node [{#NODE.NAME}]: Address	The IPv4 address of a node.	DEPENDENT	zabbix.nodes.address[{#NODE.ID}] Preprocessing: - JSONPATH: `$.address` - DISCARDUNCHANGEDHEARTBEAT: `12h`
Cluster	Cluster node [{#NODE.NAME}]: Last access time	Last access time.	DEPENDENT	zabbix.nodes.lastaccess.time[{#NODE.ID}] Preprocessing: - JSONPATH: `$.lastaccess`
Cluster	Cluster node [{#NODE.NAME}]: Last access age	The time between the database's `unix_timestamp()` and the last access time.	DEPENDENT	zabbix.nodes.lastaccess.age[{#NODE.ID}] Preprocessing: - JSONPATH: `$.lastaccess_age`
Cluster	Cluster node [{#NODE.NAME}]: Status	The status of a node.	DEPENDENT	zabbix.nodes.status[{#NODE.ID}] Preprocessing: - JSONPATH: `$.status` - DISCARDUNCHANGEDHEARTBEAT: `12h`
Zabbix raw items	Remote Zabbix server: Zabbix stats	The master item of Zabbix server statistics.	INTERNAL	zabbix[stats,{$ADDRESS},{$PORT}]
Zabbix server	Remote Zabbix server: Zabbix stats queue over 10m	The number of monitored items in the queue, which are delayed at least by 10 minutes.	INTERNAL	zabbix[stats,{$ADDRESS},{$PORT},queue,10m] Preprocessing: - JSONPATH: `$.queue`
Zabbix server	Remote Zabbix server: Zabbix stats queue	The number of monitored items in the queue, which are delayed at least by 6 seconds.	INTERNAL	zabbix[stats,{$ADDRESS},{$PORT},queue] Preprocessing: - JSONPATH: `$.queue`
Zabbix server	Remote Zabbix server: Utilization of alert manager internal processes, in %	The average percentage of the time during which the alert manager processes have been busy for the last minute.	DEPENDENT	process.alertmanager.avg.busy Preprocessing: - JSONPATH: `$.data.process['alert manager'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> No "alert manager" processes started.`
Zabbix server	Remote Zabbix server: Utilization of alert syncer internal processes, in %	The average percentage of the time during which the alert syncer processes have been busy for the last minute.	DEPENDENT	process.alertsyncer.avg.busy Preprocessing: - JSONPATH: `$.data.process['alert syncer'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> No "alert syncer" processes started.`
Zabbix server	Remote Zabbix server: Utilization of alerter internal processes, in %	The average percentage of the time during which the alerter processes have been busy for the last minute.	DEPENDENT	process.alerter.avg.busy Preprocessing: - JSONPATH: `$.data.process['alerter'].busy.avg` ⛔️ON_FAIL: `CUSTOM_ERROR -> No "alerter" processes started.`
Zabbix server	Remote Zabbix server: Utilization of availability manager internal processes, in %	The average percentage of the time during which the availability manager processes have been busy for the last minute.	DEPENDENT	process.availabilitymanager.avg.busy Preprocessing: - JSONPATH: `$.data.process['availability manager'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> No "availability manager" processes started.`
Zabbix server	Remote Zabbix server: Utilization of configuration syncer internal processes, in %	The average percentage of the time during which the configuration syncer processes have been busy for the last minute.	DEPENDENT	process.configurationsyncer.avg.busy Preprocessing: - JSONPATH: `$.data.process['configuration syncer'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> No "configuration syncer" processes started.`
Zabbix server	Remote Zabbix server: Utilization of discoverer data collector processes, in %	The average percentage of the time during which the discoverer processes have been busy for the last minute.	DEPENDENT	process.discoverer.avg.busy Preprocessing: - JSONPATH: `$.data.process['discoverer'].busy.avg` ⛔️ON_FAIL: `CUSTOM_ERROR -> No "discoverer" processes started.`
Zabbix server	Remote Zabbix server: Utilization of escalator internal processes, in %	The average percentage of the time during which the escalator processes have been busy for the last minute.	DEPENDENT	process.escalator.avg.busy Preprocessing: - JSONPATH: `$.data.process['escalator'].busy.avg` ⛔️ON_FAIL: `CUSTOM_ERROR -> No "escalator" processes started.`
Zabbix server	Remote Zabbix server: Utilization of history poller data collector processes, in %	The average percentage of the time during which the history poller processes have been busy for the last minute.	DEPENDENT	process.historypoller.avg.busy Preprocessing: - JSONPATH: `$.data.process['history poller'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> No "history poller" processes started.`
Zabbix server	Remote Zabbix server: Utilization of ODBC poller data collector processes, in %	The average percentage of the time during which the ODBC poller processes have been busy for the last minute.	DEPENDENT	process.odbc_poller.avg.busy Preprocessing: - JSONPATH: `$.data.process['odbc poller'].busy.avg`
Zabbix server	Remote Zabbix server: Utilization of history syncer internal processes, in %	The average percentage of the time during which the history syncer processes have been busy for the last minute.	DEPENDENT	process.historysyncer.avg.busy Preprocessing: - JSONPATH: `$.data.process['history syncer'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> No "history syncer" processes started.`
Zabbix server	Remote Zabbix server: Utilization of housekeeper internal processes, in %	The average percentage of the time during which the housekeeper processes have been busy for the last minute.	DEPENDENT	process.housekeeper.avg.busy Preprocessing: - JSONPATH: `$.data.process['housekeeper'].busy.avg` ⛔️ON_FAIL: `CUSTOM_ERROR -> No "housekeeper" processes started.`
Zabbix server	Remote Zabbix server: Utilization of http poller data collector processes, in %	The average percentage of the time during which the http poller processes have been busy for the last minute.	DEPENDENT	process.httppoller.avg.busy Preprocessing: - JSONPATH: `$.data.process['http poller'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> No "http poller" processes started.`
Zabbix server	Remote Zabbix server: Utilization of icmp pinger data collector processes, in %	The average percentage of the time during which the icmp pinger processes have been busy for the last minute.	DEPENDENT	process.icmppinger.avg.busy Preprocessing: - JSONPATH: `$.data.process['icmp pinger'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> No "icmp pinger" processes started.`
Zabbix server	Remote Zabbix server: Utilization of ipmi manager internal processes, in %	The average percentage of the time during which the ipmi manager processes have been busy for the last minute.	DEPENDENT	process.ipmimanager.avg.busy Preprocessing: - JSONPATH: `$.data.process['ipmi manager'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> No "ipmi manager" processes started.`
Zabbix server	Remote Zabbix server: Utilization of ipmi poller data collector processes, in %	The average percentage of the time during which the ipmi poller processes have been busy for the last minute.	DEPENDENT	process.ipmipoller.avg.busy Preprocessing: - JSONPATH: `$.data.process['ipmi poller'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> No "ipmi poller" processes started.`
Zabbix server	Remote Zabbix server: Utilization of java poller data collector processes, in %	The average percentage of the time during which the java poller processes have been busy for the last minute.	DEPENDENT	process.javapoller.avg.busy Preprocessing: - JSONPATH: `$.data.process['java poller'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> No "java poller" processes started.`
Zabbix server	Remote Zabbix server: Utilization of LLD manager internal processes, in %	The average percentage of the time during which the lld manager processes have been busy for the last minute.	DEPENDENT	process.lldmanager.avg.busy Preprocessing: - JSONPATH: `$.data.process['lld manager'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> No "LLD manager" processes started.`
Zabbix server	Remote Zabbix server: Utilization of LLD worker internal processes, in %	The average percentage of the time during which the lld worker processes have been busy for the last minute.	DEPENDENT	process.lldworker.avg.busy Preprocessing: - JSONPATH: `$.data.process['lld worker'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> No "LLD worker" processes started.`
Zabbix server	Remote Zabbix server: Utilization of poller data collector processes, in %	The average percentage of the time during which the poller processes have been busy for the last minute.	DEPENDENT	process.poller.avg.busy Preprocessing: - JSONPATH: `$.data.process['poller'].busy.avg` ⛔️ON_FAIL: `CUSTOM_ERROR -> No "poller" processes started.`
Zabbix server	Remote Zabbix server: Utilization of preprocessing worker internal processes, in %	The average percentage of the time during which the preprocessing worker processes have been busy for the last minute.	DEPENDENT	process.preprocessingworker.avg.busy Preprocessing: - JSONPATH: `$.data.process['preprocessing worker'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> No "preprocessing worker" processes started.`
Zabbix server	Remote Zabbix server: Utilization of preprocessing manager internal processes, in %	The average percentage of the time during which the preprocessing manager processes have been busy for the last minute.	DEPENDENT	process.preprocessingmanager.avg.busy Preprocessing: - JSONPATH: `$.data.process['preprocessing manager'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> No "preprocessing manager" processes started.`
Zabbix server	Remote Zabbix server: Utilization of proxy poller data collector processes, in %	The average percentage of the time during which the proxy poller processes have been busy for the last minute.	DEPENDENT	process.proxypoller.avg.busy Preprocessing: - JSONPATH: `$.data.process['proxy poller'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> No "proxy poller" processes started.`
Zabbix server	Remote Zabbix server: Utilization of report manager internal processes, in %	The average percentage of the time during which the report manager processes have been busy for the last minute.	DEPENDENT	process.reportmanager.avg.busy Preprocessing: - JSONPATH: `$.data.process['report manager'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> No "report manager" processes started.`
Zabbix server	Remote Zabbix server: Utilization of report writer internal processes, in %	The average percentage of the time during which the report writer processes have been busy for the last minute.	DEPENDENT	process.reportwriter.avg.busy Preprocessing: - JSONPATH: `$.data.process['report writer'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> No "report writer" processes started.`
Zabbix server	Remote Zabbix server: Utilization of self-monitoring internal processes, in %	The average percentage of the time during which the self-monitoring processes have been busy for the last minute.	DEPENDENT	process.self-monitoring.avg.busy Preprocessing: - JSONPATH: `$.data.process['self-monitoring'].busy.avg` ⛔️ON_FAIL: `CUSTOM_ERROR -> No "self-monitoring" processes started.`
Zabbix server	Remote Zabbix server: Utilization of snmp trapper data collector processes, in %	The average percentage of the time during which the snmp trapper processes have been busy for the last minute.	DEPENDENT	process.snmptrapper.avg.busy Preprocessing: - JSONPATH: `$.data.process['snmp trapper'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> No "snmp trapper" processes started.`
Zabbix server	Remote Zabbix server: Utilization of task manager internal processes, in %	The average percentage of the time during which the task manager processes have been busy for the last minute.	DEPENDENT	process.taskmanager.avg.busy Preprocessing: - JSONPATH: `$.data.process['task manager'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> No "task manager" processes started.`
Zabbix server	Remote Zabbix server: Utilization of timer internal processes, in %	The average percentage of the time during which the timer processes have been busy for the last minute.	DEPENDENT	process.timer.avg.busy Preprocessing: - JSONPATH: `$.data.process['timer'].busy.avg` ⛔️ON_FAIL: `CUSTOM_ERROR -> No "timer" processes started.`
Zabbix server	Remote Zabbix server: Utilization of service manager internal processes, in %	The average percentage of the time during which the service manager processes have been busy for the last minute.	DEPENDENT	process.servicemanager.avg.busy Preprocessing: - JSONPATH: `$.data.process['service manager'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> No "service manager" processes started.`
Zabbix server	Remote Zabbix server: Utilization of trigger housekeeper internal processes, in %	The average percentage of the time during which the trigger housekeeper processes have been busy for the last minute.	DEPENDENT	process.triggerhousekeeper.avg.busy Preprocessing: - JSONPATH: `$.data.process['trigger housekeeper'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> No "trigger housekeeper" processes started.`
Zabbix server	Remote Zabbix server: Utilization of trapper data collector processes, in %	The average percentage of the time during which the trapper processes have been busy for the last minute.	DEPENDENT	process.trapper.avg.busy Preprocessing: - JSONPATH: `$.data.process['trapper'].busy.avg` ⛔️ON_FAIL: `CUSTOM_ERROR -> No "trapper" processes started.`
Zabbix server	Remote Zabbix server: Utilization of unreachable poller data collector processes, in %	The average percentage of the time during which the unreachable poller processes have been busy for the last minute.	DEPENDENT	process.unreachablepoller.avg.busy Preprocessing: - JSONPATH: `$.data.process['unreachable poller'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> No "unreachable poller" processes started.`
Zabbix server	Remote Zabbix server: Utilization of vmware data collector processes, in %	The average percentage of the time during which the vmware collector processes have been busy for the last minute.	DEPENDENT	process.vmwarecollector.avg.busy Preprocessing: - JSONPATH: `$.data.process['vmware collector'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> No "vmware collector" processes started.`
Zabbix server	Remote Zabbix server: Configuration cache, % used	The availability statistics of Zabbix configuration cache. The percentage of used data buffer.	DEPENDENT	rcache.buffer.pused Preprocessing: - JSONPATH: `$.data.rcache.pused`
Zabbix server	Remote Zabbix server: Trend function cache, % of unique requests	The effectiveness statistics of Zabbix trend function cache. The percentage of cached items calculated from the sum of cached items plus requests. Low percentage most likely means that the cache size can be reduced.	DEPENDENT	tcache.pitems Preprocessing: - JSONPATH: `$.data.tcache.pitems` ⛔️ON_FAIL: `CUSTOM_ERROR -> Not supported in this version.`
Zabbix server	Remote Zabbix server: Trend function cache, % of misses	The effectiveness statistics of Zabbix trend function cache. The percentage of cache misses.	DEPENDENT	tcache.pmisses Preprocessing: - JSONPATH: `$.data.tcache.pmisses` ⛔️ON_FAIL: `CUSTOM_ERROR -> Not supported in this version.`
Zabbix server	Remote Zabbix server: Value cache, % used	The availability statistics of Zabbix value cache. The percentage of used data buffer.	DEPENDENT	vcache.buffer.pused Preprocessing: - JSONPATH: `$.data.vcache.buffer.pused`
Zabbix server	Remote Zabbix server: Value cache hits	The effectiveness statistics of Zabbix value cache. The number of cache hits (history values taken from the cache).	DEPENDENT	vcache.cache.hits Preprocessing: - JSONPATH: `$.data.vcache.cache.hits` - CHANGEPERSECOND
Zabbix server	Remote Zabbix server: Value cache misses	The effectiveness statistics of Zabbix value cache. The number of cache misses (history values taken from the database).	DEPENDENT	vcache.cache.misses Preprocessing: - JSONPATH: `$.data.vcache.cache.misses` - CHANGEPERSECOND
Zabbix server	Remote Zabbix server: Value cache operating mode	The operating mode of the value cache.	DEPENDENT	vcache.cache.mode Preprocessing: - JSONPATH: `$.data.vcache.cache.mode`
Zabbix server	Remote Zabbix server: Version	A version of Zabbix server.	DEPENDENT	version Preprocessing: - JSONPATH: `$.data.version` - DISCARDUNCHANGEDHEARTBEAT: `1d`
Zabbix server	Remote Zabbix server: VMware cache, % used	The availability statistics of Zabbix vmware cache. The percentage of used data buffer.	DEPENDENT	vmware.buffer.pused Preprocessing: - JSONPATH: `$.data.vmware.pused` ⛔️ON_FAIL: `CUSTOM_ERROR -> No "vmware collector" processes started.`
Zabbix server	Remote Zabbix server: History write cache, % used	The statistics and availability of Zabbix write cache. The percentage of used history buffer. The history cache is used to store item values. A high number indicates performance problems on the database side.	DEPENDENT	wcache.history.pused Preprocessing: - JSONPATH: `$.data.wcache.history.pused`
Zabbix server	Remote Zabbix server: History index cache, % used	The statistics and availability of Zabbix write cache. The percentage of used history index buffer. The history index cache is used to index values stored in the history cache.	DEPENDENT	wcache.index.pused Preprocessing: - JSONPATH: `$.data.wcache.index.pused`
Zabbix server	Remote Zabbix server: Trend write cache, % used	The statistics and availability of Zabbix write cache. The percentage of used trend buffer. The trend cache stores the aggregate of all items that have receive data for the current hour.	DEPENDENT	wcache.trend.pused Preprocessing: - JSONPATH: `$.data.wcache.trend.pused`
Zabbix server	Remote Zabbix server: Number of processed values per second	The statistics and availability of Zabbix write cache. The total number of values processed by Zabbix server or Zabbix proxy, except unsupported items.	DEPENDENT	wcache.values Preprocessing: - JSONPATH: `$.data.wcache.values.all` - CHANGEPERSECOND
Zabbix server	Remote Zabbix server: Number of processed numeric (float) values per second	The statistics and availability of Zabbix write cache. The number of processed float values.	DEPENDENT	wcache.values.float Preprocessing: - JSONPATH: `$.data.wcache.values.float` - CHANGEPERSECOND
Zabbix server	Remote Zabbix server: Number of processed log values per second	The statistics and availability of Zabbix write cache. The number of processed log values.	DEPENDENT	wcache.values.log Preprocessing: - JSONPATH: `$.data.wcache.values.log` - CHANGEPERSECOND
Zabbix server	Remote Zabbix server: Number of processed not supported values per second	The statistics and availability of Zabbix write cache. The number of times the item processing resulted in an item becoming unsupported or keeping that state.	DEPENDENT	wcache.values.notsupported Preprocessing: - JSONPATH: `$.data.wcache.values['not supported']` - CHANGEPER_SECOND
Zabbix server	Remote Zabbix server: Number of processed character values per second	The statistics and availability of Zabbix write cache. The number of processed character/string values.	DEPENDENT	wcache.values.str Preprocessing: - JSONPATH: `$.data.wcache.values.str` - CHANGEPERSECOND
Zabbix server	Remote Zabbix server: Number of processed text values per second	The statistics and availability of Zabbix write cache. The number of processed text values.	DEPENDENT	wcache.values.text Preprocessing: - JSONPATH: `$.data.wcache.values.text` - CHANGEPERSECOND
Zabbix server	Remote Zabbix server: LLD queue	The count of values enqueued in the low-level discovery processing queue.	DEPENDENT	lld_queue Preprocessing: - JSONPATH: `$.data.lld_queue`
Zabbix server	Remote Zabbix server: Preprocessing queue	The count of values enqueued in the preprocessing queue.	DEPENDENT	preprocessing_queue Preprocessing: - JSONPATH: `$.data.preprocessing_queue`
Zabbix server	Remote Zabbix server: Number of processed numeric (unsigned) values per second	The statistics and availability of Zabbix write cache. The number of processed numeric (unsigned) values.	DEPENDENT	wcache.values.uint Preprocessing: - JSONPATH: `$.data.wcache.values.uint` - CHANGEPERSECOND

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Cluster node [{#NODE.NAME}]: Status changed	The state of the node has changed. Confirm to close.	`last(/Remote Zabbix server health/zabbix.nodes.status[{#NODE.ID}],#1)<>last(/Remote Zabbix server health/zabbix.nodes.status[{#NODE.ID}],#2)`	INFO	Manual close: YES
Remote Zabbix server: More than 100 items having missing data for more than 10 minutes	The `zabbix[stats,{$IP},{$PORT},queue,10m]` item collects data about the number of items that have been missing the data for more than 10 minutes.	`min(/Remote Zabbix server health/zabbix[stats,{$ADDRESS},{$PORT},queue,10m],10m)>100`	WARNING
Remote Zabbix server: Utilization of alert manager processes is high	-	`avg(/Remote Zabbix server health/process.alert_manager.avg.busy,10m)>75` Recovery expression: `avg(/Remote Zabbix server health/process.alert_manager.avg.busy,10m)<65`	AVERAGE
Remote Zabbix server: Utilization of alert syncer processes is high	-	`avg(/Remote Zabbix server health/process.alert_syncer.avg.busy,10m)>75` Recovery expression: `avg(/Remote Zabbix server health/process.alert_syncer.avg.busy,10m)<65`	AVERAGE
Remote Zabbix server: Utilization of alerter processes is high	-	`avg(/Remote Zabbix server health/process.alerter.avg.busy,10m)>75` Recovery expression: `avg(/Remote Zabbix server health/process.alerter.avg.busy,10m)<65`	AVERAGE
Remote Zabbix server: Utilization of availability manager processes is high	-	`avg(/Remote Zabbix server health/process.availability_manager.avg.busy,10m)>75` Recovery expression: `avg(/Remote Zabbix server health/process.availability_manager.avg.busy,10m)<65`	AVERAGE
Remote Zabbix server: Utilization of configuration syncer processes is high	-	`avg(/Remote Zabbix server health/process.configuration_syncer.avg.busy,10m)>75` Recovery expression: `avg(/Remote Zabbix server health/process.configuration_syncer.avg.busy,10m)<65`	AVERAGE
Remote Zabbix server: Utilization of discoverer processes is high	-	`avg(/Remote Zabbix server health/process.discoverer.avg.busy,10m)>75` Recovery expression: `avg(/Remote Zabbix server health/process.discoverer.avg.busy,10m)<65`	AVERAGE
Remote Zabbix server: Utilization of escalator processes is high	-	`avg(/Remote Zabbix server health/process.escalator.avg.busy,10m)>75` Recovery expression: `avg(/Remote Zabbix server health/process.escalator.avg.busy,10m)<65`	AVERAGE
Remote Zabbix server: Utilization of history poller processes is high	-	`avg(/Remote Zabbix server health/process.history_poller.avg.busy,10m)>75` Recovery expression: `avg(/Remote Zabbix server health/process.history_poller.avg.busy,10m)<65`	AVERAGE
Remote Zabbix server: Utilization of ODBC poller processes is high	-	`avg(/Remote Zabbix server health/process.odbc_poller.avg.busy,10m)>75` Recovery expression: `avg(/Remote Zabbix server health/process.odbc_poller.avg.busy,10m)<65`	AVERAGE
Remote Zabbix server: Utilization of history syncer processes is high	-	`avg(/Remote Zabbix server health/process.history_syncer.avg.busy,10m)>75` Recovery expression: `avg(/Remote Zabbix server health/process.history_syncer.avg.busy,10m)<65`	AVERAGE
Remote Zabbix server: Utilization of housekeeper processes is high	-	`avg(/Remote Zabbix server health/process.housekeeper.avg.busy,10m)>75` Recovery expression: `avg(/Remote Zabbix server health/process.housekeeper.avg.busy,10m)<65`	AVERAGE
Remote Zabbix server: Utilization of http poller processes is high	-	`avg(/Remote Zabbix server health/process.http_poller.avg.busy,10m)>75` Recovery expression: `avg(/Remote Zabbix server health/process.http_poller.avg.busy,10m)<65`	AVERAGE
Remote Zabbix server: Utilization of icmp pinger processes is high	-	`avg(/Remote Zabbix server health/process.icmp_pinger.avg.busy,10m)>75` Recovery expression: `avg(/Remote Zabbix server health/process.icmp_pinger.avg.busy,10m)<65`	AVERAGE
Remote Zabbix server: Utilization of ipmi manager processes is high	-	`avg(/Remote Zabbix server health/process.ipmi_manager.avg.busy,10m)>75` Recovery expression: `avg(/Remote Zabbix server health/process.ipmi_manager.avg.busy,10m)<65`	AVERAGE
Remote Zabbix server: Utilization of ipmi poller processes is high	-	`avg(/Remote Zabbix server health/process.ipmi_poller.avg.busy,10m)>75` Recovery expression: `avg(/Remote Zabbix server health/process.ipmi_poller.avg.busy,10m)<65`	AVERAGE
Remote Zabbix server: Utilization of java poller processes is high	-	`avg(/Remote Zabbix server health/process.java_poller.avg.busy,10m)>75` Recovery expression: `avg(/Remote Zabbix server health/process.java_poller.avg.busy,10m)<65`	AVERAGE
Remote Zabbix server: Utilization of lld manager processes is high	-	`avg(/Remote Zabbix server health/process.lld_manager.avg.busy,10m)>75` Recovery expression: `avg(/Remote Zabbix server health/process.lld_manager.avg.busy,10m)<65`	AVERAGE
Remote Zabbix server: Utilization of lld worker processes is high	-	`avg(/Remote Zabbix server health/process.lld_worker.avg.busy,10m)>75` Recovery expression: `avg(/Remote Zabbix server health/process.lld_worker.avg.busy,10m)<65`	AVERAGE
Remote Zabbix server: Utilization of poller processes is high	-	`avg(/Remote Zabbix server health/process.poller.avg.busy,10m)>75` Recovery expression: `avg(/Remote Zabbix server health/process.poller.avg.busy,10m)<65`	AVERAGE
Remote Zabbix server: Utilization of preprocessing worker processes is high	-	`avg(/Remote Zabbix server health/process.preprocessing_worker.avg.busy,10m)>75` Recovery expression: `avg(/Remote Zabbix server health/process.preprocessing_worker.avg.busy,10m)<65`	AVERAGE
Remote Zabbix server: Utilization of preprocessing manager processes is high	-	`avg(/Remote Zabbix server health/process.preprocessing_manager.avg.busy,10m)>75` Recovery expression: `avg(/Remote Zabbix server health/process.preprocessing_manager.avg.busy,10m)<65`	AVERAGE
Remote Zabbix server: Utilization of proxy poller processes is high	-	`avg(/Remote Zabbix server health/process.proxy_poller.avg.busy,10m)>75` Recovery expression: `avg(/Remote Zabbix server health/process.proxy_poller.avg.busy,10m)<65`	AVERAGE
Remote Zabbix server: Utilization of report manager processes is high	-	`avg(/Remote Zabbix server health/process.report_manager.avg.busy,10m)>75` Recovery expression: `avg(/Remote Zabbix server health/process.report_manager.avg.busy,10m)<65`	AVERAGE
Remote Zabbix server: Utilization of report writer processes is high	-	`avg(/Remote Zabbix server health/process.report_writer.avg.busy,10m)>75` Recovery expression: `avg(/Remote Zabbix server health/process.report_writer.avg.busy,10m)<65`	AVERAGE
Remote Zabbix server: Utilization of self-monitoring processes is high	-	`avg(/Remote Zabbix server health/process.self-monitoring.avg.busy,10m)>75` Recovery expression: `avg(/Remote Zabbix server health/process.self-monitoring.avg.busy,10m)<65`	AVERAGE
Remote Zabbix server: Utilization of snmp trapper processes is high	-	`avg(/Remote Zabbix server health/process.snmp_trapper.avg.busy,10m)>75` Recovery expression: `avg(/Remote Zabbix server health/process.snmp_trapper.avg.busy,10m)<65`	AVERAGE
Remote Zabbix server: Utilization of task manager processes is high	-	`avg(/Remote Zabbix server health/process.task_manager.avg.busy,10m)>75` Recovery expression: `avg(/Remote Zabbix server health/process.task_manager.avg.busy,10m)<65`	AVERAGE
Remote Zabbix server: Utilization of timer processes is high	-	`avg(/Remote Zabbix server health/process.timer.avg.busy,10m)>75` Recovery expression: `avg(/Remote Zabbix server health/process.timer.avg.busy,10m)<65`	AVERAGE
Remote Zabbix server: Utilization of service manager processes is high	-	`avg(/Remote Zabbix server health/process.service_manager.avg.busy,10m)>75` Recovery expression: `avg(/Remote Zabbix server health/process.service_manager.avg.busy,10m)<65`	AVERAGE
Remote Zabbix server: Utilization of trigger housekeeper processes is high	-	`avg(/Remote Zabbix server health/process.trigger_housekeeper.avg.busy,10m)>75` Recovery expression: `avg(/Remote Zabbix server health/process.trigger_housekeeper.avg.busy,10m)<65`	AVERAGE
Remote Zabbix server: Utilization of trapper processes is high	-	`avg(/Remote Zabbix server health/process.trapper.avg.busy,10m)>75` Recovery expression: `avg(/Remote Zabbix server health/process.trapper.avg.busy,10m)<65`	AVERAGE
Remote Zabbix server: Utilization of unreachable poller processes is high	-	`avg(/Remote Zabbix server health/process.unreachable_poller.avg.busy,10m)>75` Recovery expression: `avg(/Remote Zabbix server health/process.unreachable_poller.avg.busy,10m)<65`	AVERAGE
Remote Zabbix server: Utilization of vmware collector processes is high	-	`avg(/Remote Zabbix server health/process.vmware_collector.avg.busy,10m)>75` Recovery expression: `avg(/Remote Zabbix server health/process.vmware_collector.avg.busy,10m)<65`	AVERAGE
Remote Zabbix server: More than 75% used in the configuration cache	Consider increasing `CacheSize` in the `zabbix_server.conf` configuration file.	`max(/Remote Zabbix server health/rcache.buffer.pused,10m)>75`	AVERAGE
Remote Zabbix server: More than 95% used in the value cache	Consider increasing `ValueCacheSize` in the `zabbix_server.conf` configuration file.	`max(/Remote Zabbix server health/vcache.buffer.pused,10m)>95`	AVERAGE
Remote Zabbix server: Zabbix value cache working in low memory mode	Once the low memory mode has been switched on, the value cache will remain in this state for 24 hours, even if the problem that triggered this mode is resolved sooner.	`last(/Remote Zabbix server health/vcache.cache.mode)=1`	HIGH
Remote Zabbix server: Version has changed	Zabbix server version has changed. Acknowledge to close manually.	`last(/Remote Zabbix server health/version,#1)<>last(/Remote Zabbix server health/version,#2) and length(last(/Remote Zabbix server health/version))>0`	INFO	Manual close: YES
Remote Zabbix server: More than 75% used in the vmware cache	Consider increasing `VMwareCacheSize` in the `zabbix_server.conf` configuration file.	`max(/Remote Zabbix server health/vmware.buffer.pused,10m)>75`	AVERAGE
Remote Zabbix server: More than 75% used in the history cache	Consider increasing `HistoryCacheSize` in the `zabbix_server.conf` configuration file.	`max(/Remote Zabbix server health/wcache.history.pused,10m)>75`	AVERAGE
Remote Zabbix server: More than 75% used in the history index cache	Consider increasing `HistoryIndexCacheSize` in the `zabbix_server.conf` configuration file.	`max(/Remote Zabbix server health/wcache.index.pused,10m)>75`	AVERAGE
Remote Zabbix server: More than 75% used in the trends cache	Consider increasing `TrendCacheSize` in the `zabbix_server.conf` configuration file.	`max(/Remote Zabbix server health/wcache.trend.pused,10m)>75`	AVERAGE

Feedback

Please report any issues with the template at https://support.zabbix.com.

app

app_zabbix_server

View README Download JSON

Zabbix server health

Overview

For Zabbix version: 6.2 and higher. This template is designed to monitor internal Zabbix metrics on the local Zabbix server.

Setup

Link this template to the local Zabbix server host.

Configuration

No specific Zabbix configuration is required.

Template links

There are no template links in this template.

Discovery rules

Name	Description	Type	Key and additional info
High availability cluster node discovery	LLD rule with item and trigger prototypes for the node discovery.	DEPENDENT	zabbix.nodes.discovery

Items collected

Group	Name	Description	Type	Key and additional info
Cluster	Cluster node [{#NODE.NAME}]: Stats	Provides the statistics of a node.	DEPENDENT	zabbix.nodes.stats[{#NODE.ID}] Preprocessing: - JSONPATH: `$.[?(@.id=="{#NODE.ID}")].first()`
Cluster	Cluster node [{#NODE.NAME}]: Address	The IPv4 address of a node.	DEPENDENT	zabbix.nodes.address[{#NODE.ID}] Preprocessing: - JSONPATH: `$.address` - DISCARDUNCHANGEDHEARTBEAT: `12h`
Cluster	Cluster node [{#NODE.NAME}]: Last access time	Last access time.	DEPENDENT	zabbix.nodes.lastaccess.time[{#NODE.ID}] Preprocessing: - JSONPATH: `$.lastaccess`
Cluster	Cluster node [{#NODE.NAME}]: Last access age	The time between the database's `unix_timestamp()` and the last access time.	DEPENDENT	zabbix.nodes.lastaccess.age[{#NODE.ID}] Preprocessing: - JSONPATH: `$.lastaccess_age`
Cluster	Cluster node [{#NODE.NAME}]: Status	The status of a node.	DEPENDENT	zabbix.nodes.status[{#NODE.ID}] Preprocessing: - JSONPATH: `$.status` - DISCARDUNCHANGEDHEARTBEAT: `12h`
Zabbix raw items	Zabbix stats cluster	The master item of Zabbix cluster statistics.	INTERNAL	zabbix[cluster,discovery,nodes]
Zabbix server	Zabbix server: Queue over 10 minutes	The number of monitored items in the queue, which are delayed at least by 10 minutes.	INTERNAL	zabbix[queue,10m]
Zabbix server	Zabbix server: Queue	The number of monitored items in the queue, which are delayed at least by 6 seconds.	INTERNAL	zabbix[queue]
Zabbix server	Zabbix server: Utilization of alert manager internal processes, in %	The average percentage of the time during which the alert manager processes have been busy for the last minute.	INTERNAL	zabbix[process,alert manager,avg,busy]
Zabbix server	Zabbix server: Utilization of alert syncer internal processes, in %	The average percentage of the time during which the alert syncer processes have been busy for the last minute.	INTERNAL	zabbix[process,alert syncer,avg,busy]
Zabbix server	Zabbix server: Utilization of alerter internal processes, in %	The average percentage of the time during which the alerter processes have been busy for the last minute.	INTERNAL	zabbix[process,alerter,avg,busy]
Zabbix server	Zabbix server: Utilization of availability manager internal processes, in %	The average percentage of the time during which the availability manager processes have been busy for the last minute.	INTERNAL	zabbix[process,availability manager,avg,busy]
Zabbix server	Zabbix server: Utilization of configuration syncer internal processes, in %	The average percentage of the time during which the configuration syncer processes have been busy for the last minute.	INTERNAL	zabbix[process,configuration syncer,avg,busy]
Zabbix server	Zabbix server: Utilization of discoverer data collector processes, in %	The average percentage of the time during which the discoverer processes have been busy for the last minute.	INTERNAL	zabbix[process,discoverer,avg,busy]
Zabbix server	Zabbix server: Utilization of escalator internal processes, in %	The average percentage of the time during which the escalator processes have been busy for the last minute.	INTERNAL	zabbix[process,escalator,avg,busy]
Zabbix server	Zabbix server: Utilization of history poller data collector processes, in %	The average percentage of the time during which the history poller processes have been busy for the last minute.	INTERNAL	zabbix[process,history poller,avg,busy]
Zabbix server	Zabbix server: Utilization of ODBC poller data collector processes, in %	The average percentage of the time during which the ODBC poller processes have been busy for the last minute.	INTERNAL	zabbix[process,odbc poller,avg,busy]
Zabbix server	Zabbix server: Utilization of history syncer internal processes, in %	The average percentage of the time during which the history syncer processes have been busy for the last minute.	INTERNAL	zabbix[process,history syncer,avg,busy]
Zabbix server	Zabbix server: Utilization of housekeeper internal processes, in %	The average percentage of the time during which the housekeeper processes have been busy for the last minute.	INTERNAL	zabbix[process,housekeeper,avg,busy]
Zabbix server	Zabbix server: Utilization of http poller data collector processes, in %	The average percentage of the time during which the http poller processes have been busy for the last minute.	INTERNAL	zabbix[process,http poller,avg,busy]
Zabbix server	Zabbix server: Utilization of icmp pinger data collector processes, in %	The average percentage of the time during which the icmp pinger processes have been busy for the last minute.	INTERNAL	zabbix[process,icmp pinger,avg,busy]
Zabbix server	Zabbix server: Utilization of ipmi manager internal processes, in %	The average percentage of the time during which the ipmi manager processes have been busy for the last minute.	INTERNAL	zabbix[process,ipmi manager,avg,busy]
Zabbix server	Zabbix server: Utilization of ipmi poller data collector processes, in %	The average percentage of the time during which the ipmi poller processes have been busy for the last minute.	INTERNAL	zabbix[process,ipmi poller,avg,busy]
Zabbix server	Zabbix server: Utilization of java poller data collector processes, in %	The average percentage of the time during which the java poller processes have been busy for the last minute.	INTERNAL	zabbix[process,java poller,avg,busy]
Zabbix server	Zabbix server: Utilization of LLD manager internal processes, in %	The average percentage of the time during which the lld manager processes have been busy for the last minute.	INTERNAL	zabbix[process,lld manager,avg,busy]
Zabbix server	Zabbix server: Utilization of LLD worker internal processes, in %	The average percentage of the time during which the lld worker processes have been busy for the last minute.	INTERNAL	zabbix[process,lld worker,avg,busy]
Zabbix server	Zabbix server: Utilization of poller data collector processes, in %	The average percentage of the time during which the poller processes have been busy for the last minute.	INTERNAL	zabbix[process,poller,avg,busy]
Zabbix server	Zabbix server: Utilization of preprocessing worker internal processes, in %	The average percentage of the time during which the preprocessing worker processes have been busy for the last minute.	INTERNAL	zabbix[process,preprocessing worker,avg,busy]
Zabbix server	Zabbix server: Utilization of preprocessing manager internal processes, in %	The average percentage of the time during which the preprocessing manager processes have been busy for the last minute.	INTERNAL	zabbix[process,preprocessing manager,avg,busy]
Zabbix server	Zabbix server: Utilization of proxy poller data collector processes, in %	The average percentage of the time during which the proxy poller processes have been busy for the last minute.	INTERNAL	zabbix[process,proxy poller,avg,busy]
Zabbix server	Zabbix server: Utilization of report manager internal processes, in %	The average percentage of the time during which the report manager processes have been busy for the last minute.	INTERNAL	zabbix[process,report manager,avg,busy]
Zabbix server	Zabbix server: Utilization of report writer internal processes, in %	The average percentage of the time during which the report writer processes have been busy for the last minute.	INTERNAL	zabbix[process,report writer,avg,busy]
Zabbix server	Zabbix server: Utilization of self-monitoring internal processes, in %	The average percentage of the time during which the self-monitoring processes have been busy for the last minute.	INTERNAL	zabbix[process,self-monitoring,avg,busy]
Zabbix server	Zabbix server: Utilization of snmp trapper data collector processes, in %	The average percentage of the time during which the snmp trapper processes have been busy for the last minute.	INTERNAL	zabbix[process,snmp trapper,avg,busy]
Zabbix server	Zabbix server: Utilization of task manager internal processes, in %	The average percentage of the time during which the task manager processes have been busy for the last minute.	INTERNAL	zabbix[process,task manager,avg,busy]
Zabbix server	Zabbix server: Utilization of timer internal processes, in %	The average percentage of the time during which the timer processes have been busy for the last minute.	INTERNAL	zabbix[process,timer,avg,busy]
Zabbix server	Zabbix server: Utilization of service manager internal processes, in %	The average percentage of the time during which the service manager processes have been busy for the last minute.	INTERNAL	zabbix[process,service manager,avg,busy]
Zabbix server	Zabbix server: Utilization of trigger housekeeper internal processes, in %	The average percentage of the time during which the trigger housekeeper processes have been busy for the last minute.	INTERNAL	zabbix[process,trigger housekeeper,avg,busy]
Zabbix server	Zabbix server: Utilization of trapper data collector processes, in %	The average percentage of the time during which the trapper processes have been busy for the last minute.	INTERNAL	zabbix[process,trapper,avg,busy]
Zabbix server	Zabbix server: Utilization of unreachable poller data collector processes, in %	The average percentage of the time during which the unreachable poller processes have been busy for the last minute.	INTERNAL	zabbix[process,unreachable poller,avg,busy]
Zabbix server	Zabbix server: Utilization of vmware data collector processes, in %	The average percentage of the time during which the vmware collector processes have been busy for the last minute.	INTERNAL	zabbix[process,vmware collector,avg,busy]
Zabbix server	Zabbix server: Configuration cache, % used	The availability statistics of Zabbix configuration cache. The percentage of used data buffer.	INTERNAL	zabbix[rcache,buffer,pused]
Zabbix server	Zabbix server: Trend function cache, % of unique requests	The effectiveness statistics of Zabbix trend function cache. The percentage of cached items calculated from the sum of cached items plus requests. Low percentage most likely means that the cache size can be reduced.	INTERNAL	zabbix[tcache,cache,pitems]
Zabbix server	Zabbix server: Trend function cache, % of misses	The effectiveness statistics of Zabbix trend function cache. The percentage of cache misses.	INTERNAL	zabbix[tcache,cache,pmisses]
Zabbix server	Zabbix server: Value cache, % used	The availability statistics of Zabbix value cache. The percentage of used data buffer.	INTERNAL	zabbix[vcache,buffer,pused]
Zabbix server	Zabbix server: Value cache hits	The effectiveness statistics of Zabbix value cache. The number of cache hits (history values taken from the cache).	INTERNAL	zabbix[vcache,cache,hits] Preprocessing: - CHANGEPERSECOND
Zabbix server	Zabbix server: Value cache misses	The effectiveness statistics of Zabbix value cache. The number of cache misses (history values taken from the database).	INTERNAL	zabbix[vcache,cache,misses] Preprocessing: - CHANGEPERSECOND
Zabbix server	Zabbix server: Value cache operating mode	The operating mode of the value cache.	INTERNAL	zabbix[vcache,cache,mode]
Zabbix server	Zabbix server: Version	A version of Zabbix server.	INTERNAL	zabbix[version] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1d`
Zabbix server	Zabbix server: VMware cache, % used	The availability statistics of Zabbix vmware cache. The percentage of used data buffer.	INTERNAL	zabbix[vmware,buffer,pused]
Zabbix server	Zabbix server: History write cache, % used	The statistics and availability of Zabbix write cache. The percentage of used history buffer. The history cache is used to store item values. A high number indicates performance problems on the database side.	INTERNAL	zabbix[wcache,history,pused]
Zabbix server	Zabbix server: History index cache, % used	The statistics and availability of Zabbix write cache. The percentage of used history index buffer. The history index cache is used to index values stored in the history cache.	INTERNAL	zabbix[wcache,index,pused]
Zabbix server	Zabbix server: Trend write cache, % used	The statistics and availability of Zabbix write cache. The percentage of used trend buffer. The trend cache stores the aggregate of all items that have received data for the current hour.	INTERNAL	zabbix[wcache,trend,pused]
Zabbix server	Zabbix server: Number of processed values per second	The statistics and availability of Zabbix write cache. The total number of values processed by Zabbix server or Zabbix proxy, except unsupported items.	INTERNAL	zabbix[wcache,values] Preprocessing: - CHANGEPERSECOND
Zabbix server	Zabbix server: Number of processed numeric (float) values per second	The statistics and availability of Zabbix write cache. The number of processed float values.	INTERNAL	zabbix[wcache,values,float] Preprocessing: - CHANGEPERSECOND
Zabbix server	Zabbix server: Number of processed log values per second	The statistics and availability of Zabbix write cache. The number of processed log values.	INTERNAL	zabbix[wcache,values,log] Preprocessing: - CHANGEPERSECOND
Zabbix server	Zabbix server: Number of processed not supported values per second	The statistics and availability of Zabbix write cache. The number of times the item processing resulted in an item becoming unsupported or keeping that state.	INTERNAL	zabbix[wcache,values,not supported] Preprocessing: - CHANGEPERSECOND
Zabbix server	Zabbix server: Number of processed character values per second	The statistics and availability of Zabbix write cache. The number of processed character/string values.	INTERNAL	zabbix[wcache,values,str] Preprocessing: - CHANGEPERSECOND
Zabbix server	Zabbix server: Number of processed text values per second	The statistics and availability of Zabbix write cache. The number of processed text values.	INTERNAL	zabbix[wcache,values,text] Preprocessing: - CHANGEPERSECOND
Zabbix server	Zabbix server: LLD queue	The count of values enqueued in the low-level discovery processing queue.	INTERNAL	zabbix[lld_queue]
Zabbix server	Zabbix server: Preprocessing queue	The count of values enqueued in the preprocessing queue.	INTERNAL	zabbix[preprocessing_queue]
Zabbix server	Zabbix server: Number of processed numeric (unsigned) values per second	The statistics and availability of Zabbix write cache. The number of processed numeric (unsigned) values.	INTERNAL	zabbix[wcache,values,uint] Preprocessing: - CHANGEPERSECOND

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Cluster node [{#NODE.NAME}]: Status changed	The state of the node has changed. Confirm to close.	`last(/Zabbix server health/zabbix.nodes.status[{#NODE.ID}],#1)<>last(/Zabbix server health/zabbix.nodes.status[{#NODE.ID}],#2)`	INFO	Manual close: YES
Zabbix server: More than 100 items having missing data for more than 10 minutes	The `zabbix[stats,{$IP},{$PORT},queue,10m]` item collects data about the number of items that have been missing the data for more than 10 minutes.	`min(/Zabbix server health/zabbix[queue,10m],10m)>100`	WARNING
Zabbix server: Utilization of alert manager processes is high	-	`avg(/Zabbix server health/zabbix[process,alert manager,avg,busy],10m)>75` Recovery expression: `avg(/Zabbix server health/zabbix[process,alert manager,avg,busy],10m)<65`	AVERAGE
Zabbix server: Utilization of alert syncer processes is high	-	`avg(/Zabbix server health/zabbix[process,alert syncer,avg,busy],10m)>75` Recovery expression: `avg(/Zabbix server health/zabbix[process,alert syncer,avg,busy],10m)<65`	AVERAGE
Zabbix server: Utilization of alerter processes is high	-	`avg(/Zabbix server health/zabbix[process,alerter,avg,busy],10m)>75` Recovery expression: `avg(/Zabbix server health/zabbix[process,alerter,avg,busy],10m)<65`	AVERAGE
Zabbix server: Utilization of availability manager processes is high	-	`avg(/Zabbix server health/zabbix[process,availability manager,avg,busy],10m)>75` Recovery expression: `avg(/Zabbix server health/zabbix[process,availability manager,avg,busy],10m)<65`	AVERAGE
Zabbix server: Utilization of configuration syncer processes is high	-	`avg(/Zabbix server health/zabbix[process,configuration syncer,avg,busy],10m)>75` Recovery expression: `avg(/Zabbix server health/zabbix[process,configuration syncer,avg,busy],10m)<65`	AVERAGE
Zabbix server: Utilization of discoverer processes is high	-	`avg(/Zabbix server health/zabbix[process,discoverer,avg,busy],10m)>75` Recovery expression: `avg(/Zabbix server health/zabbix[process,discoverer,avg,busy],10m)<65`	AVERAGE
Zabbix server: Utilization of escalator processes is high	-	`avg(/Zabbix server health/zabbix[process,escalator,avg,busy],10m)>75` Recovery expression: `avg(/Zabbix server health/zabbix[process,escalator,avg,busy],10m)<65`	AVERAGE
Zabbix server: Utilization of history poller processes is high	-	`avg(/Zabbix server health/zabbix[process,history poller,avg,busy],10m)>75` Recovery expression: `avg(/Zabbix server health/zabbix[process,history poller,avg,busy],10m)<65`	AVERAGE
Zabbix server: Utilization of ODBC poller processes is high	-	`avg(/Zabbix server health/zabbix[process,odbc poller,avg,busy],10m)>75` Recovery expression: `avg(/Zabbix server health/zabbix[process,odbc poller,avg,busy],10m)<65`	AVERAGE
Zabbix server: Utilization of history syncer processes is high	-	`avg(/Zabbix server health/zabbix[process,history syncer,avg,busy],10m)>75` Recovery expression: `avg(/Zabbix server health/zabbix[process,history syncer,avg,busy],10m)<65`	AVERAGE
Zabbix server: Utilization of housekeeper processes is high	-	`avg(/Zabbix server health/zabbix[process,housekeeper,avg,busy],10m)>75` Recovery expression: `avg(/Zabbix server health/zabbix[process,housekeeper,avg,busy],10m)<65`	AVERAGE
Zabbix server: Utilization of http poller processes is high	-	`avg(/Zabbix server health/zabbix[process,http poller,avg,busy],10m)>75` Recovery expression: `avg(/Zabbix server health/zabbix[process,http poller,avg,busy],10m)<65`	AVERAGE
Zabbix server: Utilization of icmp pinger processes is high	-	`avg(/Zabbix server health/zabbix[process,icmp pinger,avg,busy],10m)>75` Recovery expression: `avg(/Zabbix server health/zabbix[process,icmp pinger,avg,busy],10m)<65`	AVERAGE
Zabbix server: Utilization of ipmi manager processes is high	-	`avg(/Zabbix server health/zabbix[process,ipmi manager,avg,busy],10m)>75` Recovery expression: `avg(/Zabbix server health/zabbix[process,ipmi manager,avg,busy],10m)<65`	AVERAGE
Zabbix server: Utilization of ipmi poller processes is high	-	`avg(/Zabbix server health/zabbix[process,ipmi poller,avg,busy],10m)>75` Recovery expression: `avg(/Zabbix server health/zabbix[process,ipmi poller,avg,busy],10m)<65`	AVERAGE
Zabbix server: Utilization of java poller processes is high	-	`avg(/Zabbix server health/zabbix[process,java poller,avg,busy],10m)>75` Recovery expression: `avg(/Zabbix server health/zabbix[process,java poller,avg,busy],10m)<65`	AVERAGE
Zabbix server: Utilization of lld manager processes is high	-	`avg(/Zabbix server health/zabbix[process,lld manager,avg,busy],10m)>75` Recovery expression: `avg(/Zabbix server health/zabbix[process,lld manager,avg,busy],10m)<65`	AVERAGE
Zabbix server: Utilization of lld worker processes is high	-	`avg(/Zabbix server health/zabbix[process,lld worker,avg,busy],10m)>75` Recovery expression: `avg(/Zabbix server health/zabbix[process,lld worker,avg,busy],10m)<65`	AVERAGE
Zabbix server: Utilization of poller processes is high	-	`avg(/Zabbix server health/zabbix[process,poller,avg,busy],10m)>75` Recovery expression: `avg(/Zabbix server health/zabbix[process,poller,avg,busy],10m)<65`	AVERAGE
Zabbix server: Utilization of preprocessing worker processes is high	-	`avg(/Zabbix server health/zabbix[process,preprocessing worker,avg,busy],10m)>75` Recovery expression: `avg(/Zabbix server health/zabbix[process,preprocessing worker,avg,busy],10m)<65`	AVERAGE
Zabbix server: Utilization of preprocessing manager processes is high	-	`avg(/Zabbix server health/zabbix[process,preprocessing manager,avg,busy],10m)>75` Recovery expression: `avg(/Zabbix server health/zabbix[process,preprocessing manager,avg,busy],10m)<65`	AVERAGE
Zabbix server: Utilization of proxy poller processes is high	-	`avg(/Zabbix server health/zabbix[process,proxy poller,avg,busy],10m)>75` Recovery expression: `avg(/Zabbix server health/zabbix[process,proxy poller,avg,busy],10m)<65`	AVERAGE
Zabbix server: Utilization of report manager processes is high	-	`avg(/Zabbix server health/zabbix[process,report manager,avg,busy],10m)>75` Recovery expression: `avg(/Zabbix server health/zabbix[process,report manager,avg,busy],10m)<65`	AVERAGE
Zabbix server: Utilization of report writer processes is high	-	`avg(/Zabbix server health/zabbix[process,report writer,avg,busy],10m)>75` Recovery expression: `avg(/Zabbix server health/zabbix[process,report writer,avg,busy],10m)<65`	AVERAGE
Zabbix server: Utilization of self-monitoring processes is high	-	`avg(/Zabbix server health/zabbix[process,self-monitoring,avg,busy],10m)>75` Recovery expression: `avg(/Zabbix server health/zabbix[process,self-monitoring,avg,busy],10m)<65`	AVERAGE
Zabbix server: Utilization of snmp trapper processes is high	-	`avg(/Zabbix server health/zabbix[process,snmp trapper,avg,busy],10m)>75` Recovery expression: `avg(/Zabbix server health/zabbix[process,snmp trapper,avg,busy],10m)<65`	AVERAGE
Zabbix server: Utilization of task manager processes is high	-	`avg(/Zabbix server health/zabbix[process,task manager,avg,busy],10m)>75` Recovery expression: `avg(/Zabbix server health/zabbix[process,task manager,avg,busy],10m)<65`	AVERAGE
Zabbix server: Utilization of timer processes is high	-	`avg(/Zabbix server health/zabbix[process,timer,avg,busy],10m)>75` Recovery expression: `avg(/Zabbix server health/zabbix[process,timer,avg,busy],10m)<65`	AVERAGE
Zabbix server: Utilization of service manager processes is high	-	`avg(/Zabbix server health/zabbix[process,service manager,avg,busy],10m)>75` Recovery expression: `avg(/Zabbix server health/zabbix[process,service manager,avg,busy],10m)<65`	AVERAGE
Zabbix server: Utilization of trigger housekeeper processes is high	-	`avg(/Zabbix server health/zabbix[process,trigger housekeeper,avg,busy],10m)>75` Recovery expression: `avg(/Zabbix server health/zabbix[process,trigger housekeeper,avg,busy],10m)<65`	AVERAGE
Zabbix server: Utilization of trapper processes is high	-	`avg(/Zabbix server health/zabbix[process,trapper,avg,busy],10m)>75` Recovery expression: `avg(/Zabbix server health/zabbix[process,trapper,avg,busy],10m)<65`	AVERAGE
Zabbix server: Utilization of unreachable poller processes is high	-	`avg(/Zabbix server health/zabbix[process,unreachable poller,avg,busy],10m)>75` Recovery expression: `avg(/Zabbix server health/zabbix[process,unreachable poller,avg,busy],10m)<65`	AVERAGE
Zabbix server: Utilization of vmware collector processes is high	-	`avg(/Zabbix server health/zabbix[process,vmware collector,avg,busy],10m)>75` Recovery expression: `avg(/Zabbix server health/zabbix[process,vmware collector,avg,busy],10m)<65`	AVERAGE
Zabbix server: More than 75% used in the configuration cache	Consider increasing `CacheSize` in the `zabbix_server.conf` configuration file.	`max(/Zabbix server health/zabbix[rcache,buffer,pused],10m)>75`	AVERAGE
Zabbix server: More than 95% used in the value cache	Consider increasing `ValueCacheSize` in the `zabbix_server.conf` configuration file.	`max(/Zabbix server health/zabbix[vcache,buffer,pused],10m)>95`	AVERAGE
Zabbix server: Zabbix value cache working in low memory mode	Once the low memory mode has been switched on, the value cache will remain in this state for 24 hours, even if the problem that triggered this mode is resolved sooner.	`last(/Zabbix server health/zabbix[vcache,cache,mode])=1`	HIGH
Zabbix server: Version has changed	Zabbix server version has changed. Acknowledge to close manually.	`last(/Zabbix server health/zabbix[version],#1)<>last(/Zabbix server health/zabbix[version],#2) and length(last(/Zabbix server health/zabbix[version]))>0`	INFO	Manual close: YES
Zabbix server: More than 75% used in the vmware cache	Consider increasing `VMwareCacheSize` in the `zabbix_server.conf` configuration file.	`max(/Zabbix server health/zabbix[vmware,buffer,pused],10m)>75`	AVERAGE
Zabbix server: More than 75% used in the history cache	Consider increasing `HistoryCacheSize` in the `zabbix_server.conf` configuration file.	`max(/Zabbix server health/zabbix[wcache,history,pused],10m)>75`	AVERAGE
Zabbix server: More than 75% used in the history index cache	Consider increasing `HistoryIndexCacheSize` in the `zabbix_server.conf` configuration file.	`max(/Zabbix server health/zabbix[wcache,index,pused],10m)>75`	AVERAGE
Zabbix server: More than 75% used in the trends cache	Consider increasing `TrendCacheSize` in the `zabbix_server.conf` configuration file.	`max(/Zabbix server health/zabbix[wcache,trend,pused],10m)>75`	AVERAGE

Feedback

Please report any issues with the template at https://support.zabbix.com.

app

app_zabbix_proxy_remote

View README Download JSON

Remote Zabbix proxy health

Overview

For Zabbix version: 6.2 and higher

Setup

Refer to the vendor documentation.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$ZABBIX.PROXY.ADDRESS}	IP/DNS/network mask list of proxies to be remotely queried (default is 127.0.0.1).	`127.0.0.1`
{$ZABBIX.PROXY.PORT}	Port of proxy to be remotely queried (default is 10051).	`10051`
{$ZABBIX.PROXY.UTIL.MAX}	Maximum average percentage of time processes busy in the last minute (default is 75).	`75`
{$ZABBIX.PROXY.UTIL.MIN}	Minimum average percentage of time processes busy in the last minute (default is 65).	`65`

Template links

There are no template links in this template.

Discovery rules

Items collected

Group	Name	Description	Type	Key and additional info
Zabbix raw items	Remote Zabbix proxy: Zabbix stats	Zabbix server statistics master item.	INTERNAL	zabbix[stats,{$ZABBIX.PROXY.ADDRESS},{$ZABBIX.PROXY.PORT}]
Zabbix proxy	Remote Zabbix proxy: Zabbix stats queue over 10m	Number of monitored items in the queue which are delayed at least by 10 minutes.	INTERNAL	zabbix[stats,{$ZABBIX.PROXY.ADDRESS},{$ZABBIX.PROXY.PORT},queue,10m] Preprocessing: - JSONPATH: `$.queue`
Zabbix proxy	Remote Zabbix proxy: Zabbix stats queue	Number of monitored items in the queue which are delayed at least by 6 seconds.	INTERNAL	zabbix[stats,{$ZABBIX.PROXY.ADDRESS},{$ZABBIX.PROXY.PORT},queue] Preprocessing: - JSONPATH: `$.queue`
Zabbix proxy	Remote Zabbix proxy: Utilization of data sender internal processes, in %	Average percentage of time data sender processes have been busy in the last minute.	DEPENDENT	process.datasender.avg.busy Preprocessing: - JSONPATH: `$.data.process['data sender'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> Processes data sender not started`
Zabbix proxy	Remote Zabbix proxy: Utilization of availability manager internal processes, in %	Average percentage of time availability manager processes have been busy in the last minute.	DEPENDENT	process.availabilitymanager.avg.busy Preprocessing: - JSONPATH: `$.data.process['availability manager'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> Processes availability manager not started`
Zabbix proxy	Remote Zabbix proxy: Utilization of configuration syncer internal processes, in %	Average percentage of time configuration syncer processes have been busy in the last minute.	DEPENDENT	process.configurationsyncer.avg.busy Preprocessing: - JSONPATH: `$.data.process['configuration syncer'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> Processes configuration syncer not started`
Zabbix proxy	Remote Zabbix proxy: Utilization of discoverer data collector processes, in %	Average percentage of time discoverer processes have been busy in the last minute.	DEPENDENT	process.discoverer.avg.busy Preprocessing: - JSONPATH: `$.data.process['discoverer'].busy.avg` ⛔️ON_FAIL: `CUSTOM_ERROR -> Processes discoverer not started`
Zabbix proxy	Remote Zabbix proxy: Utilization of heartbeat sender internal processes, in %	Average percentage of time heartbeat sender processes have been busy in the last minute.	DEPENDENT	process.heartbeatsender.avg.busy Preprocessing: - JSONPATH: `$.data.process['heartbeat sender'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> Processes heartbeat sender not started`
Zabbix proxy	Remote Zabbix proxy: Utilization of ODBC poller data collector processes, in %	Average percentage of time ODBC poller processes have been busy in the last minute.	DEPENDENT	process.odbc_poller.avg.busy Preprocessing: - JSONPATH: `$.data.process['odbc poller'].busy.avg`
Zabbix proxy	Remote Zabbix proxy: Utilization of history poller data collector processes, in %	Average percentage of time history poller processes have been busy in the last minute.	DEPENDENT	process.historypoller.avg.busy Preprocessing: - JSONPATH: `$.data.process['history poller'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> Processes history poller not started`
Zabbix proxy	Remote Zabbix proxy: Utilization of history syncer internal processes, in %	Average percentage of time history syncer processes have been busy in the last minute.	DEPENDENT	process.historysyncer.avg.busy Preprocessing: - JSONPATH: `$.data.process['history syncer'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> Processes history syncer not started`
Zabbix proxy	Remote Zabbix proxy: Utilization of housekeeper internal processes, in %	Average percentage of time housekeeper processes have been busy in the last minute.	DEPENDENT	process.housekeeper.avg.busy Preprocessing: - JSONPATH: `$.data.process['housekeeper'].busy.avg` ⛔️ON_FAIL: `CUSTOM_ERROR -> Processes housekeeper not started`
Zabbix proxy	Remote Zabbix proxy: Utilization of http poller data collector processes, in %	Average percentage of time http poller processes have been busy in the last minute.	DEPENDENT	process.httppoller.avg.busy Preprocessing: - JSONPATH: `$.data.process['http poller'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> Processes http poller not started`
Zabbix proxy	Remote Zabbix proxy: Utilization of icmp pinger data collector processes, in %	Average percentage of time icmp pinger processes have been busy in the last minute.	DEPENDENT	process.icmppinger.avg.busy Preprocessing: - JSONPATH: `$.data.process['icmp pinger'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> Processes icmp pinger not started`
Zabbix proxy	Remote Zabbix proxy: Utilization of ipmi manager internal processes, in %	Average percentage of time ipmi manager processes have been busy in the last minute.	DEPENDENT	process.ipmimanager.avg.busy Preprocessing: - JSONPATH: `$.data.process['ipmi manager'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> Processes ipmi manager not started`
Zabbix proxy	Remote Zabbix proxy: Utilization of ipmi poller data collector processes, in %	Average percentage of time ipmi poller processes have been busy in the last minute.	DEPENDENT	process.ipmipoller.avg.busy Preprocessing: - JSONPATH: `$.data.process['ipmi poller'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> Processes ipmi poller not started`
Zabbix proxy	Remote Zabbix proxy: Utilization of java poller data collector processes, in %	Average percentage of time java poller processes have been busy in the last minute.	DEPENDENT	process.javapoller.avg.busy Preprocessing: - JSONPATH: `$.data.process['java poller'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> Processes java poller not started`
Zabbix proxy	Remote Zabbix proxy: Utilization of poller data collector processes, in %	Average percentage of time poller processes have been busy in the last minute.	DEPENDENT	process.poller.avg.busy Preprocessing: - JSONPATH: `$.data.process['poller'].busy.avg` ⛔️ON_FAIL: `CUSTOM_ERROR -> Processes poller not started`
Zabbix proxy	Remote Zabbix proxy: Utilization of preprocessing worker internal processes, in %	Average percentage of time preprocessing worker processes have been busy in the last minute.	DEPENDENT	process.preprocessingworker.avg.busy Preprocessing: - JSONPATH: `$.data.process['preprocessing worker'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> Processes preprocessing worker not started`
Zabbix proxy	Remote Zabbix proxy: Utilization of preprocessing manager internal processes, in %	Average percentage of time preprocessing manager processes have been busy in the last minute.	DEPENDENT	process.preprocessingmanager.avg.busy Preprocessing: - JSONPATH: `$.data.process['preprocessing manager'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> Processes preprocessing manager not started`
Zabbix proxy	Remote Zabbix proxy: Utilization of self-monitoring internal processes, in %	Average percentage of time self-monitoring processes have been busy in the last minute.	DEPENDENT	process.self-monitoring.avg.busy Preprocessing: - JSONPATH: `$.data.process['self-monitoring'].busy.avg` ⛔️ON_FAIL: `CUSTOM_ERROR -> Processes self-monitoring not started`
Zabbix proxy	Remote Zabbix proxy: Utilization of snmp trapper data collector processes, in %	Average percentage of time snmp trapper processes have been busy in the last minute.	DEPENDENT	process.snmptrapper.avg.busy Preprocessing: - JSONPATH: `$.data.process['snmp trapper'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> Processes snmp trapper not started`
Zabbix proxy	Remote Zabbix proxy: Utilization of task manager internal processes, in %	Average percentage of time task manager processes have been busy in the last minute.	DEPENDENT	process.taskmanager.avg.busy Preprocessing: - JSONPATH: `$.data.process['task manager'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> Processes task manager not started`
Zabbix proxy	Remote Zabbix proxy: Utilization of trapper data collector processes, in %	Average percentage of time trapper processes have been busy in the last minute.	DEPENDENT	process.trapper.avg.busy Preprocessing: - JSONPATH: `$.data.process['trapper'].busy.avg` ⛔️ON_FAIL: `CUSTOM_ERROR -> Processes trapper not started`
Zabbix proxy	Remote Zabbix proxy: Utilization of unreachable poller data collector processes, in %	Average percentage of time unreachable poller processes have been busy in the last minute.	DEPENDENT	process.unreachablepoller.avg.busy Preprocessing: - JSONPATH: `$.data.process['unreachable poller'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> Processes unreachable poller not started`
Zabbix proxy	Remote Zabbix proxy: Utilization of vmware data collector processes, in %	Average percentage of time vmware collector processes have been busy in the last minute.	DEPENDENT	process.vmwarecollector.avg.busy Preprocessing: - JSONPATH: `$.data.process['vmware collector'].busy.avg` ⛔️ONFAIL: `CUSTOM_ERROR -> Processes vmware collector not started`
Zabbix proxy	Remote Zabbix proxy: Configuration cache, % used	Availability statistics of Zabbix configuration cache. Percentage of used buffer.	DEPENDENT	rcache.buffer.pused Preprocessing: - JSONPATH: `$.data.rcache.pused`
Zabbix proxy	Remote Zabbix proxy: Version	Version of Zabbix proxy.	DEPENDENT	version Preprocessing: - JSONPATH: `$.data.version` - DISCARDUNCHANGEDHEARTBEAT: `1d`
Zabbix proxy	Remote Zabbix proxy: VMware cache, % used	Availability statistics of Zabbix vmware cache. Percentage of used buffer.	DEPENDENT	vmware.buffer.pused Preprocessing: - JSONPATH: `$.data.vmware.pused` ⛔️ON_FAIL: `CUSTOM_ERROR -> No vmware collector processes started`
Zabbix proxy	Remote Zabbix proxy: History write cache, % used	Statistics and availability of Zabbix write cache. Percentage of used history buffer. History cache is used to store item values. A high number indicates performance problems on the database side.	DEPENDENT	wcache.history.pused Preprocessing: - JSONPATH: `$.data.wcache.history.pused`
Zabbix proxy	Remote Zabbix proxy: History index cache, % used	Statistics and availability of Zabbix write cache. Percentage of used history index buffer. History index cache is used to index values stored in history cache.	DEPENDENT	wcache.index.pused Preprocessing: - JSONPATH: `$.data.wcache.index.pused`
Zabbix proxy	Remote Zabbix proxy: Number of processed values per second	Statistics and availability of Zabbix write cache. Total number of values processed by Zabbix server or Zabbix proxy, except unsupported items.	DEPENDENT	wcache.values Preprocessing: - JSONPATH: `$.data.wcache.values.all` - CHANGEPERSECOND
Zabbix proxy	Remote Zabbix proxy: Number of processed numeric (float) values per second	Statistics and availability of Zabbix write cache. Number of processed float values.	DEPENDENT	wcache.values.float Preprocessing: - JSONPATH: `$.data.wcache.values.float` - CHANGEPERSECOND
Zabbix proxy	Remote Zabbix proxy: Number of processed log values per second	Statistics and availability of Zabbix write cache. Number of processed log values.	DEPENDENT	wcache.values.log Preprocessing: - JSONPATH: `$.data.wcache.values.log` - CHANGEPERSECOND
Zabbix proxy	Remote Zabbix proxy: Number of processed not supported values per second	Statistics and availability of Zabbix write cache. Number of times item processing resulted in item becoming unsupported or keeping that state.	DEPENDENT	wcache.values.notsupported Preprocessing: - JSONPATH: `$.data.wcache.values['not supported']` - CHANGEPER_SECOND
Zabbix proxy	Remote Zabbix proxy: Number of processed character values per second	Statistics and availability of Zabbix write cache. Number of processed character/string values.	DEPENDENT	wcache.values.str Preprocessing: - JSONPATH: `$.data.wcache.values.str` - CHANGEPERSECOND
Zabbix proxy	Remote Zabbix proxy: Number of processed text values per second	Statistics and availability of Zabbix write cache. Number of processed text values.	DEPENDENT	wcache.values.text Preprocessing: - JSONPATH: `$.data.wcache.values.text` - CHANGEPERSECOND
Zabbix proxy	Remote Zabbix proxy: Preprocessing queue	Count of values enqueued in the preprocessing queue.	DEPENDENT	preprocessing_queue Preprocessing: - JSONPATH: `$.data.preprocessing_queue`
Zabbix proxy	Remote Zabbix proxy: Number of processed numeric (unsigned) values per second	Statistics and availability of Zabbix write cache. Number of processed numeric (unsigned) values.	DEPENDENT	wcache.values.uint Preprocessing: - JSONPATH: `$.data.wcache.values.uint` - CHANGEPERSECOND
Zabbix proxy	Remote Zabbix proxy: Required performance	Required performance of Zabbix proxy, in new values per second expected.	DEPENDENT	requiredperformance Preprocessing: - JSONPATH: `$.data.requiredperformance`
Zabbix proxy	Remote Zabbix proxy: Uptime	Uptime of Zabbix proxy process in seconds.	DEPENDENT	uptime Preprocessing: - JSONPATH: `$.data.uptime`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Remote Zabbix proxy: More than 100 items having missing data for more than 10 minutes	zabbix[stats,{$ZABBIX.PROXY.ADDRESS},{$ZABBIX.PROXY.PORT},queue,10m] item is collecting data about how many items are missing data for more than 10 minutes.	`min(/Remote Zabbix proxy health/zabbix[stats,{$ZABBIX.PROXY.ADDRESS},{$ZABBIX.PROXY.PORT},queue,10m],10m)>100`	WARNING
Remote Zabbix proxy: Utilization of data sender processes is high	-	`avg(/Remote Zabbix proxy health/process.data_sender.avg.busy,10m)>{$ZABBIX.PROXY.UTIL.MAX:"data sender"}` Recovery expression: `avg(/Remote Zabbix proxy health/process.data_sender.avg.busy,10m)<{$ZABBIX.PROXY.UTIL.MIN:"data sender"}`	AVERAGE
Remote Zabbix proxy: Utilization of availability manager processes is high	-	`avg(/Remote Zabbix proxy health/process.availability_manager.avg.busy,10m)>{$ZABBIX.PROXY.UTIL.MAX:"availability manager"}` Recovery expression: `avg(/Remote Zabbix proxy health/process.availability_manager.avg.busy,10m)<{$ZABBIX.PROXY.UTIL.MIN:"availability manager"}`	AVERAGE
Remote Zabbix proxy: Utilization of configuration syncer processes is high	-	`avg(/Remote Zabbix proxy health/process.configuration_syncer.avg.busy,10m)>{$ZABBIX.PROXY.UTIL.MAX:"configuration syncer"}` Recovery expression: `avg(/Remote Zabbix proxy health/process.configuration_syncer.avg.busy,10m)<{$ZABBIX.PROXY.UTIL.MIN:"configuration syncer"}`	AVERAGE
Remote Zabbix proxy: Utilization of discoverer processes is high	-	`avg(/Remote Zabbix proxy health/process.discoverer.avg.busy,10m)>{$ZABBIX.PROXY.UTIL.MAX:"discoverer"}` Recovery expression: `avg(/Remote Zabbix proxy health/process.discoverer.avg.busy,10m)<{$ZABBIX.PROXY.UTIL.MIN:"discoverer"}`	AVERAGE
Remote Zabbix proxy: Utilization of heartbeat sender processes is high	-	`avg(/Remote Zabbix proxy health/process.heartbeat_sender.avg.busy,10m)>{$ZABBIX.PROXY.UTIL.MAX:"heartbeat sender"}` Recovery expression: `avg(/Remote Zabbix proxy health/process.heartbeat_sender.avg.busy,10m)<{$ZABBIX.PROXY.UTIL.MIN:"heartbeat sender"}`	AVERAGE
Remote Zabbix proxy: Utilization of ODBC poller processes is high	-	`avg(/Remote Zabbix proxy health/process.odbc_poller.avg.busy,10m)>{$ZABBIX.PROXY.UTIL.MAX:"ODBC poller"}` Recovery expression: `avg(/Remote Zabbix proxy health/process.odbc_poller.avg.busy,10m)<{$ZABBIX.PROXY.UTIL.MIN:"ODBC poller"}`	AVERAGE
Remote Zabbix proxy: Utilization of history poller processes is high	-	`avg(/Remote Zabbix proxy health/process.history_poller.avg.busy,10m)>{$ZABBIX.PROXY.UTIL.MAX:"history poller"}` Recovery expression: `avg(/Remote Zabbix proxy health/process.history_poller.avg.busy,10m)<{$ZABBIX.PROXY.UTIL.MIN:"history poller"}`	AVERAGE
Remote Zabbix proxy: Utilization of history syncer processes is high	-	`avg(/Remote Zabbix proxy health/process.history_syncer.avg.busy,10m)>{$ZABBIX.PROXY.UTIL.MAX:"history syncer"}` Recovery expression: `avg(/Remote Zabbix proxy health/process.history_syncer.avg.busy,10m)<{$ZABBIX.PROXY.UTIL.MIN:"history syncer"}`	AVERAGE
Remote Zabbix proxy: Utilization of housekeeper processes is high	-	`avg(/Remote Zabbix proxy health/process.housekeeper.avg.busy,10m)>{$ZABBIX.PROXY.UTIL.MAX:"housekeeper"}` Recovery expression: `avg(/Remote Zabbix proxy health/process.housekeeper.avg.busy,10m)<{$ZABBIX.PROXY.UTIL.MIN:"housekeeper"}`	AVERAGE
Remote Zabbix proxy: Utilization of http poller processes is high	-	`avg(/Remote Zabbix proxy health/process.http_poller.avg.busy,10m)>{$ZABBIX.PROXY.UTIL.MAX:"http poller"}` Recovery expression: `avg(/Remote Zabbix proxy health/process.http_poller.avg.busy,10m)<{$ZABBIX.PROXY.UTIL.MIN:"http poller"}`	AVERAGE
Remote Zabbix proxy: Utilization of icmp pinger processes is high	-	`avg(/Remote Zabbix proxy health/process.icmp_pinger.avg.busy,10m)>{$ZABBIX.PROXY.UTIL.MAX:"icmp pinger"}` Recovery expression: `avg(/Remote Zabbix proxy health/process.icmp_pinger.avg.busy,10m)<{$ZABBIX.PROXY.UTIL.MIN:"icmp pinger"}`	AVERAGE
Remote Zabbix proxy: Utilization of ipmi manager processes is high	-	`avg(/Remote Zabbix proxy health/process.ipmi_manager.avg.busy,10m)>{$ZABBIX.PROXY.UTIL.MAX:"ipmi manager"}` Recovery expression: `avg(/Remote Zabbix proxy health/process.ipmi_manager.avg.busy,10m)<{$ZABBIX.PROXY.UTIL.MIN:"ipmi manager"}`	AVERAGE
Remote Zabbix proxy: Utilization of ipmi poller processes is high	-	`avg(/Remote Zabbix proxy health/process.ipmi_poller.avg.busy,10m)>{$ZABBIX.PROXY.UTIL.MAX:"ipmi poller"}` Recovery expression: `avg(/Remote Zabbix proxy health/process.ipmi_poller.avg.busy,10m)<{$ZABBIX.PROXY.UTIL.MIN:"ipmi poller"}`	AVERAGE
Remote Zabbix proxy: Utilization of java poller processes is high	-	`avg(/Remote Zabbix proxy health/process.java_poller.avg.busy,10m)>{$ZABBIX.PROXY.UTIL.MAX:"java poller"}` Recovery expression: `avg(/Remote Zabbix proxy health/process.java_poller.avg.busy,10m)<{$ZABBIX.PROXY.UTIL.MIN:"java poller"}`	AVERAGE
Remote Zabbix proxy: Utilization of poller processes is high	-	`avg(/Remote Zabbix proxy health/process.poller.avg.busy,10m)>{$ZABBIX.PROXY.UTIL.MAX:"poller"}` Recovery expression: `avg(/Remote Zabbix proxy health/process.poller.avg.busy,10m)<{$ZABBIX.PROXY.UTIL.MIN:"poller"}`	AVERAGE
Remote Zabbix proxy: Utilization of preprocessing worker processes is high	-	`avg(/Remote Zabbix proxy health/process.preprocessing_worker.avg.busy,10m)>{$ZABBIX.PROXY.UTIL.MAX:"preprocessing worker"}` Recovery expression: `avg(/Remote Zabbix proxy health/process.preprocessing_worker.avg.busy,10m)<{$ZABBIX.PROXY.UTIL.MIN:"preprocessing worker"}`	AVERAGE
Remote Zabbix proxy: Utilization of preprocessing manager processes is high	-	`avg(/Remote Zabbix proxy health/process.preprocessing_manager.avg.busy,10m)>{$ZABBIX.PROXY.UTIL.MAX:"preprocessing manager"}` Recovery expression: `avg(/Remote Zabbix proxy health/process.preprocessing_manager.avg.busy,10m)<{$ZABBIX.PROXY.UTIL.MIN:"preprocessing manager"}`	AVERAGE
Remote Zabbix proxy: Utilization of self-monitoring processes is high	-	`avg(/Remote Zabbix proxy health/process.self-monitoring.avg.busy,10m)>{$ZABBIX.PROXY.UTIL.MAX:"self-monitoring"}` Recovery expression: `avg(/Remote Zabbix proxy health/process.self-monitoring.avg.busy,10m)<{$ZABBIX.PROXY.UTIL.MIN:"self-monitoring"}`	AVERAGE
Remote Zabbix proxy: Utilization of snmp trapper processes is high	-	`avg(/Remote Zabbix proxy health/process.snmp_trapper.avg.busy,10m)>{$ZABBIX.PROXY.UTIL.MAX:"snmp trapper"}` Recovery expression: `avg(/Remote Zabbix proxy health/process.snmp_trapper.avg.busy,10m)<{$ZABBIX.PROXY.UTIL.MIN:"snmp trapper"}`	AVERAGE
Remote Zabbix proxy: Utilization of task manager processes is high	-	`avg(/Remote Zabbix proxy health/process.task_manager.avg.busy,10m)>{$ZABBIX.PROXY.UTIL.MAX:"task manager"}` Recovery expression: `avg(/Remote Zabbix proxy health/process.task_manager.avg.busy,10m)<{$ZABBIX.PROXY.UTIL.MIN:"task manager"}`	AVERAGE
Remote Zabbix proxy: Utilization of trapper processes is high	-	`avg(/Remote Zabbix proxy health/process.trapper.avg.busy,10m)>{$ZABBIX.PROXY.UTIL.MAX:"trapper"}` Recovery expression: `avg(/Remote Zabbix proxy health/process.trapper.avg.busy,10m)<{$ZABBIX.PROXY.UTIL.MIN:"trapper"}`	AVERAGE
Remote Zabbix proxy: Utilization of unreachable poller processes is high	-	`avg(/Remote Zabbix proxy health/process.unreachable_poller.avg.busy,10m)>{$ZABBIX.PROXY.UTIL.MAX:"unreachable poller"}` Recovery expression: `avg(/Remote Zabbix proxy health/process.unreachable_poller.avg.busy,10m)<{$ZABBIX.PROXY.UTIL.MIN:"unreachable poller"}`	AVERAGE
Remote Zabbix proxy: Utilization of vmware collector processes is high	-	`avg(/Remote Zabbix proxy health/process.vmware_collector.avg.busy,10m)>{$ZABBIX.PROXY.UTIL.MAX:"vmware collector"}` Recovery expression: `avg(/Remote Zabbix proxy health/process.vmware_collector.avg.busy,10m)<{$ZABBIX.PROXY.UTIL.MIN:"vmware collector"}`	AVERAGE
Remote Zabbix proxy: More than {$ZABBIX.PROXY.UTIL.MAX}% used in the configuration cache	Consider increasing CacheSize in the zabbix_server.conf configuration file.	`max(/Remote Zabbix proxy health/rcache.buffer.pused,10m)>{$ZABBIX.PROXY.UTIL.MAX}`	AVERAGE
Remote Zabbix proxy: Version has changed	Remote Zabbix proxy version has changed. Ack to close.	`last(/Remote Zabbix proxy health/version,#1)<>last(/Remote Zabbix proxy health/version,#2) and length(last(/Remote Zabbix proxy health/version))>0`	INFO	Manual close: YES
Remote Zabbix proxy: More than {$ZABBIX.PROXY.UTIL.MAX}% used in the vmware cache	Consider increasing VMwareCacheSize in the zabbix_server.conf configuration file.	`max(/Remote Zabbix proxy health/vmware.buffer.pused,10m)>{$ZABBIX.PROXY.UTIL.MAX}`	AVERAGE
Remote Zabbix proxy: More than {$ZABBIX.PROXY.UTIL.MAX}% used in the history cache	Consider increasing HistoryCacheSize in the zabbix_server.conf configuration file.	`max(/Remote Zabbix proxy health/wcache.history.pused,10m)>{$ZABBIX.PROXY.UTIL.MAX}`	AVERAGE
Remote Zabbix proxy: More than {$ZABBIX.PROXY.UTIL.MAX}% used in the history index cache	Consider increasing HistoryIndexCacheSize in the zabbix_server.conf configuration file.	`max(/Remote Zabbix proxy health/wcache.index.pused,10m)>{$ZABBIX.PROXY.UTIL.MAX}`	AVERAGE
Remote Zabbix proxy: has been restarted	Uptime is less than 10 minutes.	`last(/Remote Zabbix proxy health/uptime)<10m`	INFO	Manual close: YES

Feedback

Please report any issues with the template at https://support.zabbix.com

app

app_zabbix_proxy

View README Download JSON

Zabbix proxy health

Overview

For Zabbix version: 6.2 and higher

Setup

Refer to the vendor documentation.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$ZABBIX.PROXY.UTIL.MAX}	Maximum average percentage of time processes busy in the last minute (default is 75).	`75`
{$ZABBIX.PROXY.UTIL.MIN}	Minimum average percentage of time processes busy in the last minute (default is 65).	`65`

Template links

There are no template links in this template.

Discovery rules

Items collected

Group	Name	Description	Type	Key and additional info
Zabbix proxy	Zabbix proxy: Queue over 10 minutes	Number of monitored items in the queue which are delayed at least by 10 minutes.	INTERNAL	zabbix[queue,10m]
Zabbix proxy	Zabbix proxy: Queue	Number of monitored items in the queue which are delayed at least by 6 seconds.	INTERNAL	zabbix[queue]
Zabbix proxy	Zabbix proxy: Utilization of data sender internal processes, in %	Average percentage of time data sender processes have been busy in the last minute.	INTERNAL	zabbix[process,data sender,avg,busy]
Zabbix proxy	Zabbix proxy: Utilization of availability manager internal processes, in %	Average percentage of time availability manager processes have been busy in the last minute.	INTERNAL	zabbix[process,availability manager,avg,busy]
Zabbix proxy	Zabbix proxy: Utilization of configuration syncer internal processes, in %	Average percentage of time configuration syncer processes have been busy in the last minute.	INTERNAL	zabbix[process,configuration syncer,avg,busy]
Zabbix proxy	Zabbix proxy: Utilization of discoverer data collector processes, in %	Average percentage of time discoverer processes have been busy in the last minute.	INTERNAL	zabbix[process,discoverer,avg,busy]
Zabbix proxy	Zabbix proxy: Utilization of heartbeat sender internal processes, in %	Average percentage of time heartbeat sender processes have been busy in the last minute.	INTERNAL	zabbix[process,heartbeat sender,avg,busy]
Zabbix proxy	Zabbix proxy: Utilization of ODBC poller data collector processes, in %	Average percentage of time ODBC poller processes have been busy in the last minute.	INTERNAL	zabbix[process,odbc poller,avg,busy]
Zabbix proxy	Zabbix proxy: Utilization of history poller data collector processes, in %	Average percentage of time history poller processes have been busy in the last minute.	INTERNAL	zabbix[process,history poller,avg,busy]
Zabbix proxy	Zabbix proxy: Utilization of history syncer internal processes, in %	Average percentage of time history syncer processes have been busy in the last minute.	INTERNAL	zabbix[process,history syncer,avg,busy]
Zabbix proxy	Zabbix proxy: Utilization of housekeeper internal processes, in %	Average percentage of time housekeeper processes have been busy in the last minute.	INTERNAL	zabbix[process,housekeeper,avg,busy]
Zabbix proxy	Zabbix proxy: Utilization of http poller data collector processes, in %	Average percentage of time http poller processes have been busy in the last minute.	INTERNAL	zabbix[process,http poller,avg,busy]
Zabbix proxy	Zabbix proxy: Utilization of icmp pinger data collector processes, in %	Average percentage of time icmp pinger processes have been busy in the last minute.	INTERNAL	zabbix[process,icmp pinger,avg,busy]
Zabbix proxy	Zabbix proxy: Utilization of ipmi manager internal processes, in %	Average percentage of time ipmi manager processes have been busy in the last minute.	INTERNAL	zabbix[process,ipmi manager,avg,busy]
Zabbix proxy	Zabbix proxy: Utilization of ipmi poller data collector processes, in %	Average percentage of time ipmi poller processes have been busy in the last minute.	INTERNAL	zabbix[process,ipmi poller,avg,busy]
Zabbix proxy	Zabbix proxy: Utilization of java poller data collector processes, in %	Average percentage of time java poller processes have been busy in the last minute.	INTERNAL	zabbix[process,java poller,avg,busy]
Zabbix proxy	Zabbix proxy: Utilization of poller data collector processes, in %	Average percentage of time poller processes have been busy in the last minute.	INTERNAL	zabbix[process,poller,avg,busy]
Zabbix proxy	Zabbix proxy: Utilization of preprocessing worker internal processes, in %	Average percentage of time preprocessing worker processes have been busy in the last minute.	INTERNAL	zabbix[process,preprocessing worker,avg,busy]
Zabbix proxy	Zabbix proxy: Utilization of preprocessing manager internal processes, in %	Average percentage of time preprocessing manager processes have been busy in the last minute.	INTERNAL	zabbix[process,preprocessing manager,avg,busy]
Zabbix proxy	Zabbix proxy: Utilization of self-monitoring internal processes, in %	Average percentage of time self-monitoring processes have been busy in the last minute.	INTERNAL	zabbix[process,self-monitoring,avg,busy]
Zabbix proxy	Zabbix proxy: Utilization of snmp trapper data collector processes, in %	Average percentage of time snmp trapper processes have been busy in the last minute.	INTERNAL	zabbix[process,snmp trapper,avg,busy]
Zabbix proxy	Zabbix proxy: Utilization of task manager internal processes, in %	Average percentage of time task manager processes have been busy in the last minute.	INTERNAL	zabbix[process,task manager,avg,busy]
Zabbix proxy	Zabbix proxy: Utilization of trapper data collector processes, in %	Average percentage of time trapper processes have been busy in the last minute.	INTERNAL	zabbix[process,trapper,avg,busy]
Zabbix proxy	Zabbix proxy: Utilization of unreachable poller data collector processes, in %	Average percentage of time unreachable poller processes have been busy in the last minute.	INTERNAL	zabbix[process,unreachable poller,avg,busy]
Zabbix proxy	Zabbix proxy: Utilization of vmware data collector processes, in %	Average percentage of time vmware collector processes have been busy in the last minute.	INTERNAL	zabbix[process,vmware collector,avg,busy]
Zabbix proxy	Zabbix proxy: Configuration cache, % used	Availability statistics of Zabbix configuration cache. Percentage of used buffer.	INTERNAL	zabbix[rcache,buffer,pused]
Zabbix proxy	Zabbix proxy: Version	Version of Zabbix proxy.	INTERNAL	zabbix[version] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1d`
Zabbix proxy	Zabbix proxy: VMware cache, % used	Availability statistics of Zabbix vmware cache. Percentage of used buffer.	INTERNAL	zabbix[vmware,buffer,pused]
Zabbix proxy	Zabbix proxy: History write cache, % used	Statistics and availability of Zabbix write cache. Percentage of used history buffer. History cache is used to store item values. A high number indicates performance problems on the database side.	INTERNAL	zabbix[wcache,history,pused]
Zabbix proxy	Zabbix proxy: History index cache, % used	Statistics and availability of Zabbix write cache. Percentage of used history index buffer. History index cache is used to index values stored in history cache.	INTERNAL	zabbix[wcache,index,pused]
Zabbix proxy	Zabbix proxy: Number of processed values per second	Statistics and availability of Zabbix write cache. Total number of values processed by Zabbix server or Zabbix proxy, except unsupported items.	INTERNAL	zabbix[wcache,values] Preprocessing: - CHANGEPERSECOND
Zabbix proxy	Zabbix proxy: Number of processed numeric (float) values per second	Statistics and availability of Zabbix write cache. Number of processed float values.	INTERNAL	zabbix[wcache,values,float] Preprocessing: - CHANGEPERSECOND
Zabbix proxy	Zabbix proxy: Number of processed log values per second	Statistics and availability of Zabbix write cache. Number of processed log values.	INTERNAL	zabbix[wcache,values,log] Preprocessing: - CHANGEPERSECOND
Zabbix proxy	Zabbix proxy: Number of processed not supported values per second	Statistics and availability of Zabbix write cache. Number of times item processing resulted in item becoming unsupported or keeping that state.	INTERNAL	zabbix[wcache,values,not supported] Preprocessing: - CHANGEPERSECOND
Zabbix proxy	Zabbix proxy: Number of processed character values per second	Statistics and availability of Zabbix write cache. Number of processed character/string values.	INTERNAL	zabbix[wcache,values,str] Preprocessing: - CHANGEPERSECOND
Zabbix proxy	Zabbix proxy: Number of processed text values per second	Statistics and availability of Zabbix write cache. Number of processed text values.	INTERNAL	zabbix[wcache,values,text] Preprocessing: - CHANGEPERSECOND
Zabbix proxy	Zabbix proxy: Preprocessing queue	Count of values enqueued in the preprocessing queue.	INTERNAL	zabbix[preprocessing_queue]
Zabbix proxy	Zabbix proxy: Number of processed numeric (unsigned) values per second	Statistics and availability of Zabbix write cache. Number of processed numeric (unsigned) values.	INTERNAL	zabbix[wcache,values,uint] Preprocessing: - CHANGEPERSECOND
Zabbix proxy	Zabbix proxy: Values waiting to be sent	Number of values in the proxy history table waiting to be sent to the server.	INTERNAL	zabbix[proxy_history]
Zabbix proxy	Zabbix proxy: Required performance	Required performance of Zabbix proxy, in new values per second expected.	INTERNAL	zabbix[requiredperformance]
Zabbix proxy	Zabbix proxy: Uptime	Uptime of Zabbix proxy process in seconds.	INTERNAL	zabbix[uptime]

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Zabbix proxy: More than 100 items having missing data for more than 10 minutes	zabbix[stats,{$IP},{$PORT},queue,10m] item is collecting data about how many items are missing data for more than 10 minutes.	`min(/Zabbix proxy health/zabbix[queue,10m],10m)>100`	WARNING
Zabbix proxy: Utilization of data sender processes is high	-	`avg(/Zabbix proxy health/zabbix[process,data sender,avg,busy],10m)>{$ZABBIX.PROXY.UTIL.MAX:"data sender"}` Recovery expression: `avg(/Zabbix proxy health/zabbix[process,data sender,avg,busy],10m)<{$ZABBIX.PROXY.UTIL.MIN:"data sender"}`	AVERAGE
Zabbix proxy: Utilization of availability manager processes is high	-	`avg(/Zabbix proxy health/zabbix[process,availability manager,avg,busy],10m)>{$ZABBIX.PROXY.UTIL.MAX:"availability manager"}` Recovery expression: `avg(/Zabbix proxy health/zabbix[process,availability manager,avg,busy],10m)<{$ZABBIX.PROXY.UTIL.MIN:"availability manager"}`	AVERAGE
Zabbix proxy: Utilization of configuration syncer processes is high	-	`avg(/Zabbix proxy health/zabbix[process,configuration syncer,avg,busy],10m)>{$ZABBIX.PROXY.UTIL.MAX:"configuration syncer"}` Recovery expression: `avg(/Zabbix proxy health/zabbix[process,configuration syncer,avg,busy],10m)<{$ZABBIX.PROXY.UTIL.MIN:"configuration syncer"}`	AVERAGE
Zabbix proxy: Utilization of discoverer processes is high	-	`avg(/Zabbix proxy health/zabbix[process,discoverer,avg,busy],10m)>{$ZABBIX.PROXY.UTIL.MAX:"discoverer"}` Recovery expression: `avg(/Zabbix proxy health/zabbix[process,discoverer,avg,busy],10m)<{$ZABBIX.PROXY.UTIL.MIN:"discoverer"}`	AVERAGE
Zabbix proxy: Utilization of heartbeat sender processes is high	-	`avg(/Zabbix proxy health/zabbix[process,heartbeat sender,avg,busy],10m)>{$ZABBIX.PROXY.UTIL.MAX:"heartbeat sender"}` Recovery expression: `avg(/Zabbix proxy health/zabbix[process,heartbeat sender,avg,busy],10m)<{$ZABBIX.PROXY.UTIL.MIN:"heartbeat sender"}`	AVERAGE
Zabbix proxy: Utilization of ODBC poller processes is high	-	`avg(/Zabbix proxy health/zabbix[process,odbc poller,avg,busy],10m)>{$ZABBIX.PROXY.UTIL.MAX:"ODBC poller"}` Recovery expression: `avg(/Zabbix proxy health/zabbix[process,odbc poller,avg,busy],10m)<{$ZABBIX.PROXY.UTIL.MIN:"ODBC poller"}`	AVERAGE
Zabbix proxy: Utilization of history poller processes is high	-	`avg(/Zabbix proxy health/zabbix[process,history poller,avg,busy],10m)>{$ZABBIX.PROXY.UTIL.MAX:"history poller"}` Recovery expression: `avg(/Zabbix proxy health/zabbix[process,history poller,avg,busy],10m)<{$ZABBIX.PROXY.UTIL.MIN:"history poller"}`	AVERAGE
Zabbix proxy: Utilization of history syncer processes is high	-	`avg(/Zabbix proxy health/zabbix[process,history syncer,avg,busy],10m)>{$ZABBIX.PROXY.UTIL.MAX:"history syncer"}` Recovery expression: `avg(/Zabbix proxy health/zabbix[process,history syncer,avg,busy],10m)<{$ZABBIX.PROXY.UTIL.MIN:"history syncer"}`	AVERAGE
Zabbix proxy: Utilization of housekeeper processes is high	-	`avg(/Zabbix proxy health/zabbix[process,housekeeper,avg,busy],10m)>{$ZABBIX.PROXY.UTIL.MAX:"housekeeper"}` Recovery expression: `avg(/Zabbix proxy health/zabbix[process,housekeeper,avg,busy],10m)<{$ZABBIX.PROXY.UTIL.MIN:"housekeeper"}`	AVERAGE
Zabbix proxy: Utilization of http poller processes is high	-	`avg(/Zabbix proxy health/zabbix[process,http poller,avg,busy],10m)>{$ZABBIX.PROXY.UTIL.MAX:"http poller"}` Recovery expression: `avg(/Zabbix proxy health/zabbix[process,http poller,avg,busy],10m)<{$ZABBIX.PROXY.UTIL.MIN:"http poller"}`	AVERAGE
Zabbix proxy: Utilization of icmp pinger processes is high	-	`avg(/Zabbix proxy health/zabbix[process,icmp pinger,avg,busy],10m)>{$ZABBIX.PROXY.UTIL.MAX:"icmp pinger"}` Recovery expression: `avg(/Zabbix proxy health/zabbix[process,icmp pinger,avg,busy],10m)<{$ZABBIX.PROXY.UTIL.MIN:"icmp pinger"}`	AVERAGE
Zabbix proxy: Utilization of ipmi manager processes is high	-	`avg(/Zabbix proxy health/zabbix[process,ipmi manager,avg,busy],10m)>{$ZABBIX.PROXY.UTIL.MAX:"ipmi manager"}` Recovery expression: `avg(/Zabbix proxy health/zabbix[process,ipmi manager,avg,busy],10m)<{$ZABBIX.PROXY.UTIL.MIN:"ipmi manager"}`	AVERAGE
Zabbix proxy: Utilization of ipmi poller processes is high	-	`avg(/Zabbix proxy health/zabbix[process,ipmi poller,avg,busy],10m)>{$ZABBIX.PROXY.UTIL.MAX:"ipmi poller"}` Recovery expression: `avg(/Zabbix proxy health/zabbix[process,ipmi poller,avg,busy],10m)<{$ZABBIX.PROXY.UTIL.MIN:"ipmi poller"}`	AVERAGE
Zabbix proxy: Utilization of java poller processes is high	-	`avg(/Zabbix proxy health/zabbix[process,java poller,avg,busy],10m)>{$ZABBIX.PROXY.UTIL.MAX:"java poller"}` Recovery expression: `avg(/Zabbix proxy health/zabbix[process,java poller,avg,busy],10m)<{$ZABBIX.PROXY.UTIL.MIN:"java poller"}`	AVERAGE
Zabbix proxy: Utilization of poller processes is high	-	`avg(/Zabbix proxy health/zabbix[process,poller,avg,busy],10m)>{$ZABBIX.PROXY.UTIL.MAX:"poller"}` Recovery expression: `avg(/Zabbix proxy health/zabbix[process,poller,avg,busy],10m)<{$ZABBIX.PROXY.UTIL.MIN:"poller"}`	AVERAGE
Zabbix proxy: Utilization of preprocessing worker processes is high	-	`avg(/Zabbix proxy health/zabbix[process,preprocessing worker,avg,busy],10m)>{$ZABBIX.PROXY.UTIL.MAX:"preprocessing worker"}` Recovery expression: `avg(/Zabbix proxy health/zabbix[process,preprocessing worker,avg,busy],10m)<{$ZABBIX.PROXY.UTIL.MIN:"preprocessing worker"}`	AVERAGE
Zabbix proxy: Utilization of preprocessing manager processes is high	-	`avg(/Zabbix proxy health/zabbix[process,preprocessing manager,avg,busy],10m)>{$ZABBIX.PROXY.UTIL.MAX:"preprocessing manager"}` Recovery expression: `avg(/Zabbix proxy health/zabbix[process,preprocessing manager,avg,busy],10m)<{$ZABBIX.PROXY.UTIL.MIN:"preprocessing manager"}`	AVERAGE
Zabbix proxy: Utilization of self-monitoring processes is high	-	`avg(/Zabbix proxy health/zabbix[process,self-monitoring,avg,busy],10m)>{$ZABBIX.PROXY.UTIL.MAX:"self-monitoring"}` Recovery expression: `avg(/Zabbix proxy health/zabbix[process,self-monitoring,avg,busy],10m)<{$ZABBIX.PROXY.UTIL.MIN:"self-monitoring"}`	AVERAGE
Zabbix proxy: Utilization of snmp trapper processes is high	-	`avg(/Zabbix proxy health/zabbix[process,snmp trapper,avg,busy],10m)>{$ZABBIX.PROXY.UTIL.MAX:"snmp trapper"}` Recovery expression: `avg(/Zabbix proxy health/zabbix[process,snmp trapper,avg,busy],10m)<{$ZABBIX.PROXY.UTIL.MIN:"snmp trapper"}`	AVERAGE
Zabbix proxy: Utilization of task manager processes is high	-	`avg(/Zabbix proxy health/zabbix[process,task manager,avg,busy],10m)>{$ZABBIX.PROXY.UTIL.MAX:"task manager"}` Recovery expression: `avg(/Zabbix proxy health/zabbix[process,task manager,avg,busy],10m)<{$ZABBIX.PROXY.UTIL.MIN:"task manager"}`	AVERAGE
Zabbix proxy: Utilization of trapper processes is high	-	`avg(/Zabbix proxy health/zabbix[process,trapper,avg,busy],10m)>{$ZABBIX.PROXY.UTIL.MAX:"trapper"}` Recovery expression: `avg(/Zabbix proxy health/zabbix[process,trapper,avg,busy],10m)<{$ZABBIX.PROXY.UTIL.MIN:"trapper"}`	AVERAGE
Zabbix proxy: Utilization of unreachable poller processes is high	-	`avg(/Zabbix proxy health/zabbix[process,unreachable poller,avg,busy],10m)>{$ZABBIX.PROXY.UTIL.MAX:"unreachable poller"}` Recovery expression: `avg(/Zabbix proxy health/zabbix[process,unreachable poller,avg,busy],10m)<{$ZABBIX.PROXY.UTIL.MIN:"unreachable poller"}`	AVERAGE
Zabbix proxy: Utilization of vmware collector processes is high	-	`avg(/Zabbix proxy health/zabbix[process,vmware collector,avg,busy],10m)>{$ZABBIX.PROXY.UTIL.MAX:"vmware collector"}` Recovery expression: `avg(/Zabbix proxy health/zabbix[process,vmware collector,avg,busy],10m)<{$ZABBIX.PROXY.UTIL.MIN:"vmware collector"}`	AVERAGE
Zabbix proxy: More than {$ZABBIX.PROXY.UTIL.MAX}% used in the configuration cache	Consider increasing CacheSize in the zabbix_proxy.conf configuration file.	`max(/Zabbix proxy health/zabbix[rcache,buffer,pused],10m)>{$ZABBIX.PROXY.UTIL.MAX}`	AVERAGE
Zabbix proxy: Version has changed	Zabbix proxy version has changed. Ack to close.	`last(/Zabbix proxy health/zabbix[version],#1)<>last(/Zabbix proxy health/zabbix[version],#2) and length(last(/Zabbix proxy health/zabbix[version]))>0`	INFO	Manual close: YES
Zabbix proxy: More than {$ZABBIX.PROXY.UTIL.MAX}% used in the vmware cache	Consider increasing VMwareCacheSize in the zabbix_proxy.conf configuration file.	`max(/Zabbix proxy health/zabbix[vmware,buffer,pused],10m)>{$ZABBIX.PROXY.UTIL.MAX}`	AVERAGE
Zabbix proxy: More than {$ZABBIX.PROXY.UTIL.MAX}% used in the history cache	Consider increasing HistoryCacheSize in the zabbix_proxy.conf configuration file.	`max(/Zabbix proxy health/zabbix[wcache,history,pused],10m)>{$ZABBIX.PROXY.UTIL.MAX}`	AVERAGE
Zabbix proxy: More than {$ZABBIX.PROXY.UTIL.MAX}% used in the history index cache	Consider increasing HistoryIndexCacheSize in the zabbix_proxy.conf configuration file.	`max(/Zabbix proxy health/zabbix[wcache,index,pused],10m)>{$ZABBIX.PROXY.UTIL.MAX}`	AVERAGE
Zabbix proxy: has been restarted	Uptime is less than 10 minutes.	`last(/Zabbix proxy health/zabbix[uptime])<10m`	INFO	Manual close: YES

Feedback

Please report any issues with the template at https://support.zabbix.com

app

app_wildfly_server_jmx

View README Download JSON

WildFly Server by JMX

Overview

For Zabbix version: 6.2 and higher
Official JMX Template for WildFly server.

This template was tested on:

WildFly, version 22.6.0

Setup

Metrics are collected by JMX. This template works with standalone and domain instances.

Enable and configure JMX access to WildFly. See documentation for instructions.
Copy jboss-client.jar from /(wildfly,EAP,Jboss,AS)/bin/client in to directory /usr/share/zabbix-java-gateway/lib
Restart Zabbix Java gateway
Set the user name and password in host macros {$WILDFLY.USER} and {$WILDFLY.PASSWORD}. Depending on your server setup, you may need to specify a custom JMX scheme in macro {$WILDFLY.JMX.PROTOCOL} (default: remote+http)

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$WILDFLY.CONN.USAGE.WARN.MAX}	The maximum connection usage percent for trigger expression.	`80`
{$WILDFLY.CONN.WAIT.MAX.WARN}	The maximum number of waiting connections for trigger expression.	`300`
{$WILDFLY.DEPLOYMENT.MATCHES}	Filter of discoverable deployments	`.*`
{$WILDFLY.DEPLOYMENT.NOT_MATCHES}	Filter to exclude discovered deployments	`CHANGE_IF_NEEDED`
{$WILDFLY.JMX.PROTOCOL}	-	`remote+http`
{$WILDFLY.PASSWORD}	-	`zabbix`
{$WILDFLY.USER}	-	`zabbix`

Template links

There are no template links in this template.

Discovery rules

Name	Description	Type	Key and additional info
Deployments discovery	Discovery deployments metrics.	JMX	jmx.get[beans,"jboss.as.expr:deployment="] Filter: AND - {#DEPLOYMENT} MATCHESREGEX `{$WILDFLY.DEPLOYMENT.MATCHES}` - {#DEPLOYMENT} NOT*MATCHES_REGEX `{$WILDFLY.DEPLOYMENT.NOT_MATCHES}`
JDBC metrics discovery	-	JMX	jmx.get[beans,"jboss.as:subsystem=datasources,data-source=*,statistics=jdbc"]
Pools metrics discovery	-	JMX	jmx.get[beans,"jboss.as:subsystem=datasources,data-source=*,statistics=pool"]
Undertow metrics discovery	-	JMX	jmx.get[beans,"jboss.as:subsystem=undertow,server=,http-listener="]

Items collected

Group	Name	Description	Type	Key and additional info
WildFly	WildFly: Launch type	The manner in which the server process was launched. Either "DOMAIN" for a domain mode server launched by a Host Controller, "STANDALONE" for a standalone server launched from the command line, or "EMBEDDED" for a standalone server launched as an embedded part of an application running in the same virtual machine.	JMX	jmx["jboss.as:management-root=server","launchType"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `3h`
WildFly	WildFly: Name	For standalone mode: The name of this server. If not set, defaults to the runtime value of InetAddress.getLocalHost().getHostName(). For domain mode: The name given to this domain	JMX	jmx["jboss.as:management-root=server","name"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `3h`
WildFly	WildFly: Process type	The type of process represented by this root resource.	JMX	jmx["jboss.as:management-root=server","processType"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `3h`
WildFly	WildFly: Runtime configuration state	The current persistent configuration state, one of starting, ok, reload-required, restart-required, stopping or stopped.	JMX	jmx["jboss.as:management-root=server","runtimeConfigurationState"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `3h`
WildFly	WildFly: Server controller state	The current state of the server controller; either STARTING, RUNNING, RESTARTREQUIRED, RELOADREQUIRED or STOPPING.	JMX	jmx["jboss.as:management-root=server","serverState"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `3h`
WildFly	WildFly: Version	The version of the WildFly Core based product release	JMX	jmx["jboss.as:management-root=server","productVersion"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `3h`
WildFly	WildFly: Uptime	WildFly server uptime.	JMX	jmx["java.lang:type=Runtime","Uptime"] Preprocessing: - MULTIPLIER: `0.001`
WildFly	WildFly: Transactions: Total, rate	The total number of transactions (top-level and nested) created per second.	JMX	jmx["jboss.as:subsystem=transactions","numberOfTransactions"] Preprocessing: - CHANGEPERSECOND
WildFly	WildFly: Transactions: Aborted, rate	The number of aborted (i.e. rolledback) transactions per second.	JMX	jmx["jboss.as:subsystem=transactions","numberOfAbortedTransactions"] Preprocessing: - CHANGEPERSECOND
WildFly	WildFly: Transactions: Application rollbacks, rate	The number of transactions that have been rolled back by application request. This includes those that timeout, since the timeout behavior is considered an attribute of the application configuration.	JMX	jmx["jboss.as:subsystem=transactions","numberOfApplicationRollbacks"] Preprocessing: - CHANGEPERSECOND
WildFly	WildFly: Transactions: Committed, rate	The number of committed transactions	JMX	jmx["jboss.as:subsystem=transactions","numberOfCommittedTransactions"] Preprocessing: - CHANGEPERSECOND
WildFly	WildFly: Transactions: Heuristics, rate	The number of transactions which have terminated with heuristic outcomes.	JMX	jmx["jboss.as:subsystem=transactions","numberOfHeuristics"] Preprocessing: - CHANGEPERSECOND
WildFly	WildFly: Transactions: Current	The number of transactions that have begun but not yet terminated.	JMX	jmx["jboss.as:subsystem=transactions","numberOfInflightTransactions"]
WildFly	WildFly: Transactions: Nested, rate	The total number of nested (sub) transactions created.	JMX	jmx["jboss.as:subsystem=transactions","numberOfNestedTransactions"] Preprocessing: - CHANGEPERSECOND
WildFly	WildFly: Transactions: ResourceRollbacks, rate	The number of transactions that rolled back due to resource (participant) failure.	JMX	jmx["jboss.as:subsystem=transactions","numberOfResourceRollbacks"] Preprocessing: - CHANGEPERSECOND
WildFly	WildFly: Transactions: System rollbacks, rate	The number of transactions that have been rolled back due to internal system errors.	JMX	jmx["jboss.as:subsystem=transactions","numberOfSystemRollbacks"] Preprocessing: - CHANGEPERSECOND
WildFly	WildFly: Transactions: Timed out, rate	The number of transactions that have rolled back due to timeout.	JMX	jmx["jboss.as:subsystem=transactions","numberOfTimedOutTransactions"] Preprocessing: - CHANGEPERSECOND
WildFly	WildFly deployment [{#DEPLOYMENT}]: Status	The current runtime status of a deployment. Possible status modes are OK, FAILED, and STOPPED. FAILED indicates a dependency is missing or a service could not start. STOPPED indicates that the deployment was not enabled or was manually stopped.	JMX	jmx["{#JMXOBJ}",status] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `3h`
WildFly	WildFly deployment [{#DEPLOYMENT}]: Enabled	Boolean indicating whether the deployment content is currently deployed in the runtime (or should be deployed in the runtime the next time the server starts.)	JMX	jmx["{#JMXOBJ}",enabled] Preprocessing: - BOOLTODECIMAL - DISCARDUNCHANGEDHEARTBEAT: `3h`
WildFly	WildFly deployment [{#DEPLOYMENT}]: Managed	Indicates if the deployment is managed (aka uses the ContentRepository).	JMX	jmx["{#JMXOBJ}",managed] Preprocessing: - BOOLTODECIMAL - DISCARDUNCHANGEDHEARTBEAT: `3h`
WildFly	WildFly deployment [{#DEPLOYMENT}]: Persistent	Indicates if the deployment is managed (aka uses the ContentRepository).	JMX	jmx["{#JMXOBJ}",persistent] Preprocessing: - BOOLTODECIMAL - DISCARDUNCHANGEDHEARTBEAT: `3h`
WildFly	WildFly deployment [{#DEPLOYMENT}]: Enabled time	Indicates if the deployment is managed (aka uses the ContentRepository).	JMX	jmx["{#JMXOBJ}",enabledTime] Preprocessing: - MULTIPLIER: `0.001` - DISCARDUNCHANGEDHEARTBEAT: `3h`
WildFly	WildFly {#JMXDATASOURCE}: Cache access, rate	The number of times that the statement cache was accessed per second.	JMX	jmx["{#JMXOBJ}",PreparedStatementCacheAccessCount] Preprocessing: - CHANGEPERSECOND
WildFly	WildFly {#JMXDATASOURCE}: Cache add, rate	The number of statements added to the statement cache per second.	JMX	jmx["{#JMXOBJ}",PreparedStatementCacheAddCount] Preprocessing: - CHANGEPERSECOND
WildFly	WildFly {#JMXDATASOURCE}: Cache current size	The number of prepared and callable statements currently cached in the statement cache.	JMX	jmx["{#JMXOBJ}",PreparedStatementCacheCurrentSize]
WildFly	WildFly {#JMXDATASOURCE}: Cache delete, rate	The number of statements discarded from the cache per second.	JMX	jmx["{#JMXOBJ}",PreparedStatementCacheDeleteCount] Preprocessing: - CHANGEPERSECOND
WildFly	WildFly {#JMXDATASOURCE}: Cache hit, rate	The number of times that statements from the cache were used per second.	JMX	jmx["{#JMXOBJ}",PreparedStatementCacheHitCount] Preprocessing: - CHANGEPERSECOND
WildFly	WildFly {#JMXDATASOURCE}: Cache miss, rate	The number of times that a statement request could not be satisfied with a statement from the cache per second.	JMX	jmx["{#JMXOBJ}",PreparedStatementCacheMissCount] Preprocessing: - CHANGEPERSECOND
WildFly	WildFly {#JMXDATASOURCE}: Statistics enabled	Define whether runtime statistics are enabled or not.	JMX	jmx["{#JMXOBJ}",statisticsEnabled] Preprocessing: - BOOLTODECIMAL - DISCARDUNCHANGEDHEARTBEAT: `3h`
WildFly	WildFly {#JMXDATASOURCE}: Connections: Active	The number of open connections.	JMX	jmx["{#JMXOBJ}",ActiveCount]
WildFly	WildFly {#JMXDATASOURCE}: Connections: Available	The available count.	JMX	jmx["{#JMXOBJ}",AvailableCount]
WildFly	WildFly {#JMXDATASOURCE}: Blocking time, avg	Average Blocking Time for pool.	JMX	jmx["{#JMXOBJ}",AverageBlockingTime]
WildFly	WildFly {#JMXDATASOURCE}: Connections: Creating time, avg	The average time spent creating a physical connection.	JMX	jmx["{#JMXOBJ}",AverageCreationTime]
WildFly	WildFly {#JMXDATASOURCE}: Connections: Get time, avg	The average time spent obtaining a physical connection.	JMX	jmx["{#JMXOBJ}",AverageGetTime]
WildFly	WildFly {#JMXDATASOURCE}: Connections: Pool time, avg	The average time for a physical connection spent in the pool.	JMX	jmx["{#JMXOBJ}",AveragePoolTime]
WildFly	WildFly {#JMXDATASOURCE}: Connections: Usage time, avg	The average time spent using a physical connection	JMX	jmx["{#JMXOBJ}",AverageUsageTime]
WildFly	WildFly {#JMXDATASOURCE}: Connections: Blocking failure, rate	The number of failures trying to obtain a physical connection per second.	JMX	jmx["{#JMXOBJ}",BlockingFailureCount] Preprocessing: - CHANGEPERSECOND
WildFly	WildFly {#JMXDATASOURCE}: Connections: Created, rate	The created per second	JMX	jmx["{#JMXOBJ}",CreatedCount] Preprocessing: - CHANGEPERSECOND
WildFly	WildFly {#JMXDATASOURCE}: Connections: Destroyed, rate	The destroyed count.	JMX	jmx["{#JMXOBJ}",DestroyedCount] Preprocessing: - CHANGEPERSECOND
WildFly	WildFly {#JMXDATASOURCE}: Connections: Idle	The number of physical connections currently idle.	JMX	jmx["{#JMXOBJ}",IdleCount]
WildFly	WildFly {#JMXDATASOURCE}: Connections: In use	The number of physical connections currently in use.	JMX	jmx["{#JMXOBJ}",InUseCount]
WildFly	WildFly {#JMXDATASOURCE}: Connections: Used, max	The maximum number of connections used.	JMX	jmx["{#JMXOBJ}",MaxUsedCount]
WildFly	WildFly {#JMXDATASOURCE}: Statistics enabled	Define whether runtime statistics are enabled or not.	JMX	jmx["{#JMXOBJ}",statisticsEnabled] Preprocessing: - BOOLTODECIMAL - DISCARDUNCHANGEDHEARTBEAT: `3h`
WildFly	WildFly {#JMXDATASOURCE}: Connections: Timed out, rate	The timed out connections per second.	JMX	jmx["{#JMXOBJ}",TimedOut] Preprocessing: - CHANGEPERSECOND
WildFly	WildFly {#JMXDATASOURCE}: Connections: Wait	The number of requests that had to wait to obtain a physical connection.	JMX	jmx["{#JMXOBJ}",WaitCount]
WildFly	WildFly {#JMXDATASOURCE}: XA: Commit time, avg	The average time for a XAResource commit invocation.	JMX	jmx["{#JMXOBJ}",XACommitAverageTime]
WildFly	WildFly {#JMXDATASOURCE}: XA: Commit, rate	The number of XAResource commit invocations per second.	JMX	jmx["{#JMXOBJ}",XACommitCount] Preprocessing: - CHANGEPERSECOND
WildFly	WildFly {#JMXDATASOURCE}: XA: End time, avg	The average time for a XAResource end invocation.	JMX	jmx["{#JMXOBJ}",XAEndAverageTime]
WildFly	WildFly {#JMXDATASOURCE}: XA: End, rate	The number of XAResource end invocations per second.	JMX	jmx["{#JMXOBJ}",XAEndCount] Preprocessing: - CHANGEPERSECOND
WildFly	WildFly {#JMXDATASOURCE}: XA: Forget time, avg	The average time for a XAResource forget invocation.	JMX	jmx["{#JMXOBJ}",XAForgetAverageTime]
WildFly	WildFly {#JMXDATASOURCE}: XA: Forget, rate	The number of XAResource forget invocations per second.	JMX	jmx["{#JMXOBJ}",XAForgetCount] Preprocessing: - CHANGEPERSECOND
WildFly	WildFly {#JMXDATASOURCE}: XA: Prepare time, avg	The average time for a XAResource prepare invocation.	JMX	jmx["{#JMXOBJ}",XAPrepareAverageTime]
WildFly	WildFly {#JMXDATASOURCE}: XA: Prepare, rate	The number of XAResource prepare invocations per second.	JMX	jmx["{#JMXOBJ}",XAPrepareCount] Preprocessing: - CHANGEPERSECOND
WildFly	WildFly {#JMXDATASOURCE}: XA: Recover time, avg	The average time for a XAResource recover invocation.	JMX	jmx["{#JMXOBJ}",XARecoverAverageTime]
WildFly	WildFly {#JMXDATASOURCE}: XA: Recover, rate	The number of XAResource recover invocations per second.	JMX	jmx["{#JMXOBJ}",XARecoverCount] Preprocessing: - CHANGEPERSECOND
WildFly	WildFly {#JMXDATASOURCE}: XA: Rollback time, avg	The average time for a XAResource rollback invocation.	JMX	jmx["{#JMXOBJ}",XARollbackAverageTime]
WildFly	WildFly {#JMXDATASOURCE}: XA: Rollback, rate	The number of XAResource rollback invocations per second.	JMX	jmx["{#JMXOBJ}",XARollbackCount] Preprocessing: - CHANGEPERSECOND
WildFly	WildFly {#JMXDATASOURCE}: XA: Start time, avg	The average time for a XAResource start invocation.	JMX	jmx["{#JMXOBJ}",XAStartAverageTime]
WildFly	WildFly {#JMXDATASOURCE}: XA: Start rate	The number of XAResource start invocations per second.	JMX	jmx["{#JMXOBJ}",XAStartCount] Preprocessing: - CHANGEPERSECOND
WildFly	WildFly listener {#HTTP_LISTENER}: Errors, rate	The number of 500 responses that have been sent by this listener per second.	JMX	jmx["{#JMXOBJ}",errorCount] Preprocessing: - CHANGEPERSECOND
WildFly	WildFly listener {#HTTP_LISTENER}: Requests, rate	The number of requests this listener has served per second.	JMX	jmx["{#JMXOBJ}",requestCount] Preprocessing: - CHANGEPERSECOND
WildFly	WildFly listener {#HTTP_LISTENER}: Bytes sent, rate	The number of bytes that have been sent out on this listener per second.	JMX	jmx["{#JMXOBJ}",bytesSent] Preprocessing: - CHANGEPERSECOND
WildFly	WildFly listener {#HTTP_LISTENER}: Bytes received, rate	The number of bytes that have been received by this listener per second.	JMX	jmx["{#JMXOBJ}",bytesReceived] Preprocessing: - CHANGEPERSECOND

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
WildFly: Server needs to restart for configuration change.	-	`find(/WildFly Server by JMX/jmx["jboss.as:management-root=server","runtimeConfigurationState"],,"like","ok")=0`	WARNING
WildFly: Server controller is not in RUNNING state	-	`find(/WildFly Server by JMX/jmx["jboss.as:management-root=server","serverState"],,"like","running")=0`	WARNING	Depends on: - WildFly: Server needs to restart for configuration change.
WildFly: Version has changed	WildFly version has changed. Ack to close.	`last(/WildFly Server by JMX/jmx["jboss.as:management-root=server","productVersion"],#1)<>last(/WildFly Server by JMX/jmx["jboss.as:management-root=server","productVersion"],#2) and length(last(/WildFly Server by JMX/jmx["jboss.as:management-root=server","productVersion"]))>0`	INFO	Manual close: YES
WildFly: has been restarted	Uptime is less than 10 minutes.	`last(/WildFly Server by JMX/jmx["java.lang:type=Runtime","Uptime"])<10m`	INFO	Manual close: YES
WildFly: Failed to fetch info data	Zabbix has not received data for items for the last 15 minutes	`nodata(/WildFly Server by JMX/jmx["java.lang:type=Runtime","Uptime"],15m)=1`	WARNING
WildFly deployment [{#DEPLOYMENT}]: Deployment status has changed	Deployment status has changed. Ack to close.	`last(/WildFly Server by JMX/jmx["{#JMXOBJ}",status],#1)<>last(/WildFly Server by JMX/jmx["{#JMXOBJ}",status],#2) and length(last(/WildFly Server by JMX/jmx["{#JMXOBJ}",status]))>0`	WARNING	Manual close: YES
WildFly {#JMXDATASOURCE}: JDBC monitoring statistic is not enabled	-	`last(/WildFly Server by JMX/jmx["{#JMXOBJ}",statisticsEnabled])=0`	INFO
WildFly {#JMXDATASOURCE}: There are no active connections for 5m	-	`max(/WildFly Server by JMX/jmx["{#JMXOBJ}",ActiveCount],5m)=0`	WARNING
WildFly {#JMXDATASOURCE}: Connection usage is too high	-	`min(/WildFly Server by JMX/jmx["{#JMXOBJ}",InUseCount],5m)/last(/WildFly Server by JMX/jmx["{#JMXOBJ}",AvailableCount])*100>{$WILDFLY.CONN.USAGE.WARN.MAX}`	HIGH
WildFly {#JMXDATASOURCE}: Pools monitoring statistic is not enabled	Zabbix has not received data for items for the last 15 minutes	`last(/WildFly Server by JMX/jmx["{#JMXOBJ}",statisticsEnabled])=0`	INFO
WildFly {#JMXDATASOURCE}: There are timeout connections	-	`last(/WildFly Server by JMX/jmx["{#JMXOBJ}",TimedOut])>0`	WARNING
WildFly {#JMXDATASOURCE}: Too many waiting connections	-	`min(/WildFly Server by JMX/jmx["{#JMXOBJ}",WaitCount],5m)>{$WILDFLY.CONN.WAIT.MAX.WARN}`	WARNING
WildFly listener {#HTTP_LISTENER}: There are 500 responses by this listener.	-	`last(/WildFly Server by JMX/jmx["{#JMXOBJ}",errorCount])>0`	WARNING

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

app

app_wildfly_domain_jmx

View README Download JSON

WildFly Domain by JMX

Overview

For Zabbix version: 6.2 and higher
Official JMX Template for WildFly Domain Controller.

This template was tested on:

WildFly, version 22.6.0

Setup

Metrics are collected by JMX. This template works with Domain Controller.

Enable and configure JMX access to WildFly. See documentation for instructions.
Copy jboss-client.jar from /(wildfly,EAP,Jboss,AS)/bin/client in to directory /usr/share/zabbix-java-gateway/lib
Restart Zabbix Java gateway
Set the user name and password in host macros {$WILDFLY.USER} and {$WILDFLY.PASSWORD}. Depending on your server setup, you may need to specify a custom JMX scheme in macro {$WILDFLY.JMX.PROTOCOL} (default: remote+http)

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$WILDFLY.DEPLOYMENT.MATCHES}	Filter of discoverable deployments	`.*`
{$WILDFLY.DEPLOYMENT.NOT_MATCHES}	Filter to exclude discovered deployments	`CHANGE_IF_NEEDED`
{$WILDFLY.JMX.PROTOCOL}	-	`remote+http`
{$WILDFLY.PASSWORD}	-	`zabbix`
{$WILDFLY.SERVER.MATCHES}	Filter of discoverable servers	`.*`
{$WILDFLY.SERVER.NOT_MATCHES}	Filter to exclude discovered servers	`CHANGE_IF_NEEDED`
{$WILDFLY.USER}	-	`zabbix`

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Deployments discovery

Name	Description	Type	Key and additional info
Deployments discovery	Discovery deployments metrics.	JMX	jmx.get[beans,"jboss.as.expr:deployment=,server-group="] Filter: AND - {#DEPLOYMENT} MATCHESREGEX `{$WILDFLY.DEPLOYMENT.MATCHES}` - {#DEPLOYMENT} NOTMATCHES_REGEX `{$WILDFLY.DEPLOYMENT.NOT_MATCHES}`
Servers discovery	Discovery instances in domain.	JMX	jmx.get[beans,"jboss.as:host=master,server-config="] Filter: AND - {#SERVER} MATCHESREGEX `{$WILDFLY.SERVER.MATCHES}` - {#SERVER} NOT*MATCHES_REGEX `{$WILDFLY.SERVER.NOT_MATCHES}`

Discovery deployments metrics.

JMX

jmx.get[beans,"jboss.as.expr:deployment=,server-group="]

Filter:

AND

- {#DEPLOYMENT} MATCHESREGEX {$WILDFLY.DEPLOYMENT.MATCHES}

- {#DEPLOYMENT} NOTMATCHES_REGEX {$WILDFLY.DEPLOYMENT.NOT_MATCHES}

Servers discovery

Discovery instances in domain.

JMX

jmx.get[beans,"jboss.as:host=master,server-config=*"]

Filter:

AND

- {#SERVER} MATCHESREGEX {$WILDFLY.SERVER.MATCHES}

- {#SERVER} NOTMATCHES_REGEX {$WILDFLY.SERVER.NOT_MATCHES}

Items collected

Group	Name	Description	Type	Key and additional info
WildFly	WildFly: Launch type	The manner in which the server process was launched. Either "DOMAIN" for a domain mode server launched by a Host Controller, "STANDALONE" for a standalone server launched from the command line, or "EMBEDDED" for a standalone server launched as an embedded part of an application running in the same virtual machine.	JMX	jmx["jboss.as:management-root=server","launchType"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `3h`
WildFly	WildFly: Name	For standalone mode: The name of this server. If not set, defaults to the runtime value of InetAddress.getLocalHost().getHostName(). For domain mode: The name given to this domain	JMX	jmx["jboss.as:management-root=server","name"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `3h`
WildFly	WildFly: Process type	The type of process represented by this root resource.	JMX	jmx["jboss.as:management-root=server","processType"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `3h`
WildFly	WildFly: Version	The version of the WildFly Core based product release	JMX	jmx["jboss.as:management-root=server","productVersion"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `3h`
WildFly	WildFly: Uptime	WildFly server uptime.	JMX	jmx["java.lang:type=Runtime","Uptime"] Preprocessing: - MULTIPLIER: `0.001`
WildFly	WildFly deployment [{#DEPLOYMENT}]: Enabled	Boolean indicating whether the deployment content is currently deployed in the runtime (or should be deployed in the runtime the next time the server starts.)	JMX	jmx["{#JMXOBJ}",enabled] Preprocessing: - BOOLTODECIMAL - DISCARDUNCHANGEDHEARTBEAT: `3h`
WildFly	WildFly deployment [{#DEPLOYMENT}]: Managed	Indicates if the deployment is managed (aka uses the ContentRepository).	JMX	jmx["{#JMXOBJ}",managed] Preprocessing: - BOOLTODECIMAL - DISCARDUNCHANGEDHEARTBEAT: `3h`
WildFly	WildFly domain: Server {#SERVER}: Autostart	Whether or not this server should be started when the Host Controller starts.	JMX	jmx["{#JMXOBJ}",autoStart] Preprocessing: - BOOLTODECIMAL - DISCARDUNCHANGEDHEARTBEAT: `3h`
WildFly	WildFly domain: Server {#SERVER}: Status	The current status of the server.	JMX	jmx["{#JMXOBJ}",status] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `3h`
WildFly	WildFly domain: Server {#SERVER}: Server group	The name of a server group from the domain model.	JMX	jmx["{#JMXOBJ}",group] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `3h`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
WildFly: Version has changed	WildFly version has changed. Ack to close.	`last(/WildFly Domain by JMX/jmx["jboss.as:management-root=server","productVersion"],#1)<>last(/WildFly Domain by JMX/jmx["jboss.as:management-root=server","productVersion"],#2) and length(last(/WildFly Domain by JMX/jmx["jboss.as:management-root=server","productVersion"]))>0`	INFO	Manual close: YES
WildFly: has been restarted	Uptime is less than 10 minutes.	`last(/WildFly Domain by JMX/jmx["java.lang:type=Runtime","Uptime"])<10m`	INFO	Manual close: YES
WildFly domain: Server {#SERVER}: Server status has changed	Server status has changed. Ack to close.	`last(/WildFly Domain by JMX/jmx["{#JMXOBJ}",status],#1)<>last(/WildFly Domain by JMX/jmx["{#JMXOBJ}",status],#2) and length(last(/WildFly Domain by JMX/jmx["{#JMXOBJ}",status]))>0`	WARNING	Manual close: YES
WildFly domain: Server {#SERVER}: Server group has changed	Server group has changed. Ack to close.	`last(/WildFly Domain by JMX/jmx["{#JMXOBJ}",group],#1)<>last(/WildFly Domain by JMX/jmx["{#JMXOBJ}",group],#2) and length(last(/WildFly Domain by JMX/jmx["{#JMXOBJ}",group]))>0`	INFO	Manual close: YES

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

app

app_vmware_fqdn

View README Download JSON

VMware FQDN

Overview

For Zabbix version: 6.2 and higher
The template to monitor VMware vCenter and ESX hypervisor. The "VMware Hypervisor" and "VMware Guest" templates are used by discovery and normally should not be manually linked to a host. For additional information please check https://www.zabbix.com/documentation/6.2/manual/vm_monitoring

Setup

Compile zabbix server with required options (--with-libxml2 and --with-libcurl)
Set the StartVMwareCollectors option in Zabbix server configuration file to 1 or more
Create a new host
Set the host macros (on host or template level) required for VMware authentication:

{$VMWARE.URL}
{$VMWARE.USERNAME}
{$VMWARE.PASSWORD}

Link the template to host created early

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$VMWARE.PASSWORD}	VMware service {$USERNAME} user password	``
{$VMWARE.URL}	VMware service (vCenter or ESX hypervisor) SDK URL (https://servername/sdk)	``
{$VMWARE.USERNAME}	VMware service user name	``

Template links

There are no template links in this template.

Discovery rules

Name	Description	Type	Key and additional info
Discover VMware clusters	Discovery of clusters	SIMPLE	vmware.cluster.discovery[{$VMWARE.URL}]
Discover VMware datastores	-	SIMPLE	vmware.datastore.discovery[{$VMWARE.URL}]
Discover VMware hypervisors	Discovery of hypervisors.	SIMPLE	vmware.hv.discovery[{$VMWARE.URL}]
Discover VMware VMs FQDN	Discovery of guest virtual machines.	SIMPLE	vmware.vm.discovery[{$VMWARE.URL}] Filter: AND - {#VM.DNS} NOTMATCHESREGEX `^$`

Items collected

Group	Name	Description	Type	Key and additional info
VMware	VMware: Event log	Collect VMware event log. See also: https://www.zabbix.com/documentation/6.2/manual/config/items/preprocessing/examples#filteringvmwareeventlogrecords	SIMPLE	vmware.eventlog[{$VMWARE.URL},skip]
VMware	VMware: Full name	VMware service full name.	SIMPLE	vmware.fullname[{$VMWARE.URL}] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1d`
VMware	VMware: Version	VMware service version.	SIMPLE	vmware.version[{$VMWARE.URL}] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1d`
VMware	VMware: Status of "{#CLUSTER.NAME}" cluster	VMware cluster status.	SIMPLE	vmware.cluster.status[{$VMWARE.URL},{#CLUSTER.NAME}]
VMware	VMware: Average read latency of the datastore {#DATASTORE}	Amount of time for a read operation from the datastore (milliseconds).	SIMPLE	vmware.datastore.read[{$VMWARE.URL},{#DATASTORE},latency]
VMware	VMware: Free space on datastore {#DATASTORE} (percentage)	VMware datastore space in percentage from total.	SIMPLE	vmware.datastore.size[{$VMWARE.URL},{#DATASTORE},pfree]
VMware	VMware: Total size of datastore {#DATASTORE}	VMware datastore space in bytes.	SIMPLE	vmware.datastore.size[{$VMWARE.URL},{#DATASTORE}]
VMware	VMware: Average write latency of the datastore {#DATASTORE}	Amount of time for a write operation to the datastore (milliseconds).	SIMPLE	vmware.datastore.write[{$VMWARE.URL},{#DATASTORE},latency]

Triggers

Name	Description	Expression	Severity	Dependencies and additional info

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

VMware Guest

Overview

For Zabbix version: 6.2 and higher

Setup

Refer to the vendor documentation.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$VMWARE.PASSWORD}	VMware service {$USERNAME} user password	``
{$VMWARE.URL}	VMware service (vCenter or ESX hypervisor) SDK URL (https://servername/sdk)	``
{$VMWARE.USERNAME}	VMware service user name	``

Template links

There are no template links in this template.

Discovery rules

Name	Description	Type	Key and additional info
Disk device discovery	Discovery of all disk devices.	SIMPLE	vmware.vm.vfs.dev.discovery[{$VMWARE.URL},{$VMWARE.VM.UUID}]
Mounted filesystem discovery	Discovery of all guest file systems.	SIMPLE	vmware.vm.vfs.fs.discovery[{$VMWARE.URL},{$VMWARE.VM.UUID}]
Network device discovery	Discovery of all network devices.	SIMPLE	vmware.vm.net.if.discovery[{$VMWARE.URL},{$VMWARE.VM.UUID}]

Items collected

Group	Name	Description	Type	Key and additional info
VMware	VMware: Cluster name	Cluster name of the guest VM.	SIMPLE	vmware.vm.cluster.name[{$VMWARE.URL},{$VMWARE.VM.UUID}] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1d`
VMware	VMware: Number of virtual CPUs	Number of virtual CPUs assigned to the guest.	SIMPLE	vmware.vm.cpu.num[{$VMWARE.URL},{$VMWARE.VM.UUID}] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1d`
VMware	VMware: CPU ready	Time that the virtual machine was ready, but could not get scheduled to run on the physical CPU during last measurement interval (VMware vCenter/ESXi Server performance counter sampling interval - 20 seconds)	SIMPLE	vmware.vm.cpu.ready[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: CPU usage	Current upper-bound on CPU usage. The upper-bound is based on the host the virtual machine is current running on, as well as limits configured on the virtual machine itself or any parent resource pool. Valid while the virtual machine is running.	SIMPLE	vmware.vm.cpu.usage[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: Datacenter name	Datacenter name of the guest VM.	SIMPLE	vmware.vm.datacenter.name[{$VMWARE.URL},{$VMWARE.VM.UUID}] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1d`
VMware	VMware: Hypervisor name	Hypervisor name of the guest VM.	SIMPLE	vmware.vm.hv.name[{$VMWARE.URL},{$VMWARE.VM.UUID}] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1d`
VMware	VMware: Ballooned memory	The amount of guest physical memory that is currently reclaimed through the balloon driver.	SIMPLE	vmware.vm.memory.size.ballooned[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: Compressed memory	The amount of memory currently in the compression cache for this VM.	SIMPLE	vmware.vm.memory.size.compressed[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: Private memory	Amount of memory backed by host memory and not being shared.	SIMPLE	vmware.vm.memory.size.private[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: Shared memory	The amount of guest physical memory shared through transparent page sharing.	SIMPLE	vmware.vm.memory.size.shared[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: Swapped memory	The amount of guest physical memory swapped out to the VM's swap device by ESX.	SIMPLE	vmware.vm.memory.size.swapped[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: Guest memory usage	The amount of guest physical memory that is being used by the VM.	SIMPLE	vmware.vm.memory.size.usage.guest[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: Host memory usage	The amount of host physical memory allocated to the VM, accounting for saving from memory sharing with other VMs.	SIMPLE	vmware.vm.memory.size.usage.host[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: Memory size	Total size of configured memory.	SIMPLE	vmware.vm.memory.size[{$VMWARE.URL},{$VMWARE.VM.UUID}] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1d`
VMware	VMware: Power state	The current power state of the virtual machine.	SIMPLE	vmware.vm.powerstate[{$VMWARE.URL},{$VMWARE.VM.UUID}] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1h`
VMware	VMware: Committed storage space	Total storage space, in bytes, committed to this virtual machine across all datastores.	SIMPLE	vmware.vm.storage.committed[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: Uncommitted storage space	Additional storage space, in bytes, potentially used by this virtual machine on all datastores.	SIMPLE	vmware.vm.storage.uncommitted[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: Unshared storage space	Total storage space, in bytes, occupied by the virtual machine across all datastores, that is not shared with any other virtual machine.	SIMPLE	vmware.vm.storage.unshared[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: Uptime	System uptime.	SIMPLE	vmware.vm.uptime[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: Guest memory swapped	Amount of guest physical memory that is swapped out to the swap space.	SIMPLE	vmware.vm.guest.memory.size.swapped[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: Host memory consumed	Amount of host physical memory consumed for backing up guest physical memory pages.	SIMPLE	vmware.vm.memory.size.consumed[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: Host memory usage in percents	Percentage of host physical memory that has been consumed.	SIMPLE	vmware.vm.memory.usage[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: CPU usage in percents	CPU usage as a percentage during the interval.	SIMPLE	vmware.vm.cpu.usage.perf[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: CPU latency in percents	Percentage of time the virtual machine is unable to run because it is contending for access to the physical CPU(s).	SIMPLE	vmware.vm.cpu.latency[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: CPU readiness latency in percents	Percentage of time that the virtual machine was ready, but could not get scheduled to run on the physical CPU.	SIMPLE	vmware.vm.cpu.readiness[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: CPU swap-in latency in percents	Percentage of CPU time spent waiting for swap-in.	SIMPLE	vmware.vm.cpu.swapwait[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: Uptime of guest OS	Total time elapsed since the last operating system boot-up (in seconds).	SIMPLE	vmware.vm.guest.osuptime[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: Number of bytes received on interface {#IFDESC}	VMware virtual machine network interface input statistics (bytes per second).	SIMPLE	vmware.vm.net.if.in[{$VMWARE.URL},{$VMWARE.VM.UUID},{#IFNAME},bps]
VMware	VMware: Number of packets received on interface {#IFDESC}	VMware virtual machine network interface input statistics (packets per second).	SIMPLE	vmware.vm.net.if.in[{$VMWARE.URL},{$VMWARE.VM.UUID},{#IFNAME},pps]
VMware	VMware: Number of bytes transmitted on interface {#IFDESC}	VMware virtual machine network interface output statistics (bytes per second).	SIMPLE	vmware.vm.net.if.out[{$VMWARE.URL},{$VMWARE.VM.UUID},{#IFNAME},bps]
VMware	VMware: Number of packets transmitted on interface {#IFDESC}	VMware virtual machine network interface output statistics (packets per second).	SIMPLE	vmware.vm.net.if.out[{$VMWARE.URL},{$VMWARE.VM.UUID},{#IFNAME},pps]
VMware	VMware: Network utilization on interface {#IFDESC}	VMware virtual machine network utilization (combined transmit-rates and receive-rates) during the interval.	SIMPLE	vmware.vm.net.if.usage[{$VMWARE.URL},{$VMWARE.VM.UUID},{#IFNAME}] Preprocessing: - MULTIPLIER: `1024`
VMware	VMware: Average number of bytes read from the disk {#DISKDESC}	VMware virtual machine disk device read statistics (bytes per second).	SIMPLE	vmware.vm.vfs.dev.read[{$VMWARE.URL},{$VMWARE.VM.UUID},{#DISKNAME},bps]
VMware	VMware: Average number of reads from the disk {#DISKDESC}	VMware virtual machine disk device read statistics (operations per second).	SIMPLE	vmware.vm.vfs.dev.read[{$VMWARE.URL},{$VMWARE.VM.UUID},{#DISKNAME},ops]
VMware	VMware: Average number of bytes written to the disk {#DISKDESC}	VMware virtual machine disk device write statistics (bytes per second).	SIMPLE	vmware.vm.vfs.dev.write[{$VMWARE.URL},{$VMWARE.VM.UUID},{#DISKNAME},bps]
VMware	VMware: Average number of writes to the disk {#DISKDESC}	VMware virtual machine disk device write statistics (operations per second).	SIMPLE	vmware.vm.vfs.dev.write[{$VMWARE.URL},{$VMWARE.VM.UUID},{#DISKNAME},ops]
VMware	VMware: Average number of outstanding read requests to the disk {#DISKDESC}	Average number of outstanding read requests to the virtual disk during the collection interval.	SIMPLE	vmware.vm.storage.readoio[{$VMWARE.URL},{$VMWARE.VM.UUID},{#DISKNAME}]
VMware	VMware: Average number of outstanding write requests to the disk {#DISKDESC}	Average number of outstanding write requests to the virtual disk during the collection interval.	SIMPLE	vmware.vm.storage.writeoio[{$VMWARE.URL},{$VMWARE.VM.UUID},{#DISKNAME}]
VMware	VMware: Average write latency to the disk {#DISKDESC}	The average time a write to the virtual disk takes.	SIMPLE	vmware.vm.storage.totalwritelatency[{$VMWARE.URL},{$VMWARE.VM.UUID},{#DISKNAME}]
VMware	VMware: Average read latency to the disk {#DISKDESC}	The average time a read from the virtual disk takes.	SIMPLE	vmware.vm.storage.totalreadlatency[{$VMWARE.URL},{$VMWARE.VM.UUID},{#DISKNAME}]
VMware	VMware: Free disk space on {#FSNAME}	VMware virtual machine file system statistics (bytes).	SIMPLE	vmware.vm.vfs.fs.size[{$VMWARE.URL},{$VMWARE.VM.UUID},{#FSNAME},free]
VMware	VMware: Free disk space on {#FSNAME} (percentage)	VMware virtual machine file system statistics (percentages).	SIMPLE	vmware.vm.vfs.fs.size[{$VMWARE.URL},{$VMWARE.VM.UUID},{#FSNAME},pfree]
VMware	VMware: Total disk space on {#FSNAME}	VMware virtual machine total disk space (bytes).	SIMPLE	vmware.vm.vfs.fs.size[{$VMWARE.URL},{$VMWARE.VM.UUID},{#FSNAME},total] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1d`
VMware	VMware: Used disk space on {#FSNAME}	VMware virtual machine used disk space (bytes).	SIMPLE	vmware.vm.vfs.fs.size[{$VMWARE.URL},{$VMWARE.VM.UUID},{#FSNAME},used]

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
VMware: VM has been restarted	Uptime is less than 10 minutes.	`last(/VMware Guest/vmware.vm.guest.osuptime[{$VMWARE.URL},{$VMWARE.VM.UUID}])<10m`	WARNING	Manual close: YES

Feedback

Please report any issues with the template at https://support.zabbix.com

VMware Hypervisor

Overview

For Zabbix version: 6.2 and higher

Setup

Refer to the vendor documentation.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$VMWARE.PASSWORD}	VMware service {$USERNAME} user password	``
{$VMWARE.URL}	VMware service (vCenter or ESX hypervisor) SDK URL (https://servername/sdk)	``
{$VMWARE.USERNAME}	VMware service user name	``

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Datastore discovery

Name	Description	Type	Key and additional info
Datastore discovery	-	SIMPLE	vmware.hv.datastore.discovery[{$VMWARE.URL},{$VMWARE.HV.UUID}]
Healthcheck discovery	VMware Rollup Health State sensor discovery	DEPENDENT	vmware.hv.healthcheck.discovery Preprocessing: - JSONPATH: `$..HostNumericSensorInfo[?(@.name=="VMware Rollup Health State")]` - JAVASCRIPT: `return JSON.stringify(value != "[]" ? [{'{#SINGLETON}': ''}] : []);` - DISCARDUNCHANGEDHEARTBEAT: `6h`

SIMPLE

vmware.hv.datastore.discovery[{$VMWARE.URL},{$VMWARE.HV.UUID}]

Healthcheck discovery

VMware Rollup Health State sensor discovery

DEPENDENT

vmware.hv.healthcheck.discovery

Preprocessing:

- JSONPATH: $..HostNumericSensorInfo[?(@.name=="VMware Rollup Health State")]

- JAVASCRIPT: return JSON.stringify(value != "[]" ? [{'{#SINGLETON}': ''}] : []);

- DISCARDUNCHANGEDHEARTBEAT: 6h

Items collected

Group	Name	Description	Type	Key and additional info
VMware	VMware: Hypervisor ping	Checks if the hypervisor is running and accepting ICMP pings.	SIMPLE	icmpping[] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
VMware	VMware: Cluster name	Cluster name of the guest VM.	SIMPLE	vmware.hv.cluster.name[{$VMWARE.URL},{$VMWARE.HV.UUID}] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1d`
VMware	VMware: CPU usage	Aggregated CPU usage across all cores on the host in Hz. This is only available if the host is connected.	SIMPLE	vmware.hv.cpu.usage[{$VMWARE.URL},{$VMWARE.HV.UUID}]
VMware	VMware: CPU usage in percents	CPU usage as a percentage during the interval.	SIMPLE	vmware.hv.cpu.usage.perf[{$VMWARE.URL},{$VMWARE.HV.UUID}]
VMware	VMware: CPU utilization	CPU usage as a percentage during the interval depends on power management or HT.	SIMPLE	vmware.hv.cpu.utilization[{$VMWARE.URL},{$VMWARE.HV.UUID}]
VMware	VMware: Power usage	Current power usage.	SIMPLE	vmware.hv.power[{$VMWARE.URL},{$VMWARE.HV.UUID}]
VMware	VMware: Power usage maximum allowed	Maximum allowed power usage.	SIMPLE	vmware.hv.power[{$VMWARE.URL},{$VMWARE.HV.UUID},max] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `6h`
VMware	VMware: Datacenter name	Datacenter name of the hypervisor.	SIMPLE	vmware.hv.datacenter.name[{$VMWARE.URL},{$VMWARE.HV.UUID}] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1d`
VMware	VMware: Full name	The complete product name, including the version information.	SIMPLE	vmware.hv.fullname[{$VMWARE.URL},{$VMWARE.HV.UUID}] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1d`
VMware	VMware: CPU frequency	The speed of the CPU cores. This is an average value if there are multiple speeds. The product of CPU frequency and number of cores is approximately equal to the sum of the MHz for all the individual cores on the host.	SIMPLE	vmware.hv.hw.cpu.freq[{$VMWARE.URL},{$VMWARE.HV.UUID}]
VMware	VMware: CPU model	The CPU model.	SIMPLE	vmware.hv.hw.cpu.model[{$VMWARE.URL},{$VMWARE.HV.UUID}]
VMware	VMware: CPU cores	Number of physical CPU cores on the host. Physical CPU cores are the processors contained by a CPU package.	SIMPLE	vmware.hv.hw.cpu.num[{$VMWARE.URL},{$VMWARE.HV.UUID}] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1d`
VMware	VMware: CPU threads	Number of physical CPU threads on the host.	SIMPLE	vmware.hv.hw.cpu.threads[{$VMWARE.URL},{$VMWARE.HV.UUID}]
VMware	VMware: Total memory	The physical memory size.	SIMPLE	vmware.hv.hw.memory[{$VMWARE.URL},{$VMWARE.HV.UUID}]
VMware	VMware: Model	The system model identification.	SIMPLE	vmware.hv.hw.model[{$VMWARE.URL},{$VMWARE.HV.UUID}]
VMware	VMware: Bios UUID	The hardware BIOS identification.	SIMPLE	vmware.hv.hw.uuid[{$VMWARE.URL},{$VMWARE.HV.UUID}]
VMware	VMware: Vendor	The hardware vendor identification.	SIMPLE	vmware.hv.hw.vendor[{$VMWARE.URL},{$VMWARE.HV.UUID}]
VMware	VMware: Ballooned memory	The amount of guest physical memory that is currently reclaimed through the balloon driver. Sum of all guest VMs.	SIMPLE	vmware.hv.memory.size.ballooned[{$VMWARE.URL},{$VMWARE.HV.UUID}]
VMware	VMware: Used memory	Physical memory usage on the host.	SIMPLE	vmware.hv.memory.used[{$VMWARE.URL},{$VMWARE.HV.UUID}]
VMware	VMware: Number of bytes received	VMware hypervisor network input statistics (bytes per second).	SIMPLE	vmware.hv.network.in[{$VMWARE.URL},{$VMWARE.HV.UUID},bps]
VMware	VMware: Number of bytes transmitted	VMware hypervisor network output statistics (bytes per second).	SIMPLE	vmware.hv.network.out[{$VMWARE.URL},{$VMWARE.HV.UUID},bps]
VMware	VMware: Overall status	The overall alarm status of the host: gray - unknown, green - ok, red - it has a problem, yellow - it might have a problem.	SIMPLE	vmware.hv.status[{$VMWARE.URL},{$VMWARE.HV.UUID}]
VMware	VMware: Uptime	System uptime.	SIMPLE	vmware.hv.uptime[{$VMWARE.URL},{$VMWARE.HV.UUID}]
VMware	VMware: Version	Dot-separated version string.	SIMPLE	vmware.hv.version[{$VMWARE.URL},{$VMWARE.HV.UUID}]
VMware	VMware: Number of guest VMs	Number of guest virtual machines.	SIMPLE	vmware.hv.vm.num[{$VMWARE.URL},{$VMWARE.HV.UUID}]
VMware	VMware: Average read latency of the datastore {#DATASTORE}	Average amount of time for a read operation from the datastore (milliseconds).	SIMPLE	vmware.hv.datastore.read[{$VMWARE.URL},{$VMWARE.HV.UUID},{#DATASTORE},latency]
VMware	VMware: Free space on datastore {#DATASTORE} (percentage)	VMware datastore space in percentage from total.	SIMPLE	vmware.hv.datastore.size[{$VMWARE.URL},{$VMWARE.HV.UUID},{#DATASTORE},pfree]
VMware	VMware: Total size of datastore {#DATASTORE}	VMware datastore space in bytes.	SIMPLE	vmware.hv.datastore.size[{$VMWARE.URL},{$VMWARE.HV.UUID},{#DATASTORE}]
VMware	VMware: Average write latency of the datastore {#DATASTORE}	Average amount of time for a write operation to the datastore (milliseconds).	SIMPLE	vmware.hv.datastore.write[{$VMWARE.URL},{$VMWARE.HV.UUID},{#DATASTORE},latency]
VMware	VMware: Multipath count for datastore {#DATASTORE}	Number of available datastore paths.	SIMPLE	vmware.hv.datastore.multipath[{$VMWARE.URL},{$VMWARE.HV.UUID},{#DATASTORE}]
VMware	VMware: Health state rollup	The host health state rollup sensor value: gray - unknown, green - ok, red - it has a problem, yellow - it might have a problem.	DEPENDENT	vmware.hv.sensor.health.state[{#SINGLETON}] Preprocessing: - JSONPATH: `$..HostNumericSensorInfo[?(@.name=="VMware Rollup Health State")].healthState.label.first()`
Zabbix raw items	VMware: Get sensors	Master item for sensors data.	SIMPLE	vmware.hv.sensors.get[{$VMWARE.URL},{$VMWARE.HV.UUID}]

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
VMware: Hypervisor is down	The service is unavailable or does not accept ICMP ping.	`last(/VMware Hypervisor/icmpping[])=0`	AVERAGE	Manual close: YES
VMware: The {$VMWARE.HV.UUID} health is Red	One or more components in the appliance might be in an unusable status and the appliance might become unresponsive soon. Security patches might be available.	`last(/VMware Hypervisor/vmware.hv.status[{$VMWARE.URL},{$VMWARE.HV.UUID}])=3`	HIGH
VMware: The {$VMWARE.HV.UUID} health is Yellow	One or more components in the appliance might become overloaded soon.	`last(/VMware Hypervisor/vmware.hv.status[{$VMWARE.URL},{$VMWARE.HV.UUID}])=2`	AVERAGE	Depends on: - VMware: The {$VMWARE.HV.UUID} health is Red
VMware: Hypervisor has been restarted	Uptime is less than 10 minutes.	`last(/VMware Hypervisor/vmware.hv.uptime[{$VMWARE.URL},{$VMWARE.HV.UUID}])<10m`	WARNING	Manual close: YES
VMware: The multipath count has been changed	The number of available datastore paths less than registered ({#MULTIPATH.COUNT}).	`last(/VMware Hypervisor/vmware.hv.datastore.multipath[{$VMWARE.URL},{$VMWARE.HV.UUID},{#DATASTORE}],#1)<>last(/VMware Hypervisor/vmware.hv.datastore.multipath[{$VMWARE.URL},{$VMWARE.HV.UUID},{#DATASTORE}],#2) and last(/VMware Hypervisor/vmware.hv.datastore.multipath[{$VMWARE.URL},{$VMWARE.HV.UUID},{#DATASTORE}])<{#MULTIPATH.COUNT}`	AVERAGE	Manual close: YES
VMware: The {$VMWARE.HV.UUID} health is Red	One or more components in the appliance might be in an unusable status and the appliance might become unresponsive soon. Security patches might be available.	`last(/VMware Hypervisor/vmware.hv.sensor.health.state[{#SINGLETON}])="Red"`	HIGH	Depends on: - VMware: The {$VMWARE.HV.UUID} health is Red
VMware: The {$VMWARE.HV.UUID} health is Yellow	One or more components in the appliance might become overloaded soon.	`last(/VMware Hypervisor/vmware.hv.sensor.health.state[{#SINGLETON}])="Yellow"`	AVERAGE	Depends on: - VMware: The {$VMWARE.HV.UUID} health is Red - VMware: The {$VMWARE.HV.UUID} health is Red - VMware: The {$VMWARE.HV.UUID} health is Yellow

Feedback

Please report any issues with the template at https://support.zabbix.com

app

app_vmware

View README Download JSON

VMware

Overview

Setup

Compile zabbix server with required options (--with-libxml2 and --with-libcurl)
Set the StartVMwareCollectors option in Zabbix server configuration file to 1 or more
Create a new host
Set the host macros (on host or template level) required for VMware authentication:

{$VMWARE.URL}
{$VMWARE.USERNAME}
{$VMWARE.PASSWORD}

Link the template to host created early

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$VMWARE.PASSWORD}	VMware service {$USERNAME} user password	``
{$VMWARE.URL}	VMware service (vCenter or ESX hypervisor) SDK URL (https://servername/sdk)	``
{$VMWARE.USERNAME}	VMware service user name	``

Template links

There are no template links in this template.

Discovery rules

Name	Description	Type	Key and additional info
Discover VMware clusters	Discovery of clusters	SIMPLE	vmware.cluster.discovery[{$VMWARE.URL}]
Discover VMware datastores	-	SIMPLE	vmware.datastore.discovery[{$VMWARE.URL}]
Discover VMware hypervisors	Discovery of hypervisors.	SIMPLE	vmware.hv.discovery[{$VMWARE.URL}]
Discover VMware VMs	Discovery of guest virtual machines.	SIMPLE	vmware.vm.discovery[{$VMWARE.URL}]

Items collected

Group	Name	Description	Type	Key and additional info
VMware	VMware: Event log	Collect VMware event log. See also: https://www.zabbix.com/documentation/6.2/manual/config/items/preprocessing/examples#filteringvmwareeventlogrecords	SIMPLE	vmware.eventlog[{$VMWARE.URL},skip]
VMware	VMware: Full name	VMware service full name.	SIMPLE	vmware.fullname[{$VMWARE.URL}] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1d`
VMware	VMware: Version	VMware service version.	SIMPLE	vmware.version[{$VMWARE.URL}] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1d`
VMware	VMware: Status of "{#CLUSTER.NAME}" cluster	VMware cluster status.	SIMPLE	vmware.cluster.status[{$VMWARE.URL},{#CLUSTER.NAME}]
VMware	VMware: Average read latency of the datastore {#DATASTORE}	Amount of time for a read operation from the datastore (milliseconds).	SIMPLE	vmware.datastore.read[{$VMWARE.URL},{#DATASTORE},latency]
VMware	VMware: Free space on datastore {#DATASTORE} (percentage)	VMware datastore space in percentage from total.	SIMPLE	vmware.datastore.size[{$VMWARE.URL},{#DATASTORE},pfree]
VMware	VMware: Total size of datastore {#DATASTORE}	VMware datastore space in bytes.	SIMPLE	vmware.datastore.size[{$VMWARE.URL},{#DATASTORE}]
VMware	VMware: Average write latency of the datastore {#DATASTORE}	Amount of time for a write operation to the datastore (milliseconds).	SIMPLE	vmware.datastore.write[{$VMWARE.URL},{#DATASTORE},latency]

Triggers

Name	Description	Expression	Severity	Dependencies and additional info

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

VMware Guest

Overview

For Zabbix version: 6.2 and higher

Setup

Refer to the vendor documentation.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$VMWARE.PASSWORD}	VMware service {$USERNAME} user password	``
{$VMWARE.URL}	VMware service (vCenter or ESX hypervisor) SDK URL (https://servername/sdk)	``
{$VMWARE.USERNAME}	VMware service user name	``

Template links

There are no template links in this template.

Discovery rules

Name	Description	Type	Key and additional info
Disk device discovery	Discovery of all disk devices.	SIMPLE	vmware.vm.vfs.dev.discovery[{$VMWARE.URL},{$VMWARE.VM.UUID}]
Mounted filesystem discovery	Discovery of all guest file systems.	SIMPLE	vmware.vm.vfs.fs.discovery[{$VMWARE.URL},{$VMWARE.VM.UUID}]
Network device discovery	Discovery of all network devices.	SIMPLE	vmware.vm.net.if.discovery[{$VMWARE.URL},{$VMWARE.VM.UUID}]

Items collected

Group	Name	Description	Type	Key and additional info
VMware	VMware: Cluster name	Cluster name of the guest VM.	SIMPLE	vmware.vm.cluster.name[{$VMWARE.URL},{$VMWARE.VM.UUID}] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1d`
VMware	VMware: Number of virtual CPUs	Number of virtual CPUs assigned to the guest.	SIMPLE	vmware.vm.cpu.num[{$VMWARE.URL},{$VMWARE.VM.UUID}] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1d`
VMware	VMware: CPU ready	Time that the virtual machine was ready, but could not get scheduled to run on the physical CPU during last measurement interval (VMware vCenter/ESXi Server performance counter sampling interval - 20 seconds)	SIMPLE	vmware.vm.cpu.ready[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: CPU usage	Current upper-bound on CPU usage. The upper-bound is based on the host the virtual machine is current running on, as well as limits configured on the virtual machine itself or any parent resource pool. Valid while the virtual machine is running.	SIMPLE	vmware.vm.cpu.usage[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: Datacenter name	Datacenter name of the guest VM.	SIMPLE	vmware.vm.datacenter.name[{$VMWARE.URL},{$VMWARE.VM.UUID}] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1d`
VMware	VMware: Hypervisor name	Hypervisor name of the guest VM.	SIMPLE	vmware.vm.hv.name[{$VMWARE.URL},{$VMWARE.VM.UUID}] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1d`
VMware	VMware: Ballooned memory	The amount of guest physical memory that is currently reclaimed through the balloon driver.	SIMPLE	vmware.vm.memory.size.ballooned[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: Compressed memory	The amount of memory currently in the compression cache for this VM.	SIMPLE	vmware.vm.memory.size.compressed[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: Private memory	Amount of memory backed by host memory and not being shared.	SIMPLE	vmware.vm.memory.size.private[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: Shared memory	The amount of guest physical memory shared through transparent page sharing.	SIMPLE	vmware.vm.memory.size.shared[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: Swapped memory	The amount of guest physical memory swapped out to the VM's swap device by ESX.	SIMPLE	vmware.vm.memory.size.swapped[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: Guest memory usage	The amount of guest physical memory that is being used by the VM.	SIMPLE	vmware.vm.memory.size.usage.guest[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: Host memory usage	The amount of host physical memory allocated to the VM, accounting for saving from memory sharing with other VMs.	SIMPLE	vmware.vm.memory.size.usage.host[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: Memory size	Total size of configured memory.	SIMPLE	vmware.vm.memory.size[{$VMWARE.URL},{$VMWARE.VM.UUID}] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1d`
VMware	VMware: Power state	The current power state of the virtual machine.	SIMPLE	vmware.vm.powerstate[{$VMWARE.URL},{$VMWARE.VM.UUID}] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1h`
VMware	VMware: Committed storage space	Total storage space, in bytes, committed to this virtual machine across all datastores.	SIMPLE	vmware.vm.storage.committed[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: Uncommitted storage space	Additional storage space, in bytes, potentially used by this virtual machine on all datastores.	SIMPLE	vmware.vm.storage.uncommitted[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: Unshared storage space	Total storage space, in bytes, occupied by the virtual machine across all datastores, that is not shared with any other virtual machine.	SIMPLE	vmware.vm.storage.unshared[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: Uptime	System uptime.	SIMPLE	vmware.vm.uptime[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: Guest memory swapped	Amount of guest physical memory that is swapped out to the swap space.	SIMPLE	vmware.vm.guest.memory.size.swapped[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: Host memory consumed	Amount of host physical memory consumed for backing up guest physical memory pages.	SIMPLE	vmware.vm.memory.size.consumed[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: Host memory usage in percents	Percentage of host physical memory that has been consumed.	SIMPLE	vmware.vm.memory.usage[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: CPU usage in percents	CPU usage as a percentage during the interval.	SIMPLE	vmware.vm.cpu.usage.perf[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: CPU latency in percents	Percentage of time the virtual machine is unable to run because it is contending for access to the physical CPU(s).	SIMPLE	vmware.vm.cpu.latency[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: CPU readiness latency in percents	Percentage of time that the virtual machine was ready, but could not get scheduled to run on the physical CPU.	SIMPLE	vmware.vm.cpu.readiness[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: CPU swap-in latency in percents	Percentage of CPU time spent waiting for swap-in.	SIMPLE	vmware.vm.cpu.swapwait[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: Uptime of guest OS	Total time elapsed since the last operating system boot-up (in seconds).	SIMPLE	vmware.vm.guest.osuptime[{$VMWARE.URL},{$VMWARE.VM.UUID}]
VMware	VMware: Number of bytes received on interface {#IFDESC}	VMware virtual machine network interface input statistics (bytes per second).	SIMPLE	vmware.vm.net.if.in[{$VMWARE.URL},{$VMWARE.VM.UUID},{#IFNAME},bps]
VMware	VMware: Number of packets received on interface {#IFDESC}	VMware virtual machine network interface input statistics (packets per second).	SIMPLE	vmware.vm.net.if.in[{$VMWARE.URL},{$VMWARE.VM.UUID},{#IFNAME},pps]
VMware	VMware: Number of bytes transmitted on interface {#IFDESC}	VMware virtual machine network interface output statistics (bytes per second).	SIMPLE	vmware.vm.net.if.out[{$VMWARE.URL},{$VMWARE.VM.UUID},{#IFNAME},bps]
VMware	VMware: Number of packets transmitted on interface {#IFDESC}	VMware virtual machine network interface output statistics (packets per second).	SIMPLE	vmware.vm.net.if.out[{$VMWARE.URL},{$VMWARE.VM.UUID},{#IFNAME},pps]
VMware	VMware: Network utilization on interface {#IFDESC}	VMware virtual machine network utilization (combined transmit-rates and receive-rates) during the interval.	SIMPLE	vmware.vm.net.if.usage[{$VMWARE.URL},{$VMWARE.VM.UUID},{#IFNAME}] Preprocessing: - MULTIPLIER: `1024`
VMware	VMware: Average number of bytes read from the disk {#DISKDESC}	VMware virtual machine disk device read statistics (bytes per second).	SIMPLE	vmware.vm.vfs.dev.read[{$VMWARE.URL},{$VMWARE.VM.UUID},{#DISKNAME},bps]
VMware	VMware: Average number of reads from the disk {#DISKDESC}	VMware virtual machine disk device read statistics (operations per second).	SIMPLE	vmware.vm.vfs.dev.read[{$VMWARE.URL},{$VMWARE.VM.UUID},{#DISKNAME},ops]
VMware	VMware: Average number of bytes written to the disk {#DISKDESC}	VMware virtual machine disk device write statistics (bytes per second).	SIMPLE	vmware.vm.vfs.dev.write[{$VMWARE.URL},{$VMWARE.VM.UUID},{#DISKNAME},bps]
VMware	VMware: Average number of writes to the disk {#DISKDESC}	VMware virtual machine disk device write statistics (operations per second).	SIMPLE	vmware.vm.vfs.dev.write[{$VMWARE.URL},{$VMWARE.VM.UUID},{#DISKNAME},ops]
VMware	VMware: Average number of outstanding read requests to the disk {#DISKDESC}	Average number of outstanding read requests to the virtual disk during the collection interval.	SIMPLE	vmware.vm.storage.readoio[{$VMWARE.URL},{$VMWARE.VM.UUID},{#DISKNAME}]
VMware	VMware: Average number of outstanding write requests to the disk {#DISKDESC}	Average number of outstanding write requests to the virtual disk during the collection interval.	SIMPLE	vmware.vm.storage.writeoio[{$VMWARE.URL},{$VMWARE.VM.UUID},{#DISKNAME}]
VMware	VMware: Average write latency to the disk {#DISKDESC}	The average time a write to the virtual disk takes.	SIMPLE	vmware.vm.storage.totalwritelatency[{$VMWARE.URL},{$VMWARE.VM.UUID},{#DISKNAME}]
VMware	VMware: Average read latency to the disk {#DISKDESC}	The average time a read from the virtual disk takes.	SIMPLE	vmware.vm.storage.totalreadlatency[{$VMWARE.URL},{$VMWARE.VM.UUID},{#DISKNAME}]
VMware	VMware: Free disk space on {#FSNAME}	VMware virtual machine file system statistics (bytes).	SIMPLE	vmware.vm.vfs.fs.size[{$VMWARE.URL},{$VMWARE.VM.UUID},{#FSNAME},free]
VMware	VMware: Free disk space on {#FSNAME} (percentage)	VMware virtual machine file system statistics (percentages).	SIMPLE	vmware.vm.vfs.fs.size[{$VMWARE.URL},{$VMWARE.VM.UUID},{#FSNAME},pfree]
VMware	VMware: Total disk space on {#FSNAME}	VMware virtual machine total disk space (bytes).	SIMPLE	vmware.vm.vfs.fs.size[{$VMWARE.URL},{$VMWARE.VM.UUID},{#FSNAME},total] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1d`
VMware	VMware: Used disk space on {#FSNAME}	VMware virtual machine used disk space (bytes).	SIMPLE	vmware.vm.vfs.fs.size[{$VMWARE.URL},{$VMWARE.VM.UUID},{#FSNAME},used]

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
VMware: VM has been restarted	Uptime is less than 10 minutes.	`last(/VMware Guest/vmware.vm.guest.osuptime[{$VMWARE.URL},{$VMWARE.VM.UUID}])<10m`	WARNING	Manual close: YES

Feedback

Please report any issues with the template at https://support.zabbix.com

VMware Hypervisor

Overview

For Zabbix version: 6.2 and higher

Setup

Refer to the vendor documentation.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$VMWARE.PASSWORD}	VMware service {$USERNAME} user password	``
{$VMWARE.URL}	VMware service (vCenter or ESX hypervisor) SDK URL (https://servername/sdk)	``
{$VMWARE.USERNAME}	VMware service user name	``

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Datastore discovery

Name	Description	Type	Key and additional info
Datastore discovery	-	SIMPLE	vmware.hv.datastore.discovery[{$VMWARE.URL},{$VMWARE.HV.UUID}]
Healthcheck discovery	VMware Rollup Health State sensor discovery	DEPENDENT	vmware.hv.healthcheck.discovery Preprocessing: - JSONPATH: `$..HostNumericSensorInfo[?(@.name=="VMware Rollup Health State")]` - JAVASCRIPT: `return JSON.stringify(value != "[]" ? [{'{#SINGLETON}': ''}] : []);` - DISCARDUNCHANGEDHEARTBEAT: `6h`

SIMPLE

vmware.hv.datastore.discovery[{$VMWARE.URL},{$VMWARE.HV.UUID}]

Healthcheck discovery

VMware Rollup Health State sensor discovery

DEPENDENT

vmware.hv.healthcheck.discovery

Preprocessing:

- JSONPATH: $..HostNumericSensorInfo[?(@.name=="VMware Rollup Health State")]

- JAVASCRIPT: return JSON.stringify(value != "[]" ? [{'{#SINGLETON}': ''}] : []);

- DISCARDUNCHANGEDHEARTBEAT: 6h

Items collected

Group	Name	Description	Type	Key and additional info
VMware	VMware: Hypervisor ping	Checks if the hypervisor is running and accepting ICMP pings.	SIMPLE	icmpping[] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
VMware	VMware: Cluster name	Cluster name of the guest VM.	SIMPLE	vmware.hv.cluster.name[{$VMWARE.URL},{$VMWARE.HV.UUID}] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1d`
VMware	VMware: CPU usage	Aggregated CPU usage across all cores on the host in Hz. This is only available if the host is connected.	SIMPLE	vmware.hv.cpu.usage[{$VMWARE.URL},{$VMWARE.HV.UUID}]
VMware	VMware: CPU usage in percents	CPU usage as a percentage during the interval.	SIMPLE	vmware.hv.cpu.usage.perf[{$VMWARE.URL},{$VMWARE.HV.UUID}]
VMware	VMware: CPU utilization	CPU usage as a percentage during the interval depends on power management or HT.	SIMPLE	vmware.hv.cpu.utilization[{$VMWARE.URL},{$VMWARE.HV.UUID}]
VMware	VMware: Power usage	Current power usage.	SIMPLE	vmware.hv.power[{$VMWARE.URL},{$VMWARE.HV.UUID}]
VMware	VMware: Power usage maximum allowed	Maximum allowed power usage.	SIMPLE	vmware.hv.power[{$VMWARE.URL},{$VMWARE.HV.UUID},max] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `6h`
VMware	VMware: Datacenter name	Datacenter name of the hypervisor.	SIMPLE	vmware.hv.datacenter.name[{$VMWARE.URL},{$VMWARE.HV.UUID}] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1d`
VMware	VMware: Full name	The complete product name, including the version information.	SIMPLE	vmware.hv.fullname[{$VMWARE.URL},{$VMWARE.HV.UUID}] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1d`
VMware	VMware: CPU frequency	The speed of the CPU cores. This is an average value if there are multiple speeds. The product of CPU frequency and number of cores is approximately equal to the sum of the MHz for all the individual cores on the host.	SIMPLE	vmware.hv.hw.cpu.freq[{$VMWARE.URL},{$VMWARE.HV.UUID}]
VMware	VMware: CPU model	The CPU model.	SIMPLE	vmware.hv.hw.cpu.model[{$VMWARE.URL},{$VMWARE.HV.UUID}]
VMware	VMware: CPU cores	Number of physical CPU cores on the host. Physical CPU cores are the processors contained by a CPU package.	SIMPLE	vmware.hv.hw.cpu.num[{$VMWARE.URL},{$VMWARE.HV.UUID}] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1d`
VMware	VMware: CPU threads	Number of physical CPU threads on the host.	SIMPLE	vmware.hv.hw.cpu.threads[{$VMWARE.URL},{$VMWARE.HV.UUID}]
VMware	VMware: Total memory	The physical memory size.	SIMPLE	vmware.hv.hw.memory[{$VMWARE.URL},{$VMWARE.HV.UUID}]
VMware	VMware: Model	The system model identification.	SIMPLE	vmware.hv.hw.model[{$VMWARE.URL},{$VMWARE.HV.UUID}]
VMware	VMware: Bios UUID	The hardware BIOS identification.	SIMPLE	vmware.hv.hw.uuid[{$VMWARE.URL},{$VMWARE.HV.UUID}]
VMware	VMware: Vendor	The hardware vendor identification.	SIMPLE	vmware.hv.hw.vendor[{$VMWARE.URL},{$VMWARE.HV.UUID}]
VMware	VMware: Ballooned memory	The amount of guest physical memory that is currently reclaimed through the balloon driver. Sum of all guest VMs.	SIMPLE	vmware.hv.memory.size.ballooned[{$VMWARE.URL},{$VMWARE.HV.UUID}]
VMware	VMware: Used memory	Physical memory usage on the host.	SIMPLE	vmware.hv.memory.used[{$VMWARE.URL},{$VMWARE.HV.UUID}]
VMware	VMware: Number of bytes received	VMware hypervisor network input statistics (bytes per second).	SIMPLE	vmware.hv.network.in[{$VMWARE.URL},{$VMWARE.HV.UUID},bps]
VMware	VMware: Number of bytes transmitted	VMware hypervisor network output statistics (bytes per second).	SIMPLE	vmware.hv.network.out[{$VMWARE.URL},{$VMWARE.HV.UUID},bps]
VMware	VMware: Overall status	The overall alarm status of the host: gray - unknown, green - ok, red - it has a problem, yellow - it might have a problem.	SIMPLE	vmware.hv.status[{$VMWARE.URL},{$VMWARE.HV.UUID}]
VMware	VMware: Uptime	System uptime.	SIMPLE	vmware.hv.uptime[{$VMWARE.URL},{$VMWARE.HV.UUID}]
VMware	VMware: Version	Dot-separated version string.	SIMPLE	vmware.hv.version[{$VMWARE.URL},{$VMWARE.HV.UUID}]
VMware	VMware: Number of guest VMs	Number of guest virtual machines.	SIMPLE	vmware.hv.vm.num[{$VMWARE.URL},{$VMWARE.HV.UUID}]
VMware	VMware: Average read latency of the datastore {#DATASTORE}	Average amount of time for a read operation from the datastore (milliseconds).	SIMPLE	vmware.hv.datastore.read[{$VMWARE.URL},{$VMWARE.HV.UUID},{#DATASTORE},latency]
VMware	VMware: Free space on datastore {#DATASTORE} (percentage)	VMware datastore space in percentage from total.	SIMPLE	vmware.hv.datastore.size[{$VMWARE.URL},{$VMWARE.HV.UUID},{#DATASTORE},pfree]
VMware	VMware: Total size of datastore {#DATASTORE}	VMware datastore space in bytes.	SIMPLE	vmware.hv.datastore.size[{$VMWARE.URL},{$VMWARE.HV.UUID},{#DATASTORE}]
VMware	VMware: Average write latency of the datastore {#DATASTORE}	Average amount of time for a write operation to the datastore (milliseconds).	SIMPLE	vmware.hv.datastore.write[{$VMWARE.URL},{$VMWARE.HV.UUID},{#DATASTORE},latency]
VMware	VMware: Multipath count for datastore {#DATASTORE}	Number of available datastore paths.	SIMPLE	vmware.hv.datastore.multipath[{$VMWARE.URL},{$VMWARE.HV.UUID},{#DATASTORE}]
VMware	VMware: Health state rollup	The host health state rollup sensor value: gray - unknown, green - ok, red - it has a problem, yellow - it might have a problem.	DEPENDENT	vmware.hv.sensor.health.state[{#SINGLETON}] Preprocessing: - JSONPATH: `$..HostNumericSensorInfo[?(@.name=="VMware Rollup Health State")].healthState.label.first()`
Zabbix raw items	VMware: Get sensors	Master item for sensors data.	SIMPLE	vmware.hv.sensors.get[{$VMWARE.URL},{$VMWARE.HV.UUID}]

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
VMware: Hypervisor is down	The service is unavailable or does not accept ICMP ping.	`last(/VMware Hypervisor/icmpping[])=0`	AVERAGE	Manual close: YES
VMware: The {$VMWARE.HV.UUID} health is Red	One or more components in the appliance might be in an unusable status and the appliance might become unresponsive soon. Security patches might be available.	`last(/VMware Hypervisor/vmware.hv.status[{$VMWARE.URL},{$VMWARE.HV.UUID}])=3`	HIGH
VMware: The {$VMWARE.HV.UUID} health is Yellow	One or more components in the appliance might become overloaded soon.	`last(/VMware Hypervisor/vmware.hv.status[{$VMWARE.URL},{$VMWARE.HV.UUID}])=2`	AVERAGE	Depends on: - VMware: The {$VMWARE.HV.UUID} health is Red
VMware: Hypervisor has been restarted	Uptime is less than 10 minutes.	`last(/VMware Hypervisor/vmware.hv.uptime[{$VMWARE.URL},{$VMWARE.HV.UUID}])<10m`	WARNING	Manual close: YES
VMware: The multipath count has been changed	The number of available datastore paths less than registered ({#MULTIPATH.COUNT}).	`last(/VMware Hypervisor/vmware.hv.datastore.multipath[{$VMWARE.URL},{$VMWARE.HV.UUID},{#DATASTORE}],#1)<>last(/VMware Hypervisor/vmware.hv.datastore.multipath[{$VMWARE.URL},{$VMWARE.HV.UUID},{#DATASTORE}],#2) and last(/VMware Hypervisor/vmware.hv.datastore.multipath[{$VMWARE.URL},{$VMWARE.HV.UUID},{#DATASTORE}])<{#MULTIPATH.COUNT}`	AVERAGE	Manual close: YES
VMware: The {$VMWARE.HV.UUID} health is Red	One or more components in the appliance might be in an unusable status and the appliance might become unresponsive soon. Security patches might be available.	`last(/VMware Hypervisor/vmware.hv.sensor.health.state[{#SINGLETON}])="Red"`	HIGH	Depends on: - VMware: The {$VMWARE.HV.UUID} health is Red
VMware: The {$VMWARE.HV.UUID} health is Yellow	One or more components in the appliance might become overloaded soon.	`last(/VMware Hypervisor/vmware.hv.sensor.health.state[{#SINGLETON}])="Yellow"`	AVERAGE	Depends on: - VMware: The {$VMWARE.HV.UUID} health is Red - VMware: The {$VMWARE.HV.UUID} health is Red - VMware: The {$VMWARE.HV.UUID} health is Yellow

Feedback

Please report any issues with the template at https://support.zabbix.com

app

veeam_enterprise_manager_http

View README Download JSON

Veeam Backup Enterprise Manager by HTTP

Overview

This template is designed to monitor Veeam Backup Enterprise Manager. The Veeam Backup Enterprise Manager REST API enables the communication with Zabbix to query the information about Veeam Backup Enterprise Manager objects. It works without any external scripts and uses the script item.

Requirements

For Zabbix version: 6.2 and higher.

Setup

Create a user to monitor the service, or use an existing read-only account. You can also obtain the collected jobs if you are logged in under an account having only Portal Administrator role. > See Veeam Help Center for more details.
Link the template to a host.
Configure the following macros: {$VEEAM.MANAGER.API.URL}, {$VEEAM.MANAGER.USER}, {$VEEAM.MANAGER.PASSWORD}.

Configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$BACKUP.NAME.MATCHES}	This macro is used in backup discovery rule.	`.*`
{$BACKUP.NAME.NOT_MATCHES}	This macro is used in backup discovery rule.	`CHANGE_IF_NEEDED`
{$BACKUP.TYPE.MATCHES}	This macro is used in backup discovery rule.	`.*`
{$BACKUP.TYPE.NOT_MATCHES}	This macro is used in backup discovery rule.	`CHANGE_IF_NEEDED`
{$VEEAM.MANAGER.API.URL}	Veeam Backup Enterprise Manager API endpoint is a URL in the format: `<scheme>://<host>:<port>`.	`https://localhost:9398`
{$VEEAM.MANAGER.DATA.TIMEOUT}	A response timeout for API.	`10`
{$VEEAM.MANAGER.HTTP.PROXY}	Sets the HTTP proxy to `http_proxy` value. If this parameter is empty, then no proxy is used.	``
{$VEEAM.MANAGER.JOB.MAX.FAIL}	The maximum score of failed jobs (for a trigger expression).	`5`
{$VEEAM.MANAGER.JOB.MAX.WARN}	The maximum score of warning jobs (for a trigger expression).	`10`
{$VEEAM.MANAGER.PASSWORD}	The `password` of the Veeam Backup Enterprise Manager account.	``
{$VEEAM.MANAGER.USER}	The `user name` of the Veeam Backup Enterprise Manager account .	``

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Backup Files discovery

Discovery of all backup files created on, or imported to the backup servers that are connected to Veeam Backup Enterprise Manager.

DEPENDENT

veeam.backup.files.discovery

Preprocessing:

- JSONPATH: $.backupFiles.Refs

- DISCARDUNCHANGEDHEARTBEAT: 6h

Filter:

AND

- {#TYPE} MATCHESREGEX {$BACKUP.TYPE.MATCHES}

- {#TYPE} NOTMATCHESREGEX {$BACKUP.TYPE.NOT_MATCHES}

- {#NAME} MATCHESREGEX {$BACKUP.NAME.MATCHES}

- {#NAME} NOTMATCHESREGEX {$BACKUP.NAME.NOT_MATCHES}

Items collected

Group	Name	Description	Type	Key and additional info
Veeam	Veeam Manager: Get metrics	The result of API requests is expressed in the JSON.	SCRIPT	veeam.manager.get.metrics Expression: `The text is too long. Please see the template.`
Veeam	Veeam Manager: Get errors	The errors from API requests.	DEPENDENT	veeam.manager.get.errors Preprocessing: - JSONPATH: `$.error` ⛔️ONFAIL: `CUSTOM_VALUE ->` - DISCARDUNCHANGED_HEARTBEAT: `1h`
Veeam	Veeam Manager: Running Jobs	Informs about the running jobs.	DEPENDENT	veeam.manager.running.jobs Preprocessing: - JSONPATH: `$.JobStatistics.RunningJobs` ⛔️ONFAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGED_HEARTBEAT: `1h`
Veeam	Veeam Manager: Scheduled Jobs	Informs about the scheduled jobs.	DEPENDENT	veeam.manager.scheduled.jobs Preprocessing: - JSONPATH: `$.JobStatistics.ScheduledJobs` ⛔️ONFAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGED_HEARTBEAT: `1h`
Veeam	Veeam Manager: Scheduled Backup Jobs	Informs about the scheduled backup jobs.	DEPENDENT	veeam.manager.scheduled.backup.jobs Preprocessing: - JSONPATH: `$.JobStatistics.ScheduledBackupJobs` ⛔️ONFAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGED_HEARTBEAT: `1h`
Veeam	Veeam Manager: Scheduled Replica Jobs	Informs about the scheduled replica jobs.	DEPENDENT	veeam.manager.scheduled.replica.jobs Preprocessing: - JSONPATH: `$.JobStatistics.ScheduledReplicaJobs` ⛔️ONFAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGED_HEARTBEAT: `1h`
Veeam	Veeam Manager: Total Job Runs	Informs about the total job runs.	DEPENDENT	veeam.manager.scheduled.total.jobs Preprocessing: - JSONPATH: `$.JobStatistics.TotalJobRuns` ⛔️ONFAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGED_HEARTBEAT: `1h`
Veeam	Veeam Manager: Warnings Job Runs	Informs about the warning job runs.	DEPENDENT	veeam.manager.warning.jobs Preprocessing: - JSONPATH: `$.JobStatistics.WarningsJobRuns` ⛔️ONFAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGED_HEARTBEAT: `1h`
Veeam	Veeam Manager: Failed Job Runs	Informs about the failed job runs.	DEPENDENT	veeam.manager.failed.jobs Preprocessing: - JSONPATH: `$.JobStatistics.FailedJobRuns` ⛔️ONFAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGED_HEARTBEAT: `1h`
Veeam	Veeam Manager: Backup Size [{#NAME}]	Gets the backup size with the name `[{#NAME}]`.	DEPENDENT	veeam.backup.file.size[{#NAME}] Preprocessing: - JSONPATH: `$.['{#NAME}'].BackupFile.BackupSize` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Veeam	Veeam Manager: Data Size [{#NAME}]	Gets the data size with the name `[{#NAME}]`.	DEPENDENT	veeam.backup.data.size[{#NAME}] Preprocessing: - JSONPATH: `$.['{#NAME}'].BackupFile.DataSize` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Veeam	Veeam Manager: Compression ratio [{#NAME}]	Gets the data compression ratio with the name `[{#NAME}]`.	DEPENDENT	veeam.backup.compress.ratio[{#NAME}] Preprocessing: - JSONPATH: `$.['{#NAME}'].BackupFile.CompressRatio` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Veeam	Veeam Manager: Deduplication Ratio [{#NAME}]	Gets the data deduplication ratio with the name `[{#NAME}]`.	DEPENDENT	veeam.backup.deduplication.ratio[{#NAME}] Preprocessing: - JSONPATH: `$.['{#NAME}'].BackupFile.DeduplicationRatio` ⛔️ON_FAIL: `DISCARD_VALUE ->`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Veeam Manager: There are errors in requests to API	Zabbix has received errors in response to API requests.	`length(last(/Veeam Backup Enterprise Manager by HTTP/veeam.manager.get.errors))>0`	AVERAGE
Veeam Manager: Warning job runs is too high	-	`last(/Veeam Backup Enterprise Manager by HTTP/veeam.manager.warning.jobs)>{$VEEAM.MANAGER.JOB.MAX.WARN}`	WARNING	Manual close: YES
Veeam Manager: Failed job runs is too high	-	`last(/Veeam Backup Enterprise Manager by HTTP/veeam.manager.failed.jobs)>{$VEEAM.MANAGER.JOB.MAX.FAIL}`	AVERAGE	Manual close: YES

Feedback

Please report any issues with the template at https://support.zabbix.com.

You can also provide feedback, discuss the template, or ask for help at ZABBIX forums.

app

veeam_backup_replication_http

View README Download JSON

Veeam Backup and Replication by HTTP

Overview

This template is designed to monitor Veeam Backup and Replication version 11.0. It works without any external scripts and uses the script item.

Requirements

For Zabbix version: 6.2 and higher.

Setup

Create a user to monitor the service or use an existing read-only account. > See Veeam Help Center for more details.
Link the template to a host.
Configure macros {$VEEAM.API.URL}, {$VEEAM.USER}, and {$VEEAM.PASSWORD}.

Configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$CREATED.AFTER}	Returns sessions that are created after chosen days.	`7`
{$JOB.NAME.MATCHES}	This macro is used in discovery rule to evaluate the states of jobs.	`.*`
{$JOB.NAME.NOT_MATCHES}	This macro is used in discovery rule to evaluate the states of jobs.	`CHANGE_IF_NEEDED`
{$JOB.STATUS.MATCHES}	This macro is used in discovery rule to evaluate the states of jobs.	`.*`
{$JOB.STATUS.NOT_MATCHES}	This macro is used in discovery rule to evaluate the states of jobs.	`CHANGE_IF_NEEDED`
{$JOB.TYPE.MATCHES}	This macro is used in discovery rule to evaluate the states of jobs.	`.*`
{$JOB.TYPE.NOT_MATCHES}	This macro is used in discovery rule to evaluate the states of jobs.	`CHANGE_IF_NEEDED`
{$PROXIES.NAME.MATCHES}	This macro is used in proxies discovery rule.	`.*`
{$PROXIES.NAME.NOT_MATCHES}	This macro is used in proxies discovery rule.	`CHANGE_IF_NEEDED`
{$PROXIES.TYPE.MATCHES}	This macro is used in proxies discovery rule.	`.*`
{$PROXIES.TYPE.NOT_MATCHES}	This macro is used in proxies discovery rule.	`CHANGE_IF_NEEDED`
{$REPOSITORIES.NAME.MATCHES}	This macro is used in repositories discovery rule.	`.*`
{$REPOSITORIES.NAME.NOT_MATCHES}	This macro is used in repositories discovery rule.	`CHANGE_IF_NEEDED`
{$REPOSITORIES.TYPE.MATCHES}	This macro is used in repositories discovery rule.	`.*`
{$REPOSITORIES.TYPE.NOT_MATCHES}	This macro is used in repositories discovery rule.	`CHANGE_IF_NEEDED`
{$SESSION.NAME.MATCHES}	This macro is used in discovery rule to evaluate sessions.	`.*`
{$SESSION.NAME.NOT_MATCHES}	This macro is used in discovery rule to evaluate sessions.	`CHANGE_IF_NEEDED`
{$SESSION.TYPE.MATCHES}	This macro is used in discovery rule to evaluate sessions.	`.*`
{$SESSION.TYPE.NOT_MATCHES}	This macro is used in discovery rule to evaluate sessions.	`CHANGE_IF_NEEDED`
{$VEEAM.API.URL}	The Veeam API endpoint is a URL in the format `<scheme>://<host>:<port>`.	`https://localhost:9419`
{$VEEAM.DATA.TIMEOUT}	A response timeout for the API.	`10`
{$VEEAM.HTTP.PROXY}	Sets the HTTP proxy to `http_proxy` value. If this parameter is empty, then no proxy is used.	``
{$VEEAM.PASSWORD}	The `password` of the Veeam Backup and Replication account. It is used to obtain an access token.	``
{$VEEAM.USER}	The `username` of the Veeam Backup and Replication account. It is used to obtain an access token.	``

Template links

There are no template links in this template.

Discovery rules

Name	Description	Type	Key and additional info
Jobs states discovery	Discovery of the jobs states.	DEPENDENT	veeam.job.state.discovery Preprocessing: - JSONPATH: `$.jobs_states.data` - DISCARDUNCHANGEDHEARTBEAT: `6h` Filter: AND - {#TYPE} MATCHESREGEX `{$JOB.TYPE.MATCHES}` - {#TYPE} NOTMATCHESREGEX `{$JOB.TYPE.NOT_MATCHES}` - {#NAME} MATCHESREGEX `{$JOB.NAME.MATCHES}` - {#NAME} NOTMATCHESREGEX `{$JOB.NAME.NOT_MATCHES}` - {#JOB.STATUS} MATCHESREGEX `{$JOB.STATUS.MATCHES}` - {#JOB.STATUS} NOTMATCHES_REGEX `{$JOB.STATUS.NOT_MATCHES}`
Proxies discovery	Discovery of proxies.	DEPENDENT	veeam.proxies.discovery Preprocessing: - JSONPATH: `$.proxies.data` - DISCARDUNCHANGEDHEARTBEAT: `6h` Filter: AND - {#TYPE} MATCHESREGEX `{$PROXIES.TYPE.MATCHES}` - {#TYPE} NOTMATCHESREGEX `{$PROXIES.TYPE.NOT_MATCHES}` - {#NAME} MATCHESREGEX `{$PROXIES.NAME.MATCHES}` - {#NAME} NOTMATCHESREGEX `{$PROXIES.NAME.NOT_MATCHES}`
Repositories discovery	Discovery of repositories.	DEPENDENT	veeam.repositories.discovery Preprocessing: - JSONPATH: `$.repositories_states.data` - DISCARDUNCHANGEDHEARTBEAT: `6h` Filter: AND - {#TYPE} MATCHESREGEX `{$REPOSITORIES.TYPE.MATCHES}` - {#TYPE} NOTMATCHESREGEX `{$REPOSITORIES.TYPE.NOT_MATCHES}` - {#NAME} MATCHESREGEX `{$REPOSITORIES.NAME.MATCHES}` - {#NAME} NOTMATCHESREGEX `{$REPOSITORIES.NAME.NOT_MATCHES}`
Sessions discovery	Discovery of sessions.	DEPENDENT	veeam.sessions.discovery Preprocessing: - JSONPATH: `$.sessions.data` - JAVASCRIPT - DISCARDUNCHANGEDHEARTBEAT: `6h` Filter: AND - {#TYPE} MATCHESREGEX `{$SESSION.TYPE.MATCHES}` - {#TYPE} NOTMATCHESREGEX `{$SESSION.TYPE.NOT_MATCHES}` - {#NAME} MATCHESREGEX `{$SESSION.NAME.MATCHES}` - {#NAME} NOTMATCHESREGEX `{$SESSION.NAME.NOT_MATCHES}`

Items collected

Group	Name	Description	Type	Key and additional info
Veeam	Veeam: Get metrics	The result of API requests is expressed in the JSON.	SCRIPT	veeam.get.metrics Expression: `The text is too long. Please see the template.`
Veeam	Veeam: Get errors	The errors from API requests.	DEPENDENT	veeam.get.errors Preprocessing: - JSONPATH: `$.error` ⛔️ONFAIL: `CUSTOM_VALUE ->` - DISCARDUNCHANGED_HEARTBEAT: `1h`
Veeam	Veeam: Server [{#NAME}]: Get data	Gets raw data collected by the proxy server.	DEPENDENT	veeam.proxy.server.raw[{#NAME}] Preprocessing: - JSONPATH: `$.managedServers.data.[?(@.id=='{#HOSTID}')].first()`
Veeam	Veeam: Proxy [{#NAME}] [{#TYPE}]: Get data	Gets raw data collected by the proxy with the name `[{#NAME}]`, `[{#TYPE}]`.	DEPENDENT	veeam.proxy.raw[{#NAME}] Preprocessing: - JSONPATH: `$.proxies.data.[?(@.id=='{#ID}')].first()`
Veeam	Veeam: Proxy [{#NAME}] [{#TYPE}]: Max Task Count	The maximum number of concurrent tasks.	DEPENDENT	veeam.proxy.maxtask[{#NAME}] Preprocessing: - JSONPATH: `$.server.maxTaskCount`
Veeam	Veeam: Proxy [{#NAME}] [{#TYPE}]: Host name	The name of the proxy server.	DEPENDENT	veeam.proxy.server.name[{#NAME}] Preprocessing: - JSONPATH: `$.name`
Veeam	Veeam: Proxy [{#NAME}] [{#TYPE}]: Host type	The type of the proxy server.	DEPENDENT	veeam.proxy.server.type[{#NAME}] Preprocessing: - JSONPATH: `$.type`
Veeam	Veeam: Repository [{#NAME}] [{#TYPE}]: Get data	Gets raw data from repository with the name: `[{#NAME}]`, `[{#TYPE}]`.	DEPENDENT	veeam.repositories.raw[{#NAME}] Preprocessing: - JSONPATH: `$.repositories_states.data.[?(@.id=='{#ID}')].first()`
Veeam	Veeam: Repository [{#NAME}] [{#TYPE}]: Used space [{#PATH}]	Used space by repositories expressed in gigabytes (GB).	DEPENDENT	veeam.repository.capacity[{#NAME}] Preprocessing: - JSONPATH: `$.usedSpaceGB`
Veeam	Veeam: Repository [{#NAME}] [{#TYPE}]: Free space [{#PATH}]	Free space of repositories expressed in gigabytes (GB).	DEPENDENT	veeam.repository.free.space[{#NAME}] Preprocessing: - JSONPATH: `$.freeGB`
Veeam	Veeam: Session [{#NAME}] [{#TYPE}]: Get data	Gets raw data from session with the name: `[{#NAME}]`, `[{#TYPE}]`.	DEPENDENT	veeam.sessions.raw[{#ID}] Preprocessing: - JSONPATH: `$.sessions.data.[?(@.id=='{#ID}')].first()` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Veeam	Veeam: Session [{#NAME}] [{#TYPE}]: State	The state of the session. The enums used: `Stopped`, `Starting`, `Stopping`, `Working`, `Pausing`, `Resuming`, `WaitingTape`, `Idle`, `Postprocessing`, `WaitingRepository`, `WaitingSlot`.	DEPENDENT	veeam.sessions.state[{#ID}] Preprocessing: - JSONPATH: `$.state`
Veeam	Veeam: Session [{#NAME}] [{#TYPE}]: Result	The result of the session. The enums used: `None`, `Success`, `Warning`, `Failed`.	DEPENDENT	veeam.sessions.result[{#ID}] Preprocessing: - JSONPATH: `$.result.result`
Veeam	Veeam: Session [{#NAME}] [{#TYPE}]: Message	A message that explains the session result.	DEPENDENT	veeam.sessions.message[{#ID}] Preprocessing: - JSONPATH: `$.result.message`
Veeam	Veeam: Session progress percent [{#NAME}] [{#TYPE}]	The progress of the session expressed as percentage.	DEPENDENT	veeam.sessions.progress.percent[{#ID}] Preprocessing: - JSONPATH: `$.progressPercent`
Veeam	Veeam: Job states [{#NAME}] [{#TYPE}]: Get data	Gets raw data from the job states with the name `[{#NAME}]`.	DEPENDENT	veeam.jobs.states.raw[{#ID}] Preprocessing: - JSONPATH: `$.jobs_states.data.[?(@.id=='{#ID}')].first()`
Veeam	Veeam: Job states [{#NAME}] [{#TYPE}]: Status	The current status of the job. The enums used: `running`, `inactive`, `disabled`.	DEPENDENT	veeam.jobs.status[{#ID}] Preprocessing: - JSONPATH: `$.status`
Veeam	Veeam: Job states [{#NAME}] [{#TYPE}]: Last result	The result of the session. The enums used: `None`, `Success`, `Warning`, `Failed`.	DEPENDENT	veeam.jobs.last.result[{#ID}] Preprocessing: - JSONPATH: `$.lastResult`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Veeam: There are errors in requests to API	Zabbix has received errors in response to API requests.	`length(last(/Veeam Backup and Replication by HTTP/veeam.get.errors))>0`	AVERAGE
Veeam: Last result session failed	-	`find(/Veeam Backup and Replication by HTTP/veeam.sessions.result[{#ID}],,"like","Failed")=1`	AVERAGE	Manual close: YES
Veeam: Last result job failed	-	`find(/Veeam Backup and Replication by HTTP/veeam.jobs.last.result[{#ID}],,"like","Failed")=1`	AVERAGE	Manual close: YES

Feedback

Please report any issues with the template at https://support.zabbix.com.

You can also provide feedback, discuss the template, or ask for help at ZABBIX forums.

app

app_vault_http

View README Download JSON

HashiCorp Vault by HTTP

Overview

For Zabbix version: 6.2 and higher
The template to monitor HashiCorp Vault by Zabbix that work without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

Template Vault by HTTP — collects metrics by HTTP agent from /sys/metrics API endpoint. See https://www.vaultproject.io/api-docs/system/metrics.

This template was tested on:

Vault, version 1.6

Setup

Configure Vault API. See Vault Configuration. Create a Vault service token and set it to the macro {$VAULT.TOKEN}.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$VAULT.API.PORT}	Vault port.	`8200`
{$VAULT.API.SCHEME}	Vault API scheme.	`http`
{$VAULT.HOST}	Vault host name.	`<PUT YOUR VAULT HOST>`
{$VAULT.LEADERSHIP.LOSSES.MAX.WARN}	Maximum number of Vault leadership losses.	`5`
{$VAULT.LEADERSHIP.SETUP.FAILED.MAX.WARN}	Maximum number of Vault leadership setup failed.	`5`
{$VAULT.LEADERSHIP.STEPDOWNS.MAX.WARN}	Maximum number of Vault leadership step downs.	`5`
{$VAULT.LLD.FILTER.STORAGE.MATCHES}	Filter of discoverable storage backends.	`.+`
{$VAULT.OPEN.FDS.MAX.WARN}	Maximum percentage of used file descriptors for trigger expression.	`90`
{$VAULT.TOKEN.ACCESSORS}	Vault accessors separated by spaces for monitoring token expiration time.	``
{$VAULT.TOKEN.TTL.MIN.CRIT}	Token TTL critical threshold.	`3d`
{$VAULT.TOKEN.TTL.MIN.WARN}	Token TTL warning threshold.	`7d`
{$VAULT.TOKEN}	Vault auth token.	`<PUT YOUR AUTH TOKEN>`

Template links

There are no template links in this template.

Discovery rules

Name	Description	Type	Key and additional info
Mountpoint metrics discovery	Mountpoint metrics discovery.	DEPENDENT	vault.mountpoint.discovery
Replication metrics discovery	Discovery for replication metrics.	DEPENDENT	vault.replication.discovery
Storage metrics discovery	Storage backend metrics discovery.	DEPENDENT	vault.storage.discovery Filter: AND - {#STORAGE} MATCHES_REGEX `{$VAULT.LLD.FILTER.STORAGE.MATCHES}`
Token metrics discovery	Tokens metrics discovery.	DEPENDENT	vault.tokens.discovery
WAL metrics discovery	Discovery for WAL metrics.	DEPENDENT	vault.wal.discovery

Items collected

Group	Name	Description	Type	Key and additional info
Vault	Vault: Initialized	Initialization status.	DEPENDENT	vault.health.initialized Preprocessing: - JSONPATH: `$.initialized` ⛔️ONFAIL: `DISCARD_VALUE ->` - BOOLTODECIMAL - DISCARDUNCHANGED_HEARTBEAT: `1h`
Vault	Vault: Sealed	Seal status.	DEPENDENT	vault.health.sealed Preprocessing: - JSONPATH: `$.sealed` ⛔️ONFAIL: `DISCARD_VALUE ->` - BOOLTODECIMAL - DISCARDUNCHANGED_HEARTBEAT: `1h`
Vault	Vault: Standby	Standby status.	DEPENDENT	vault.health.standby Preprocessing: - JSONPATH: `$.standby` ⛔️ONFAIL: `DISCARD_VALUE ->` - BOOLTODECIMAL - DISCARDUNCHANGED_HEARTBEAT: `1h`
Vault	Vault: Performance standby	Performance standby status.	DEPENDENT	vault.health.performancestandby Preprocessing: - JSONPATH: `$.performance_standby` ⛔️ONFAIL: `DISCARD_VALUE ->` - BOOLTODECIMAL - DISCARDUNCHANGEDHEARTBEAT: `1h`
Vault	Vault: Performance replication	Performance replication mode https://www.vaultproject.io/docs/enterprise/replication	DEPENDENT	vault.health.replicationperformancemode Preprocessing: - JSONPATH: `$.replication_performance_mode` ⛔️ONFAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGED_HEARTBEAT: `1h`
Vault	Vault: Disaster Recovery replication	Disaster recovery replication mode https://www.vaultproject.io/docs/enterprise/replication	DEPENDENT	vault.health.replicationdrmode Preprocessing: - JSONPATH: `$.replication_dr_mode` ⛔️ONFAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGED_HEARTBEAT: `1h`
Vault	Vault: Version	Server version.	DEPENDENT	vault.health.version Preprocessing: - JSONPATH: `$.version` ⛔️ONFAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGED_HEARTBEAT: `1h`
Vault	Vault: Healthcheck	Vault healthcheck.	DEPENDENT	vault.health.check Preprocessing: - JSONPATH: `$.healthcheck` ⛔️ONFAIL: `CUSTOM_VALUE -> 1` - DISCARDUNCHANGED_HEARTBEAT: `1h`
Vault	Vault: HA enabled	HA enabled status.	DEPENDENT	vault.leader.haenabled Preprocessing: - JSONPATH: `$.ha_enabled` - BOOLTODECIMAL - DISCARDUNCHANGED_HEARTBEAT: `1h`
Vault	Vault: Is leader	Leader status.	DEPENDENT	vault.leader.isself Preprocessing: - JSONPATH: `$.is_self` - BOOLTODECIMAL - DISCARDUNCHANGED_HEARTBEAT: `1h`
Vault	Vault: Get metrics error	Get metrics error.	DEPENDENT	vault.getmetrics.error Preprocessing: - JSONPATH: `$.errors[0]` ⛔️ONFAIL: `CUSTOM_VALUE ->` - DISCARDUNCHANGEDHEARTBEAT: `1h`
Vault	Vault: Process CPU seconds, total	Total user and system CPU time spent in seconds.	DEPENDENT	vault.metrics.process.cpu.seconds.total Preprocessing: - PROMETHEUSPATTERN: `process_cpu_seconds_total` ⛔️ONFAIL: `DISCARD_VALUE ->`
Vault	Vault: Open file descriptors, max	Maximum number of open file descriptors.	DEPENDENT	vault.metrics.process.max.fds Preprocessing: - PROMETHEUSPATTERN: `process_max_fds` ⛔️ONFAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGEDHEARTBEAT: `1h`
Vault	Vault: Open file descriptors, current	Number of open file descriptors.	DEPENDENT	vault.metrics.process.open.fds Preprocessing: - PROMETHEUSPATTERN: `process_open_fds` ⛔️ONFAIL: `DISCARD_VALUE ->`
Vault	Vault: Process resident memory	Resident memory size in bytes.	DEPENDENT	vault.metrics.process.residentmemory.bytes Preprocessing: - PROMETHEUSPATTERN: `process_resident_memory_bytes` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Uptime	Server uptime.	DEPENDENT	vault.metrics.process.uptime Preprocessing: - PROMETHEUSPATTERN: `process_start_time_seconds` ⛔️ONFAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return Math.floor(Date.now()/1000 - Number(value));`
Vault	Vault: Process virtual memory, current	Virtual memory size in bytes.	DEPENDENT	vault.metrics.process.virtualmemory.bytes Preprocessing: - PROMETHEUSPATTERN: `process_virtual_memory_bytes` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Process virtual memory, max	Maximum amount of virtual memory available in bytes.	DEPENDENT	vault.metrics.process.virtualmemory.max.bytes Preprocessing: - PROMETHEUSPATTERN: `process_virtual_memory_max_bytes` ⛔️ONFAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGED_HEARTBEAT: `1h`
Vault	Vault: Audit log requests, rate	Number of all audit log requests across all audit log devices.	DEPENDENT	vault.metrics.audit.log.request.rate Preprocessing: - PROMETHEUSPATTERN: `vault_audit_log_request_count` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Vault	Vault: Audit log request failures, rate	Number of audit log request failures.	DEPENDENT	vault.metrics.audit.log.request.failure.rate Preprocessing: - PROMETHEUSPATTERN: `vault_audit_log_request_failure` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Vault	Vault: Audit log response, rate	Number of audit log responses across all audit log devices.	DEPENDENT	vault.metrics.audit.log.response.rate Preprocessing: - PROMETHEUSPATTERN: `vault_audit_log_response_count` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Vault	Vault: Audit log response failures, rate	Number of audit log response failures.	DEPENDENT	vault.metrics.audit.log.response.failure.rate Preprocessing: - PROMETHEUSPATTERN: `vault_audit_log_response_failure` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Vault	Vault: Barrier DELETE ops, rate	Number of DELETE operations at the barrier.	DEPENDENT	vault.metrics.barrier.delete.rate Preprocessing: - PROMETHEUSPATTERN: `vault_barrier_delete_count` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Vault	Vault: Barrier GET ops, rate	Number of GET operations at the barrier.	DEPENDENT	vault.metrics.vault.barrier.get.rate Preprocessing: - PROMETHEUSPATTERN: `vault_barrier_get_count` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Vault	Vault: Barrier LIST ops, rate	Number of LIST operations at the barrier.	DEPENDENT	vault.metrics.barrier.list.rate Preprocessing: - PROMETHEUSPATTERN: `vault_barrier_list_count` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Vault	Vault: Barrier PUT ops, rate	Number of PUT operations at the barrier.	DEPENDENT	vault.metrics.barrier.put.rate Preprocessing: - PROMETHEUSPATTERN: `vault_barrier_put_count` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Vault	Vault: Cache hit, rate	Number of times a value was retrieved from the LRU cache.	DEPENDENT	vault.metrics.cache.hit.rate Preprocessing: - PROMETHEUSPATTERN: `vault_cache_hit` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Vault	Vault: Cache miss, rate	Number of times a value was not in the LRU cache. The results in a read from the configured storage.	DEPENDENT	vault.metrics.cache.miss.rate Preprocessing: - PROMETHEUSPATTERN: `vault_cache_miss` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Vault	Vault: Cache write, rate	Number of times a value was written to the LRU cache.	DEPENDENT	vault.metrics.cache.write.rate Preprocessing: - PROMETHEUSPATTERN: `vault_cache_write` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Vault	Vault: Check token, rate	Number of token checks handled by Vault core.	DEPENDENT	vault.metrics.core.check.token.rate Preprocessing: - PROMETHEUSPATTERN: `vault_core_check_token_count` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Vault	Vault: Fetch ACL and token, rate	Number of ACL and corresponding token entry fetches handled by Vault core.	DEPENDENT	vault.metrics.core.fetch.aclandtoken Preprocessing: - PROMETHEUSPATTERN: `vault_core_fetch_acl_and_token_count` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Vault	Vault: Requests, rate	Number of requests handled by Vault core.	DEPENDENT	vault.metrics.core.handle.request Preprocessing: - PROMETHEUSPATTERN: `vault_core_handle_request_count` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Vault	Vault: Leadership setup failed, counter	Cluster leadership setup failures which have occurred in a highly available Vault cluster.	DEPENDENT	vault.metrics.core.leadership.setupfailed Preprocessing: - PROMETHEUSTOJSON: `vault_core_leadership_setup_failed` - JSONPATH: `$[?(@.name=="vault_core_leadership_setup_failed")].value.sum()` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
Vault	Vault: Leadership setup lost, counter	Cluster leadership losses which have occurred in a highly available Vault cluster.	DEPENDENT	vault.metrics.core.leadershiplost Preprocessing: - PROMETHEUSTOJSON: `vault_core_leadership_lost_count` - JSONPATH: `$[?(@.name=="vault_core_leadership_lost_count")].value.sum()` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
Vault	Vault: Post-unseal ops, counter	Duration of time taken by post-unseal operations handled by Vault core.	DEPENDENT	vault.metrics.core.postunseal Preprocessing: - PROMETHEUSPATTERN: `vault_core_post_unseal_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Pre-seal ops, counter	Duration of time taken by pre-seal operations.	DEPENDENT	vault.metrics.core.preseal Preprocessing: - PROMETHEUSPATTERN: `vault_core_pre_seal_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Requested seal ops, counter	Duration of time taken by requested seal operations.	DEPENDENT	vault.metrics.core.sealwithrequest Preprocessing: - PROMETHEUSPATTERN: `vault_core_seal_with_request_count` ⛔️ONFAIL: `DISCARD_VALUE ->`
Vault	Vault: Seal ops, counter	Duration of time taken by seal operations.	DEPENDENT	vault.metrics.core.seal Preprocessing: - PROMETHEUSPATTERN: `vault_core_seal_count` ⛔️ONFAIL: `DISCARD_VALUE ->`
Vault	Vault: Internal seal ops, counter	Duration of time taken by internal seal operations.	DEPENDENT	vault.metrics.core.sealinternal Preprocessing: - PROMETHEUSPATTERN: `vault_core_seal_internal_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Leadership step downs, counter	Cluster leadership step down.	DEPENDENT	vault.metrics.core.stepdown Preprocessing: - PROMETHEUSTOJSON: `vault_core_step_down_count` - JSONPATH: `$[?(@.name=="vault_core_step_down_count")].value.sum()` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
Vault	Vault: Unseal ops, counter	Duration of time taken by unseal operations.	DEPENDENT	vault.metrics.core.unseal Preprocessing: - PROMETHEUSPATTERN: `vault_core_unseal_count` ⛔️ONFAIL: `DISCARD_VALUE ->`
Vault	Vault: Fetch lease times, counter	Time taken to fetch lease times.	DEPENDENT	vault.metrics.expire.fetch.lease.times Preprocessing: - PROMETHEUSPATTERN: `vault_expire_fetch_lease_times_count` ⛔️ONFAIL: `DISCARD_VALUE ->`
Vault	Vault: Fetch lease times by token, counter	Time taken to fetch lease times by token.	DEPENDENT	vault.metrics.expire.fetch.lease.times.bytoken Preprocessing: - PROMETHEUSPATTERN: `vault_expire_fetch_lease_times_by_token_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Number of expiring leases	Number of all leases which are eligible for eventual expiry.	DEPENDENT	vault.metrics.expire.numleases Preprocessing: - PROMETHEUSPATTERN: `vault_expire_num_leases` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Expire revoke, count	Time taken to revoke a token.	DEPENDENT	vault.metrics.expire.revoke Preprocessing: - PROMETHEUSPATTERN: `vault_expire_revoke_count` ⛔️ONFAIL: `DISCARD_VALUE ->`
Vault	Vault: Expire revoke force, count	Time taken to forcibly revoke a token.	DEPENDENT	vault.metrics.expire.revoke.force Preprocessing: - PROMETHEUSPATTERN: `vault_expire_revoke_force_count` ⛔️ONFAIL: `DISCARD_VALUE ->`
Vault	Vault: Expire revoke prefix, count	Tokens revoke on a prefix.	DEPENDENT	vault.metrics.expire.revoke.prefix Preprocessing: - PROMETHEUSPATTERN: `vault_expire_revoke_prefix_count` ⛔️ONFAIL: `DISCARD_VALUE ->`
Vault	Vault: Revoke secrets by token, count	Time taken to revoke all secrets issued with a given token.	DEPENDENT	vault.metrics.expire.revoke.bytoken Preprocessing: - PROMETHEUSPATTERN: `vault_expire_revoke_by_token_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Expire renew, count	Time taken to renew a lease.	DEPENDENT	vault.metrics.expire.renew Preprocessing: - PROMETHEUSPATTERN: `vault_expire_renew_count` ⛔️ONFAIL: `DISCARD_VALUE ->`
Vault	Vault: Renew token, count	Time taken to renew a token which does not need to invoke a logical backend.	DEPENDENT	vault.metrics.expire.renewtoken Preprocessing: - PROMETHEUSPATTERN: `vault_expire_renew_token_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Register ops, count	Time taken for register operations.	DEPENDENT	vault.metrics.expire.register Preprocessing: - PROMETHEUSPATTERN: `vault_expire_register_count` ⛔️ONFAIL: `DISCARD_VALUE ->`
Vault	Vault: Register auth ops, count	Time taken for register authentication operations which create lease entries without lease ID.	DEPENDENT	vault.metrics.expire.register.auth Preprocessing: - PROMETHEUSPATTERN: `vault_expire_register_auth_count` ⛔️ONFAIL: `DISCARD_VALUE ->`
Vault	Vault: Policy GET ops, rate	Number of operations to get a policy.	DEPENDENT	vault.metrics.policy.getpolicy.rate Preprocessing: - PROMETHEUSPATTERN: `vault_policy_get_policy_count` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
Vault	Vault: Policy LIST ops, rate	Number of operations to list policies.	DEPENDENT	vault.metrics.policy.listpolicies.rate Preprocessing: - PROMETHEUSPATTERN: `vault_policy_list_policies_count` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
Vault	Vault: Policy DELETE ops, rate	Number of operations to delete a policy.	DEPENDENT	vault.metrics.policy.deletepolicy.rate Preprocessing: - PROMETHEUSPATTERN: `vault_policy_delete_policy_count` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
Vault	Vault: Policy SET ops, rate	Number of operations to set a policy.	DEPENDENT	vault.metrics.policy.setpolicy.rate Preprocessing: - PROMETHEUSPATTERN: `vault_policy_set_policy_count` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
Vault	Vault: Token create, count	The time taken to create a token.	DEPENDENT	vault.metrics.token.create Preprocessing: - PROMETHEUSPATTERN: `vault_token_create_count` ⛔️ONFAIL: `DISCARD_VALUE ->`
Vault	Vault: Token createAccessor, count	The time taken to create a token accessor.	DEPENDENT	vault.metrics.token.createAccessor Preprocessing: - PROMETHEUSPATTERN: `vault_token_createAccessor_count` ⛔️ONFAIL: `DISCARD_VALUE ->`
Vault	Vault: Token lookup, rate	Number of token look up.	DEPENDENT	vault.metrics.token.lookup.rate Preprocessing: - PROMETHEUSPATTERN: `vault_token_lookup_count` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Vault	Vault: Token revoke, count	The time taken to look up a token.	DEPENDENT	vault.metrics.token.revoke Preprocessing: - PROMETHEUSPATTERN: `vault_token_revoke_count` ⛔️ONFAIL: `DISCARD_VALUE ->`
Vault	Vault: Token revoke tree, count	Time taken to revoke a token tree.	DEPENDENT	vault.metrics.token.revoke.tree Preprocessing: - PROMETHEUSPATTERN: `vault_token_revoke_tree_count` ⛔️ONFAIL: `DISCARD_VALUE ->`
Vault	Vault: Token store, count	Time taken to store an updated token entry without writing to the secondary index.	DEPENDENT	vault.metrics.token.store Preprocessing: - PROMETHEUSPATTERN: `vault_token_store_count` ⛔️ONFAIL: `DISCARD_VALUE ->`
Vault	Vault: Runtime allocated bytes	Number of bytes allocated by the Vault process. This could burst from time to time, but should return to a steady state value.	DEPENDENT	vault.metrics.runtime.alloc.bytes Preprocessing: - PROMETHEUSPATTERN: `vault_runtime_alloc_bytes` ⛔️ONFAIL: `DISCARD_VALUE ->`
Vault	Vault: Runtime freed objects	Number of freed objects.	DEPENDENT	vault.metrics.runtime.free.count Preprocessing: - PROMETHEUSPATTERN: `vault_runtime_free_count` ⛔️ONFAIL: `DISCARD_VALUE ->`
Vault	Vault: Runtime heap objects	Number of objects on the heap. This is a good general memory pressure indicator worth establishing a baseline and thresholds for alerting.	DEPENDENT	vault.metrics.runtime.heap.objects Preprocessing: - PROMETHEUSPATTERN: `vault_runtime_heap_objects` ⛔️ONFAIL: `DISCARD_VALUE ->`
Vault	Vault: Runtime malloc count	Cumulative count of allocated heap objects.	DEPENDENT	vault.metrics.runtime.malloc.count Preprocessing: - PROMETHEUSPATTERN: `vault_runtime_malloc_count` ⛔️ONFAIL: `DISCARD_VALUE ->`
Vault	Vault: Runtime num goroutines	Number of goroutines. This serves as a general system load indicator worth establishing a baseline and thresholds for alerting.	DEPENDENT	vault.metrics.runtime.numgoroutines Preprocessing: - PROMETHEUSPATTERN: `vault_runtime_num_goroutines` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Runtime sys bytes	Number of bytes allocated to Vault. This includes what is being used by Vault's heap and what has been reclaimed but not given back to the operating system.	DEPENDENT	vault.metrics.runtime.sys.bytes Preprocessing: - PROMETHEUSPATTERN: `vault_runtime_sys_bytes` ⛔️ONFAIL: `DISCARD_VALUE ->`
Vault	Vault: Runtime GC pause, total	The total garbage collector pause time since Vault was last started.	DEPENDENT	vault.metrics.total.gc.pause Preprocessing: - PROMETHEUSPATTERN: `vault_runtime_total_gc_pause_ns` ⛔️ONFAIL: `DISCARD_VALUE ->` - MULTIPLIER: `1.0E-9`
Vault	Vault: Runtime GC runs, total	Total number of garbage collection runs since Vault was last started.	DEPENDENT	vault.metrics.runtime.total.gc.runs Preprocessing: - PROMETHEUSPATTERN: `vault_runtime_total_gc_runs` ⛔️ONFAIL: `DISCARD_VALUE ->`
Vault	Vault: Token count, total	Total number of service tokens available for use; counts all un-expired and un-revoked tokens in Vault's token store. This measurement is performed every 10 minutes.	DEPENDENT	vault.metrics.token Preprocessing: - PROMETHEUSTOJSON: `vault_token_count` - JSONPATH: `$[?(@.name=="vault_token_count")].value.sum()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Vault	Vault: Token count by auth, total	Total number of service tokens that were created by a auth method.	DEPENDENT	vault.metrics.token.byauth Preprocessing: - PROMETHEUSTOJSON: `vault_token_count_by_auth` - JSONPATH: `$[?(@.name=="vault_token_count_by_auth")].value.sum()` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
Vault	Vault: Token count by policy, total	Total number of service tokens that have a policy attached.	DEPENDENT	vault.metrics.token.bypolicy Preprocessing: - PROMETHEUSTOJSON: `vault_token_count_by_policy` - JSONPATH: `$[?(@.name=="vault_token_count_by_policy")].value.sum()` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
Vault	Vault: Token count by ttl, total	Number of service tokens, grouped by the TTL range they were assigned at creation.	DEPENDENT	vault.metrics.token.byttl Preprocessing: - PROMETHEUSTOJSON: `vault_token_count_by_ttl` - JSONPATH: `$[?(@.name=="vault_token_count_by_ttl")].value.sum()` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
Vault	Vault: Token creation, rate	Number of service or batch tokens created.	DEPENDENT	vault.metrics.token.creation.rate Preprocessing: - PROMETHEUSTOJSON: `vault_token_creation` - JSONPATH: `$[?(@.name=="vault_token_creation")].value.sum()` ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - CHANGEPER_SECOND
Vault	Vault: Secret kv entries	Number of entries in each key-value secret engine.	DEPENDENT	vault.metrics.secret.kv.count Preprocessing: - PROMETHEUSTOJSON: `vault_secret_kv_count` - JSONPATH: `$[?(@.name=="vault_secret_kv_count")].value.sum()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Vault	Vault: Token secret lease creation, rate	Counts the number of leases created by secret engines.	DEPENDENT	vault.metrics.secret.lease.creation.rate Preprocessing: - PROMETHEUSTOJSON: `vault_secret_lease_creation` - JSONPATH: `$[?(@.name=="vault_secret_lease_creation")].value.sum()` ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - CHANGEPER_SECOND
Vault	Vault: Storage [{#STORAGE}] {#OPERATION} ops, rate	Number of a {#OPERATION} operation against the {#STORAGE} storage backend.	DEPENDENT	vault.metrics.storage.rate[{#STORAGE}, {#OPERATION}] Preprocessing: - PROMETHEUSPATTERN: `{#PATTERN_C}` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Vault	Vault: Rollback attempt [{#MOUNTPOINT}] ops, rate	Number of operations to perform a rollback operation on the given mount point.	DEPENDENT	vault.metrics.rollback.attempt.rate[{#MOUNTPOINT}] Preprocessing: - PROMETHEUSPATTERN: `{#PATTERN_C}` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Vault	Vault: Route rollback [{#MOUNTPOINT}] ops, rate	Number of operations to dispatch a rollback operation to a backend, and for that backend to process it. Rollback operations are automatically scheduled to clean up partial errors.	DEPENDENT	vault.metrics.route.rollback.rate[{#MOUNTPOINT}] Preprocessing: - PROMETHEUSPATTERN: `{#PATTERN_C}` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Vault	Vault: Delete WALs, count{#SINGLETON}	Time taken to delete a Write Ahead Log (WAL).	DEPENDENT	vault.metrics.wal.deletewals[{#SINGLETON}] Preprocessing: - PROMETHEUSPATTERN: `vault_wal_deletewals_count` ⛔️ONFAIL: `DISCARD_VALUE ->`
Vault	Vault: GC deleted WAL{#SINGLETON}	Number of Write Ahead Logs (WAL) deleted during each garbage collection run.	DEPENDENT	vault.metrics.wal.gc.deleted[{#SINGLETON}] Preprocessing: - PROMETHEUSPATTERN: `vault_wal_gc_deleted` ⛔️ONFAIL: `DISCARD_VALUE ->`
Vault	Vault: WALs on disk, total{#SINGLETON}	Total Number of Write Ahead Logs (WAL) on disk.	DEPENDENT	vault.metrics.wal.gc.total[{#SINGLETON}] Preprocessing: - PROMETHEUSPATTERN: `vault_wal_gc_total` ⛔️ONFAIL: `DISCARD_VALUE ->`
Vault	Vault: Load WALs, count{#SINGLETON}	Time taken to load a Write Ahead Log (WAL).	DEPENDENT	vault.metrics.wal.loadWAL[{#SINGLETON}] Preprocessing: - PROMETHEUSPATTERN: `vault_wal_loadWAL_count` ⛔️ONFAIL: `DISCARD_VALUE ->`
Vault	Vault: Persist WALs, count{#SINGLETON}	Time taken to persist a Write Ahead Log (WAL).	DEPENDENT	vault.metrics.wal.persistwals[{#SINGLETON}] Preprocessing: - PROMETHEUSPATTERN: `vault_wal_persistwals_count` ⛔️ONFAIL: `DISCARD_VALUE ->`
Vault	Vault: Flush ready WAL, count{#SINGLETON}	Time taken to flush a ready Write Ahead Log (WAL) to storage.	DEPENDENT	vault.metrics.wal.flushready[{#SINGLETON}] Preprocessing: - PROMETHEUSPATTERN: `vault_wal_flushready_count` ⛔️ONFAIL: `DISCARD_VALUE ->`
Vault	Vault: Stream WAL missing guard, count{#SINGLETON}	Number of incidences where the starting Merkle Tree index used to begin streaming WAL entries is not matched/found.	DEPENDENT	vault.metrics.logshipper.streamWALs.missingguard[{#SINGLETON}] Preprocessing: - PROMETHEUSPATTERN: `logshipper_streamWALs_missing_guard` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Stream WAL guard found, count{#SINGLETON}	Number of incidences where the starting Merkle Tree index used to begin streaming WAL entries is matched/found.	DEPENDENT	vault.metrics.logshipper.streamWALs.guardfound[{#SINGLETON}] Preprocessing: - PROMETHEUSPATTERN: `logshipper_streamWALs_guard_found` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Merkle commit index{#SINGLETON}	The last committed index in the Merkle Tree.	DEPENDENT	vault.metrics.replication.merkle.commitindex[{#SINGLETON}] Preprocessing: - PROMETHEUSPATTERN: `replication_merkle_commit_index` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Last WAL{#SINGLETON}	The index of the last WAL.	DEPENDENT	vault.metrics.replication.wal.lastwal[{#SINGLETON}] Preprocessing: - PROMETHEUSPATTERN: `replication_wal_last_wal` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Last DR WAL{#SINGLETON}	The index of the last DR WAL.	DEPENDENT	vault.metrics.replication.wal.lastdrwal[{#SINGLETON}] Preprocessing: - PROMETHEUSPATTERN: `replication_wal_last_dr_wal` ⛔️ONFAIL: `DISCARD_VALUE ->`
Vault	Vault: Last performance WAL{#SINGLETON}	The index of the last Performance WAL.	DEPENDENT	vault.metrics.replication.wal.lastperformancewal[{#SINGLETON}] Preprocessing: - PROMETHEUSPATTERN: `replication_wal_last_performance_wal` ⛔️ONFAIL: `DISCARD_VALUE ->`
Vault	Vault: Last remote WAL{#SINGLETON}	The index of the last remote WAL.	DEPENDENT	vault.metrics.replication.fsm.lastremotewal[{#SINGLETON}] Preprocessing: - PROMETHEUSPATTERN: `replication_fsm_last_remote_wal` ⛔️ONFAIL: `DISCARD_VALUE ->`
Vault	Vault: Token [{#TOKEN_NAME}] error	Token lookup error text.	DEPENDENT	vault.tokenviaaccessor.error["{#ACCESSOR}"] Preprocessing: - JSONPATH: `$.[?(@.accessor == "{#ACCESSOR}")].error.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
Vault	Vault: Token [{#TOKEN_NAME}] has TTL	The Token has TTL.	DEPENDENT	vault.tokenviaaccessor.hasttl["{#ACCESSOR}"] Preprocessing: - JSONPATH: `$.[?(@.accessor == "{#ACCESSOR}")].has_ttl.first()` - BOOLTODECIMAL - DISCARDUNCHANGED_HEARTBEAT: `1h`
Vault	Vault: Token [{#TOKEN_NAME}] TTL	The TTL period of the token.	DEPENDENT	vault.tokenviaaccessor.ttl["{#ACCESSOR}"] Preprocessing: - JSONPATH: `$.[?(@.accessor == "{#ACCESSOR}")].ttl.first()`
Zabbix raw items	Vault: Get health	-	HTTP_AGENT	vault.gethealth Preprocessing: - CHECKNOTSUPPORTED ⛔️ONFAIL: `CUSTOM_VALUE -> {"healthcheck": 0}`
Zabbix raw items	Vault: Get leader	-	HTTP_AGENT	vault.getleader Preprocessing: - CHECKNOT_SUPPORTED
Zabbix raw items	Vault: Get metrics	-	HTTP_AGENT	vault.getmetrics Preprocessing: - CHECKNOT_SUPPORTED
Zabbix raw items	Vault: Clear metrics	-	DEPENDENT	vault.clearmetrics Preprocessing: - CHECKJSONERROR: `$.errors` ⛔️ONFAIL: `DISCARD_VALUE ->`
Zabbix raw items	Vault: Get tokens	Get information about tokens via their accessors. Accessors are defined in the macro "{$VAULT.TOKEN.ACCESSORS}".	SCRIPT	vault.get_tokens Expression: `The text is too long. Please see the template.`
Zabbix raw items	Vault: Check WAL discovery	-	DEPENDENT	vault.checkwaldiscovery Preprocessing: - PROMETHEUSTOJSON: `{__name__=~"^vault_wal_(?:.+)$"}` ⛔️ONFAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return JSON.stringify(value !== "[]" ? [{'{#SINGLETON}': ''}] : []);` - DISCARDUNCHANGED_HEARTBEAT: `15m`
Zabbix raw items	Vault: Check replication discovery	-	DEPENDENT	vault.checkreplicationdiscovery Preprocessing: - PROMETHEUSTOJSON: `{__name__=~"^replication_(?:.+)$"}` ⛔️ONFAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return JSON.stringify(value !== "[]" ? [{'{#SINGLETON}': ''}] : []);` - DISCARDUNCHANGED_HEARTBEAT: `15m`
Zabbix raw items	Vault: Check storage discovery	-	DEPENDENT	vault.checkstoragediscovery Preprocessing: - PROMETHEUSTOJSON: `{name=~"^vault(?:.+)(?:get	put	list	delete)count$"}`</p><p>⛔️ON_FAIL:`DISCARDVALUE -> `</p><p>- JAVASCRIPT:`The text is too long. Please see the template.`</p><p>- DISCARD_UNCHANGED_HEARTBEAT:`15m`
Zabbix raw items	Vault: Check mountpoint discovery	-	DEPENDENT	vault.checkmountpointdiscovery Preprocessing: - PROMETHEUSTOJSON: `{__name__=~"^vault_rollback_attempt_(?:.+?)_count$"}` ⛔️ONFAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGED_HEARTBEAT: `15m`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Vault: Vault server is sealed	https://www.vaultproject.io/docs/concepts/seal	`last(/HashiCorp Vault by HTTP/vault.health.sealed)=1`	AVERAGE
Vault: Version has changed	Vault version has changed. Ack to close.	`last(/HashiCorp Vault by HTTP/vault.health.version,#1)<>last(/HashiCorp Vault by HTTP/vault.health.version,#2) and length(last(/HashiCorp Vault by HTTP/vault.health.version))>0`	INFO	Manual close: YES
Vault: Vault server is not responding	-	`last(/HashiCorp Vault by HTTP/vault.health.check)=0`	HIGH
Vault: Failed to get metrics	-	`length(last(/HashiCorp Vault by HTTP/vault.get_metrics.error))>0`	WARNING	Depends on: - Vault: Vault server is sealed
Vault: Current number of open files is too high	-	`min(/HashiCorp Vault by HTTP/vault.metrics.process.open.fds,5m)/last(/HashiCorp Vault by HTTP/vault.metrics.process.max.fds)*100>{$VAULT.OPEN.FDS.MAX.WARN}`	WARNING
Vault: has been restarted	Uptime is less than 10 minutes.	`last(/HashiCorp Vault by HTTP/vault.metrics.process.uptime)<10m`	INFO	Manual close: YES
Vault: High frequency of leadership setup failures	There have been more than {$VAULT.LEADERSHIP.SETUP.FAILED.MAX.WARN} Vault leadership setup failures in the past 1h.	`(max(/HashiCorp Vault by HTTP/vault.metrics.core.leadership.setup_failed,1h)-min(/HashiCorp Vault by HTTP/vault.metrics.core.leadership.setup_failed,1h))>{$VAULT.LEADERSHIP.SETUP.FAILED.MAX.WARN}`	AVERAGE
Vault: High frequency of leadership losses	There have been more than {$VAULT.LEADERSHIP.LOSSES.MAX.WARN} Vault leadership losses in the past 1h.	`(max(/HashiCorp Vault by HTTP/vault.metrics.core.leadership_lost,1h)-min(/HashiCorp Vault by HTTP/vault.metrics.core.leadership_lost,1h))>{$VAULT.LEADERSHIP.LOSSES.MAX.WARN}`	AVERAGE
Vault: High frequency of leadership step downs	There have been more than {$VAULT.LEADERSHIP.STEPDOWNS.MAX.WARN} Vault leadership step downs in the past 1h.	`(max(/HashiCorp Vault by HTTP/vault.metrics.core.step_down,1h)-min(/HashiCorp Vault by HTTP/vault.metrics.core.step_down,1h))>{$VAULT.LEADERSHIP.STEPDOWNS.MAX.WARN}`	AVERAGE
Vault: Token [{#TOKEN_NAME}] lookup error occurred	-	`length(last(/HashiCorp Vault by HTTP/vault.token_via_accessor.error["{#ACCESSOR}"]))>0`	WARNING	Depends on: - Vault: Vault server is sealed
Vault: Token [{#TOKEN_NAME}] will expire soon	-	`last(/HashiCorp Vault by HTTP/vault.token_via_accessor.has_ttl["{#ACCESSOR}"])=1 and last(/HashiCorp Vault by HTTP/vault.token_via_accessor.ttl["{#ACCESSOR}"])<{$VAULT.TOKEN.TTL.MIN.CRIT}`	AVERAGE
Vault: Token [{#TOKEN_NAME}] will expire soon	-	`last(/HashiCorp Vault by HTTP/vault.token_via_accessor.has_ttl["{#ACCESSOR}"])=1 and last(/HashiCorp Vault by HTTP/vault.token_via_accessor.ttl["{#ACCESSOR}"])<{$VAULT.TOKEN.TTL.MIN.WARN}`	WARNING	Depends on: - Vault: Token [{#TOKEN_NAME}] will expire soon

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

app

app_truenas_snmp

View README Download JSON

TrueNAS by SNMP

Overview

For Zabbix version: 6.2 and higher. Template for monitoring TrueNAS by SNMP

This template was tested on:

TrueNAS Core, version 12.0-U8

Setup

Import template into Zabbix
Enable SNMP daemon at Services in TrueNAS web interface https://www.truenas.com/docs/core/services/snmp
Link template to the host

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$CPU.UTIL.CRIT}	Threshold of CPU utilization for warning trigger in %.	`90`
{$DATASET.FREE.MIN.CRIT}	This macro is used for trigger expression. It can be overridden on the host or linked on the template level.	`5G`
{$DATASET.FREE.MIN.WARN}	This macro is used for trigger expression. It can be overridden on the host or linked on the template level.	`5G`
{$DATASET.NAME.MATCHES}	This macro is used in datasets discovery. Can be overridden on the host or linked template level	`.+`
{$DATASET.NAME.NOT_MATCHES}	This macro is used in datasets discovery. Can be overridden on the host or linked template level	`^(boot	.+.system(.+)?$)`
{$DATASET.PUSED.MAX.CRIT}	Threshold of used dataset space for average severity trigger in %.	`90`
{$DATASET.PUSED.MAX.WARN}	Threshold of used dataset space for warning trigger in %.	`80`
{$ICMPLOSSWARN}	Threshold of ICMP packets loss for warning trigger in %.	`20`
{$ICMPRESPONSETIME_WARN}	Threshold of average ICMP response time for warning trigger in seconds.	`0.15`
{$IF.ERRORS.WARN}	Threshold of error packets rate for warning trigger. Can be used with interface name as context.	`2`
{$IF.UTIL.MAX}	Threshold of interface bandwidth utilization for warning trigger in %. Can be used with interface name as context.	`90`
{$IFCONTROL}	Macro for operational state of the interface for link down trigger. Can be used with interface name as context.	`1`
{$LOADAVGPER_CPU.MAX.WARN}	Load per CPU considered sustainable. Tune if needed.	`1.5`
{$MEMORY.AVAILABLE.MIN}	Threshold of available memory for trigger in bytes.	`20M`
{$MEMORY.UTIL.MAX}	Threshold of memory utilization for trigger in %	`90`
{$NET.IF.IFADMINSTATUS.MATCHES}	This macro is used in filters of network interfaces discovery rule.	`^.*`
{$NET.IF.IFADMINSTATUS.NOT_MATCHES}	Ignore down(2) administrative status	`^2$`
{$NET.IF.IFALIAS.MATCHES}	This macro is used in filters of network interfaces discovery rule.	`.*`
{$NET.IF.IFALIAS.NOT_MATCHES}	This macro is used in filters of network interfaces discovery rule.	`CHANGE_IF_NEEDED`
{$NET.IF.IFDESCR.MATCHES}	This macro used in filters of network interfaces discovery rule.	`.*`
{$NET.IF.IFDESCR.NOT_MATCHES}	This macro used in filters of network interfaces discovery rule.	`CHANGE_IF_NEEDED`
{$NET.IF.IFNAME.NOT_MATCHES}	This macro used in filters of network interfaces discovery rule.	`CHANGE_IF_NEEDED`
{$NET.IF.IFOPERSTATUS.MATCHES}	This macro used in filters of network interfaces discovery rule.	`^.*$`
{$NET.IF.IFOPERSTATUS.NOT_MATCHES}	Ignore notPresent(6)	`^6$`
{$NET.IF.IFTYPE.MATCHES}	This macro used in filters of network interfaces discovery rule.	`.*`
{$NET.IF.IFTYPE.NOT_MATCHES}	This macro used in filters of network interfaces discovery rule.	`CHANGE_IF_NEEDED`
{$SNMP.TIMEOUT}	The time interval for SNMP availability trigger.	`5m`
{$SWAP.PFREE.MIN.WARN}	Threshold of free swap space for warning trigger in %.	`50`
{$TEMPERATURE.MAX.CRIT}	This macro is used for trigger expression. It can be overridden on the host or linked on the template level.	`65`
{$TEMPERATURE.MAX.WARN}	This macro is used for trigger expression. It can be overridden on the host or linked on the template level.	`50`
{$VFS.DEV.DEVNAME.MATCHES}	This macro is used in block devices discovery. Can be overridden on the host or linked template level	`.+`
{$VFS.DEV.DEVNAME.NOT_MATCHES}	This macro is used in block devices discovery. Can be overridden on the host or linked template level	`^(loop[0-9]*	sd[a-z][0-9]+	nbd[0-9]+	sr[0-9]+	fd[0-9]+	dm-[0-9]+	ram[0-9]+	ploop[a-z0-9]+	md[0-9]*	hcp[0-9]*	cd[0-9]*	pass[0-9]*	zram[0-9]*)`
{$ZPOOL.FREE.MIN.CRIT}	This macro is used for trigger expression. It can be overridden on the host or linked on the template level.	`5G`
{$ZPOOL.FREE.MIN.WARN}	This macro is used for trigger expression. It can be overridden on the host or linked on the template level.	`5G`
{$ZPOOL.PUSED.MAX.CRIT}	Threshold of used pool space for average severity trigger in %.	`90`
{$ZPOOL.PUSED.MAX.WARN}	Threshold of used pool space for warning trigger in %.	`80`

Template links

There are no template links in this template.

Discovery rules

Name	Description	Type	Key and additional info
Block devices discovery	Block devices are discovered from UCD-DISKIO-MIB::diskIOTable (http://net-snmp.sourceforge.net/docs/mibs/ucdDiskIOMIB.html#diskIOTable).	SNMP	vfs.dev.discovery Filter: AND - {#DEVNAME} MATCHESREGEX `{$VFS.DEV.DEVNAME.MATCHES}` - {#DEVNAME} NOTMATCHES_REGEX `{$VFS.DEV.DEVNAME.NOT_MATCHES}`
CPU discovery	This discovery will create set of per core CPU metrics from UCD-SNMP-MIB, using {#CPU.COUNT} in preprocessing. That's the only reason why LLD is used.	DEPENDENT	cpu.discovery Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.`
Disks temperature discovery	Disks temperature discovery from FREENAS-MIB.	SNMP	truenas.disk.temp.discovery Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `6h`
Network interfaces discovery	Discovering interfaces from IF-MIB.	SNMP	net.if.discovery Filter: AND - {#IFADMINSTATUS} MATCHESREGEX `{$NET.IF.IFADMINSTATUS.MATCHES}` - {#IFADMINSTATUS} NOTMATCHESREGEX `{$NET.IF.IFADMINSTATUS.NOT_MATCHES}` - {#IFOPERSTATUS} MATCHESREGEX `{$NET.IF.IFOPERSTATUS.MATCHES}` - {#IFOPERSTATUS} NOTMATCHESREGEX `{$NET.IF.IFOPERSTATUS.NOT_MATCHES}` - {#IFNAME} MATCHESREGEX `@Network interfaces for discovery` - {#IFNAME} NOTMATCHESREGEX `{$NET.IF.IFNAME.NOT_MATCHES}` - {#IFDESCR} MATCHESREGEX `{$NET.IF.IFDESCR.MATCHES}` - {#IFDESCR} NOTMATCHESREGEX `{$NET.IF.IFDESCR.NOT_MATCHES}` - {#IFALIAS} MATCHESREGEX `{$NET.IF.IFALIAS.MATCHES}` - {#IFALIAS} NOTMATCHESREGEX `{$NET.IF.IFALIAS.NOT_MATCHES}` - {#IFTYPE} MATCHESREGEX `{$NET.IF.IFTYPE.MATCHES}` - {#IFTYPE} NOTMATCHESREGEX `{$NET.IF.IFTYPE.NOT_MATCHES}`
ZFS datasets discovery	ZFS datasets discovery from FREENAS-MIB.	SNMP	truenas.zfs.dataset.discovery Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `6h` Filter: AND - {#DATASETNAME} MATCHESREGEX `{$DATASET.NAME.MATCHES}` - {#DATASETNAME} NOTMATCHES_REGEX `{$DATASET.NAME.NOT_MATCHES}`
ZFS pools discovery	ZFS pools discovery from FREENAS-MIB.	SNMP	truenas.zfs.pools.discovery Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `6h`
ZFS volumes discovery	ZFS volumes discovery from FREENAS-MIB.	SNMP	truenas.zfs.zvols.discovery Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `6h`

Items collected

Group	Name	Description	Type	Key and additional info
CPU	TrueNAS: Interrupts per second	MIB: UCD-SNMP-MIB Number of interrupts processed.	SNMP	system.cpu.intr Preprocessing: - CHANGEPERSECOND
CPU	TrueNAS: Context switches per second	MIB: UCD-SNMP-MIB Number of context switches.	SNMP	system.cpu.switches Preprocessing: - CHANGEPERSECOND
CPU	TrueNAS: Load average (1m avg)	MIB: UCD-SNMP-MIB The 1 minute load averages.	SNMP	system.cpu.load.avg1
CPU	TrueNAS: Load average (5m avg)	MIB: UCD-SNMP-MIB The 5 minutes load averages.	SNMP	system.cpu.load.avg5
CPU	TrueNAS: Load average (15m avg)	MIB: UCD-SNMP-MIB The 15 minutes load averages.	SNMP	system.cpu.load.avg15
CPU	TrueNAS: Number of CPUs	MIB: HOST-RESOURCES-MIB Count the number of CPU cores by counting number of cores discovered in hrProcessorTable using LLD.	SNMP	system.cpu.num Preprocessing: - JAVASCRIPT: `//count the number of cores return JSON.parse(value).length;`
CPU	TrueNAS: CPU idle time	MIB: UCD-SNMP-MIB The time the CPU has spent doing nothing.	SNMP	system.cpu.idle[{#SNMPINDEX}]
CPU	TrueNAS: CPU system time	MIB: UCD-SNMP-MIB The time the CPU has spent running the kernel and its processes.	SNMP	system.cpu.system[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND - JAVASCRIPT: `//to get utilization in %, divide by N, where N is number of cores. return value/{#CPU.COUNT}`
CPU	TrueNAS: CPU user time	MIB: UCD-SNMP-MIB The time the CPU has spent running users' processes that are not niced.	SNMP	system.cpu.user[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND - JAVASCRIPT: `//to get utilization in %, divide by N, where N is number of cores. return value/{#CPU.COUNT}`
CPU	TrueNAS: CPU nice time	MIB: UCD-SNMP-MIB The time the CPU has spent running users' processes that have been niced.	SNMP	system.cpu.nice[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND - JAVASCRIPT: `//to get utilization in %, divide by N, where N is number of cores. return value/{#CPU.COUNT}`
CPU	TrueNAS: CPU iowait time	MIB: UCD-SNMP-MIB The amount of time the CPU has been waiting for I/O to complete.	SNMP	system.cpu.iowait[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND - JAVASCRIPT: `//to get utilization in %, divide by N, where N is number of cores. return value/{#CPU.COUNT}`
CPU	TrueNAS: CPU interrupt time	MIB: UCD-SNMP-MIB The amount of time the CPU has been servicing hardware interrupts.	SNMP	system.cpu.interrupt[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND - JAVASCRIPT: `//to get utilization in %, divide by N, where N is number of cores. return value/{#CPU.COUNT}`
CPU	TrueNAS: CPU utilization	The CPU utilization expressed in %.	DEPENDENT	system.cpu.util[{#SNMPINDEX}] Preprocessing: - JAVASCRIPT: `//Calculate utilization return (100 - value)`
General	TrueNAS: System contact details	MIB: SNMPv2-MIB The textual identification of the contact person for this managed node, together with information on how to contact this person. If no contact information is known, the value is the zero-length string.	SNMP	system.contact Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `6h`
General	TrueNAS: System description	MIB: SNMPv2-MIB System description of the host.	SNMP	system.descr Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `6h`
General	TrueNAS: System location	MIB: SNMPv2-MIB The physical location of this node. If the location is unknown, the value is the zero-length string.	SNMP	system.location Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `6h`
General	TrueNAS: System name	MIB: SNMPv2-MIB The host name of the system.	SNMP	system.name Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `6h`
General	TrueNAS: System object ID	MIB: SNMPv2-MIB The vendor authoritative identification of the network management subsystem contained in the entity. This value is allocated within the SMI enterprises subtree (1.3.6.1.4.1) and provides an easy and unambiguous means for determining what kind of box is being managed.	SNMP	system.objectid Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `6h`
Memory	TrueNAS: Free memory	MIB: UCD-SNMP-MIB The amount of real/physical memory currently unused or available.	SNMP	vm.memory.free Preprocessing: - MULTIPLIER: `1024`
Memory	TrueNAS: Memory (buffers)	MIB: UCD-SNMP-MIB The total amount of real or virtual memory currently allocated for use as memory buffers.	SNMP	vm.memory.buffers Preprocessing: - MULTIPLIER: `1024`
Memory	TrueNAS: Memory (cached)	MIB: UCD-SNMP-MIB The total amount of real or virtual memory currently allocated for use as cached memory.	SNMP	vm.memory.cached Preprocessing: - MULTIPLIER: `1024`
Memory	TrueNAS: Total memory	MIB: UCD-SNMP-MIB The total memory expressed in Bytes.	SNMP	vm.memory.total Preprocessing: - MULTIPLIER: `1024`
Memory	TrueNAS: Available memory	Please note that memory utilization is a rough estimate, since memory available is calculated as free+buffers+cached, which is not 100% accurate, but the best we can get using SNMP.	CALCULATED	vm.memory.available Expression: `last(//vm.memory.free)+last(//vm.memory.buffers)+last(//vm.memory.cached)`
Memory	TrueNAS: Memory utilization	Please note that memory utilization is a rough estimate, since memory available is calculated as free+buffers+cached, which is not 100% accurate, but the best we can get using SNMP.	CALCULATED	vm.memory.util Expression: `(last(//vm.memory.total)-(last(//vm.memory.free)+last(//vm.memory.buffers)+last(//vm.memory.cached)))/last(//vm.memory.total)*100`
Memory	TrueNAS: Total swap space	MIB: UCD-SNMP-MIB The total amount of swap space configured for this host.	SNMP	system.swap.total Preprocessing: - MULTIPLIER: `1024`
Memory	TrueNAS: Free swap space	MIB: UCD-SNMP-MIB The amount of swap space currently unused or available.	SNMP	system.swap.free Preprocessing: - MULTIPLIER: `1024`
Memory	TrueNAS: Free swap space in %	The free space of the swap volume/file expressed in %.	CALCULATED	system.swap.pfree Expression: `last(//system.swap.free)/last(//system.swap.total)*100`
Network interfaces	TrueNAS: Interface [{#IFNAME}({#IFALIAS})]: Inbound packets discarded	MIB: IF-MIB The number of inbound packets which were chosen to be discarded even though no errors had been detected to prevent their being deliverable to a higher-layer protocol. One possible reason for discarding such a packet could be to free up buffer space. Discontinuities in the value of this counter can occur at re-initialization of the management system, and at other times as indicated by the value of ifCounterDiscontinuityTime.	SNMP	net.if.in.discards[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND: ``
Network interfaces	TrueNAS: Interface [{#IFNAME}({#IFALIAS})]: Inbound packets with errors	MIB: IF-MIB For packet-oriented interfaces, the number of inbound packets that contained errors preventing them from being deliverable to a higher-layer protocol. For character-oriented or fixed-length interfaces, the number of inbound transmission units that contained errors preventing them from being deliverable to a higher-layer protocol. Discontinuities in the value of this counter can occur at re-initialization of the management system, and at other times as indicated by the value of ifCounterDiscontinuityTime.	SNMP	net.if.in.errors[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND: ``
Network interfaces	TrueNAS: Interface [{#IFNAME}({#IFALIAS})]: Bits received	MIB: IF-MIB The total number of octets received on the interface, including framing characters. This object is a 64-bit version of ifInOctets. Discontinuities in the value of this counter can occur at re-initialization of the management system, and at other times as indicated by the value of ifCounterDiscontinuityTime.	SNMP	net.if.in[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND: ``</p><p>- MULTIPLIER:`8`
Network interfaces	TrueNAS: Interface [{#IFNAME}({#IFALIAS})]: Outbound packets discarded	MIB: IF-MIB The number of outbound packets which were chosen to be discarded even though no errors had been detected to prevent their being deliverable to a higher-layer protocol. One possible reason for discarding such a packet could be to free up buffer space. Discontinuities in the value of this counter can occur at re-initialization of the management system, and at other times as indicated by the value of ifCounterDiscontinuityTime.	SNMP	net.if.out.discards[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND: ``
Network interfaces	TrueNAS: Interface [{#IFNAME}({#IFALIAS})]: Outbound packets with errors	MIB: IF-MIB For packet-oriented interfaces, the number of outbound packets that contained errors preventing them from being deliverable to a higher-layer protocol. For character-oriented or fixed-length interfaces, the number of outbound transmission units that contained errors preventing them from being deliverable to a higher-layer protocol. Discontinuities in the value of this counter can occur at re-initialization of the management system, and at other times as indicated by the value of ifCounterDiscontinuityTime.	SNMP	net.if.out.errors[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND: ``
Network interfaces	TrueNAS: Interface [{#IFNAME}({#IFALIAS})]: Bits sent	MIB: IF-MIB The total number of octets transmitted out of the interface, including framing characters. This object is a 64-bit version of ifOutOctets.Discontinuities in the value of this counter can occur at re-initialization of the management system, and at other times as indicated by the value of ifCounterDiscontinuityTime.	SNMP	net.if.out[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND: ``</p><p>- MULTIPLIER:`8`
Network interfaces	TrueNAS: Interface [{#IFNAME}({#IFALIAS})]: Speed	MIB: IF-MIB An estimate of the interface's current bandwidth in units of 1,000,000 bits per second. If this object reports a value of `n' then the speed of the interface is somewhere in the range of`n-500,000' to`n+499,999'. For interfaces which do not vary in bandwidth or for those where no accurate estimation can be made, this object should contain the nominal bandwidth. For a sub-layer which has no concept of bandwidth, this object should be zero.	SNMP	net.if.speed[{#SNMPINDEX}] Preprocessing: - MULTIPLIER: `1000000` - DISCARDUNCHANGEDHEARTBEAT: `1h`
Network interfaces	TrueNAS: Interface [{#IFNAME}({#IFALIAS})]: Operational status	MIB: IF-MIB The current operational state of the interface. - The testing(3) state indicates that no operational packet scan be passed - If ifAdminStatus is down(2) then ifOperStatus should be down(2) - If ifAdminStatus is changed to up(1) then ifOperStatus should change to up(1) if the interface is ready to transmit and receive network traffic - It should change todormant(5) if the interface is waiting for external actions (such as a serial line waiting for an incoming connection) - It should remain in the down(2) state if and only if there is a fault that prevents it from going to the up(1) state - It should remain in the notPresent(6) state if the interface has missing(typically, hardware) components.	SNMP	net.if.status[{#SNMPINDEX}] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `6h`
Network interfaces	TrueNAS: Interface [{#IFNAME}({#IFALIAS})]: Interface type	MIB: IF-MIB The type of interface. Additional values for ifType are assigned by the Internet Assigned Numbers Authority (IANA), through updating the syntax of the IANAifType textual convention.	SNMP	net.if.type[{#SNMPINDEX}] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `6h`
Status	TrueNAS: ICMP ping	Host accessibility by ICMP. 0 - ICMP ping fails. 1 - ICMP ping successful.	SIMPLE	icmpping
Status	TrueNAS: ICMP loss	Percentage of lost packets.	SIMPLE	icmppingloss
Status	TrueNAS: ICMP response time	ICMP ping response time (in seconds).	SIMPLE	icmppingsec
Status	TrueNAS: Uptime	MIB: SNMPv2-MIB The system uptime expressed in the following format:'N days, hh:mm:ss'.	SNMP	system.uptime Preprocessing: - MULTIPLIER: `0.01`
Status	TrueNAS: SNMP agent availability	Availability of SNMP checks on the host. The value of this item corresponds to availability icons in the host list. Possible value: 0 - not available 1 - available 2 - unknown	INTERNAL	zabbix[host,snmp,available]
Storage	TrueNAS: [{#DEVNAME}]: Disk read rate	MIB: UCD-DISKIO-MIB The number of read accesses from this device since boot.	SNMP	vfs.dev.read.rate[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND
Storage	TrueNAS: [{#DEVNAME}]: Disk write rate	MIB: UCD-DISKIO-MIB The number of write accesses from this device since boot.	SNMP	vfs.dev.write.rate[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND
Storage	TrueNAS: [{#DEVNAME}]: Disk utilization	MIB: UCD-DISKIO-MIB The 1 minute average load of disk (%).	SNMP	vfs.dev.util[{#SNMPINDEX}]
TrueNAS	TrueNAS: ARC size	MIB: FREENAS-MIB ARC size in bytes.	SNMP	truenas.zfs.arc.size Preprocessing: - MULTIPLIER: `1024` - DISCARDUNCHANGEDHEARTBEAT: `1h`
TrueNAS	TrueNAS: ARC metadata size	MIB: FREENAS-MIB ARC metadata size used in bytes.	SNMP	truenas.zfs.arc.meta Preprocessing: - MULTIPLIER: `1024`
TrueNAS	TrueNAS: ARC data size	MIB: FREENAS-MIB ARC data size used in bytes.	SNMP	truenas.zfs.arc.data Preprocessing: - MULTIPLIER: `1024`
TrueNAS	TrueNAS: ARC hits	MIB: FREENAS-MIB Total amount of cache hits in the ARC per second.	SNMP	truenas.zfs.arc.hits Preprocessing: - CHANGEPERSECOND
TrueNAS	TrueNAS: ARC misses	MIB: FREENAS-MIB Total amount of cache misses in the ARC per second.	SNMP	truenas.zfs.arc.misses Preprocessing: - CHANGEPERSECOND
TrueNAS	TrueNAS: ARC target size of cache	MIB: FREENAS-MIB ARC target size of cache in bytes.	SNMP	truenas.zfs.arc.c Preprocessing: - MULTIPLIER: `1024` - DISCARDUNCHANGEDHEARTBEAT: `1h`
TrueNAS	TrueNAS: ARC target size of MRU	MIB: FREENAS-MIB ARC target size of MRU in bytes.	SNMP	truenas.zfs.arc.p Preprocessing: - MULTIPLIER: `1024` - DISCARDUNCHANGEDHEARTBEAT: `1h`
TrueNAS	TrueNAS: ARC cache hit ratio	MIB: FREENAS-MIB ARC cache hit ration percentage.	SNMP	truenas.zfs.arc.hit.ratio
TrueNAS	TrueNAS: ARC cache miss ratio	MIB: FREENAS-MIB ARC cache miss ration percentage.	SNMP	truenas.zfs.arc.miss.ratio
TrueNAS	TrueNAS: L2ARC hits	MIB: FREENAS-MIB Hits to the L2 cache per second.	SNMP	truenas.zfs.l2arc.hits Preprocessing: - CHANGEPERSECOND
TrueNAS	TrueNAS: L2ARC misses	MIB: FREENAS-MIB Misses to the L2 cache per second.	SNMP	truenas.zfs.l2arc.misses Preprocessing: - CHANGEPERSECOND
TrueNAS	TrueNAS: L2ARC read rate	MIB: FREENAS-MIB Read rate from L2 cache in bytes per second.	SNMP	truenas.zfs.l2arc.read Preprocessing: - CHANGEPERSECOND
TrueNAS	TrueNAS: L2ARC write rate	MIB: FREENAS-MIB Write rate from L2 cache in bytes per second.	SNMP	truenas.zfs.l2arc.write Preprocessing: - CHANGEPERSECOND
TrueNAS	TrueNAS: L2ARC size	MIB: FREENAS-MIB L2ARC size in bytes.	SNMP	truenas.zfs.l2arc.size Preprocessing: - MULTIPLIER: `1024` - DISCARDUNCHANGEDHEARTBEAT: `1h`
TrueNAS	TrueNAS: ZIL operations 1 second	MIB: FREENAS-MIB The ops column parsed from the command zilstat 1 1.	SNMP	truenas.zfs.zil.ops1
TrueNAS	TrueNAS: ZIL operations 5 seconds	MIB: FREENAS-MIB The ops column parsed from the command zilstat 5 1.	SNMP	truenas.zfs.zil.ops5
TrueNAS	TrueNAS: ZIL operations 10 seconds	MIB: FREENAS-MIB The ops column parsed from the command zilstat 10 1.	SNMP	truenas.zfs.zil.ops10
TrueNAS	TrueNAS: Pool [{#POOLNAME}]: Total space	MIB: FREENAS-MIB The size of the storage pool in bytes.	SNMP	truenas.zpool.size.total[{#POOLNAME}] Preprocessing: - MULTIPLIER: `{#POOL_ALLOC_UNITS}` - DISCARDUNCHANGEDHEARTBEAT: `1h`
TrueNAS	TrueNAS: Pool [{#POOLNAME}]: Used space	MIB: FREENAS-MIB The used size of the storage pool in bytes.	SNMP	truenas.zpool.used[{#POOLNAME}] Preprocessing: - MULTIPLIER: `{#POOL_ALLOC_UNITS}`
TrueNAS	TrueNAS: Pool [{#POOLNAME}]: Available space	MIB: FREENAS-MIB The available size of the storage pool in bytes.	SNMP	truenas.zpool.avail[{#POOLNAME}] Preprocessing: - MULTIPLIER: `{#POOL_ALLOC_UNITS}`
TrueNAS	TrueNAS: Pool [{#POOLNAME}]: Usage in %	The used size of the storage pool in %.	CALCULATED	truenas.zpool.pused[{#POOLNAME}] Expression: `last(//truenas.zpool.used[{#POOLNAME}]) * 100 / last(//truenas.zpool.size.total[{#POOLNAME}])`
TrueNAS	TrueNAS: Pool [{#POOLNAME}]: Health	MIB: FREENAS-MIB The current health of the containing pool, as reported by zpool status.	SNMP	truenas.zpool.health[{#POOLNAME}] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1h`
TrueNAS	TrueNAS: Pool [{#POOLNAME}]: Read operations rate	MIB: FREENAS-MIB The number of read I/O operations sent to the pool or device, including metadata requests (averaged since system booted).	SNMP	truenas.zpool.read.ops[{#POOLNAME}] Preprocessing: - CHANGEPERSECOND
TrueNAS	TrueNAS: Pool [{#POOLNAME}]: Write operations rate	MIB: FREENAS-MIB The number of write I/O operations sent to the pool or device (averaged since system booted).	SNMP	truenas.zpool.write.ops[{#POOLNAME}] Preprocessing: - CHANGEPERSECOND
TrueNAS	TrueNAS: Pool [{#POOLNAME}]: Read rate	MIB: FREENAS-MIB The bandwidth of all read operations (including metadata), expressed as units per second (averaged since system booted).	SNMP	truenas.zpool.read.bytes[{#POOLNAME}] Preprocessing: - MULTIPLIER: `{#POOL_ALLOC_UNITS}` - CHANGEPERSECOND
TrueNAS	TrueNAS: Pool [{#POOLNAME}]: Write rate	MIB: FREENAS-MIB The bandwidth of all write operations, expressed as units per second (averaged since system booted).	SNMP	truenas.zpool.write.bytes[{#POOLNAME}] Preprocessing: - MULTIPLIER: `{#POOL_ALLOC_UNITS}` - CHANGEPERSECOND
TrueNAS	TrueNAS: Dataset [{#DATASET_NAME}]: Total space	MIB: FREENAS-MIB The size of the dataset in bytes.	SNMP	truenas.dataset.size.total[{#DATASETNAME}] Preprocessing: - MULTIPLIER: `{#DATASET_ALLOC_UNITS}` - DISCARDUNCHANGED_HEARTBEAT: `1h`
TrueNAS	TrueNAS: Dataset [{#DATASET_NAME}]: Used space	MIB: FREENAS-MIB The used size of the dataset in bytes.	SNMP	truenas.dataset.used[{#DATASET_NAME}] Preprocessing: - MULTIPLIER: `{#DATASET_ALLOC_UNITS}`
TrueNAS	TrueNAS: Dataset [{#DATASET_NAME}]: Available space	MIB: FREENAS-MIB The available size of the dataset in bytes.	SNMP	truenas.dataset.avail[{#DATASET_NAME}] Preprocessing: - MULTIPLIER: `{#DATASET_ALLOC_UNITS}`
TrueNAS	TrueNAS: Dataset [{#DATASET_NAME}]: Usage in %	The used size of the dataset in %.	CALCULATED	truenas.dataset.pused[{#DATASET_NAME}] Expression: `last(//truenas.dataset.used[{#DATASET_NAME}]) * 100 / last(//truenas.dataset.size.total[{#DATASET_NAME}])`
TrueNAS	TrueNAS: ZFS volume [{#ZVOL_NAME}]: Total space	MIB: FREENAS-MIB The size of the ZFS volume in bytes.	SNMP	truenas.zvol.size.total[{#ZVOLNAME}] Preprocessing: - MULTIPLIER: `{#ZVOL_ALLOC_UNITS}` - DISCARDUNCHANGED_HEARTBEAT: `1h`
TrueNAS	TrueNAS: ZFS volume [{#ZVOL_NAME}]: Used space	MIB: FREENAS-MIB The used size of the ZFS volume in bytes.	SNMP	truenas.zvol.used[{#ZVOL_NAME}] Preprocessing: - MULTIPLIER: `{#ZVOL_ALLOC_UNITS}`
TrueNAS	TrueNAS: ZFS volume [{#ZVOL_NAME}]: Available space	MIB: FREENAS-MIB The available of the ZFS volume in bytes.	SNMP	truenas.zvol.avail[{#ZVOL_NAME}] Preprocessing: - MULTIPLIER: `{#ZVOL_ALLOC_UNITS}`
TrueNAS	TrueNAS: Disk [{#DISK_NAME}]: Temperature	MIB: FREENAS-MIB The temperature of this HDD in mC.	SNMP	truenas.disk.temp[{#DISKNAME}] Preprocessing: - MULTIPLIER: `0.001` - DISCARDUNCHANGED_HEARTBEAT: `1h`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
TrueNAS: Load average is too high	Per CPU load average is too high. Your system may be slow to respond.	`min(/TrueNAS by SNMP/system.cpu.load.avg1,5m)/last(/TrueNAS by SNMP/system.cpu.num)>{$LOAD_AVG_PER_CPU.MAX.WARN} and last(/TrueNAS by SNMP/system.cpu.load.avg5)>0 and last(/TrueNAS by SNMP/system.cpu.load.avg15)>0`	AVERAGE
TrueNAS: High CPU utilization	The CPU utilization is too high. The system might be slow to respond.	`min(/TrueNAS by SNMP/system.cpu.util[{#SNMPINDEX}],5m)>{$CPU.UTIL.CRIT}`	WARNING	Depends on: - TrueNAS: Load average is too high
TrueNAS: System name has changed	The name of the system has changed. Ack to close the problem manually.	`last(/TrueNAS by SNMP/system.name,#1)<>last(/TrueNAS by SNMP/system.name,#2) and length(last(/TrueNAS by SNMP/system.name))>0`	INFO	Manual close: YES
TrueNAS: Lack of available memory	The system is running out of memory.	`min(/TrueNAS by SNMP/vm.memory.available,5m)<{$MEMORY.AVAILABLE.MIN} and last(/TrueNAS by SNMP/vm.memory.total)>0`	AVERAGE
TrueNAS: High memory utilization	The system is running out of free memory.	`min(/TrueNAS by SNMP/vm.memory.util,5m)>{$MEMORY.UTIL.MAX}`	AVERAGE	Depends on: - TrueNAS: Lack of available memory
TrueNAS: High swap space usage	This trigger is ignored, if there is no swap configured.	`min(/TrueNAS by SNMP/system.swap.pfree,5m)<{$SWAP.PFREE.MIN.WARN} and last(/TrueNAS by SNMP/system.swap.total)>0`	WARNING	Depends on: - TrueNAS: High memory utilization - TrueNAS: Lack of available memory
TrueNAS: Interface [{#IFNAME}({#IFALIAS})]: High input error rate	Recovers when below 80% of {$IF.ERRORS.WARN:"{#IFNAME}"} threshold.	`min(/TrueNAS by SNMP/net.if.in.errors[{#SNMPINDEX}],5m)>{$IF.ERRORS.WARN:"{#IFNAME}"}` Recovery expression: `max(/TrueNAS by SNMP/net.if.in.errors[{#SNMPINDEX}],5m)<{$IF.ERRORS.WARN:"{#IFNAME}"}*0.8`	WARNING	Depends on: - TrueNAS: Interface [{#IFNAME}({#IFALIAS})]: Link down
TrueNAS: Interface [{#IFNAME}({#IFALIAS})]: High inbound bandwidth usage	The network interface utilization is close to its estimated maximum bandwidth.	`(avg(/TrueNAS by SNMP/net.if.in[{#SNMPINDEX}],15m)>({$IF.UTIL.MAX:"{#IFNAME}"}/100)last(/TrueNAS by SNMP/net.if.speed[{#SNMPINDEX}])) and last(/TrueNAS by SNMP/net.if.speed[{#SNMPINDEX}])>0` Recovery expression: `avg(/TrueNAS by SNMP/net.if.in[{#SNMPINDEX}],15m)<(({$IF.UTIL.MAX:"{#IFNAME}"}-3)/100)last(/TrueNAS by SNMP/net.if.speed[{#SNMPINDEX}])`	WARNING	Depends on: - TrueNAS: Interface [{#IFNAME}({#IFALIAS})]: Link down
TrueNAS: Interface [{#IFNAME}({#IFALIAS})]: High output error rate	Recovers when below 80% of {$IF.ERRORS.WARN:"{#IFNAME}"} threshold.	`min(/TrueNAS by SNMP/net.if.out.errors[{#SNMPINDEX}],5m)>{$IF.ERRORS.WARN:"{#IFNAME}"}` Recovery expression: `max(/TrueNAS by SNMP/net.if.out.errors[{#SNMPINDEX}],5m)<{$IF.ERRORS.WARN:"{#IFNAME}"}*0.8`	WARNING	Depends on: - TrueNAS: Interface [{#IFNAME}({#IFALIAS})]: Link down
TrueNAS: Interface [{#IFNAME}({#IFALIAS})]: High outbound bandwidth usage	The network interface utilization is close to its estimated maximum bandwidth.	`(avg(/TrueNAS by SNMP/net.if.out[{#SNMPINDEX}],15m)>({$IF.UTIL.MAX:"{#IFNAME}"}/100)last(/TrueNAS by SNMP/net.if.speed[{#SNMPINDEX}])) and last(/TrueNAS by SNMP/net.if.speed[{#SNMPINDEX}])>0` Recovery expression: `avg(/TrueNAS by SNMP/net.if.out[{#SNMPINDEX}],15m)<(({$IF.UTIL.MAX:"{#IFNAME}"}-3)/100)last(/TrueNAS by SNMP/net.if.speed[{#SNMPINDEX}])`	WARNING	Depends on: - TrueNAS: Interface [{#IFNAME}({#IFALIAS})]: Link down
TrueNAS: Interface [{#IFNAME}({#IFALIAS})]: Ethernet has changed to lower speed than it was before	This Ethernet connection has transitioned down from its known maximum speed. This might be a sign of autonegotiation issues. Ack to close.	change(/TrueNAS by SNMP/net.if.speed[{#SNMPINDEX}])<0 and last(/TrueNAS by SNMP/net.if.speed[{#SNMPINDEX}])>0 and ( last(/TrueNAS by SNMP/net.if.type[{#SNMPINDEX}])=6 or last(/TrueNAS by SNMP/net.if.type[{#SNMPINDEX}])=7 or last(/TrueNAS by SNMP/net.if.type[{#SNMPINDEX}])=11 or last(/TrueNAS by SNMP/net.if.type[{#SNMPINDEX}])=62 or last(/TrueNAS by SNMP/net.if.type[{#SNMPINDEX}])=69 or last(/TrueNAS by SNMP/net.if.type[{#SNMPINDEX}])=117 ) and (last(/TrueNAS by SNMP/net.if.status[{#SNMPINDEX}])<>2) Recovery expression: `(change(/TrueNAS by SNMP/net.if.speed[{#SNMPINDEX}])>0 and last(/TrueNAS by SNMP/net.if.speed[{#SNMPINDEX}],#2)>0) or (last(/TrueNAS by SNMP/net.if.status[{#SNMPINDEX}])=2)`	INFO	Depends on: - TrueNAS: Interface [{#IFNAME}({#IFALIAS})]: Link down
TrueNAS: Interface [{#IFNAME}({#IFALIAS})]: Link down	This trigger expression works as follows: 1. Can be triggered if operations status is down. 2. {$IFCONTROL:"{#IFNAME}"}=1 - user can redefine Context macro to value - 0. That marks this interface as not important. No new trigger will be fired if this interface is down.	`{$IFCONTROL:"{#IFNAME}"}=1 and (last(/TrueNAS by SNMP/net.if.status[{#SNMPINDEX}])=2)`	AVERAGE
TrueNAS: Unavailable by ICMP ping	Last three attempts returned timeout. Please check device connectivity.	`max(/TrueNAS by SNMP/icmpping,#3)=0`	HIGH
TrueNAS: High ICMP ping loss	ICMP packets loss detected.	`min(/TrueNAS by SNMP/icmppingloss,5m)>{$ICMP_LOSS_WARN} and min(/TrueNAS by SNMP/icmppingloss,5m)<100`	WARNING	Depends on: - TrueNAS: Unavailable by ICMP ping
TrueNAS: High ICMP ping response time	Average ICMP response time is too big.	`avg(/TrueNAS by SNMP/icmppingsec,5m)>{$ICMP_RESPONSE_TIME_WARN}`	WARNING	Depends on: - TrueNAS: Unavailable by ICMP ping
TrueNAS: Host has been restarted	The host uptime is less than 10 minutes.	`last(/TrueNAS by SNMP/system.uptime)<10m`	INFO	Manual close: YES
TrueNAS: No SNMP data collection	SNMP is not available for polling. Please check device connectivity and SNMP settings.	`max(/TrueNAS by SNMP/zabbix[host,snmp,available],{$SNMP.TIMEOUT})=0`	WARNING	Depends on: - TrueNAS: Unavailable by ICMP ping
TrueNAS: Pool [{#POOLNAME}]: Very high space usage	Two conditions should match: First, space utilization should be above {$ZPOOL.PUSED.MAX.CRIT:"{#POOLNAME}"}%. Second condition: The pool free space is less than {$ZPOOL.FREE.MIN.CRIT:"{#POOLNAME}"}.	`min(/TrueNAS by SNMP/truenas.zpool.pused[{#POOLNAME}],5m) > {$ZPOOL.PUSED.MAX.CRIT:"{#POOLNAME}"} and last(/TrueNAS by SNMP/truenas.zpool.avail[{#POOLNAME}]) < {$ZPOOL.FREE.MIN.CRIT:"{#POOLNAME}"}`	AVERAGE
TrueNAS: Pool [{#POOLNAME}]: High space usage	Two conditions should match: First, space utilization should be above {$ZPOOL.PUSED.MAX.WARN:"{#POOLNAME}"}%. Second condition: The pool free space is less than {$ZPOOL.FREE.MIN.WARN:"{#POOLNAME}"}.	`min(/TrueNAS by SNMP/truenas.zpool.pused[{#POOLNAME}],5m) > {$ZPOOL.PUSED.MAX.WARN:"{#POOLNAME}"} and last(/TrueNAS by SNMP/truenas.zpool.avail[{#POOLNAME}]) < {$ZPOOL.FREE.MIN.WARN:"{#POOLNAME}"}`	WARNING	Depends on: - TrueNAS: Pool [{#POOLNAME}]: Very high space usage
TrueNAS: Pool [{#POOLNAME}]: Status is not online	Please check pool status.	`last(/TrueNAS by SNMP/truenas.zpool.health[{#POOLNAME}]) <> 0`	AVERAGE
TrueNAS: Dataset [{#DATASET_NAME}]: Very high space usage	Two conditions should match: First, space utilization should be above {$DATASET.PUSED.MAX.CRIT:"{#DATASET_NAME}"}%. Second condition: The dataset free space is less than {$DATASET.FREE.MIN.CRIT:"{#POOLNAME}"}.	`min(/TrueNAS by SNMP/truenas.dataset.pused[{#DATASET_NAME}],5m) > {$DATASET.PUSED.MAX.CRIT:"{#DATASET_NAME}"} and last(/TrueNAS by SNMP/truenas.dataset.avail[{#DATASET_NAME}]) < {$DATASET.FREE.MIN.CRIT:"{#POOLNAME}"}`	AVERAGE
TrueNAS: Dataset [{#DATASET_NAME}]: High space usage	Two conditions should match: First, space utilization should be above {$DATASET.PUSED.MAX.WARN:"{#DATASET_NAME}"}%. Second condition: The dataset free space is less than {$DATASET.FREE.MIN.WARN:"{#POOLNAME}"}.	`min(/TrueNAS by SNMP/truenas.dataset.pused[{#DATASET_NAME}],5m) > {$DATASET.PUSED.MAX.WARN:"{#DATASET_NAME}"} and last(/TrueNAS by SNMP/truenas.dataset.avail[{#DATASET_NAME}]) < {$DATASET.FREE.MIN.WARN:"{#POOLNAME}"}`	WARNING	Depends on: - TrueNAS: Dataset [{#DATASET_NAME}]: Very high space usage
TrueNAS: Disk [{#DISK_NAME}]: Average disk temperature is too high	Disk temperature is high.	`avg(/TrueNAS by SNMP/truenas.disk.temp[{#DISK_NAME}],5m) > {$TEMPERATURE.MAX.CRIT:"{#DISK_NAME}"}`	AVERAGE
TrueNAS: Disk [{#DISK_NAME}]: Average disk temperature is too high	Disk temperature is high.	`avg(/TrueNAS by SNMP/truenas.disk.temp[{#DISK_NAME}],5m) > {$TEMPERATURE.MAX.WARN:"{#DISK_NAME}"}`	WARNING

Feedback

Please report any issues with the template at https://support.zabbix.com.

app

app_travis_ci_http

View README Download JSON

Travis CI by HTTP

Overview

For Zabbix version: 6.2 and higher
The template to monitor Travis CI by Zabbix that work without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

This template was tested on:

Travis CI, version API V3 2021

Setup

You must set {$TRAVIS.API.TOKEN} and {$TRAVIS.API.URL} macros. {$TRAVIS.API.TOKEN} is a Travis API authentication token located in User -> Settings -> API authentication. {$TRAVIS.API.URL} could be in 2 different variations:

for a private project : api.travis-ci.com
for an enterprise projects: api.example.com (where you replace example.com with the domain Travis CI is running on)

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$TRAVIS.API.TOKEN}	Travis API Token	``
{$TRAVIS.API.URL}	Travis API URL	`api.travis-ci.com`
{$TRAVIS.BUILDS.SUCCESS.PERCENT}	Percent of successful builds in the repo (for trigger expression)	`80`

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Repos metrics discovery

Metrics for Repos statistics.

DEPENDENT

travis.repos.discovery

Preprocessing:

- JAVASCRIPT: The text is too long. Please see the template.

- DISCARDUNCHANGEDHEARTBEAT: 1h

Items collected

Group	Name	Description	Type	Key and additional info
Travis	Travis: Get health	Getting home JSON using Travis API.	HTTP_AGENT	travis.gethealth Preprocessing: - CHECKNOTSUPPORTED ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - JAVASCRIPT: `return JSON.parse(value).config ? 1 : 0`
Travis	Travis: Jobs passed	Total count of passed jobs in all repos.	DEPENDENT	travis.jobs.total Preprocessing: - JSONPATH: `$.jobs.length()`
Travis	Travis: Jobs active	Active jobs in all repos.	DEPENDENT	travis.jobs.active Preprocessing: - JSONPATH: `$.jobs[?(@.state == "started")].length()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Travis	Travis: Jobs in queue	Jobs in queue in all repos.	DEPENDENT	travis.jobs.queue Preprocessing: - JSONPATH: `$.jobs[?(@.state == "received")].length()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Travis	Travis: Builds	Total count of builds in all repos.	DEPENDENT	travis.builds.total Preprocessing: - JSONPATH: `$.builds.length()`
Travis	Travis: Builds duration	Sum of all builds durations in all repos.	DEPENDENT	travis.builds.duration Preprocessing: - JSONPATH: `$..duration.sum()` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Travis	Travis: Repo [{#SLUG}]: Cache files	Count of cache files in {#SLUG} repo.	DEPENDENT	travis.repo.caches.files[{#SLUG}] Preprocessing: - JSONPATH: `$.caches.length()`
Travis	Travis: Repo [{#SLUG}]: Cache size	Total size of cache files in {#SLUG} repo.	DEPENDENT	travis.repo.caches.size[{#SLUG}] Preprocessing: - JSONPATH: `$.caches..size.sum()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Travis	Travis: Repo [{#SLUG}]: Builds passed	Count of all passed builds in {#SLUG} repo.	DEPENDENT	travis.repo.builds.passed[{#SLUG}] Preprocessing: - JAVASCRIPT: `return JSON.parse(value).builds.filter(function (e){return e.state == "passed"}).length`
Travis	Travis: Repo [{#SLUG}]: Builds failed	Count of all failed builds in {#SLUG} repo.	DEPENDENT	travis.repo.builds.failed[{#SLUG}] Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.`
Travis	Travis: Repo [{#SLUG}]: Builds total	Count of total builds in {#SLUG} repo.	DEPENDENT	travis.repo.builds.total[{#SLUG}] Preprocessing: - JSONPATH: `$.builds.length()`
Travis	Travis: Repo [{#SLUG}]: Builds passed, %	Percent of passed builds in {#SLUG} repo.	CALCULATED	travis.repo.builds.passed.pct[{#SLUG}] Expression: `last(//travis.repo.builds.passed[{#SLUG}])/last(//travis.repo.builds.total[{#SLUG}])*100`
Travis	Travis: Repo [{#SLUG}]: Description	Description of Travis repo (git project description).	DEPENDENT	travis.repo.description[{#SLUG}] Preprocessing: - JSONPATH: `$.repositories[?(@.slug == "{#SLUG}")].description.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
Travis	Travis: Repo [{#SLUG}]: Last build duration	Last build duration in {#SLUG} repo.	DEPENDENT	travis.repo.lastbuild.duration[{#SLUG}] Preprocessing: - JSONPATH: `$.builds[0].duration` ⛔️ONFAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGEDHEARTBEAT: `1h`
Travis	Travis: Repo [{#SLUG}]: Last build state	Last build state in {#SLUG} repo.	DEPENDENT	travis.repo.lastbuild.state[{#SLUG}] Preprocessing: - JSONPATH: `$.builds[0].state` - DISCARDUNCHANGED_HEARTBEAT: `1h`
Travis	Travis: Repo [{#SLUG}]: Last build number	Last build number in {#SLUG} repo.	DEPENDENT	travis.repo.lastbuild.number[{#SLUG}] Preprocessing: - JSONPATH: `$.builds[0].number` - DISCARDUNCHANGED_HEARTBEAT: `1h`
Travis	Travis: Repo [{#SLUG}]: Last build id	Last build id in {#SLUG} repo.	DEPENDENT	travis.repo.lastbuild.id[{#SLUG}] Preprocessing: - JSONPATH: `$.builds[0].id` - DISCARDUNCHANGED_HEARTBEAT: `1h`
Zabbix raw items	Travis: Get repos	Getting repos using Travis API.	HTTP_AGENT	travis.get_repos
Zabbix raw items	Travis: Get builds	Getting builds using Travis API.	HTTP_AGENT	travis.get_builds
Zabbix raw items	Travis: Get jobs	Getting jobs using Travis API.	HTTP_AGENT	travis.get_jobs
Zabbix raw items	Travis: Repo [{#SLUG}]: Get builds	Getting builds of {#SLUG} using Travis API.	HTTP_AGENT	travis.repo.get_builds[{#SLUG}]
Zabbix raw items	Travis: Repo [{#SLUG}]: Get caches	Getting caches of {#SLUG} using Travis API.	HTTP_AGENT	travis.repo.get_caches[{#SLUG}]

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Travis: Service is unavailable	Travis API is unavailable. Please check if the correct macros are set.	`last(/Travis CI by HTTP/travis.get_health)=0`	HIGH	Manual close: YES
Travis: Failed to fetch home page	Zabbix has not received data for items for the last 30 minutes.	`nodata(/Travis CI by HTTP/travis.get_health,30m)=1`	WARNING	Manual close: YES
Travis: Repo [{#SLUG}]: Percent of successful builds	Low successful builds rate.	`last(/Travis CI by HTTP/travis.repo.builds.passed.pct[{#SLUG}])<{$TRAVIS.BUILDS.SUCCESS.PERCENT}`	WARNING	Manual close: YES
Travis: Repo [{#SLUG}]: Last build status is 'errored'	Last build status is errored.	`find(/Travis CI by HTTP/travis.repo.last_build.state[{#SLUG}],,"like","errored")=1`	WARNING	Manual close: YES

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

app

app_tomcat_jmx

View README Download JSON

Apache Tomcat by JMX

Overview

For Zabbix version: 6.2 and higher
Official JMX Template for Apache Tomcat.

This template was tested on:

Apache Tomcat, version 8.5.59

Setup

Metrics are collected by JMX.

Enable and configure JMX access to Apache Tomcat. See documentation for instructions (chose your version).
If your Tomcat installation require authentication for JMX, set values in host macros {$TOMCAT.USERNAME} and {$TOMCAT.PASSWORD}.
You can set custom macro values and add macros with context for specific metrics following macro description.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$TOMCAT.LLD.FILTER.MATCHES}	Filter for discoverable objects. Can be used with following contexts: "GlobalRequestProcessor", "ThreadPool", "Manager"	`.*`
{$TOMCAT.LLD.FILTER.NOT_MATCHES}	Filter to exclude discovered objects. Can be used with following contexts: "GlobalRequestProcessor", "ThreadPool", "Manager"	`CHANGE IF NEEDED`
{$TOMCAT.PASSWORD}	Password for JMX	``
{$TOMCAT.THREADS.MAX.PCT}	Threshold for busy worker threads trigger. Can be used with {#JMXNAME} as context.	`75`
{$TOMCAT.THREADS.MAX.TIME}	The time during which the number of busy threads can exceed the threshold. Can be used with {#JMXNAME} as context.	`5m`
{$TOMCAT.USER}	User for JMX	``

Template links

There are no template links in this template.

Discovery rules

Name	Description	Type	Key and additional info
Contexts discovery	Discovery for contexts	JMX	jmx.discovery[beans,"Catalina:type=Manager,host=,context="] Filter: AND - {#JMXHOST} MATCHESREGEX `{$TOMCAT.LLD.FILTER.MATCHES:"Manager"}` - {#JMXHOST} NOTMATCHES_REGEX `{$TOMCAT.LLD.FILTER.NOT_MATCHES:"Manager"}`
Global request processors discovery	Discovery for GlobalRequestProcessor	JMX	jmx.discovery[beans,"Catalina:type=GlobalRequestProcessor,name="] Filter: AND - {#JMXNAME} MATCHESREGEX `{$TOMCAT.LLD.FILTER.MATCHES:"GlobalRequestProcessor"}` - {#JMXNAME} NOT*MATCHES_REGEX `{$TOMCAT.LLD.FILTER.NOT_MATCHES:"GlobalRequestProcessor"}`
Protocol handlers discovery	Discovery for ProtocolHandler	JMX	jmx.discovery[attributes,"Catalina:type=ProtocolHandler,port="] Filter*: AND - {#JMXATTR} MATCHES_REGEX `^name$`
Thread pools discovery	Discovery for ThreadPool	JMX	jmx.discovery[beans,"Catalina:type=ThreadPool,name="] Filter: AND - {#JMXNAME} MATCHESREGEX `{$TOMCAT.LLD.FILTER.MATCHES:"ThreadPool"}` - {#JMXNAME} NOT*MATCHES_REGEX `{$TOMCAT.LLD.FILTER.NOT_MATCHES:"ThreadPool"}`

Items collected

Group	Name	Description	Type	Key and additional info
Tomcat	Tomcat: Version	The version of the Tomcat.	JMX	jmx["Catalina:type=Server",serverInfo] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1d`
Tomcat	{#JMXNAME}: Bytes received per second	Bytes received rate by processor {#JMXNAME}	JMX	jmx[{#JMXOBJ},bytesReceived] Preprocessing: - CHANGEPERSECOND
Tomcat	{#JMXNAME}: Bytes sent per second	Bytes sent rate by processor {#JMXNAME}	JMX	jmx[{#JMXOBJ},bytesSent] Preprocessing: - CHANGEPERSECOND
Tomcat	{#JMXNAME}: Errors per second	Error rate of request processor {#JMXNAME}	JMX	jmx[{#JMXOBJ},errorCount] Preprocessing: - CHANGEPERSECOND
Tomcat	{#JMXNAME}: Requests per second	Rate of requests served by request processor {#JMXNAME}	JMX	jmx[{#JMXOBJ},requestCount] Preprocessing: - CHANGEPERSECOND
Tomcat	{#JMXNAME}: Requests processing time	The total time to process all incoming requests of request processor {#JMXNAME}	JMX	jmx[{#JMXOBJ},processingTime] Preprocessing: - MULTIPLIER: `0.001`
Tomcat	{#JMXVALUE}: Gzip compression status	Gzip compression status on {#JMXNAME}. Enabling gzip compression may save server bandwidth.	JMX	jmx[{#JMXOBJ},compression] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1h`
Tomcat	{#JMXNAME}: Threads count	Amount of threads the thread pool has right now, both busy and free.	JMX	jmx[{#JMXOBJ},currentThreadCount] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
Tomcat	{#JMXNAME}: Threads limit	Limit of the threads count. When currentThreadsBusy counter reaches the maxThreads limit, no more requests could be handled, and the application chokes.	JMX	jmx[{#JMXOBJ},maxThreads] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
Tomcat	{#JMXNAME}: Threads busy	Number of the requests that are being currently handled.	JMX	jmx[{#JMXOBJ},currentThreadsBusy]
Tomcat	{#JMXHOST}{#JMXCONTEXT}: Sessions active	Active sessions of the application.	JMX	jmx[{#JMXOBJ},activeSessions]
Tomcat	{#JMXHOST}{#JMXCONTEXT}: Sessions active maximum so far	Maximum number of active sessions so far.	JMX	jmx[{#JMXOBJ},maxActive]
Tomcat	{#JMXHOST}{#JMXCONTEXT}: Sessions created per second	Rate of sessions created by this application per second.	JMX	jmx[{#JMXOBJ},sessionCounter] Preprocessing: - CHANGEPERSECOND
Tomcat	{#JMXHOST}{#JMXCONTEXT}: Sessions rejected per second	Rate of sessions we rejected due to maxActive being reached.	JMX	jmx[{#JMXOBJ},rejectedSessions] Preprocessing: - CHANGEPERSECOND
Tomcat	{#JMXHOST}{#JMXCONTEXT}: Sessions allowed maximum	The maximum number of active Sessions allowed, or -1 for no limit.	JMX	jmx[{#JMXOBJ},maxActiveSessions]

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Tomcat: Version has been changed	Tomcat version has changed. Ack to close.	`last(/Apache Tomcat by JMX/jmx["Catalina:type=Server",serverInfo],#1)<>last(/Apache Tomcat by JMX/jmx["Catalina:type=Server",serverInfo],#2) and length(last(/Apache Tomcat by JMX/jmx["Catalina:type=Server",serverInfo]))>0`	INFO	Manual close: YES
{#JMXVALUE}: Gzip compression is disabled	gzip compression is disabled for connector {#JMXVALUE}.	`find(/Apache Tomcat by JMX/jmx[{#JMXOBJ},compression],,"like","off") = 1`	INFO	Manual close: YES
{#JMXNAME}: Busy worker threads count is high	When current threads busy counter reaches the limit, no more requests could be handled, and the application chokes.	`min(/Apache Tomcat by JMX/jmx[{#JMXOBJ},currentThreadsBusy],{$TOMCAT.THREADS.MAX.TIME:"{#JMXNAME}"})>last(/Apache Tomcat by JMX/jmx[{#JMXOBJ},maxThreads])*{$TOMCAT.THREADS.MAX.PCT:"{#JMXNAME}"}/100`	HIGH

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

app

app_telnet_service

View README Download JSON

Telnet Service

Overview

For Zabbix version: 6.2 and higher

Setup

Refer to the vendor documentation.

Zabbix configuration

No specific Zabbix configuration is required.

Template links

There are no template links in this template.

Discovery rules

Items collected

Group	Name	Description	Type	Key and additional info
Services	Telnet service is running	-	SIMPLE	net.tcp.service[telnet]

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Telnet service is down on {HOST.NAME}	-	`max(/Telnet Service/net.tcp.service[telnet],#3)=0`	AVERAGE

Feedback

Please report any issues with the template at https://support.zabbix.com

app

app_systemd

View README Download JSON

Systemd by Zabbix agent 2

Overview

For Zabbix version: 6.2 and higher
The template to monitor systemd units. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

Systemd by Zabbix agent 2 — collects metrics by polling zabbix-agent2.

This template was tested on:

Systemd, version 219

Setup

Setup and configure zabbix-agent2 compiled with the Systemd monitoring plugin.
Set filters with macros if you want to override default filter parameters.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$SYSTEMD.ACTIVESTATE.SERVICE.MATCHES}	Filter of systemd service units by active state	`active`
{$SYSTEMD.ACTIVESTATE.SERVICE.NOT_MATCHES}	Filter of systemd service units by active state	`CHANGE_IF_NEEDED`
{$SYSTEMD.ACTIVESTATE.SOCKET.MATCHES}	Filter of systemd socket units by active state	`active`
{$SYSTEMD.ACTIVESTATE.SOCKET.NOT_MATCHES}	Filter of systemd socket units by active state	`CHANGE_IF_NEEDED`
{$SYSTEMD.NAME.SERVICE.MATCHES}	Filter of systemd service units by name	`.*`
{$SYSTEMD.NAME.SERVICE.NOT_MATCHES}	Filter of systemd service units by name	`CHANGE_IF_NEEDED`
{$SYSTEMD.NAME.SOCKET.MATCHES}	Filter of systemd socket units by name	`.*`
{$SYSTEMD.NAME.SOCKET.NOT_MATCHES}	Filter of systemd socket units by name	`CHANGE_IF_NEEDED`
{$SYSTEMD.UNITFILESTATE.SERVICE.MATCHES}	Filter of systemd service units by unit file state	`enabled`
{$SYSTEMD.UNITFILESTATE.SERVICE.NOT_MATCHES}	Filter of systemd service units by unit file state	`CHANGE_IF_NEEDED`
{$SYSTEMD.UNITFILESTATE.SOCKET.MATCHES}	Filter of systemd socket units by unit file state	`enabled`
{$SYSTEMD.UNITFILESTATE.SOCKET.NOT_MATCHES}	Filter of systemd socket units by unit file state	`CHANGE_IF_NEEDED`

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Service units discovery

Discover systemd service units and their details.

ZABBIX_PASSIVE

systemd.unit.discovery[service]

Filter:

AND

- {#UNIT.ACTIVESTATE} MATCHESREGEX {$SYSTEMD.ACTIVESTATE.SERVICE.MATCHES}

- {#UNIT.ACTIVESTATE} NOTMATCHESREGEX {$SYSTEMD.ACTIVESTATE.SERVICE.NOT_MATCHES}

- {#UNIT.UNITFILESTATE} MATCHESREGEX {$SYSTEMD.UNITFILESTATE.SERVICE.MATCHES}

- {#UNIT.UNITFILESTATE} NOTMATCHESREGEX {$SYSTEMD.UNITFILESTATE.SERVICE.NOT_MATCHES}

- {#UNIT.NAME} NOTMATCHESREGEX {$SYSTEMD.NAME.SERVICE.NOT_MATCHES}

- {#UNIT.NAME} MATCHES_REGEX {$SYSTEMD.NAME.SERVICE.MATCHES}

Socket units discovery

Discover systemd socket units and their details.

ZABBIX_PASSIVE

systemd.unit.discovery[socket]

Filter:

AND

- {#UNIT.ACTIVESTATE} MATCHESREGEX {$SYSTEMD.ACTIVESTATE.SOCKET.MATCHES}

- {#UNIT.ACTIVESTATE} NOTMATCHESREGEX {$SYSTEMD.ACTIVESTATE.SOCKET.NOT_MATCHES}

- {#UNIT.UNITFILESTATE} MATCHESREGEX {$SYSTEMD.UNITFILESTATE.SOCKET.MATCHES}

- {#UNIT.UNITFILESTATE} NOTMATCHESREGEX {$SYSTEMD.UNITFILESTATE.SOCKET.NOT_MATCHES}

- {#UNIT.NAME} NOTMATCHESREGEX {$SYSTEMD.NAME.SOCKET.NOT_MATCHES}

- {#UNIT.NAME} MATCHES_REGEX {$SYSTEMD.NAME.SOCKET.MATCHES}

Items collected

Group	Name	Description	Type	Key and additional info
Systemd	{#UNIT.NAME}: Active state	State value that reflects whether the unit is currently active or not. The following states are currently defined: "active", "reloading", "inactive", "failed", "activating", and "deactivating".	DEPENDENT	systemd.service.activestate["{#UNIT.NAME}"] Preprocessing: - JSONPATH: `$.ActiveState.state` - DISCARDUNCHANGED_HEARTBEAT: `30m`
Systemd	{#UNIT.NAME}: Load state	State value that reflects whether the configuration file of this unit has been loaded. The following states are currently defined: "loaded", "error", and "masked".	DEPENDENT	systemd.service.loadstate["{#UNIT.NAME}"] Preprocessing: - JSONPATH: `$.LoadState.state` - DISCARDUNCHANGED_HEARTBEAT: `30m`
Systemd	{#UNIT.NAME}: Unit file state	Encodes the install state of the unit file of FragmentPath. It currently knows the following states: "enabled", "enabled-runtime", "linked", "linked-runtime", "masked", "masked-runtime", "static", "disabled", and "invalid".	DEPENDENT	systemd.service.unitfilestate["{#UNIT.NAME}"] Preprocessing: - JSONPATH: `$.UnitFileState.state` - DISCARDUNCHANGED_HEARTBEAT: `30m`
Systemd	{#UNIT.NAME}: Active time	Number of seconds since unit entered the active state.	DEPENDENT	systemd.service.uptime["{#UNIT.NAME}"] Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.`
Systemd	{#UNIT.NAME}: Connections accepted per sec	The number of accepted socket connections (NAccepted) per second.	DEPENDENT	systemd.socket.connaccepted.rate["{#UNIT.NAME}"] Preprocessing: - JSONPATH: `$.NAccepted` - CHANGEPER_SECOND
Systemd	{#UNIT.NAME}: Connections connected	The current number of socket connections (NConnections).	DEPENDENT	systemd.socket.conn_count["{#UNIT.NAME}"] Preprocessing: - JSONPATH: `$.NConnections`
Zabbix raw items	{#UNIT.NAME}: Get unit info	Returns all properties of a systemd service unit. Unit description: {#UNIT.DESCRIPTION}.	ZABBIX_PASSIVE	systemd.unit.get["{#UNIT.NAME}"]
Zabbix raw items	{#UNIT.NAME}: Get unit info	Returns all properties of a systemd socket unit. Unit description: {#UNIT.DESCRIPTION}.	ZABBIX_PASSIVE	systemd.unit.get["{#UNIT.NAME}",Socket]

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
{#UNIT.NAME}: Service is not running	-	`last(/Systemd by Zabbix agent 2/systemd.service.active_state["{#UNIT.NAME}"])<>1`	WARNING	Manual close: YES
{#UNIT.NAME}: has been restarted	Uptime is less than 10 minutes.	`last(/Systemd by Zabbix agent 2/systemd.service.uptime["{#UNIT.NAME}"])<10m`	INFO	Manual close: YES

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

app

app_ssh_service

View README Download JSON

SSH Service

Overview

For Zabbix version: 6.2 and higher

Setup

Refer to the vendor documentation.

Zabbix configuration

No specific Zabbix configuration is required.

Template links

There are no template links in this template.

Discovery rules

Items collected

Group	Name	Description	Type	Key and additional info
Services	SSH service is running	-	SIMPLE	net.tcp.service[ssh]

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
SSH service is down on {HOST.NAME}	-	`max(/SSH Service/net.tcp.service[ssh],#3)=0`	AVERAGE

Feedback

Please report any issues with the template at https://support.zabbix.com

app

app_squid_snmp

View README Download JSON

Squid by SNMP

Overview

For Zabbix version: 6.2 and higher.

This template was tested on:

Squid, version 3.5.12

Setup

Setup Squid

Enable SNMP support following official documentation. Required parameters in squid.conf:

snmp_port <port_number>
acl <zbx_acl_name> snmp_community <community_name>
snmp_access allow <zbx_acl_name> <zabbix_server_ip>

Setup Zabbix

1. Import the template templateappsquid_snmp.yaml into Zabbix.

2. Set values for {$SQUID.SNMP.COMMUNITY}, {$SQUID.SNMP.PORT} and {$SQUID.HTTP.PORT} as configured in squid.conf.

3. Link the imported template to a host with Squid.

4. Add SNMPv2 interface to Squid host. Set Port as {$SQUID.SNMP.PORT} and SNMP community as {$SQUID.SNMP.COMMUNITY}.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$SQUID.FILE.DESC.WARN.MIN}	The threshold for minimum number of available file descriptors	`100`
{$SQUID.HTTP.PORT}	http_port configured in squid.conf (Default: 3128)	`3128`
{$SQUID.PAGE.FAULT.WARN}	The threshold for sys page faults rate in percent of received HTTP requests	`90`
{$SQUID.SNMP.COMMUNITY}	SNMP community allowed by ACL in squid.conf	`public`
{$SQUID.SNMP.PORT}	snmp_port configured in squid.conf (Default: 3401)	`3401`

Template links

There are no template links in this template.

Discovery rules

Items collected

Group	Name	Description	Type	Key and additional info
Squid	Squid: Service ping	-	SIMPLE	net.tcp.service[tcp,,{$SQUID.HTTP.PORT}] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
Squid	Squid: Uptime	The Uptime of the cache in timeticks (in hundredths of a second) with preprocessing	SNMP	squid[cacheUptime] Preprocessing: - MULTIPLIER: `0.01`
Squid	Squid: Version	Cache Software Version	SNMP	squid[cacheVersionId] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `6h`
Squid	Squid: CPU usage	The percentage use of the CPU	SNMP	squid[cacheCpuUsage]
Squid	Squid: Memory maximum resident size	Maximum Resident Size	SNMP	squid[cacheMaxResSize] Preprocessing: - MULTIPLIER: `1024`
Squid	Squid: Memory maximum cache size	The value of the cache_mem parameter	SNMP	squid[cacheMemMaxSize] Preprocessing: - MULTIPLIER: `1048576`
Squid	Squid: Memory cache usage	Total accounted memory	SNMP	squid[cacheMemUsage] Preprocessing: - MULTIPLIER: `1024`
Squid	Squid: Cache swap low water mark	Cache Swap Low Water Mark	SNMP	squid[cacheSwapLowWM]
Squid	Squid: Cache swap high water mark	Cache Swap High Water Mark	SNMP	squid[cacheSwapHighWM]
Squid	Squid: Cache swap directory size	The total of the cache_dir space allocated	SNMP	squid[cacheSwapMaxSize] Preprocessing: - MULTIPLIER: `1048576`
Squid	Squid: Cache swap current size	Storage Swap Size	SNMP	squid[cacheCurrentSwapSize]
Squid	Squid: File descriptor count - current used	Number of file descriptors in use	SNMP	squid[cacheCurrentFileDescrCnt]
Squid	Squid: File descriptor count - current maximum	Highest number of file descriptors in use	SNMP	squid[cacheCurrentFileDescrMax]
Squid	Squid: File descriptor count - current reserved	Reserved number of file descriptors	SNMP	squid[cacheCurrentResFileDescrCnt]
Squid	Squid: File descriptor count - current available	Available number of file descriptors	SNMP	squid[cacheCurrentUnusedFDescrCnt]
Squid	Squid: Byte hit ratio per 1 minute	Byte Hit Ratios	SNMP	squid[cacheRequestByteRatio.1]
Squid	Squid: Byte hit ratio per 5 minutes	Byte Hit Ratios	SNMP	squid[cacheRequestByteRatio.5]
Squid	Squid: Byte hit ratio per 1 hour	Byte Hit Ratios	SNMP	squid[cacheRequestByteRatio.60]
Squid	Squid: Request hit ratio per 1 minute	Byte Hit Ratios	SNMP	squid[cacheRequestHitRatio.1]
Squid	Squid: Request hit ratio per 5 minutes	Byte Hit Ratios	SNMP	squid[cacheRequestHitRatio.5]
Squid	Squid: Request hit ratio per 1 hour	Byte Hit Ratios	SNMP	squid[cacheRequestHitRatio.60]
Squid	Squid: Sys page faults per second	Page faults with physical I/O	SNMP	squid[cacheSysPageFaults] Preprocessing: - CHANGEPERSECOND
Squid	Squid: HTTP requests received per second	Number of HTTP requests received	SNMP	squid[cacheProtoClientHttpRequests] Preprocessing: - CHANGEPERSECOND
Squid	Squid: HTTP traffic received per second	Number of HTTP traffic received from clients	SNMP	squid[cacheHttpInKb] Preprocessing: - MULTIPLIER: `1024` - CHANGEPERSECOND
Squid	Squid: HTTP traffic sent per second	Number of HTTP traffic sent to clients	SNMP	squid[cacheHttpOutKb] Preprocessing: - MULTIPLIER: `1024` - CHANGEPERSECOND
Squid	Squid: HTTP Hits sent from cache per second	Number of HTTP Hits sent to clients from cache	SNMP	squid[cacheHttpHits] Preprocessing: - CHANGEPERSECOND
Squid	Squid: HTTP Errors sent per second	Number of HTTP Errors sent to clients	SNMP	squid[cacheHttpErrors] Preprocessing: - CHANGEPERSECOND
Squid	Squid: ICP messages sent per second	Number of ICP messages sent	SNMP	squid[cacheIcpPktsSent] Preprocessing: - CHANGEPERSECOND
Squid	Squid: ICP messages received per second	Number of ICP messages received	SNMP	squid[cacheIcpPktsRecv] Preprocessing: - CHANGEPERSECOND
Squid	Squid: ICP traffic transmitted per second	Number of ICP traffic transmitted	SNMP	squid[cacheIcpKbSent] Preprocessing: - MULTIPLIER: `1024` - CHANGEPERSECOND
Squid	Squid: ICP traffic received per second	Number of ICP traffic received	SNMP	squid[cacheIcpKbRecv] Preprocessing: - MULTIPLIER: `1024` - CHANGEPERSECOND
Squid	Squid: DNS server requests per second	Number of external dns server requests	SNMP	squid[cacheDnsRequests] Preprocessing: - CHANGEPERSECOND
Squid	Squid: DNS server replies per second	Number of external dns server replies	SNMP	squid[cacheDnsReplies] Preprocessing: - CHANGEPERSECOND
Squid	Squid: FQDN cache requests per second	Number of FQDN Cache requests	SNMP	squid[cacheFqdnRequests] Preprocessing: - CHANGEPERSECOND
Squid	Squid: FQDN cache hits per second	Number of FQDN Cache hits	SNMP	squid[cacheFqdnHits] Preprocessing: - CHANGEPERSECOND
Squid	Squid: FQDN cache misses per second	Number of FQDN Cache misses	SNMP	squid[cacheFqdnMisses] Preprocessing: - CHANGEPERSECOND
Squid	Squid: IP cache requests per second	Number of IP Cache requests	SNMP	squid[cacheIpRequests] Preprocessing: - CHANGEPERSECOND
Squid	Squid: IP cache hits per second	Number of IP Cache hits	SNMP	squid[cacheIpHits] Preprocessing: - CHANGEPERSECOND
Squid	Squid: IP cache misses per second	Number of IP Cache misses	SNMP	squid[cacheIpMisses] Preprocessing: - CHANGEPERSECOND
Squid	Squid: Objects count	Number of objects stored by the cache	SNMP	squid[cacheNumObjCount]
Squid	Squid: Objects LRU expiration age	Storage LRU Expiration Age	SNMP	squid[cacheCurrentLRUExpiration] Preprocessing: - MULTIPLIER: `0.01`
Squid	Squid: Objects unlinkd requests	Requests given to unlinkd	SNMP	squid[cacheCurrentUnlinkRequests]
Squid	Squid: HTTP all service time per 5 minutes	HTTP all service time per 5 minutes	SNMP	squid[cacheHttpAllSvcTime.5] Preprocessing: - MULTIPLIER: `0.001`
Squid	Squid: HTTP all service time per hour	HTTP all service time per hour	SNMP	squid[cacheHttpAllSvcTime.60] Preprocessing: - MULTIPLIER: `0.001`
Squid	Squid: HTTP miss service time per 5 minutes	HTTP miss service time per 5 minutes	SNMP	squid[cacheHttpMissSvcTime.5] Preprocessing: - MULTIPLIER: `0.001`
Squid	Squid: HTTP miss service time per hour	HTTP miss service time per hour	SNMP	squid[cacheHttpMissSvcTime.60] Preprocessing: - MULTIPLIER: `0.001`
Squid	Squid: HTTP hit service time per 5 minutes	HTTP hit service time per 5 minutes	SNMP	squid[cacheHttpHitSvcTime.5] Preprocessing: - MULTIPLIER: `0.001`
Squid	Squid: HTTP hit service time per hour	HTTP hit service time per hour	SNMP	squid[cacheHttpHitSvcTime.60] Preprocessing: - MULTIPLIER: `0.001`
Squid	Squid: ICP query service time per 5 minutes	ICP query service time per 5 minutes	SNMP	squid[cacheIcpQuerySvcTime.5] Preprocessing: - MULTIPLIER: `0.001`
Squid	Squid: ICP query service time per hour	ICP query service time per hour	SNMP	squid[cacheIcpQuerySvcTime.60] Preprocessing: - MULTIPLIER: `0.001`
Squid	Squid: ICP reply service time per 5 minutes	ICP reply service time per 5 minutes	SNMP	squid[cacheIcpReplySvcTime.5] Preprocessing: - MULTIPLIER: `0.001`
Squid	Squid: ICP reply service time per hour	ICP reply service time per hour	SNMP	squid[cacheIcpReplySvcTime.60] Preprocessing: - MULTIPLIER: `0.001`
Squid	Squid: DNS service time per 5 minutes	DNS service time per 5 minutes	SNMP	squid[cacheDnsSvcTime.5] Preprocessing: - MULTIPLIER: `0.001`
Squid	Squid: DNS service time per hour	DNS service time per hour	SNMP	squid[cacheDnsSvcTime.60] Preprocessing: - MULTIPLIER: `0.001`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Squid: Port {$SQUID.HTTP.PORT} is down	-	`last(/Squid by SNMP/net.tcp.service[tcp,,{$SQUID.HTTP.PORT}])=0`	AVERAGE	Manual close: YES
Squid: Squid has been restarted	Uptime is less than 10 minutes.	`last(/Squid by SNMP/squid[cacheUptime])<10m`	INFO	Manual close: YES
Squid: Squid version has been changed	Squid version has changed. Ack to close.	`last(/Squid by SNMP/squid[cacheVersionId],#1)<>last(/Squid by SNMP/squid[cacheVersionId],#2) and length(last(/Squid by SNMP/squid[cacheVersionId]))>0`	INFO	Manual close: YES
Squid: Swap usage is more than low watermark	-	`last(/Squid by SNMP/squid[cacheCurrentSwapSize])>last(/Squid by SNMP/squid[cacheSwapLowWM])*last(/Squid by SNMP/squid[cacheSwapMaxSize])/100`	WARNING
Squid: Swap usage is more than high watermark	-	`last(/Squid by SNMP/squid[cacheCurrentSwapSize])>last(/Squid by SNMP/squid[cacheSwapHighWM])*last(/Squid by SNMP/squid[cacheSwapMaxSize])/100`	HIGH
Squid: Squid is running out of file descriptors	-	`last(/Squid by SNMP/squid[cacheCurrentUnusedFDescrCnt])<{$SQUID.FILE.DESC.WARN.MIN}`	WARNING
Squid: High sys page faults rate	-	`avg(/Squid by SNMP/squid[cacheSysPageFaults],5m)>avg(/Squid by SNMP/squid[cacheProtoClientHttpRequests],5m)/100*{$SQUID.PAGE.FAULT.WARN}`	WARNING

Feedback

Please report any issues with the template at https://support.zabbix.com.

You can also provide feedback, discuss the template, or ask for help at ZABBIX forums.

app

app_smtp_service

View README Download JSON

SMTP Service

Overview

For Zabbix version: 6.2 and higher

Setup

Refer to the vendor documentation.

Zabbix configuration

No specific Zabbix configuration is required.

Template links

There are no template links in this template.

Discovery rules

Items collected

Group	Name	Description	Type	Key and additional info
Services	SMTP service is running	-	SIMPLE	net.tcp.service[smtp]

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
SMTP service is down on {HOST.NAME}	-	`max(/SMTP Service/net.tcp.service[smtp],#3)=0`	AVERAGE

Feedback

Please report any issues with the template at https://support.zabbix.com

app

app_sharepoint_http

View README Download JSON

Microsoft SharePoint by HTTP

Overview

For Zabbix version: 6.2 and higher
SharePoint includes a Representational State Transfer (REST) service. Developers can perform read operations from their SharePoint Add-ins, solutions, and client applications, using REST web technologies and standard Open Data Protocol (OData) syntax. Details in https://docs.microsoft.com/ru-ru/sharepoint/dev/sp-add-ins/get-to-know-the-sharepoint-rest-service?tabs=csom

This template was tested on:

SharePoint Server, version 2019

Setup

Create a new host. Define macros according to your Sharepoint web portal. It is recommended to fill in the values of the filter macros to avoid getting redundant data.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$SHAREPOINT.GET_INTERVAL}	-	`1m`
{$SHAREPOINT.LLD.FILTER.FULL_PATH.MATCHES}	Filter of discoverable dictionaries by full path.	`^/`
{$SHAREPOINT.LLD.FILTER.FULLPATH.NOTMATCHES}	Filter to exclude discovered dictionaries by full path.	`CHANGE_IF_NEEDED`
{$SHAREPOINT.LLD.FILTER.NAME.MATCHES}	Filter of discoverable dictionaries by name.	`.*`
{$SHAREPOINT.LLD.FILTER.NAME.NOT_MATCHES}	Filter to exclude discovered dictionaries by name.	`CHANGE_IF_NEEDED`
{$SHAREPOINT.LLD.FILTER.TYPE.MATCHES}	Filter of discoverable types.	`FOLDER`
{$SHAREPOINT.LLD.FILTER.TYPE.NOT_MATCHES}	Filter to exclude discovered types.	`CHANGE_IF_NEEDED`
{$SHAREPOINT.LLD_INTERVAL}	-	`3h`
{$SHAREPOINT.MAXHEALTSCORE}	Must be in the range from 0 to 10 in details: https://docs.microsoft.com/en-us/openspecs/sharepoint_protocols/ms-wsshp/c60ddeb6-4113-4a73-9e97-26b5c3907d33	`5`
{$SHAREPOINT.PASSWORD}	-	``
{$SHAREPOINT.ROOT}	-	`/Shared Documents`
{$SHAREPOINT.URL}	Portal page URL. For example http://sharepoint.companyname.local/	``
{$SHAREPOINT.USER}	-	``

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Directory discovery

SCRIPT

sharepoint.directory.discovery

Preprocessing:

- DISCARDUNCHANGEDHEARTBEAT: 6h

Filter:

AND

- {#SHAREPOINT.LLD.NAME} MATCHESREGEX {$SHAREPOINT.LLD.FILTER.NAME.MATCHES}

- {#SHAREPOINT.LLD.NAME} NOTMATCHESREGEX {$SHAREPOINT.LLD.FILTER.NAME.NOT_MATCHES}

- {#SHAREPOINT.LLD.FULLPATH} MATCHESREGEX {$SHAREPOINT.LLD.FILTER.FULL_PATH.MATCHES}

- {#SHAREPOINT.LLD.FULLPATH} NOTMATCHESREGEX {$SHAREPOINT.LLD.FILTER.FULL_PATH.NOT_MATCHES}

- {#SHAREPOINT.LLD.TYPE} MATCHESREGEX {$SHAREPOINT.LLD.FILTER.TYPE.MATCHES}

- {#SHAREPOINT.LLD.TYPE} NOTMATCHES_REGEX {$SHAREPOINT.LLD.FILTER.TYPE.NOT_MATCHES}

Items collected

Group	Name	Description	Type	Key and additional info
Sharepoint	Sharepoint: Get directory structure: Status	HTTP response (status) code. Indicates whether the HTTP request was successfully completed. Additional information is available in the server log file.	DEPENDENT	sharepoint.getdir.status Preprocessing: - JSONPATH: `$.status` ⛔️ONFAIL: `CUSTOM_ERROR -> DISCARD_VALUE` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Sharepoint	Sharepoint: Get directory structure: Exec time	The time taken to execute the script for obtaining the data structure (in ms). Less is better.	DEPENDENT	sharepoint.getdir.time Preprocessing: - JSONPATH: `$.time` ⛔️ONFAIL: `CUSTOM_ERROR -> DISCARD_VALUE` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Sharepoint	Sharepoint: Health score	This item specifies a value between 0 and 10, where 0 represents a low load and a high ability to process requests and 10 represents a high load and that the server is throttling requests to maintain adequate throughput.	HTTP_AGENT	sharepoint.healthscore Preprocessing: - REGEX: `X-SharePointHealthScore\b:\s(\d+) \1` - INRANGE: `0 10` ⛔️ONFAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGED_HEARTBEAT: `3h`
Sharepoint	Sharepoint: Size ({#SHAREPOINT.LLD.FULL_PATH})	Size of: {#SHAREPOINT.LLD.FULL_PATH}	DEPENDENT	sharepoint.size["{#SHAREPOINT.LLD.FULLPATH}"] Preprocessing: - JSONPATH: `{{#SHAREPOINT.LLD.JSON_PATH}.regsub("(.)", \1)}.meta.size` ⛔️ON*FAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGEDHEARTBEAT: `24h`
Sharepoint	Sharepoint: Modified ({#SHAREPOINT.LLD.FULL_PATH})	Date of change: {#SHAREPOINT.LLD.FULL_PATH}	DEPENDENT	sharepoint.modified["{#SHAREPOINT.LLD.FULLPATH}"] Preprocessing: - JSONPATH: `{{#SHAREPOINT.LLD.JSON_PATH}.regsub("(.)", \1)}.meta.modified` ⛔️ON*FAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Sharepoint	Sharepoint: Created ({#SHAREPOINT.LLD.FULL_PATH})	Date of creation: {#SHAREPOINT.LLD.FULL_PATH}	DEPENDENT	sharepoint.created["{#SHAREPOINT.LLD.FULLPATH}"] Preprocessing: - JSONPATH: `{{#SHAREPOINT.LLD.JSON_PATH}.regsub("(.)", \1)}.meta.created` ⛔️ON*FAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Zabbix raw items	Sharepoint: Get directory structure	Used to get directory structure information	SCRIPT	sharepoint.getdir Preprocessing: - CHECKNOTSUPPORTED ⛔️ONFAIL: `CUSTOM_VALUE -> {"status":520,"data":{},"time":0}` Expression: `The text is too long. Please see the template.`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Sharepoint: Error getting directory structure.	Error getting directory structure. Check the Zabbix server log for more details.	`last(/Microsoft SharePoint by HTTP/sharepoint.get_dir.status)<>200`	WARNING
Sharepoint: Server responds slowly to API request	-	`last(/Microsoft SharePoint by HTTP/sharepoint.get_dir.time)>2000`	WARNING
Sharepoint: Bad health score	-	`last(/Microsoft SharePoint by HTTP/sharepoint.health_score)>"{$SHAREPOINT.MAX_HEALT_SCORE}"`	AVERAGE
Sharepoint: Sharepoint object is changed	Updated date of modification of folder / file	`last(/Microsoft SharePoint by HTTP/sharepoint.modified["{#SHAREPOINT.LLD.FULL_PATH}"],#1)<>last(/Microsoft SharePoint by HTTP/sharepoint.modified["{#SHAREPOINT.LLD.FULL_PATH}"],#2)`	INFO	Manual close: YES

Feedback

Please report any issues with the template at https://support.zabbix.com

app

app_rabbitmq_http

View README Download JSON

RabbitMQ cluster by HTTP

Overview

For Zabbix version: 6.2 and higher. The template to monitor RabbitMQ by Zabbix that work without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

Template RabbitMQ Cluster — collects metrics by polling RabbitMQ management plugin with HTTP agent remotely.

This template was tested on:

RabbitMQ, version 3.5.7, 3.7.17, 3.7.18, 3.7.7, 3.8.5, 3.8.12

Setup

Enable the RabbitMQ management plugin. See RabbitMQ's documentation to enable it.

Create a user to monitor the service:

rabbitmqctl add_user zbx_monitor <PASSWORD>
rabbitmqctl set_permissions  -p / zbx_monitor "" "" ".*"
rabbitmqctl set_user_tags zbx_monitor monitoring

{$RABBITMQ.API.USER}
{$RABBITMQ.API.PASSWORD}

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$RABBITMQ.API.PASSWORD}	-	`zabbix`
{$RABBITMQ.API.PORT}	The port of RabbitMQ API endpoint	`15672`
{$RABBITMQ.API.SCHEME}	Request scheme which may be http or https	`http`
{$RABBITMQ.API.USER}	-	`zbx_monitor`
{$RABBITMQ.LLD.FILTER.EXCHANGE.MATCHES}	Filter of discoverable exchanges	`.*`
{$RABBITMQ.LLD.FILTER.EXCHANGE.NOT_MATCHES}	Filter to exclude discovered exchanges	`CHANGE_IF_NEEDED`

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Exchanges discovery

Individual exchange metrics

DEPENDENT

rabbitmq.exchanges.discovery

Filter:

AND

- {#EXCHANGE} MATCHESREGEX {$RABBITMQ.LLD.FILTER.EXCHANGE.MATCHES}

- {#EXCHANGE} NOTMATCHES_REGEX {$RABBITMQ.LLD.FILTER.EXCHANGE.NOT_MATCHES}

Health Check 3.8.10+ discovery

Version 3.8.10+ specific metrics

DEPENDENT

rabbitmq.healthcheck.v3810.discovery

Preprocessing:

- JSONPATH: $.management_version

- JAVASCRIPT: The text is too long. Please see the template.

Items collected

Group	Name	Description	Type	Key and additional info
RabbitMQ	RabbitMQ: Connections total	The total number of connections.	DEPENDENT	rabbitmq.overview.object_totals.connections Preprocessing: - JSONPATH: `$.object_totals.connections`
RabbitMQ	RabbitMQ: Channels total	The total number of channels.	DEPENDENT	rabbitmq.overview.object_totals.channels Preprocessing: - JSONPATH: `$.object_totals.channels`
RabbitMQ	RabbitMQ: Queues total	The total number of queues.	DEPENDENT	rabbitmq.overview.object_totals.queues Preprocessing: - JSONPATH: `$.object_totals.queues`
RabbitMQ	RabbitMQ: Consumers total	The total number of consumers.	DEPENDENT	rabbitmq.overview.object_totals.consumers Preprocessing: - JSONPATH: `$.object_totals.consumers`
RabbitMQ	RabbitMQ: Exchanges total	The total number of exchanges.	DEPENDENT	rabbitmq.overview.object_totals.exchanges Preprocessing: - JSONPATH: `$.object_totals.exchanges`
RabbitMQ	RabbitMQ: Messages total	The total number of messages (ready, plus unacknowledged).	DEPENDENT	rabbitmq.overview.queuetotals.messages Preprocessing: - JSONPATH: `$.queue_totals.messages` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Messages ready for delivery	The number of messages ready for delivery.	DEPENDENT	rabbitmq.overview.queuetotals.messages.ready Preprocessing: - JSONPATH: `$.queue_totals.messages_ready` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Messages unacknowledged	The number of unacknowledged messages.	DEPENDENT	rabbitmq.overview.queuetotals.messages.unacknowledged Preprocessing: - JSONPATH: `$.queue_totals.messages_unacknowledged` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Messages acknowledged	The number of messages delivered to clients and acknowledged.	DEPENDENT	rabbitmq.overview.messages.ack Preprocessing: - JSONPATH: `$.message_stats.ack` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Messages acknowledged per second	The rate of messages (per second) delivered to clients and acknowledged.	DEPENDENT	rabbitmq.overview.messages.ack.rate Preprocessing: - JSONPATH: `$.message_stats.ack_details.rate` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Messages confirmed	The count of confirmed messages.	DEPENDENT	rabbitmq.overview.messages.confirm Preprocessing: - JSONPATH: `$.message_stats.confirm` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Messages confirmed per second	The rate of confirmed messages per second.	DEPENDENT	rabbitmq.overview.messages.confirm.rate Preprocessing: - JSONPATH: `$.message_stats.confirm_details.rate` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Messages delivered	The sum of messages delivered to consumers: in acknowledgement mode and in no-acknowledgement mode; delivered to consumers in response to `basic.get`: in acknowledgement mode and in no-acknowledgement mode.	DEPENDENT	rabbitmq.overview.messages.deliverget Preprocessing: - JSONPATH: `$.message_stats.deliver_get` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Messages delivered per second	The rate of the sum of messages (per second) delivered to consumers: in acknowledgement mode and in no-acknowledgement mode; delivered to consumers in response to `basic.get`: in acknowledgement mode and in no-acknowledgement mode.	DEPENDENT	rabbitmq.overview.messages.deliverget.rate Preprocessing: - JSONPATH: `$.message_stats.deliver_get_details.rate` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Messages published	The count of published messages.	DEPENDENT	rabbitmq.overview.messages.publish Preprocessing: - JSONPATH: `$.message_stats.publish` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Messages published per second	The rate of published messages per second.	DEPENDENT	rabbitmq.overview.messages.publish.rate Preprocessing: - JSONPATH: `$.message_stats.publish_details.rate` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Messages publish_in	The count of messages published from the channels into this overview.	DEPENDENT	rabbitmq.overview.messages.publishin Preprocessing: - JSONPATH: `$.message_stats.publish_in` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Messages publish_in per second	The rate of messages (per second) published from the channels into this overview.	DEPENDENT	rabbitmq.overview.messages.publishin.rate Preprocessing: - JSONPATH: `$.message_stats.publish_in_details.rate` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Messages publish_out	The count of messages published from this overview into queues.	DEPENDENT	rabbitmq.overview.messages.publishout Preprocessing: - JSONPATH: `$.message_stats.publish_out` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Messages publish_out per second	The rate of messages (per second) published from this overview into queues.	DEPENDENT	rabbitmq.overview.messages.publishout.rate Preprocessing: - JSONPATH: `$.message_stats.publish_out_details.rate` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Messages returned unroutable	The count of messages returned to a publisher as unroutable.	DEPENDENT	rabbitmq.overview.messages.returnunroutable Preprocessing: - JSONPATH: `$.message_stats.return_unroutable` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Messages returned unroutable per second	The rate of messages (per second) returned to a publisher as unroutable.	DEPENDENT	rabbitmq.overview.messages.returnunroutable.rate Preprocessing: - JSONPATH: `$.message_stats.return_unroutable_details.rate` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Messages returned redeliver	The count of subset of messages in the `deliver_get`, which had the `redelivered` flag set.	DEPENDENT	rabbitmq.overview.messages.redeliver Preprocessing: - JSONPATH: `$.message_stats.redeliver` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Messages returned redeliver per second	The rate of subset of messages (per second) in the `deliver_get`, which had the `redelivered` flag set.	DEPENDENT	rabbitmq.overview.messages.redeliver.rate Preprocessing: - JSONPATH: `$.message_stats.redeliver_details.rate` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Healthcheck: alarms in effect in the cluster{#SINGLETON}	Responds a 200 OK if there are no alarms in effect in the cluster, otherwise responds with a 503 Service Unavailable.	HTTP_AGENT	rabbitmq.healthcheck.alarms[{#SINGLETON}] Preprocessing: - REGEX: `HTTP\/1\.1\b\s(\d+) \1` - JAVASCRIPT: `switch(value){ case '200': return 1 case '503': return 0 default: 2}` - DISCARDUNCHANGEDHEARTBEAT: `3h`
RabbitMQ	RabbitMQ: Exchange [{#VHOST}][{#EXCHANGE}][{#TYPE}]: Messages acknowledged	The number of messages delivered to clients and acknowledged.	DEPENDENT	rabbitmq.exchange.messages.ack["{#VHOST}/{#EXCHANGE}/{#TYPE}"] Preprocessing: - JSONPATH: `$.message_stats.ack` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Exchange [{#VHOST}][{#EXCHANGE}][{#TYPE}]: Messages acknowledged per second	The rate of messages (per second) delivered to clients and acknowledged.	DEPENDENT	rabbitmq.exchange.messages.ack.rate["{#VHOST}/{#EXCHANGE}/{#TYPE}"] Preprocessing: - JSONPATH: `$.message_stats.ack_details.rate` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Exchange [{#VHOST}][{#EXCHANGE}][{#TYPE}]: Messages confirmed	The count of confirmed messages.	DEPENDENT	rabbitmq.exchange.messages.confirm["{#VHOST}/{#EXCHANGE}/{#TYPE}"] Preprocessing: - JSONPATH: `$.message_stats.confirm` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Exchange [{#VHOST}][{#EXCHANGE}][{#TYPE}]: Messages confirmed per second	The rate of messages confirmed per second.	DEPENDENT	rabbitmq.exchange.messages.confirm.rate["{#VHOST}/{#EXCHANGE}/{#TYPE}"] Preprocessing: - JSONPATH: `$.message_stats.confirm_details.rate` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Exchange [{#VHOST}][{#EXCHANGE}][{#TYPE}]: Messages delivered	The sum of messages delivered to consumers: in acknowledgement mode and in no-acknowledgement mode; delivered to consumers in response to the `basic.get`: in acknowledgement mode and in no-acknowledgement mode.	DEPENDENT	rabbitmq.exchange.messages.deliverget["{#VHOST}/{#EXCHANGE}/{#TYPE}"] Preprocessing: - JSONPATH: `$.message_stats.deliver_get` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Exchange [{#VHOST}][{#EXCHANGE}][{#TYPE}]: Messages delivered per second	The rate of the sum of messages (per second) delivered to consumers: in acknowledgement mode and in no-acknowledgement mode; delivered to consumers in response to the `basic.get`: in acknowledgement mode and in no-acknowledgement mode.	DEPENDENT	rabbitmq.exchange.messages.deliverget.rate["{#VHOST}/{#EXCHANGE}/{#TYPE}"] Preprocessing: - JSONPATH: `$.message_stats.deliver_get_details.rate` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Exchange [{#VHOST}][{#EXCHANGE}][{#TYPE}]: Messages published	The count of published messages.	DEPENDENT	rabbitmq.exchange.messages.publish["{#VHOST}/{#EXCHANGE}/{#TYPE}"] Preprocessing: - JSONPATH: `$.message_stats.publish` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Exchange [{#VHOST}][{#EXCHANGE}][{#TYPE}]: Messages published per second	The rate of messages published per second.	DEPENDENT	rabbitmq.exchange.messages.publish.rate["{#VHOST}/{#EXCHANGE}/{#TYPE}"] Preprocessing: - JSONPATH: `$.message_stats.publish_details.rate` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Exchange [{#VHOST}][{#EXCHANGE}][{#TYPE}]: Messages publish_in	The count of messages published from the channels into this overview.	DEPENDENT	rabbitmq.exchange.messages.publishin["{#VHOST}/{#EXCHANGE}/{#TYPE}"] Preprocessing: - JSONPATH: `$.message_stats.publish_in` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Exchange [{#VHOST}][{#EXCHANGE}][{#TYPE}]: Messages publish_in per second	The rate of messages (per second) published from the channels into this overview.	DEPENDENT	rabbitmq.exchange.messages.publishin.rate["{#VHOST}/{#EXCHANGE}/{#TYPE}"] Preprocessing: - JSONPATH: `$.message_stats.publish_in_details.rate` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Exchange [{#VHOST}][{#EXCHANGE}][{#TYPE}]: Messages publish_out	The count of messages published from this overview into queues.	DEPENDENT	rabbitmq.exchange.messages.publishout["{#VHOST}/{#EXCHANGE}/{#TYPE}"] Preprocessing: - JSONPATH: `$.message_stats.publish_out` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Exchange [{#VHOST}][{#EXCHANGE}][{#TYPE}]: Messages publish_out per second	The rate of messages (per second) published from this overview into queues.	DEPENDENT	rabbitmq.exchange.messages.publishout.rate["{#VHOST}/{#EXCHANGE}/{#TYPE}"] Preprocessing: - JSONPATH: `$.message_stats.publish_out_details.rate` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Exchange [{#VHOST}][{#EXCHANGE}][{#TYPE}]: Messages returned unroutable	The count of messages returned to a publisher as unroutable.	DEPENDENT	rabbitmq.exchange.messages.returnunroutable["{#VHOST}/{#EXCHANGE}/{#TYPE}"] Preprocessing: - JSONPATH: `$.message_stats.return_unroutable` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Exchange [{#VHOST}][{#EXCHANGE}][{#TYPE}]: Messages returned unroutable per second	The rate of messages (per second) returned to a publisher as unroutable.	DEPENDENT	rabbitmq.exchange.messages.returnunroutable.rate["{#VHOST}/{#EXCHANGE}/{#TYPE}"] Preprocessing: - JSONPATH: `$.message_stats.return_unroutable_details.rate` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Exchange [{#VHOST}][{#EXCHANGE}][{#TYPE}]: Messages redelivered	The count of subset of messages in the `deliver_get`, which had the `redelivered` flag set.	DEPENDENT	rabbitmq.exchange.messages.redeliver["{#VHOST}/{#EXCHANGE}/{#TYPE}"] Preprocessing: - JSONPATH: `$.message_stats.redeliver` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Exchange {#VHOST}/{#EXCHANGE}/{#TYPE}: Messages redelivered per second	The rate of subset of messages (per second) in the `deliver_get`, which had the `redelivered` flag set.	DEPENDENT	rabbitmq.exchange.messages.redeliver.rate["{#VHOST}/{#EXCHANGE}/{#TYPE}"] Preprocessing: - JSONPATH: `$.message_stats.redeliver_details.rate` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Zabbix raw items	RabbitMQ: Get overview	The HTTP API endpoint that returns cluster-wide metrics.	HTTP_AGENT	rabbitmq.get_overview
Zabbix raw items	RabbitMQ: Get exchanges	The HTTP API endpoint that returns exchanges metrics.	HTTP_AGENT	rabbitmq.get_exchanges
Zabbix raw items	RabbitMQ: Exchange [{#VHOST}][{#EXCHANGE}][{#TYPE}]: Get data	The HTTP API endpoint that returns [{#VHOST}][{#EXCHANGE}][{#TYPE}] exchanges metrics	DEPENDENT	rabbitmq.get_exchanges["{#VHOST}/{#EXCHANGE}/{#TYPE}"] Preprocessing: - JSONPATH: `$[?(@.name == "{#EXCHANGE}" && @.vhost == "{#VHOST}" && @.type =="{#TYPE}")].first()`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
RabbitMQ: There are active alarms in the cluster	This is the default API endpoint path: http://{HOST.CONN}:{$RABBITMQ.API.PORT}/api/index.html.	`last(/RabbitMQ cluster by HTTP/rabbitmq.healthcheck.alarms[{#SINGLETON}])=0`	AVERAGE
RabbitMQ: Failed to fetch overview data	Zabbix has not received data for items for the last 30 minutes	`nodata(/RabbitMQ cluster by HTTP/rabbitmq.get_overview,30m)=1`	WARNING	Manual close: YES

Feedback

Please report any issues with the template at https://support.zabbix.com.

You can also provide feedback, discuss the template, or ask for help at ZABBIX forums.

RabbitMQ node by HTTP

Overview

Template RabbitMQ Node — (Zabbix version >= 4.2) collects metrics by polling RabbitMQ management plugin with HTTP agent remotely.

This template was tested on:

RabbitMQ, version 3.5.7, 3.7.17, 3.7.18, 3.7.7, 3.8.5, 3.8.12

Setup

Enable the RabbitMQ management plugin. See RabbitMQ's documentation to enable it.

Create a user to monitor the service:

rabbitmqctl add_user zbx_monitor <PASSWORD>
rabbitmqctl set_permissions  -p / zbx_monitor "" "" ".*"
rabbitmqctl set_user_tags zbx_monitor monitoring

{$RABBITMQ.API.USER}
{$RABBITMQ.API.PASSWORD}

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$RABBITMQ.API.PASSWORD}	-	`zabbix`
{$RABBITMQ.API.PORT}	The port of RabbitMQ API endpoint	`15672`
{$RABBITMQ.API.SCHEME}	Request scheme which may be http or https	`http`
{$RABBITMQ.API.USER}	-	`zbx_monitor`
{$RABBITMQ.CLUSTER.NAME}	The name of RabbitMQ cluster	`rabbit`
{$RABBITMQ.LLD.FILTER.QUEUE.MATCHES}	Filter of discoverable queues	`.*`
{$RABBITMQ.LLD.FILTER.QUEUE.NOT_MATCHES}	Filter to exclude discovered queues	`CHANGE_IF_NEEDED`
{$RABBITMQ.MESSAGES.MAX.WARN}	Maximum number of messages in the queue for trigger expression	`1000`
{$RABBITMQ.RESPONSE_TIME.MAX.WARN}	Maximum RabbitMQ response time in seconds for trigger expression	`10`

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Health Check 3.8.10+ discovery

Version 3.8.10+ specific metrics

DEPENDENT

rabbitmq.healthcheck.v3810.discovery

Preprocessing:

- JSONPATH: $.management_version

- JAVASCRIPT: The text is too long. Please see the template.

Health Check 3.8.9- discovery

Specific metrics up to and including version 3.8.4

DEPENDENT

rabbitmq.healthcheck.v389.discovery

Preprocessing:

- JSONPATH: $.management_version

- JAVASCRIPT: The text is too long. Please see the template.

Queues discovery

Individual queue metrics

DEPENDENT

rabbitmq.queues.discovery

Filter:

AND

- {#QUEUE} MATCHESREGEX {$RABBITMQ.LLD.FILTER.QUEUE.MATCHES}

- {#QUEUE} NOTMATCHESREGEX {$RABBITMQ.LLD.FILTER.QUEUE.NOT_MATCHES}

- {#NODE} MATCHESREGEX {$RABBITMQ.CLUSTER.NAME}@{HOST.NAME}

Items collected

Group	Name	Description	Type	Key and additional info
RabbitMQ	RabbitMQ: Management plugin version	The version of the management plugin in use.	DEPENDENT	rabbitmq.node.overview.managementversion Preprocessing: - JSONPATH: `$.management_version` - DISCARDUNCHANGED_HEARTBEAT: `1d`
RabbitMQ	RabbitMQ: RabbitMQ version	The version of the RabbitMQ on the node, which processed this request.	DEPENDENT	rabbitmq.node.overview.rabbitmqversion Preprocessing: - JSONPATH: `$.rabbitmq_version` - DISCARDUNCHANGED_HEARTBEAT: `1d`
RabbitMQ	RabbitMQ: Used file descriptors	The descriptors of the used file.	DEPENDENT	rabbitmq.node.fd_used Preprocessing: - JSONPATH: `$.fd_used`
RabbitMQ	RabbitMQ: Free disk space	The current free disk space.	DEPENDENT	rabbitmq.node.disk_free Preprocessing: - JSONPATH: `$.disk_free`
RabbitMQ	RabbitMQ: Disk free limit	The free space limit of a disk expressed in bytes.	DEPENDENT	rabbitmq.node.diskfreelimit Preprocessing: - JSONPATH: `$.disk_free_limit`
RabbitMQ	RabbitMQ: Memory used	The memory usage expressed in bytes.	DEPENDENT	rabbitmq.node.mem_used Preprocessing: - JSONPATH: `$.mem_used`
RabbitMQ	RabbitMQ: Memory limit	The memory usage with high watermark properties expressed in bytes.	DEPENDENT	rabbitmq.node.mem_limit Preprocessing: - JSONPATH: `$.mem_limit`
RabbitMQ	RabbitMQ: Runtime run queue	The average number of Erlang processes waiting to run.	DEPENDENT	rabbitmq.node.run_queue Preprocessing: - JSONPATH: `$.run_queue`
RabbitMQ	RabbitMQ: Sockets used	The number of file descriptors used as sockets.	DEPENDENT	rabbitmq.node.sockets_used Preprocessing: - JSONPATH: `$.sockets_used`
RabbitMQ	RabbitMQ: Sockets available	The file descriptors available for use as sockets.	DEPENDENT	rabbitmq.node.sockets_total Preprocessing: - JSONPATH: `$.sockets_total`
RabbitMQ	RabbitMQ: Number of network partitions	The number of network partitions, which this node "sees".	DEPENDENT	rabbitmq.node.partitions Preprocessing: - JSONPATH: `$.partitions` - JAVASCRIPT: `return JSON.parse(value).length;`
RabbitMQ	RabbitMQ: Is running	It "sees" whether the node is running or not.	DEPENDENT	rabbitmq.node.running Preprocessing: - JSONPATH: `$.running` - BOOLTODECIMAL
RabbitMQ	RabbitMQ: Memory alarm	It checks whether the host has a memory alarm or not.	DEPENDENT	rabbitmq.node.memalarm Preprocessing: - JSONPATH: `$.mem_alarm` - BOOLTO_DECIMAL
RabbitMQ	RabbitMQ: Disk free alarm	It checks whether the node has a disk alarm or not.	DEPENDENT	rabbitmq.node.diskfreealarm Preprocessing: - JSONPATH: `$.disk_free_alarm` - BOOLTODECIMAL
RabbitMQ	RabbitMQ: Uptime	Uptime expressed in milliseconds.	DEPENDENT	rabbitmq.node.uptime Preprocessing: - JSONPATH: `$.uptime` - MULTIPLIER: `0.001`
RabbitMQ	RabbitMQ: Service ping	-	SIMPLE	net.tcp.service["{$RABBITMQ.API.SCHEME}","{HOST.CONN}","{$RABBITMQ.API.PORT}"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
RabbitMQ	RabbitMQ: Service response time	-	SIMPLE	net.tcp.service.perf["{$RABBITMQ.API.SCHEME}","{HOST.CONN}","{$RABBITMQ.API.PORT}"]
RabbitMQ	RabbitMQ: Healthcheck: local alarms in effect on this node{#SINGLETON}	Responds a 200 OK if there are no local alarms in effect on the target node, otherwise responds with a 503 Service Unavailable.	HTTP_AGENT	rabbitmq.healthcheck.localalarms[{#SINGLETON}] Preprocessing: - REGEX: `HTTP\/1\.1\b\s(\d+) \1` - JAVASCRIPT: `switch(value){ case '200': return 1 case '503': return 0 default: 2}` - DISCARDUNCHANGED_HEARTBEAT: `3h`
RabbitMQ	RabbitMQ: Healthcheck: expiration date on the certificates{#SINGLETON}	Checks the expiration date on the certificates for every listener configured to use TLS. Responds a 200 OK if all certificates are valid (have not expired), otherwise responds with a 503 Service Unavailable.	HTTP_AGENT	rabbitmq.healthcheck.certificateexpiration[{#SINGLETON}] Preprocessing: - REGEX: `HTTP\/1\.1\b\s(\d+) \1` - JAVASCRIPT: `switch(value){ case '200': return 1 case '503': return 0 default: 2}` - DISCARDUNCHANGED_HEARTBEAT: `3h`
RabbitMQ	RabbitMQ: Healthcheck: virtual hosts on this node{#SINGLETON}	Responds a 200 OK if all virtual hosts and running on the target node, otherwise responds with a 503 Service Unavailable.	HTTP_AGENT	rabbitmq.healthcheck.virtualhosts[{#SINGLETON}] Preprocessing: - REGEX: `HTTP\/1\.1\b\s(\d+) \1` - JAVASCRIPT: `switch(value){ case '200': return 1 case '503': return 0 default: 2}` - DISCARDUNCHANGED_HEARTBEAT: `3h`
RabbitMQ	RabbitMQ: Healthcheck: classic mirrored queues without synchronized mirrors online{#SINGLETON}	Checks if there are classic mirrored queues without synchronized mirrors online (queues that would potentially lose data if the target node is shut down). Responds a 200 OK if there are no such classic mirrored queues, otherwise responds with a 503 Service Unavailable.	HTTP_AGENT	rabbitmq.healthcheck.mirrorsync[{#SINGLETON}] Preprocessing: - REGEX: `HTTP\/1\.1\b\s(\d+) \1` - JAVASCRIPT: `switch(value){ case '200': return 1 case '503': return 0 default: 2}` - DISCARDUNCHANGED_HEARTBEAT: `3h`
RabbitMQ	RabbitMQ: Healthcheck: queues with minimum online quorum{#SINGLETON}	Checks if there are quorum queues with minimum online quorum (queues that would lose their quorum and availability if the target node is shut down). Responds a 200 OK if there are no such quorum queues, otherwise responds with a 503 Service Unavailable.	HTTP_AGENT	rabbitmq.healthcheck.quorum[{#SINGLETON}] Preprocessing: - REGEX: `HTTP\/1\.1\b\s(\d+) \1` - JAVASCRIPT: `switch(value){ case '200': return 1 case '503': return 0 default: 2}` - DISCARDUNCHANGEDHEARTBEAT: `3h`
RabbitMQ	RabbitMQ: Healthcheck{#SINGLETON}	It checks whether the RabbitMQ application is running; and whether the channels and queues can be listed successfully; and that no alarms are in effect.	HTTP_AGENT	rabbitmq.healthcheck[{#SINGLETON}] Preprocessing: - JSONPATH: `$.status` - BOOLTODECIMAL ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Queue [{#VHOST}][{#QUEUE}]: Get data	The HTTP API endpoint that returns [{#VHOST}][{#QUEUE}] queue metrics	DEPENDENT	rabbitmq.get_exchanges["{#VHOST}/{#QUEUE}"] Preprocessing: - JSONPATH: `$[?(@.name == "{#QUEUE}" && @.vhost == "{#VHOST}")].first()`
RabbitMQ	RabbitMQ: Queue [{#VHOST}][{#QUEUE}]: Messages	The count of total messages in the queue.	DEPENDENT	rabbitmq.queue.messages["{#VHOST}/{#QUEUE}"] Preprocessing: - JSONPATH: `$.messages`
RabbitMQ	RabbitMQ: Queue [{#VHOST}][{#QUEUE}]: Messages per second	The count of total messages per second in the queue.	DEPENDENT	rabbitmq.queue.messages.rate["{#VHOST}/{#QUEUE}"] Preprocessing: - JSONPATH: `$.messages_details.rate`
RabbitMQ	RabbitMQ: Queue [{#VHOST}][{#QUEUE}]: Consumers	The number of consumers.	DEPENDENT	rabbitmq.queue.consumers["{#VHOST}/{#QUEUE}"] Preprocessing: - JSONPATH: `$.consumers`
RabbitMQ	RabbitMQ: Queue [{#VHOST}][{#QUEUE}]: Memory	The bytes of memory consumed by the Erlang process associated with the queue, including stack, heap and internal structures.	DEPENDENT	rabbitmq.queue.memory["{#VHOST}/{#QUEUE}"] Preprocessing: - JSONPATH: `$.memory`
RabbitMQ	RabbitMQ: Queue [{#VHOST}][{#QUEUE}]: Messages ready	The number of messages ready to be delivered to clients.	DEPENDENT	rabbitmq.queue.messages_ready["{#VHOST}/{#QUEUE}"] Preprocessing: - JSONPATH: `$.messages_ready`
RabbitMQ	RabbitMQ: Queue [{#VHOST}][{#QUEUE}]: Messages ready per second	The number of messages per second ready to be delivered to clients.	DEPENDENT	rabbitmq.queue.messages_ready.rate["{#VHOST}/{#QUEUE}"] Preprocessing: - JSONPATH: `$.messages_ready_details.rate`
RabbitMQ	RabbitMQ: Queue [{#VHOST}][{#QUEUE}]: Messages unacknowledged	The number of messages delivered to clients but not yet acknowledged.	DEPENDENT	rabbitmq.queue.messages_unacknowledged["{#VHOST}/{#QUEUE}"] Preprocessing: - JSONPATH: `$.messages_unacknowledged`
RabbitMQ	RabbitMQ: Queue [{#VHOST}][{#QUEUE}]: Messages unacknowledged per second	The number of messages per second delivered to clients but not yet acknowledged.	DEPENDENT	rabbitmq.queue.messages_unacknowledged.rate["{#VHOST}/{#QUEUE}"] Preprocessing: - JSONPATH: `$.messages_unacknowledged_details.rate`
RabbitMQ	RabbitMQ: Queue [{#VHOST}][{#QUEUE}]: Messages acknowledged	The number of messages delivered to clients and acknowledged.	DEPENDENT	rabbitmq.queue.messages.ack["{#VHOST}/{#QUEUE}"] Preprocessing: - JSONPATH: `$.message_stats.ack` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Queue [{#VHOST}][{#QUEUE}]: Messages acknowledged per second	The number of messages (per second) delivered to clients and acknowledged.	DEPENDENT	rabbitmq.queue.messages.ack.rate["{#VHOST}/{#QUEUE}"] Preprocessing: - JSONPATH: `$.message_stats.ack_details.rate` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Queue [{#VHOST}][{#QUEUE}]: Messages delivered	The count of messages delivered to consumers in acknowledgement mode.	DEPENDENT	rabbitmq.queue.messages.deliver["{#VHOST}/{#QUEUE}"] Preprocessing: - JSONPATH: `$.message_stats.deliver` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Queue [{#VHOST}][{#QUEUE}]: Messages delivered per second	The count of messages (per second) delivered to consumers in acknowledgement mode.	DEPENDENT	rabbitmq.queue.messages.deliver.rate["{#VHOST}/{#QUEUE}"] Preprocessing: - JSONPATH: `$.message_stats.deliver_details.rate` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Queue [{#VHOST}][{#QUEUE}]: Sum of messages delivered	The sum of messages delivered to consumers: in acknowledgement mode and in no-acknowledgement mode; delivered to consumers in response to `basic.get`: in acknowledgement mode and in no-acknowledgement mode.	DEPENDENT	rabbitmq.queue.messages.deliverget["{#VHOST}/{#QUEUE}"] Preprocessing: - JSONPATH: `$.message_stats.deliver_get` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Queue [{#VHOST}][{#QUEUE}]: Sum of messages delivered per second	The rate of delivery per second. The sum of messages delivered (per second) to consumers: in acknowledgement mode and in no-acknowledgement mode; delivered to consumers in response to `basic.get`: in acknowledgement mode and in no-acknowledgement mode.	DEPENDENT	rabbitmq.queue.messages.deliverget.rate["{#VHOST}/{#QUEUE}"] Preprocessing: - JSONPATH: `$.message_stats.deliver_get_details.rate` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Queue [{#VHOST}][{#QUEUE}]: Messages published	The count of published messages.	DEPENDENT	rabbitmq.queue.messages.publish["{#VHOST}/{#QUEUE}"] Preprocessing: - JSONPATH: `$.message_stats.publish` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Queue [{#VHOST}][{#QUEUE}]: Messages published per second	The rate of published messages per second.	DEPENDENT	rabbitmq.queue.messages.publish.rate["{#VHOST}/{#QUEUE}"] Preprocessing: - JSONPATH: `$.message_stats.publish_details.rate` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Queue [{#VHOST}][{#QUEUE}]: Messages redelivered	The count of subset of messages in the `deliver_get` queue with the `redelivered` flag set.	DEPENDENT	rabbitmq.queue.messages.redeliver["{#VHOST}/{#QUEUE}"] Preprocessing: - JSONPATH: `$.message_stats.redeliver` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Queue [{#VHOST}][{#QUEUE}]: Messages redelivered per second	The rate of messages redelivered per second.	DEPENDENT	rabbitmq.queue.messages.redeliver.rate["{#VHOST}/{#QUEUE}"] Preprocessing: - JSONPATH: `$.message_stats.redeliver_details.rate` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Zabbix raw items	RabbitMQ: Get node overview	The HTTP API endpoint that returns cluster-wide metrics.	HTTP_AGENT	rabbitmq.getnodeoverview
Zabbix raw items	RabbitMQ: Get nodes	The HTTP API endpoint that returns metrics of the nodes.	HTTP_AGENT	rabbitmq.get_nodes
Zabbix raw items	RabbitMQ: Get queues	The HTTP API endpoint that returns metrics of the queues metrics.	HTTP_AGENT	rabbitmq.get_queues

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
RabbitMQ: Version has changed	The RabbitMQ version has changed. Acknowledge (Ack) to close manually.	`last(/RabbitMQ node by HTTP/rabbitmq.node.overview.rabbitmq_version,#1)<>last(/RabbitMQ node by HTTP/rabbitmq.node.overview.rabbitmq_version,#2) and length(last(/RabbitMQ node by HTTP/rabbitmq.node.overview.rabbitmq_version))>0`	INFO	Manual close: YES
RabbitMQ: Number of network partitions is too high	https://www.rabbitmq.com/partitions.html#detecting	`min(/RabbitMQ node by HTTP/rabbitmq.node.partitions,5m)>0`	WARNING
RabbitMQ: Node is not running	RabbitMQ node is not running	`max(/RabbitMQ node by HTTP/rabbitmq.node.running,5m)=0`	AVERAGE	Depends on: - RabbitMQ: Service is down
RabbitMQ: Memory alarm	https://www.rabbitmq.com/memory.html	`last(/RabbitMQ node by HTTP/rabbitmq.node.mem_alarm)=1`	AVERAGE
RabbitMQ: Free disk space alarm	https://www.rabbitmq.com/disk-alarms.html	`last(/RabbitMQ node by HTTP/rabbitmq.node.disk_free_alarm)=1`	AVERAGE
RabbitMQ: Host has been restarted	The host uptime is less than 10 minutes.	`last(/RabbitMQ node by HTTP/rabbitmq.node.uptime)<10m`	INFO	Manual close: YES
RabbitMQ: Service is down	-	`last(/RabbitMQ node by HTTP/net.tcp.service["{$RABBITMQ.API.SCHEME}","{HOST.CONN}","{$RABBITMQ.API.PORT}"])=0`	AVERAGE	Manual close: YES
RabbitMQ: Service response time is too high	-	`min(/RabbitMQ node by HTTP/net.tcp.service.perf["{$RABBITMQ.API.SCHEME}","{HOST.CONN}","{$RABBITMQ.API.PORT}"],5m)>{$RABBITMQ.RESPONSE_TIME.MAX.WARN}`	WARNING	Manual close: YES Depends on: - RabbitMQ: Service is down
RabbitMQ: There are active alarms in the node	This is the default API endpoint path: http://{HOST.CONN}:{$RABBITMQ.API.PORT}/api/index.html.	`last(/RabbitMQ node by HTTP/rabbitmq.healthcheck.local_alarms[{#SINGLETON}])=0`	AVERAGE
RabbitMQ: There are valid TLS certificates expiring in the next month	This is the default API endpoint path: http://{HOST.CONN}:{$RABBITMQ.API.PORT}/api/index.html.	`last(/RabbitMQ node by HTTP/rabbitmq.healthcheck.certificate_expiration[{#SINGLETON}])=0`	AVERAGE
RabbitMQ: There are not running virtual hosts	This is the default API endpoint path: http://{HOST.CONN}:{$RABBITMQ.API.PORT}/api/index.html.	`last(/RabbitMQ node by HTTP/rabbitmq.healthcheck.virtual_hosts[{#SINGLETON}])=0`	AVERAGE
RabbitMQ: There are queues that could potentially lose data if this node goes offline.	This is the default API endpoint path: http://{HOST.CONN}:{$RABBITMQ.API.PORT}/api/index.html.	`last(/RabbitMQ node by HTTP/rabbitmq.healthcheck.mirror_sync[{#SINGLETON}])=0`	AVERAGE
RabbitMQ: There are queues that would lose their quorum and availability if this node is shut down.	This is the default API endpoint path: http://{HOST.CONN}:{$RABBITMQ.API.PORT}/api/index.html.	`last(/RabbitMQ node by HTTP/rabbitmq.healthcheck.quorum[{#SINGLETON}])=0`	AVERAGE
RabbitMQ: Node healthcheck failed	https://www.rabbitmq.com/monitoring.html#health-checks	`last(/RabbitMQ node by HTTP/rabbitmq.healthcheck[{#SINGLETON}])=0`	AVERAGE
RabbitMQ: Too many messages in queue [{#VHOST}][{#QUEUE}]	-	`min(/RabbitMQ node by HTTP/rabbitmq.queue.messages["{#VHOST}/{#QUEUE}"],5m)>{$RABBITMQ.MESSAGES.MAX.WARN:"{#QUEUE}"}`	WARNING
RabbitMQ: Failed to fetch nodes data	Zabbix has not received data for items for the last 30 minutes.	`nodata(/RabbitMQ node by HTTP/rabbitmq.get_nodes,30m)=1`	WARNING	Manual close: YES Depends on: - RabbitMQ: Service is down

Feedback

Please report any issues with the template at https://support.zabbix.com.

You can also provide feedback, discuss the template, or ask for help at ZABBIX forums.

app

app_rabbitmq_agent

View README Download JSON

RabbitMQ cluster by Zabbix agent

Overview

For Zabbix version: 6.2 and higher. This template is developed to monitor the messaging broker RabbitMQ by Zabbix that works without any external scripts.

Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

The template RabbitMQ Cluster — collects metrics by polling RabbitMQ management plugin with Zabbix agent.

This template was tested on:

RabbitMQ, version 3.5.7, 3.7.17, 3.7.18, 3.7.7, 3.8.5, 3.8.12

Setup

Enable the RabbitMQ management plugin. See RabbitMQ documentation for the instructions.

Create a user to monitor the service:

rabbitmqctl add_user zbx_monitor <PASSWORD>
rabbitmqctl set_permissions  -p / zbx_monitor "" "" ".*"
rabbitmqctl set_user_tags zbx_monitor monitoring

A login name and password are also supported in macros functions:

{$RABBITMQ.API.USER}
{$RABBITMQ.API.PASSWORD}

If your cluster consists of several nodes, it is recommended to assign the cluster template to a separate balancing host. In the case of a single-node installation, you can assign the cluster template to one host with a node template.

If you use another API endpoint, then don't forget to change {$RABBITMQ.API.CLUSTER_HOST} macro.

Install and setup Zabbix agent.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$RABBITMQ.API.CLUSTER_HOST}	The hostname or an IP of the API endpoint for the RabbitMQ cluster.	`127.0.0.1`
{$RABBITMQ.API.PASSWORD}	-	`zabbix`
{$RABBITMQ.API.PORT}	The port of the RabbitMQ API endpoint.	`15672`
{$RABBITMQ.API.SCHEME}	The request scheme, which may be HTTP or HTTPS.	`http`
{$RABBITMQ.API.USER}	-	`zbx_monitor`
{$RABBITMQ.LLD.FILTER.EXCHANGE.MATCHES}	This macro is used in the discovery of exchanges. It can be overridden at host level or its linked template level.	`.*`
{$RABBITMQ.LLD.FILTER.EXCHANGE.NOT_MATCHES}	This macro is used in the discovery of exchanges. It can be overridden at host level or its linked template level.	`CHANGE_IF_NEEDED`

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Exchanges discovery

The metrics for an individual exchange.

DEPENDENT

rabbitmq.exchanges.discovery

Filter:

AND

- {#EXCHANGE} MATCHESREGEX {$RABBITMQ.LLD.FILTER.EXCHANGE.MATCHES}

- {#EXCHANGE} NOTMATCHES_REGEX {$RABBITMQ.LLD.FILTER.EXCHANGE.NOT_MATCHES}

Health Check 3.8.10+ discovery

Specific metrics for the versions: up to and including 3.8.10.

DEPENDENT

rabbitmq.healthcheck.v3810.discovery

Preprocessing:

- JSONPATH: $.management_version

- JAVASCRIPT: The text is too long. Please see the template.

Items collected

Group	Name	Description	Type	Key and additional info
RabbitMQ	RabbitMQ: Connections total	The total number of connections.	DEPENDENT	rabbitmq.overview.object_totals.connections Preprocessing: - JSONPATH: `$.object_totals.connections`
RabbitMQ	RabbitMQ: Channels total	The total number of channels.	DEPENDENT	rabbitmq.overview.object_totals.channels Preprocessing: - JSONPATH: `$.object_totals.channels`
RabbitMQ	RabbitMQ: Queues total	The total number of queues.	DEPENDENT	rabbitmq.overview.object_totals.queues Preprocessing: - JSONPATH: `$.object_totals.queues`
RabbitMQ	RabbitMQ: Consumers total	The total number of consumers.	DEPENDENT	rabbitmq.overview.object_totals.consumers Preprocessing: - JSONPATH: `$.object_totals.consumers`
RabbitMQ	RabbitMQ: Exchanges total	The total number of exchanges.	DEPENDENT	rabbitmq.overview.object_totals.exchanges Preprocessing: - JSONPATH: `$.object_totals.exchanges`
RabbitMQ	RabbitMQ: Messages total	The total number of messages (ready, plus unacknowledged).	DEPENDENT	rabbitmq.overview.queuetotals.messages Preprocessing: - JSONPATH: `$.queue_totals.messages` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Messages ready for delivery	The number of messages ready for delivery.	DEPENDENT	rabbitmq.overview.queuetotals.messages.ready Preprocessing: - JSONPATH: `$.queue_totals.messages_ready` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Messages unacknowledged	The number of unacknowledged messages.	DEPENDENT	rabbitmq.overview.queuetotals.messages.unacknowledged Preprocessing: - JSONPATH: `$.queue_totals.messages_unacknowledged` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Messages acknowledged	The number of messages delivered to clients and acknowledged.	DEPENDENT	rabbitmq.overview.messages.ack Preprocessing: - JSONPATH: `$.message_stats.ack` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Messages acknowledged per second	The rate of messages (per second) delivered to clients and acknowledged.	DEPENDENT	rabbitmq.overview.messages.ack.rate Preprocessing: - JSONPATH: `$.message_stats.ack_details.rate` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Messages confirmed	The count of confirmed messages.	DEPENDENT	rabbitmq.overview.messages.confirm Preprocessing: - JSONPATH: `$.message_stats.confirm` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Messages confirmed per second	The rate of confirmed messages per second.	DEPENDENT	rabbitmq.overview.messages.confirm.rate Preprocessing: - JSONPATH: `$.message_stats.confirm_details.rate` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Messages delivered	The sum of messages delivered to consumers: in acknowledgement mode and in no-acknowledgement mode; delivered to consumers in response to `basic.get`: in acknowledgement mode and in no-acknowledgement mode.	DEPENDENT	rabbitmq.overview.messages.deliverget Preprocessing: - JSONPATH: `$.message_stats.deliver_get` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Messages delivered per second	The rate of the sum of messages (per second) delivered to consumers: in acknowledgement mode and in no-acknowledgement mode; delivered to consumers in response to `basic.get`: in acknowledgement mode and in no-acknowledgement mode.	DEPENDENT	rabbitmq.overview.messages.deliverget.rate Preprocessing: - JSONPATH: `$.message_stats.deliver_get_details.rate` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Messages published	The count of published messages.	DEPENDENT	rabbitmq.overview.messages.publish Preprocessing: - JSONPATH: `$.message_stats.publish` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Messages published per second	The rate of published messages per second.	DEPENDENT	rabbitmq.overview.messages.publish.rate Preprocessing: - JSONPATH: `$.message_stats.publish_details.rate` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Messages publish_in	The count of messages published from the channels into this overview.	DEPENDENT	rabbitmq.overview.messages.publishin Preprocessing: - JSONPATH: `$.message_stats.publish_in` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Messages publish_in per second	The rate of messages (per second) published from the channels into this overview.	DEPENDENT	rabbitmq.overview.messages.publishin.rate Preprocessing: - JSONPATH: `$.message_stats.publish_in_details.rate` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Messages publish_out	The count of messages published from this overview into queues.	DEPENDENT	rabbitmq.overview.messages.publishout Preprocessing: - JSONPATH: `$.message_stats.publish_out` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Messages publish_out per second	The rate of messages (per second) published from this overview into queues.	DEPENDENT	rabbitmq.overview.messages.publishout.rate Preprocessing: - JSONPATH: `$.message_stats.publish_out_details.rate` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Messages returned unroutable	The count of messages returned to a publisher as unroutable.	DEPENDENT	rabbitmq.overview.messages.returnunroutable Preprocessing: - JSONPATH: `$.message_stats.return_unroutable` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Messages returned unroutable per second	The rate of messages (per second) returned to a publisher as unroutable.	DEPENDENT	rabbitmq.overview.messages.returnunroutable.rate Preprocessing: - JSONPATH: `$.message_stats.return_unroutable_details.rate` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Messages returned redeliver	The count of subset of messages in the `deliver_get`, which had the `redelivered` flag set.	DEPENDENT	rabbitmq.overview.messages.redeliver Preprocessing: - JSONPATH: `$.message_stats.redeliver` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Messages returned redeliver per second	The rate of subset of messages (per second) in the `deliver_get`, which had the `redelivered` flag set.	DEPENDENT	rabbitmq.overview.messages.redeliver.rate Preprocessing: - JSONPATH: `$.message_stats.redeliver_details.rate` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Healthcheck: alarms in effect in the cluster{#SINGLETON}	It responds with a status code`200 OK` if there are no alarms in effect in the cluster. Otherwise, it responds with a status code `503 Service Unavailable`.	ZABBIX_PASSIVE	web.page.get["{$RABBITMQ.API.SCHEME}://{$RABBITMQ.API.USER}:{$RABBITMQ.API.PASSWORD}@{$RABBITMQ.API.CLUSTERHOST}:{$RABBITMQ.API.PORT}/api/health/checks/alarms{#SINGLETON}"] Preprocessing: - REGEX: `HTTP\/1\.1\b\s(\d+) \1` - JAVASCRIPT: `switch(value){ case '200': return 1 case '503': return 0 default: 2}` - DISCARDUNCHANGED_HEARTBEAT: `3h`
RabbitMQ	RabbitMQ: Exchange [{#VHOST}][{#EXCHANGE}][{#TYPE}]: Messages acknowledged	The number of messages delivered to clients and acknowledged.	DEPENDENT	rabbitmq.exchange.messages.ack["{#VHOST}/{#EXCHANGE}/{#TYPE}"] Preprocessing: - JSONPATH: `$.message_stats.ack` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Exchange [{#VHOST}][{#EXCHANGE}][{#TYPE}]: Messages acknowledged per second	The rate of messages (per second) delivered to clients and acknowledged.	DEPENDENT	rabbitmq.exchange.messages.ack.rate["{#VHOST}/{#EXCHANGE}/{#TYPE}"] Preprocessing: - JSONPATH: `$.message_stats.ack_details.rate` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Exchange [{#VHOST}][{#EXCHANGE}][{#TYPE}]: Messages confirmed	The count of confirmed messages.	DEPENDENT	rabbitmq.exchange.messages.confirm["{#VHOST}/{#EXCHANGE}/{#TYPE}"] Preprocessing: - JSONPATH: `$.message_stats.confirm` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Exchange [{#VHOST}][{#EXCHANGE}][{#TYPE}]: Messages confirmed per second	The rate of messages confirmed per second.	DEPENDENT	rabbitmq.exchange.messages.confirm.rate["{#VHOST}/{#EXCHANGE}/{#TYPE}"] Preprocessing: - JSONPATH: `$.message_stats.confirm_details.rate` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Exchange [{#VHOST}][{#EXCHANGE}][{#TYPE}]: Messages delivered	The sum of messages delivered to consumers: in acknowledgement mode and in no-acknowledgement mode; delivered to consumers in response to the `basic.get`: in acknowledgement mode and in no-acknowledgement mode.	DEPENDENT	rabbitmq.exchange.messages.deliverget["{#VHOST}/{#EXCHANGE}/{#TYPE}"] Preprocessing: - JSONPATH: `$.message_stats.deliver_get` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Exchange [{#VHOST}][{#EXCHANGE}][{#TYPE}]: Messages delivered per second	The rate of the sum of messages (per second) delivered to consumers: in acknowledgement mode and in no-acknowledgement mode; delivered to consumers in response to the `basic.get`: in acknowledgement mode and in no-acknowledgement mode.	DEPENDENT	rabbitmq.exchange.messages.deliverget.rate["{#VHOST}/{#EXCHANGE}/{#TYPE}"] Preprocessing: - JSONPATH: `$.message_stats.deliver_get_details.rate` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Exchange [{#VHOST}][{#EXCHANGE}][{#TYPE}]: Messages published	The count of published messages.	DEPENDENT	rabbitmq.exchange.messages.publish["{#VHOST}/{#EXCHANGE}/{#TYPE}"] Preprocessing: - JSONPATH: `$.message_stats.publish` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Exchange [{#VHOST}][{#EXCHANGE}][{#TYPE}]: Messages published per second	The rate of messages published per second.	DEPENDENT	rabbitmq.exchange.messages.publish.rate["{#VHOST}/{#EXCHANGE}/{#TYPE}"] Preprocessing: - JSONPATH: `$.message_stats.publish_details.rate` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Exchange [{#VHOST}][{#EXCHANGE}][{#TYPE}]: Messages publish_in	The count of messages published from the channels into this overview.	DEPENDENT	rabbitmq.exchange.messages.publishin["{#VHOST}/{#EXCHANGE}/{#TYPE}"] Preprocessing: - JSONPATH: `$.message_stats.publish_in` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Exchange [{#VHOST}][{#EXCHANGE}][{#TYPE}]: Messages publish_in per second	The rate of messages (per second) published from the channels into this overview.	DEPENDENT	rabbitmq.exchange.messages.publishin.rate["{#VHOST}/{#EXCHANGE}/{#TYPE}"] Preprocessing: - JSONPATH: `$.message_stats.publish_in_details.rate` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Exchange [{#VHOST}][{#EXCHANGE}][{#TYPE}]: Messages publish_out	The count of messages published from this overview into queues.	DEPENDENT	rabbitmq.exchange.messages.publishout["{#VHOST}/{#EXCHANGE}/{#TYPE}"] Preprocessing: - JSONPATH: `$.message_stats.publish_out` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Exchange [{#VHOST}][{#EXCHANGE}][{#TYPE}]: Messages publish_out per second	The rate of messages (per second) published from this overview into queues.	DEPENDENT	rabbitmq.exchange.messages.publishout.rate["{#VHOST}/{#EXCHANGE}/{#TYPE}"] Preprocessing: - JSONPATH: `$.message_stats.publish_out_details.rate` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Exchange [{#VHOST}][{#EXCHANGE}][{#TYPE}]: Messages returned unroutable	The count of messages returned to a publisher as unroutable.	DEPENDENT	rabbitmq.exchange.messages.returnunroutable["{#VHOST}/{#EXCHANGE}/{#TYPE}"] Preprocessing: - JSONPATH: `$.message_stats.return_unroutable` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Exchange [{#VHOST}][{#EXCHANGE}][{#TYPE}]: Messages returned unroutable per second	The rate of messages (per second) returned to a publisher as unroutable.	DEPENDENT	rabbitmq.exchange.messages.returnunroutable.rate["{#VHOST}/{#EXCHANGE}/{#TYPE}"] Preprocessing: - JSONPATH: `$.message_stats.return_unroutable_details.rate` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Exchange [{#VHOST}][{#EXCHANGE}][{#TYPE}]: Messages redelivered	The count of subset of messages in the `deliver_get`, which had the `redelivered` flag set.	DEPENDENT	rabbitmq.exchange.messages.redeliver["{#VHOST}/{#EXCHANGE}/{#TYPE}"] Preprocessing: - JSONPATH: `$.message_stats.redeliver` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Exchange {#VHOST}/{#EXCHANGE}/{#TYPE}: Messages redelivered per second	The rate of subset of messages (per second) in the `deliver_get`, which had the `redelivered` flag set.	DEPENDENT	rabbitmq.exchange.messages.redeliver.rate["{#VHOST}/{#EXCHANGE}/{#TYPE}"] Preprocessing: - JSONPATH: `$.message_stats.redeliver_details.rate` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Zabbix raw items	RabbitMQ: Get overview	The HTTP API endpoint that returns cluster-wide metrics.	ZABBIX_PASSIVE	web.page.get["{$RABBITMQ.API.SCHEME}://{$RABBITMQ.API.USER}:{$RABBITMQ.API.PASSWORD}@{$RABBITMQ.API.CLUSTER_HOST}:{$RABBITMQ.API.PORT}/api/overview"] Preprocessing: - REGEX: `\n\s?\n(.*) \1`
Zabbix raw items	RabbitMQ: Get exchanges	The HTTP API endpoint that returns exchanges metrics.	ZABBIX_PASSIVE	web.page.get["{$RABBITMQ.API.SCHEME}://{$RABBITMQ.API.USER}:{$RABBITMQ.API.PASSWORD}@{$RABBITMQ.API.CLUSTER_HOST}:{$RABBITMQ.API.PORT}/api/exchanges"] Preprocessing: - REGEX: `\n\s?\n(.*) \1`
Zabbix raw items	RabbitMQ: Exchange [{#VHOST}][{#EXCHANGE}][{#TYPE}]: Get data	The HTTP API endpoint that returns [{#VHOST}][{#EXCHANGE}][{#TYPE}] exchanges metrics	DEPENDENT	rabbitmq.get_exchanges["{#VHOST}/{#EXCHANGE}/{#TYPE}"] Preprocessing: - JSONPATH: `$[?(@.name == "{#EXCHANGE}" && @.vhost == "{#VHOST}" && @.type =="{#TYPE}")].first()`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
RabbitMQ: There are active alarms in the cluster	This is the default API endpoint path: http://{HOST.CONN}:{$RABBITMQ.API.PORT}/api/index.html.	`last(/RabbitMQ cluster by Zabbix agent/web.page.get["{$RABBITMQ.API.SCHEME}://{$RABBITMQ.API.USER}:{$RABBITMQ.API.PASSWORD}@{$RABBITMQ.API.CLUSTER_HOST}:{$RABBITMQ.API.PORT}/api/health/checks/alarms{#SINGLETON}"])=0`	AVERAGE
RabbitMQ: Failed to fetch overview data	Zabbix has not received any data for items for the last 30 minutes.	`nodata(/RabbitMQ cluster by Zabbix agent/web.page.get["{$RABBITMQ.API.SCHEME}://{$RABBITMQ.API.USER}:{$RABBITMQ.API.PASSWORD}@{$RABBITMQ.API.CLUSTER_HOST}:{$RABBITMQ.API.PORT}/api/overview"],30m)=1`	WARNING	Manual close: YES

Feedback

Please report any issues with the template at https://support.zabbix.com.

You can also provide feedback, discuss the template, or ask for help at ZABBIX forums.

RabbitMQ node by Zabbix agent

Overview

For Zabbix version: 6.2 and higher.

This template is developed to monitor RabbitMQ by Zabbix that works without any external scripts.

Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

The template RabbitMQ Node — (Zabbix version >= 4.2) collects metrics by polling RabbitMQ management plugin with Zabbix agent.

It also uses Zabbix agent to collect RabbitMQ Linux process statistics, such as CPU usage, memory usage, and whether the process is running or not.

This template was tested on:

RabbitMQ, version 3.5.7, 3.7.17, 3.7.18, 3.7.7, 3.8.5, 3.8.12

Setup

Enable the RabbitMQ management plugin. See RabbitMQ documentation for the instructions.

Create a user to monitor the service:

rabbitmqctl add_user zbx_monitor <PASSWORD>
rabbitmqctl set_permissions  -p / zbx_monitor "" "" ".*"
rabbitmqctl set_user_tags zbx_monitor monitoring

A login name and password are also supported in macros functions:

{$RABBITMQ.API.USER}
{$RABBITMQ.API.PASSWORD}

If you use another API endpoint, then don't forget to change {$RABBITMQ.API.HOST} macro. Install and setup Zabbix agent.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$RABBITMQ.API.HOST}	The hostname or an IP of the API endpoint for the RabbitMQ.	`127.0.0.1`
{$RABBITMQ.API.PASSWORD}	-	`zabbix`
{$RABBITMQ.API.PORT}	The port of the RabbitMQ API endpoint.	`15672`
{$RABBITMQ.API.SCHEME}	The request scheme, which may be HTTP or HTTPS.	`http`
{$RABBITMQ.API.USER}	-	`zbx_monitor`
{$RABBITMQ.CLUSTER.NAME}	The name of the RabbitMQ cluster.	`rabbit`
{$RABBITMQ.LLD.FILTER.QUEUE.MATCHES}	This macro is used in the discovery of queues. It can be overridden at host level or its linked template level.	`.*`
{$RABBITMQ.LLD.FILTER.QUEUE.NOT_MATCHES}	This macro is used in the discovery of queues. It can be overridden at host level or its linked template level.	`CHANGE_IF_NEEDED`
{$RABBITMQ.MESSAGES.MAX.WARN}	The maximum number of messages in the queue for a trigger expression.	`1000`
{$RABBITMQ.PROCESS_NAME}	The name of the RabbitMQ server process.	`beam.smp`
{$RABBITMQ.RESPONSE_TIME.MAX.WARN}	The maximum response time by the RabbitMQ expressed in seconds for a trigger expression.	`10`

Template links

There are no template links in this template.

Discovery rules

Name	Description	Type	Key and additional info
Health Check 3.8.10+ discovery	Specific metrics for the versions: up to and including 3.8.10.	DEPENDENT	rabbitmq.healthcheck.v3810.discovery Preprocessing: - JSONPATH: `$.management_version` - JAVASCRIPT: `The text is too long. Please see the template.`
Health Check 3.8.9- discovery	Specific metrics for the versions: up to and including 3.8.4.	DEPENDENT	rabbitmq.healthcheck.v389.discovery Preprocessing: - JSONPATH: `$.management_version` - JAVASCRIPT: `The text is too long. Please see the template.`
Queues discovery	The metrics for an individual queue.	DEPENDENT	rabbitmq.queues.discovery Filter: AND - {#QUEUE} MATCHESREGEX `{$RABBITMQ.LLD.FILTER.QUEUE.MATCHES}` - {#QUEUE} NOTMATCHESREGEX `{$RABBITMQ.LLD.FILTER.QUEUE.NOT_MATCHES}` - {#NODE} MATCHESREGEX `{$RABBITMQ.CLUSTER.NAME}@{HOST.NAME}`
RabbitMQ process discovery	The discovery of the RabbitMQ summary processes.	DEPENDENT	rabbitmq.proc.discovery Filter: AND - {#NAME} MATCHES_REGEX `{$RABBITMQ.PROCESS_NAME}`

Items collected

Group	Name	Description	Type	Key and additional info
RabbitMQ	RabbitMQ: Get nodes	The HTTP API endpoint that returns metrics from the nodes.	ZABBIX_PASSIVE	web.page.get["{$RABBITMQ.API.SCHEME}://{$RABBITMQ.API.USER}:{$RABBITMQ.API.PASSWORD}@{$RABBITMQ.API.HOST}:{$RABBITMQ.API.PORT}/api/nodes/{$RABBITMQ.CLUSTER.NAME}@{HOST.NAME}?memory=true"] Preprocessing: - REGEX: `\n\s?\n(.*) \1`
RabbitMQ	RabbitMQ: Management plugin version	The version of the management plugin in use.	DEPENDENT	rabbitmq.node.overview.managementversion Preprocessing: - JSONPATH: `$.management_version` - DISCARDUNCHANGED_HEARTBEAT: `1d`
RabbitMQ	RabbitMQ: RabbitMQ version	The version of the RabbitMQ on the node, which processed this request.	DEPENDENT	rabbitmq.node.overview.rabbitmqversion Preprocessing: - JSONPATH: `$.rabbitmq_version` - DISCARDUNCHANGED_HEARTBEAT: `1d`
RabbitMQ	RabbitMQ: Used file descriptors	The descriptors of the used file.	DEPENDENT	rabbitmq.node.fd_used Preprocessing: - JSONPATH: `$.fd_used`
RabbitMQ	RabbitMQ: Free disk space	The current free disk space.	DEPENDENT	rabbitmq.node.disk_free Preprocessing: - JSONPATH: `$.disk_free`
RabbitMQ	RabbitMQ: Memory used	The memory usage expressed in bytes.	DEPENDENT	rabbitmq.node.mem_used Preprocessing: - JSONPATH: `$.mem_used`
RabbitMQ	RabbitMQ: Memory limit	The memory usage with high watermark properties expressed in bytes.	DEPENDENT	rabbitmq.node.mem_limit Preprocessing: - JSONPATH: `$.mem_limit`
RabbitMQ	RabbitMQ: Disk free limit	The free space limit of a disk expressed in bytes.	DEPENDENT	rabbitmq.node.diskfreelimit Preprocessing: - JSONPATH: `$.disk_free_limit`
RabbitMQ	RabbitMQ: Runtime run queue	The average number of Erlang processes waiting to run.	DEPENDENT	rabbitmq.node.run_queue Preprocessing: - JSONPATH: `$.run_queue`
RabbitMQ	RabbitMQ: Sockets used	The number of file descriptors used as sockets.	DEPENDENT	rabbitmq.node.sockets_used Preprocessing: - JSONPATH: `$.sockets_used`
RabbitMQ	RabbitMQ: Sockets available	The file descriptors available for use as sockets.	DEPENDENT	rabbitmq.node.sockets_total Preprocessing: - JSONPATH: `$.sockets_total`
RabbitMQ	RabbitMQ: Number of network partitions	The number of network partitions, which this node "sees".	DEPENDENT	rabbitmq.node.partitions Preprocessing: - JSONPATH: `$.partitions` - JAVASCRIPT: `return JSON.parse(value).length;`
RabbitMQ	RabbitMQ: Is running	It "sees" whether the node is running or not.	DEPENDENT	rabbitmq.node.running Preprocessing: - JSONPATH: `$.running` - BOOLTODECIMAL
RabbitMQ	RabbitMQ: Memory alarm	It checks whether the host has a memory alarm or not.	DEPENDENT	rabbitmq.node.memalarm Preprocessing: - JSONPATH: `$.mem_alarm` - BOOLTO_DECIMAL
RabbitMQ	RabbitMQ: Disk free alarm	It checks whether the node has a disk alarm or not.	DEPENDENT	rabbitmq.node.diskfreealarm Preprocessing: - JSONPATH: `$.disk_free_alarm` - BOOLTODECIMAL
RabbitMQ	RabbitMQ: Uptime	Uptime expressed in milliseconds.	DEPENDENT	rabbitmq.node.uptime Preprocessing: - JSONPATH: `$.uptime` - MULTIPLIER: `0.001`
RabbitMQ	RabbitMQ: Get processes summary	The aggregated data of summary metrics for all processes.	ZABBIX_PASSIVE	proc.get[,,,summary]
RabbitMQ	RabbitMQ: Service ping	-	ZABBIX_PASSIVE	net.tcp.service["{$RABBITMQ.API.SCHEME}","{$RABBITMQ.API.HOST}","{$RABBITMQ.API.PORT}"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
RabbitMQ	RabbitMQ: Service response time	-	ZABBIX_PASSIVE	net.tcp.service.perf["{$RABBITMQ.API.SCHEME}","{$RABBITMQ.API.HOST}","{$RABBITMQ.API.PORT}"]
RabbitMQ	RabbitMQ: Get process data	The summary metrics aggregated by a process {#NAME}.	DEPENDENT	rabbitmq.proc.get[{#NAME}] Preprocessing: - JSONPATH: `$.[?(@["name"]=="{#NAME}")].first()` ⛔️ON_FAIL: `CUSTOM_VALUE -> Failed to retrieve process {#NAME} data`
RabbitMQ	RabbitMQ: Number of running processes	The number of running processes {#NAME}.	DEPENDENT	rabbitmq.proc.num[{#NAME}] Preprocessing: - JSONPATH: `$.processes` ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - DISCARDUNCHANGED_HEARTBEAT: `1h`
RabbitMQ	RabbitMQ: Memory usage (rss)	The summary of resident set size memory used by a process {#NAME} expressed in bytes.	DEPENDENT	rabbitmq.proc.rss[{#NAME}] Preprocessing: - JSONPATH: `$.rss` ⛔️ON_FAIL: `DISCARD_VALUE ->`
RabbitMQ	RabbitMQ: Memory usage (vsize)	The summary of virtual memory used by a process {#NAME} expressed in bytes.	DEPENDENT	rabbitmq.proc.vmem[{#NAME}] Preprocessing: - JSONPATH: `$.vsize` ⛔️ON_FAIL: `DISCARD_VALUE ->`
RabbitMQ	RabbitMQ: Memory usage, %	The percentage of real memory used by a process {#NAME}.	DEPENDENT	rabbitmq.proc.pmem[{#NAME}] Preprocessing: - JSONPATH: `$.pmem` ⛔️ON_FAIL: `DISCARD_VALUE ->`
RabbitMQ	RabbitMQ: CPU utilization	The percentage of the CPU utilization by a process {#NAME}.	ZABBIX_PASSIVE	proc.cpu.util[{#NAME}]
RabbitMQ	RabbitMQ: Healthcheck: local alarms in effect on this node{#SINGLETON}	It responds with a status code `200 OK` if there are no local alarms in effect on the target node. Otherwise, it responds with a status code `503 Service Unavailable`.	ZABBIX_PASSIVE	web.page.get["{$RABBITMQ.API.SCHEME}://{$RABBITMQ.API.USER}:{$RABBITMQ.API.PASSWORD}@{$RABBITMQ.API.HOST}:{$RABBITMQ.API.PORT}/api/health/checks/local-alarms{#SINGLETON}"] Preprocessing: - REGEX: `HTTP\/1\.1\b\s(\d+) \1` - JAVASCRIPT: `switch(value){ case '200': return 1 case '503': return 0 default: 2}` - DISCARDUNCHANGEDHEARTBEAT: `3h`
RabbitMQ	RabbitMQ: Healthcheck: expiration date on the certificates{#SINGLETON}	It checks the expiration date on the certificates for every listener configured to use the Transport Layer Security (TLS). It responds with a status code `200 OK` if all the certificates are valid (have not expired). Otherwise, it responds with a status code `503 Service Unavailable`.	ZABBIX_PASSIVE	web.page.get["{$RABBITMQ.API.SCHEME}://{$RABBITMQ.API.USER}:{$RABBITMQ.API.PASSWORD}@{$RABBITMQ.API.HOST}:{$RABBITMQ.API.PORT}/api/health/checks/certificate-expiration/1/months{#SINGLETON}"] Preprocessing: - REGEX: `HTTP\/1\.1\b\s(\d+) \1` - JAVASCRIPT: `switch(value){ case '200': return 1 case '503': return 0 default: 2}` - DISCARDUNCHANGEDHEARTBEAT: `3h`
RabbitMQ	RabbitMQ: Healthcheck: virtual hosts on this node{#SINGLETON}	It responds with It responds with a status code `200 OK` if all virtual hosts are running on the target node. Otherwise it responds with a status code `503 Service Unavailable`.	ZABBIX_PASSIVE	web.page.get["{$RABBITMQ.API.SCHEME}://{$RABBITMQ.API.USER}:{$RABBITMQ.API.PASSWORD}@{$RABBITMQ.API.HOST}:{$RABBITMQ.API.PORT}/api/health/checks/virtual-hosts{#SINGLETON}"] Preprocessing: - REGEX: `HTTP\/1\.1\b\s(\d+) \1` - JAVASCRIPT: `switch(value){ case '200': return 1 case '503': return 0 default: 2}` - DISCARDUNCHANGEDHEARTBEAT: `3h`
RabbitMQ	RabbitMQ: Healthcheck: classic mirrored queues without synchronized mirrors online{#SINGLETON}	It checks if there are classic mirrored queues without synchronized mirrors online (queues that would potentially lose data if the target node is shut down). It responds with a status code `200 OK` if there are no such classic mirrored queues. Otherwise, it responds with a status code `503 Service Unavailable`.	ZABBIX_PASSIVE	web.page.get["{$RABBITMQ.API.SCHEME}://{$RABBITMQ.API.USER}:{$RABBITMQ.API.PASSWORD}@{$RABBITMQ.API.HOST}:{$RABBITMQ.API.PORT}/api/health/checks/node-is-mirror-sync-critical{#SINGLETON}"] Preprocessing: - REGEX: `HTTP\/1\.1\b\s(\d+) \1` - JAVASCRIPT: `switch(value){ case '200': return 1 case '503': return 0 default: 2}` - DISCARDUNCHANGEDHEARTBEAT: `3h`
RabbitMQ	RabbitMQ: Healthcheck: queues with minimum online quorum{#SINGLETON}	It checks if there are quorum queues with minimum online quorum (queues that would lose their quorum and availability if the target node is shut down). It responds with a status code `200 OK` if there are no such quorum queues. Otherwise, it responds with a status code `503 Service Unavailable`.	ZABBIX_PASSIVE	web.page.get["{$RABBITMQ.API.SCHEME}://{$RABBITMQ.API.USER}:{$RABBITMQ.API.PASSWORD}@{$RABBITMQ.API.HOST}:{$RABBITMQ.API.PORT}/api/health/checks/node-is-quorum-critical{#SINGLETON}"] Preprocessing: - REGEX: `HTTP\/1\.1\b\s(\d+) \1` - JAVASCRIPT: `switch(value){ case '200': return 1 case '503': return 0 default: 2}` - DISCARDUNCHANGEDHEARTBEAT: `3h`
RabbitMQ	RabbitMQ: Healthcheck{#SINGLETON}	It checks whether the RabbitMQ application is running; and whether the channels and queues can be listed successfully; and that no alarms are in effect.	ZABBIX_PASSIVE	web.page.get["{$RABBITMQ.API.SCHEME}://{$RABBITMQ.API.USER}:{$RABBITMQ.API.PASSWORD}@{$RABBITMQ.API.HOST}:{$RABBITMQ.API.PORT}/api/healthchecks/node{#SINGLETON}"] Preprocessing: - REGEX: `\n\s?\n(.*) \1` - JSONPATH: `$.status` - BOOLTODECIMAL ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Queue [{#VHOST}][{#QUEUE}]: Get data	The HTTP API endpoint that returns [{#VHOST}][{#QUEUE}] queue metrics	DEPENDENT	rabbitmq.get_exchanges["{#VHOST}/{#QUEUE}"] Preprocessing: - JSONPATH: `$[?(@.name == "{#QUEUE}" && @.vhost == "{#VHOST}")].first()`
RabbitMQ	RabbitMQ: Queue [{#VHOST}][{#QUEUE}]: Messages	The count of total messages in the queue.	DEPENDENT	rabbitmq.queue.messages["{#VHOST}/{#QUEUE}"] Preprocessing: - JSONPATH: `$.messages`
RabbitMQ	RabbitMQ: Queue [{#VHOST}][{#QUEUE}]: Messages per second	The count of total messages per second in the queue.	DEPENDENT	rabbitmq.queue.messages.rate["{#VHOST}/{#QUEUE}"] Preprocessing: - JSONPATH: `$.messages_details.rate`
RabbitMQ	RabbitMQ: Queue [{#VHOST}][{#QUEUE}]: Consumers	The number of consumers.	DEPENDENT	rabbitmq.queue.consumers["{#VHOST}/{#QUEUE}"] Preprocessing: - JSONPATH: `$.consumers`
RabbitMQ	RabbitMQ: Queue [{#VHOST}][{#QUEUE}]: Memory	The bytes of memory consumed by the Erlang process associated with the queue, including stack, heap and internal structures.	DEPENDENT	rabbitmq.queue.memory["{#VHOST}/{#QUEUE}"] Preprocessing: - JSONPATH: `$.memory`
RabbitMQ	RabbitMQ: Queue [{#VHOST}][{#QUEUE}]: Messages ready	The number of messages ready to be delivered to clients.	DEPENDENT	rabbitmq.queue.messages_ready["{#VHOST}/{#QUEUE}"] Preprocessing: - JSONPATH: `$.messages_ready`
RabbitMQ	RabbitMQ: Queue [{#VHOST}][{#QUEUE}]: Messages ready per second	The number of messages per second ready to be delivered to clients.	DEPENDENT	rabbitmq.queue.messages_ready.rate["{#VHOST}/{#QUEUE}"] Preprocessing: - JSONPATH: `$.messages_ready_details.rate`
RabbitMQ	RabbitMQ: Queue [{#VHOST}][{#QUEUE}]: Messages unacknowledged	The number of messages delivered to clients but not yet acknowledged.	DEPENDENT	rabbitmq.queue.messages_unacknowledged["{#VHOST}/{#QUEUE}"] Preprocessing: - JSONPATH: `$.messages_unacknowledged`
RabbitMQ	RabbitMQ: Queue [{#VHOST}][{#QUEUE}]: Messages unacknowledged per second	The number of messages per second delivered to clients but not yet acknowledged.	DEPENDENT	rabbitmq.queue.messages_unacknowledged.rate["{#VHOST}/{#QUEUE}"] Preprocessing: - JSONPATH: `$.messages_unacknowledged_details.rate`
RabbitMQ	RabbitMQ: Queue [{#VHOST}][{#QUEUE}]: Messages acknowledged	The number of messages delivered to clients and acknowledged.	DEPENDENT	rabbitmq.queue.messages.ack["{#VHOST}/{#QUEUE}"] Preprocessing: - JSONPATH: `$.message_stats.ack` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Queue [{#VHOST}][{#QUEUE}]: Messages acknowledged per second	The number of messages (per second) delivered to clients and acknowledged.	DEPENDENT	rabbitmq.queue.messages.ack.rate["{#VHOST}/{#QUEUE}"] Preprocessing: - JSONPATH: `$.message_stats.ack_details.rate` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Queue [{#VHOST}][{#QUEUE}]: Messages delivered	The count of messages delivered to consumers in acknowledgement mode.	DEPENDENT	rabbitmq.queue.messages.deliver["{#VHOST}/{#QUEUE}"] Preprocessing: - JSONPATH: `$.message_stats.deliver` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Queue [{#VHOST}][{#QUEUE}]: Messages delivered per second	The count of messages (per second) delivered to consumers in acknowledgement mode.	DEPENDENT	rabbitmq.queue.messages.deliver.rate["{#VHOST}/{#QUEUE}"] Preprocessing: - JSONPATH: `$.message_stats.deliver_details.rate` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Queue [{#VHOST}][{#QUEUE}]: Sum of messages delivered	The sum of messages delivered to consumers: in acknowledgement mode and in no-acknowledgement mode; delivered to consumers in response to `basic.get`: in acknowledgement mode and in no-acknowledgement mode.	DEPENDENT	rabbitmq.queue.messages.deliverget["{#VHOST}/{#QUEUE}"] Preprocessing: - JSONPATH: `$.message_stats.deliver_get` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Queue [{#VHOST}][{#QUEUE}]: Sum of messages delivered per second	The rate of delivery per second. The sum of messages delivered (per second) to consumers: in acknowledgement mode and in no-acknowledgement mode; delivered to consumers in response to `basic.get`: in acknowledgement mode and in no-acknowledgement mode.	DEPENDENT	rabbitmq.queue.messages.deliverget.rate["{#VHOST}/{#QUEUE}"] Preprocessing: - JSONPATH: `$.message_stats.deliver_get_details.rate` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Queue [{#VHOST}][{#QUEUE}]: Messages published	The count of published messages.	DEPENDENT	rabbitmq.queue.messages.publish["{#VHOST}/{#QUEUE}"] Preprocessing: - JSONPATH: `$.message_stats.publish` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Queue [{#VHOST}][{#QUEUE}]: Messages published per second	The rate of published messages per second.	DEPENDENT	rabbitmq.queue.messages.publish.rate["{#VHOST}/{#QUEUE}"] Preprocessing: - JSONPATH: `$.message_stats.publish_details.rate` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Queue [{#VHOST}][{#QUEUE}]: Messages redelivered	The count of subset of messages in the `deliver_get` queue with the `redelivered` flag set.	DEPENDENT	rabbitmq.queue.messages.redeliver["{#VHOST}/{#QUEUE}"] Preprocessing: - JSONPATH: `$.message_stats.redeliver` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
RabbitMQ	RabbitMQ: Queue [{#VHOST}][{#QUEUE}]: Messages redelivered per second	The rate of messages redelivered per second.	DEPENDENT	rabbitmq.queue.messages.redeliver.rate["{#VHOST}/{#QUEUE}"] Preprocessing: - JSONPATH: `$.message_stats.redeliver_details.rate` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Zabbix raw items	RabbitMQ: Get node overview	The HTTP API endpoint that returns cluster-wide metrics.	ZABBIX_PASSIVE	web.page.get["{$RABBITMQ.API.SCHEME}://{$RABBITMQ.API.USER}:{$RABBITMQ.API.PASSWORD}@{$RABBITMQ.API.HOST}:{$RABBITMQ.API.PORT}/api/overview"] Preprocessing: - REGEX: `\n\s?\n(.*) \1`
Zabbix raw items	RabbitMQ: Get queues	The HTTP API endpoint that returns metrics of the queues metrics.	ZABBIX_PASSIVE	web.page.get["{$RABBITMQ.API.SCHEME}://{$RABBITMQ.API.USER}:{$RABBITMQ.API.PASSWORD}@{$RABBITMQ.API.HOST}:{$RABBITMQ.API.PORT}/api/queues"] Preprocessing: - REGEX: `\n\s?\n(.*) \1`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
RabbitMQ: Version has changed	The RabbitMQ version has changed. Acknowledge (Ack) to close manually.	`last(/RabbitMQ node by Zabbix agent/rabbitmq.node.overview.rabbitmq_version,#1)<>last(/RabbitMQ node by Zabbix agent/rabbitmq.node.overview.rabbitmq_version,#2) and length(last(/RabbitMQ node by Zabbix agent/rabbitmq.node.overview.rabbitmq_version))>0`	INFO	Manual close: YES
RabbitMQ: Number of network partitions is too high	For more details see Detecting Network Partitions.	`min(/RabbitMQ node by Zabbix agent/rabbitmq.node.partitions,5m)>0`	WARNING
RabbitMQ: Memory alarm	For more details see Memory Alarms.	`last(/RabbitMQ node by Zabbix agent/rabbitmq.node.mem_alarm)=1`	AVERAGE
RabbitMQ: Free disk space alarm	For more details see Free Disk Space Alarms.	`last(/RabbitMQ node by Zabbix agent/rabbitmq.node.disk_free_alarm)=1`	AVERAGE
RabbitMQ: Host has been restarted	The host uptime is less than 10 minutes.	`last(/RabbitMQ node by Zabbix agent/rabbitmq.node.uptime)<10m`	INFO	Manual close: YES
RabbitMQ: Process is not running	-	`last(/RabbitMQ node by Zabbix agent/rabbitmq.proc.num[{#NAME}])=0`	HIGH
RabbitMQ: Service is down	-	`last(/RabbitMQ node by Zabbix agent/net.tcp.service["{$RABBITMQ.API.SCHEME}","{$RABBITMQ.API.HOST}","{$RABBITMQ.API.PORT}"])=0 and last(/RabbitMQ node by Zabbix agent/rabbitmq.proc.num[{#NAME}])>0`	AVERAGE	Manual close: YES
RabbitMQ: Failed to fetch nodes data	Zabbix has not received any data for items for the last 30 minutes.	`nodata(/RabbitMQ node by Zabbix agent/web.page.get["{$RABBITMQ.API.SCHEME}://{$RABBITMQ.API.USER}:{$RABBITMQ.API.PASSWORD}@{$RABBITMQ.API.HOST}:{$RABBITMQ.API.PORT}/api/nodes/{$RABBITMQ.CLUSTER.NAME}@{HOST.NAME}?memory=true"],30m)=1 and last(/RabbitMQ node by Zabbix agent/rabbitmq.proc.num[{#NAME}])>0`	WARNING	Manual close: YES Depends on: - RabbitMQ: Process is not running
RabbitMQ: Node is not running	The RabbitMQ node is not running.	`max(/RabbitMQ node by Zabbix agent/rabbitmq.node.running,5m)=0 and last(/RabbitMQ node by Zabbix agent/rabbitmq.proc.num[{#NAME}])>0`	AVERAGE	Depends on: - RabbitMQ: Service is down
RabbitMQ: Service response time is too high	-	`min(/RabbitMQ node by Zabbix agent/net.tcp.service.perf["{$RABBITMQ.API.SCHEME}","{$RABBITMQ.API.HOST}","{$RABBITMQ.API.PORT}"],5m)>{$RABBITMQ.RESPONSE_TIME.MAX.WARN} and last(/RabbitMQ node by Zabbix agent/rabbitmq.proc.num[{#NAME}])>0`	WARNING	Manual close: YES Depends on: - RabbitMQ: Service is down
RabbitMQ: There are active alarms in the node	It checks the active alarms in the nodes via API. This is the default API endpoint path: http://{HOST.CONN}:{$RABBITMQ.API.PORT}/api/index.html.	`last(/RabbitMQ node by Zabbix agent/web.page.get["{$RABBITMQ.API.SCHEME}://{$RABBITMQ.API.USER}:{$RABBITMQ.API.PASSWORD}@{$RABBITMQ.API.HOST}:{$RABBITMQ.API.PORT}/api/health/checks/local-alarms{#SINGLETON}"])=0`	AVERAGE
RabbitMQ: There are valid TLS certificates expiring in the next month	It checks if there are valid TLS certificates expiring in the next month. This is the default API endpoint path: http://{HOST.CONN}:{$RABBITMQ.API.PORT}/api/index.html.	`last(/RabbitMQ node by Zabbix agent/web.page.get["{$RABBITMQ.API.SCHEME}://{$RABBITMQ.API.USER}:{$RABBITMQ.API.PASSWORD}@{$RABBITMQ.API.HOST}:{$RABBITMQ.API.PORT}/api/health/checks/certificate-expiration/1/months{#SINGLETON}"])=0`	AVERAGE
RabbitMQ: There are not running virtual hosts	It checks if there are not running virtual hosts via API. This is the default API endpoint path: http://{HOST.CONN}:{$RABBITMQ.API.PORT}/api/index.html.	`last(/RabbitMQ node by Zabbix agent/web.page.get["{$RABBITMQ.API.SCHEME}://{$RABBITMQ.API.USER}:{$RABBITMQ.API.PASSWORD}@{$RABBITMQ.API.HOST}:{$RABBITMQ.API.PORT}/api/health/checks/virtual-hosts{#SINGLETON}"])=0`	AVERAGE
RabbitMQ: There are queues that could potentially lose data if this node goes offline.	It checks whether there are queues that could potentially lose data if this node goes offline via API. This is the default API endpoint path: http://{HOST.CONN}:{$RABBITMQ.API.PORT}/api/index.html.	`last(/RabbitMQ node by Zabbix agent/web.page.get["{$RABBITMQ.API.SCHEME}://{$RABBITMQ.API.USER}:{$RABBITMQ.API.PASSWORD}@{$RABBITMQ.API.HOST}:{$RABBITMQ.API.PORT}/api/health/checks/node-is-mirror-sync-critical{#SINGLETON}"])=0`	AVERAGE
RabbitMQ: There are queues that would lose their quorum and availability if this node is shut down.	It checks if there are queues that could potentially lose data if this node goes offline via API. This is the default API endpoint path: http://{HOST.CONN}:{$RABBITMQ.API.PORT}/api/index.html.	`last(/RabbitMQ node by Zabbix agent/web.page.get["{$RABBITMQ.API.SCHEME}://{$RABBITMQ.API.USER}:{$RABBITMQ.API.PASSWORD}@{$RABBITMQ.API.HOST}:{$RABBITMQ.API.PORT}/api/health/checks/node-is-quorum-critical{#SINGLETON}"])=0`	AVERAGE
RabbitMQ: Node healthcheck failed	For more details see Health Checks.	`last(/RabbitMQ node by Zabbix agent/web.page.get["{$RABBITMQ.API.SCHEME}://{$RABBITMQ.API.USER}:{$RABBITMQ.API.PASSWORD}@{$RABBITMQ.API.HOST}:{$RABBITMQ.API.PORT}/api/healthchecks/node{#SINGLETON}"])=0`	AVERAGE
RabbitMQ: Too many messages in queue [{#VHOST}][{#QUEUE}]	-	`min(/RabbitMQ node by Zabbix agent/rabbitmq.queue.messages["{#VHOST}/{#QUEUE}"],5m)>{$RABBITMQ.MESSAGES.MAX.WARN:"{#QUEUE}"}`	WARNING

Feedback

Please report any issues with the template at https://support.zabbix.com.

You can also provide feedback, discuss the template, or ask for help at ZABBIX forums.

app

app_proxmox

View README Download JSON

Proxmox VE by HTTP

Overview

For Zabbix version: 6.2 and higher
Proxmox VE uses a REST like API. The concept is described in (Resource Oriented Architecture - ROA).

We choose JSON as primary data format, and the whole API is formally defined using JSON Schema.

You can explore the API documentation at http://pve.proxmox.com/pve-docs/api-viewer/index.html

Setup

Create an API token for the monitoring user. Important note: for security reasons, it is recommended to create a separate user (Datacenter - Permissions).

For the created API token and user, provide the necessary access levels:

Check: ["perm","/",["Sys.Audit"]]
Check: ["perm","/nodes/{node}",["Sys.Audit"]]
Check: ["perm","/vms/{vmid}",["VM.Audit"]]

Copy the resulting Token ID and Secret into host macros.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$PVE.CPU.PUSE.MAX.WARN}	Maximum used CPU in percentage.	`90`
{$PVE.LXC.CPU.PUSE.MAX.WARN}	Maximum used CPU in percentage.	`90`
{$PVE.LXC.MEMORY.PUSE.MAX.WARN}	Maximum used memory in percentage.	`90`
{$PVE.MEMORY.PUSE.MAX.WARN}	Maximum used memory in percentage.	`90`
{$PVE.ROOT.PUSE.MAX.WARN}	Maximum used root space in percentage.	`90`
{$PVE.STORAGE.PUSE.MAX.WARN}	Maximum used storage space in percentage.	`90`
{$PVE.SWAP.PUSE.MAX.WARN}	Maximum used swap space in percentage.	`90`
{$PVE.TOKEN.ID}	API tokens allow stateless access to most parts of the REST API by another system, software or API client.	`USER@REALM!TOKENID`
{$PVE.TOKEN.SECRET}	Secret key.	`xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx`
{$PVE.URL.PORT}	The API uses the HTTPS protocol and the server listens to port 8006 by default.	`8006`
{$PVE.VM.CPU.PUSE.MAX.WARN}	Maximum used CPU in percentage.	`90`
{$PVE.VM.MEMORY.PUSE.MAX.WARN}	Maximum used memory in percentage.	`90`

Template links

There are no template links in this template.

Discovery rules

Name	Description	Type	Key and additional info
Cluster discovery	-	DEPENDENT	proxmox.cluster.discovery Filter: AND - {#RESOURCE.TYPE} MATCHES_REGEX `^cluster$`
LXC discovery	-	DEPENDENT	proxmox.lxc.discovery Filter: AND - {#RESOURCE.TYPE} MATCHES_REGEX `^lxc$`
Node discovery	-	DEPENDENT	proxmox.node.discovery Filter: AND - {#RESOURCE.TYPE} MATCHES_REGEX `^node$`
QEMU discovery	-	DEPENDENT	proxmox.qemu.discovery Filter: AND - {#RESOURCE.TYPE} MATCHES_REGEX `^qemu$`
Storage discovery	-	DEPENDENT	proxmox.storage.discovery Filter: AND - {#RESOURCE.TYPE} MATCHES_REGEX `^storage$`

Items collected

Group	Name	Description	Type	Key and additional info
CPU	Proxmox: Node [{#NODE.NAME}]: CPU, usage	CPU usage.	DEPENDENT	proxmox.node.cpu[{#NODE.NAME}] Preprocessing: - JSONPATH: `$.cpu` - MULTIPLIER: `100` - DISCARDUNCHANGEDHEARTBEAT: `10m`
CPU	Proxmox: Node [{#NODE.NAME}]: CPU, loadavg	CPU average load.	DEPENDENT	proxmox.node.loadavg[{#NODE.NAME}] Preprocessing: - JSONPATH: `$.loadavg` - MULTIPLIER: `100` - DISCARDUNCHANGEDHEARTBEAT: `10m`
CPU	Proxmox: Node [{#NODE.NAME}]: CPU, iowait	CPU iowait time.	DEPENDENT	proxmox.node.iowait[{#NODE.NAME}] Preprocessing: - JSONPATH: `$.iowait` - MULTIPLIER: `100` - DISCARDUNCHANGEDHEARTBEAT: `10m`
CPU	Proxmox: VM [{#NODE.NAME}/{#QEMU.NAME} ({#QEMU.ID})]: CPU usage	CPU load.	DEPENDENT	proxmox.qemu.cpu[{#QEMU.ID}] Preprocessing: - JSONPATH: `$.data.cpu` - MULTIPLIER: `100` - DISCARDUNCHANGEDHEARTBEAT: `10m`
CPU	Proxmox: LXC [{#NODE.NAME}/{#LXC.NAME} ({#LXC.ID})]: CPU usage	CPU load.	DEPENDENT	proxmox.lxc.cpu[{#LXC.ID}] Preprocessing: - JSONPATH: `$.data.cpu` - MULTIPLIER: `100` - DISCARDUNCHANGEDHEARTBEAT: `10m`
General	Proxmox: Node [{#NODE.NAME}]: Time zone	Time zone.	DEPENDENT	proxmox.node.timezone[{#NODE.NAME}] Preprocessing: - JSONPATH: `$.data.timezone` - DISCARDUNCHANGEDHEARTBEAT: `12h`
General	Proxmox: Node [{#NODE.NAME}]: Localtime	Seconds since 1970-01-01 00:00:00 (local time).	DEPENDENT	proxmox.node.localtime[{#NODE.NAME}] Preprocessing: - JSONPATH: `$.data.localtime`
General	Proxmox: Node [{#NODE.NAME}]: Time	Seconds since 1970-01-01 00:00:00 UTC.	DEPENDENT	proxmox.node.utctime[{#NODE.NAME}] Preprocessing: - JSONPATH: `$.data.time`
Inventory	Proxmox: Node [{#NODE.NAME}]: PVE version	PVE manager version.	DEPENDENT	proxmox.node.pveversion[{#NODE.NAME}] Preprocessing: - JSONPATH: `$.data.pveversion` - DISCARDUNCHANGEDHEARTBEAT: `1d`
Inventory	Proxmox: Node [{#NODE.NAME}]: Kernel version	Kernel version info.	DEPENDENT	proxmox.node.kernelversion[{#NODE.NAME}] Preprocessing: - JSONPATH: `$.data.kversion` - DISCARDUNCHANGEDHEARTBEAT: `1d`
Memory	Proxmox: Node [{#NODE.NAME}]: Memory, used	Memory usage.	DEPENDENT	proxmox.node.memused[{#NODE.NAME}] Preprocessing: - JSONPATH: `$.memused` - DISCARDUNCHANGEDHEARTBEAT: `10m`
Memory	Proxmox: Node [{#NODE.NAME}]: Memory, total	Memory total.	DEPENDENT	proxmox.node.memtotal[{#NODE.NAME}] Preprocessing: - JSONPATH: `$.memtotal` - DISCARDUNCHANGEDHEARTBEAT: `10m`
Memory	Proxmox: VM [{#NODE.NAME}/{#QEMU.NAME} ({#QEMU.ID})]: Memory usage	Used memory in Bytes.	DEPENDENT	proxmox.qemu.mem[{#QEMU.ID}] Preprocessing: - JSONPATH: `$.data.mem` - DISCARDUNCHANGEDHEARTBEAT: `10m`
Memory	Proxmox: VM [{#NODE.NAME}/{#QEMU.NAME} ({#QEMU.ID})]: Memory total	Total memory in Bytes.	DEPENDENT	proxmox.qemu.maxmem[{#QEMU.ID}] Preprocessing: - JSONPATH: `$.data.maxmem` - DISCARDUNCHANGEDHEARTBEAT: `10m`
Memory	Proxmox: LXC [{#NODE.NAME}/{#LXC.NAME} ({#LXC.ID})]: Memory usage	Used memory in Bytes.	DEPENDENT	proxmox.lxc.mem[{#LXC.ID}] Preprocessing: - JSONPATH: `$.data.mem` - DISCARDUNCHANGEDHEARTBEAT: `10m`
Memory	Proxmox: LXC [{#NODE.NAME}/{#LXC.NAME} ({#LXC.ID})]: Memory total	Total memory in Bytes.	DEPENDENT	proxmox.lxc.maxmem[{#LXC.ID}] Preprocessing: - JSONPATH: `$.data.maxmem` - DISCARDUNCHANGEDHEARTBEAT: `10m`
Network interfaces	Proxmox: Node [{#NODE.NAME}]: Outgoing data, rate	Network usage.	DEPENDENT	proxmox.node.netout[{#NODE.NAME}] Preprocessing: - JSONPATH: `$.netout` - MULTIPLIER: `8` - DISCARDUNCHANGEDHEARTBEAT: `10m`
Network interfaces	Proxmox: Node [{#NODE.NAME}]: Incoming data, rate	Network usage.	DEPENDENT	proxmox.node.netin[{#NODE.NAME}] Preprocessing: - JSONPATH: `$.netin` - MULTIPLIER: `8` - DISCARDUNCHANGEDHEARTBEAT: `10m`
Network interfaces	Proxmox: VM [{#NODE.NAME}/{#QEMU.NAME} ({#QEMU.ID})]: Incoming data, rate	Incoming data rate.	DEPENDENT	proxmox.qemu.netin[{#QEMU.ID}] Preprocessing: - JSONPATH: `$.data.netin` - CHANGEPERSECOND - MULTIPLIER: `8` - DISCARDUNCHANGEDHEARTBEAT: `10m`
Network interfaces	Proxmox: VM [{#NODE.NAME}/{#QEMU.NAME} ({#QEMU.ID})]: Outgoing data, rate	Outgoing data rate.	DEPENDENT	proxmox.qemu.netout[{#QEMU.ID}] Preprocessing: - JSONPATH: `$.data.netout` - CHANGEPERSECOND - MULTIPLIER: `8` - DISCARDUNCHANGEDHEARTBEAT: `10m`
Network interfaces	Proxmox: LXC [{#NODE.NAME}/{#LXC.NAME} ({#LXC.ID})]: Incoming data, rate	Incoming data rate.	DEPENDENT	proxmox.lxc.netin[{#LXC.ID}] Preprocessing: - JSONPATH: `$.data.netin` - CHANGEPERSECOND - MULTIPLIER: `8` - DISCARDUNCHANGEDHEARTBEAT: `10m`
Network interfaces	Proxmox: LXC [{#NODE.NAME}/{#LXC.NAME} ({#LXC.ID})]: Outgoing data, rate	Outgoing data rate.	DEPENDENT	proxmox.lxc.netout[{#LXC.ID}] Preprocessing: - JSONPATH: `$.data.netout` - CHANGEPERSECOND - MULTIPLIER: `8` - DISCARDUNCHANGEDHEARTBEAT: `10m`
Status	Proxmox: API service status	Get API service status.	SCRIPT	proxmox.api.available Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `12h` Expression: `The text is too long. Please see the template.`
Status	Proxmox: Cluster [{#RESOURCE.NAME}]: Quorate	Indicates if there is a majority of nodes online to make decisions.	DEPENDENT	proxmox.cluster.quorate[{#RESOURCE.NAME}] Preprocessing: - JSONPATH: `$.data.[?(@.name == '{#RESOURCE.NAME}' && @.type == 'cluster')].quorate.first()` - DISCARDUNCHANGEDHEARTBEAT: `10m`
Status	Proxmox: Node [{#NODE.NAME}]: Status	Indicates if the node is online or offline.	DEPENDENT	proxmox.node.online[{#NODE.NAME}] Preprocessing: - JSONPATH: `$.data.[?(@.name == '{#NODE.NAME}' && @.type == 'node')].online.first()` - DISCARDUNCHANGEDHEARTBEAT: `10m`
Status	Proxmox: Node [{#NODE.NAME}]: Uptime	System uptime in 'N days, hh:mm:ss' format.	DEPENDENT	proxmox.node.uptime[{#NODE.NAME}] Preprocessing: - JSONPATH: `$.data.uptime`
Status	Proxmox: VM [{#NODE.NAME}/{#QEMU.NAME} ({#QEMU.ID})]: Uptime	System uptime in 'N days, hh:mm:ss' format.	DEPENDENT	proxmox.qemu.uptime[{#QEMU.ID}] Preprocessing: - JSONPATH: `$.data.uptime`
Status	Proxmox: VM [{#NODE.NAME}/{#QEMU.NAME} ({#QEMU.ID})]: Status	-	DEPENDENT	proxmox.qemu.vmstatus[{#QEMU.ID}] Preprocessing: - JSONPATH: `$.data.status`
Status	Proxmox: LXC [{#NODE.NAME}/{#LXC.NAME} ({#LXC.ID})]: Uptime	System uptime in 'N days, hh:mm:ss' format.	DEPENDENT	proxmox.lxc.uptime[{#LXC.ID}] Preprocessing: - JSONPATH: `$.data.uptime`
Status	Proxmox: LXC [{#NODE.NAME}/{#LXC.NAME} ({#LXC.ID})]: Status	-	DEPENDENT	proxmox.lxc.vmstatus[{#LXC.ID}] Preprocessing: - JSONPATH: `$.data.status`
Storage	Proxmox: Node [{#NODE.NAME}]: Root filesystem, used	Root filesystem usage.	DEPENDENT	proxmox.node.rootused[{#NODE.NAME}] Preprocessing: - JSONPATH: `$.rootused` - DISCARDUNCHANGEDHEARTBEAT: `10m`
Storage	Proxmox: Node [{#NODE.NAME}]: Root filesystem, total	Root filesystem total.	DEPENDENT	proxmox.node.roottotal[{#NODE.NAME}] Preprocessing: - JSONPATH: `$.roottotal` - DISCARDUNCHANGEDHEARTBEAT: `10m`
Storage	Proxmox: Node [{#NODE.NAME}]: Swap filesystem, total	Swap total.	DEPENDENT	proxmox.node.swaptotal[{#NODE.NAME}] Preprocessing: - JSONPATH: `$.swaptotal` - DISCARDUNCHANGEDHEARTBEAT: `10m`
Storage	Proxmox: Node [{#NODE.NAME}]: Swap filesystem, used	Swap used.	DEPENDENT	proxmox.node.swapused[{#NODE.NAME}] Preprocessing: - JSONPATH: `$.swapused` - DISCARDUNCHANGEDHEARTBEAT: `10m`
Storage	Proxmox: Storage [{#NODE.NAME}/{#STORAGE.NAME}]: Type	More specific type, if available.	DEPENDENT	proxmox.node.plugintype[{#NODE.NAME},{#STORAGE.NAME}] Preprocessing: - JSONPATH: `$.data[?(@.id == "storage/{#NODE.NAME}/{#STORAGE.NAME}")].plugintype.first()` - DISCARDUNCHANGEDHEARTBEAT: `12h`
Storage	Proxmox: Storage [{#NODE.NAME}/{#STORAGE.NAME}]: Size	Storage size in bytes.	DEPENDENT	proxmox.node.maxdisk[{#NODE.NAME},{#STORAGE.NAME}] Preprocessing: - JSONPATH: `$.data[?(@.id == "storage/{#NODE.NAME}/{#STORAGE.NAME}")].maxdisk.first()` - DISCARDUNCHANGEDHEARTBEAT: `10m`
Storage	Proxmox: Storage [{#NODE.NAME}/{#STORAGE.NAME}]: Content	Allowed storage content types.	DEPENDENT	proxmox.node.content[{#NODE.NAME},{#STORAGE.NAME}] Preprocessing: - JSONPATH: `$.data[?(@.id == "storage/{#NODE.NAME}/{#STORAGE.NAME}")].content.first()` - DISCARDUNCHANGEDHEARTBEAT: `12h`
Storage	Proxmox: Storage [{#NODE.NAME}/{#STORAGE.NAME}]: Used	Used disk space in bytes.	DEPENDENT	proxmox.node.disk[{#NODE.NAME},{#STORAGE.NAME}] Preprocessing: - JSONPATH: `$.data[?(@.id == "storage/{#NODE.NAME}/{#STORAGE.NAME}")].disk.first()` - DISCARDUNCHANGEDHEARTBEAT: `10m`
Storage	Proxmox: VM [{#NODE.NAME}/{#QEMU.NAME} ({#QEMU.ID})]: Disk write, rate	Disk write.	DEPENDENT	proxmox.qemu.diskwrite[{#QEMU.ID}] Preprocessing: - JSONPATH: `$.data.diskwrite` - CHANGEPERSECOND - DISCARDUNCHANGEDHEARTBEAT: `10m`
Storage	Proxmox: VM [{#NODE.NAME}/{#QEMU.NAME} ({#QEMU.ID})]: Disk read, rate	Disk read.	DEPENDENT	proxmox.qemu.diskread[{#QEMU.ID}] Preprocessing: - JSONPATH: `$.data.diskread` - CHANGEPERSECOND - DISCARDUNCHANGEDHEARTBEAT: `10m`
Storage	Proxmox: LXC [{#NODE.NAME}/{#LXC.NAME} ({#LXC.ID})]: Disk write, rate	Disk write.	DEPENDENT	proxmox.lxc.diskwrite[{#LXC.ID}] Preprocessing: - JSONPATH: `$.data.diskwrite` - CHANGEPERSECOND - DISCARDUNCHANGEDHEARTBEAT: `10m`
Storage	Proxmox: LXC [{#NODE.NAME}/{#LXC.NAME} ({#LXC.ID})]: Disk read, rate	Disk read.	DEPENDENT	proxmox.lxc.diskread[{#LXC.ID}] Preprocessing: - JSONPATH: `$.data.diskread` - CHANGEPERSECOND - DISCARDUNCHANGEDHEARTBEAT: `10m`
Zabbix raw items	Proxmox: Get cluster resources	Resources index.	HTTP_AGENT	proxmox.cluster.resources Preprocessing: - CHECKNOTSUPPORTED ⛔️ON_FAIL: `CUSTOM_VALUE -> Error getting data`
Zabbix raw items	Proxmox: Get cluster status	Get cluster status information.	HTTP_AGENT	proxmox.cluster.status Preprocessing: - CHECKNOTSUPPORTED ⛔️ON_FAIL: `CUSTOM_VALUE -> Error getting data`
Zabbix raw items	Proxmox: Node [{#NODE.NAME}]: Status	Read node status.	HTTP_AGENT	proxmox.node.status[{#NODE.NAME}]
Zabbix raw items	Proxmox: Node [{#NODE.NAME}]: RRD statistics	Read node RRD statistics.	HTTP_AGENT	proxmox.node.rrd[{#NODE.NAME}] Preprocessing: - JAVASCRIPT: `var rrd_data = JSON.parse(value).data; return JSON.stringify(rrd_data[rrd_data.length - 2])`
Zabbix raw items	Proxmox: Node [{#NODE.NAME}]: Time	Read server time and time zone settings.	HTTP_AGENT	proxmox.node.time[{#NODE.NAME}]
Zabbix raw items	Proxmox: VM [{#NODE.NAME}/{#QEMU.NAME}]: Status	Read VM status.	HTTP_AGENT	proxmox.qemu.status[{#QEMU.ID}]
Zabbix raw items	Proxmox: LXC [{#LXC.NAME}/{#LXC.NAME}]: Status	Read LXC status.	HTTP_AGENT	proxmox.lxc.status[{#LXC.ID}]

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Proxmox: Node [{#NODE.NAME}] high CPU usage	CPU usage.	`min(/Proxmox VE by HTTP/proxmox.node.cpu[{#NODE.NAME}],5m) > {$PVE.CPU.PUSE.MAX.WARN:"{#NODE.NAME}"}`	WARNING
Proxmox: VM [{#NODE.NAME}/{#QEMU.NAME} ({#QEMU.ID})] high CPU usage	CPU usage.	`min(/Proxmox VE by HTTP/proxmox.qemu.cpu[{#QEMU.ID}],5m) > {$PVE.VM.CPU.PUSE.MAX.WARN:"{#QEMU.ID}"}`	WARNING
Proxmox: LXC [{#NODE.NAME}/{#LXC.NAME} ({#LXC.ID})] high CPU usage	CPU usage.	`min(/Proxmox VE by HTTP/proxmox.lxc.cpu[{#LXC.ID}],5m) > {$PVE.LXC.CPU.PUSE.MAX.WARN:"{#LXC.ID}"}`	WARNING
Proxmox: Node [{#NODE.NAME}]: PVE manager has changed	Firmware version has changed. Ack to close	`last(/Proxmox VE by HTTP/proxmox.node.pveversion[{#NODE.NAME}],#1)<>last(/Proxmox VE by HTTP/proxmox.node.pveversion[{#NODE.NAME}],#2) and length(last(/Proxmox VE by HTTP/proxmox.node.pveversion[{#NODE.NAME}]))>0`	INFO	Manual close: YES
Proxmox: Node [{#NODE.NAME}]: Kernel version has changed	Firmware version has changed. Ack to close	`last(/Proxmox VE by HTTP/proxmox.node.kernelversion[{#NODE.NAME}],#1)<>last(/Proxmox VE by HTTP/proxmox.node.kernelversion[{#NODE.NAME}],#2) and length(last(/Proxmox VE by HTTP/proxmox.node.kernelversion[{#NODE.NAME}]))>0`	INFO	Manual close: YES
Proxmox: Node [{#NODE.NAME}] high memory usage	Memory usage.	`min(/Proxmox VE by HTTP/proxmox.node.memused[{#NODE.NAME}],5m) / last(/Proxmox VE by HTTP/proxmox.node.memtotal[{#NODE.NAME}]) * 100 >{$PVE.MEMORY.PUSE.MAX.WARN:"{#NODE.NAME}"}`	WARNING
Proxmox: VM [{#NODE.NAME}/{#QEMU.NAME} ({#QEMU.ID})] high memory usage	Memory usage.	`min(/Proxmox VE by HTTP/proxmox.qemu.mem[{#QEMU.ID}],5m) / last(/Proxmox VE by HTTP/proxmox.qemu.maxmem[{#QEMU.ID}]) * 100 >{$PVE.VM.MEMORY.PUSE.MAX.WARN:"{#QEMU.ID}"}`	WARNING
Proxmox: LXC [{#NODE.NAME}/{#LXC.NAME} ({#LXC.ID})] high memory usage	Memory usage.	`min(/Proxmox VE by HTTP/proxmox.lxc.mem[{#LXC.ID}],5m) / last(/Proxmox VE by HTTP/proxmox.lxc.maxmem[{#LXC.ID}]) * 100 >{$PVE.LXC.MEMORY.PUSE.MAX.WARN:"{#LXC.ID}"}`	WARNING
Proxmox: API service not available	The API service is not available. Check your network and authorization settings.	`last(/Proxmox VE by HTTP/proxmox.api.available) <> 200`	HIGH
Proxmox: Cluster [{#RESOURCE.NAME}] not quorum	Proxmox VE use a quorum-based technique to provide a consistent state among all cluster nodes.	`last(/Proxmox VE by HTTP/proxmox.cluster.quorate[{#RESOURCE.NAME}]) <> 1`	HIGH
Proxmox: Node [{#NODE.NAME}] offline	Node offline.	`last(/Proxmox VE by HTTP/proxmox.node.online[{#NODE.NAME}]) <> 1`	HIGH
Proxmox: Node [{#NODE.NAME}]: has been restarted	Uptime is less than 10 minutes.	`last(/Proxmox VE by HTTP/proxmox.node.uptime[{#NODE.NAME}])<10m`	INFO	Manual close: YES Depends on: - Proxmox: Node [{#NODE.NAME}] offline
Proxmox: VM [{#NODE.NAME}/{#QEMU.NAME}]: has been restarted	Uptime is less than 10 minutes.	`last(/Proxmox VE by HTTP/proxmox.qemu.uptime[{#QEMU.ID}])<10m`	INFO	Manual close: YES Depends on: - Proxmox: VM [{#NODE.NAME}/{#QEMU.NAME} ({#QEMU.ID})]: Not running
Proxmox: VM [{#NODE.NAME}/{#QEMU.NAME} ({#QEMU.ID})]: Not running	VM state is not "running".	`last(/Proxmox VE by HTTP/proxmox.qemu.vmstatus[{#QEMU.ID}])<>"running"`	AVERAGE
Proxmox: LXC [{#NODE.NAME}/{#LXC.NAME}]: has been restarted	Uptime is less than 10 minutes.	`last(/Proxmox VE by HTTP/proxmox.lxc.uptime[{#LXC.ID}])<10m`	INFO	Manual close: YES Depends on: - Proxmox: LXC [{#NODE.NAME}/{#LXC.NAME} ({#LXC.ID})]: Not running
Proxmox: LXC [{#NODE.NAME}/{#LXC.NAME} ({#LXC.ID})]: Not running	LXC state is not "running".	`last(/Proxmox VE by HTTP/proxmox.lxc.vmstatus[{#LXC.ID}])<>"running"`	AVERAGE
Proxmox: Node [{#NODE.NAME}] high root filesystem space usage	Root filesystem space usage.	`min(/Proxmox VE by HTTP/proxmox.node.rootused[{#NODE.NAME}],5m) / last(/Proxmox VE by HTTP/proxmox.node.roottotal[{#NODE.NAME}]) * 100 >{$PVE.ROOT.PUSE.MAX.WARN:"{#NODE.NAME}"}`	WARNING
Proxmox: Node [{#NODE.NAME}] high root filesystem space usage	This trigger is ignored, if there is no swap configured.	`min(/Proxmox VE by HTTP/proxmox.node.swapused[{#NODE.NAME}],5m) / last(/Proxmox VE by HTTP/proxmox.node.swaptotal[{#NODE.NAME}]) * 100 > {$PVE.SWAP.PUSE.MAX.WARN:"{#NODE.NAME}"} and last(/Proxmox VE by HTTP/proxmox.node.swaptotal[{#NODE.NAME}]) > 0`	WARNING
Proxmox: Storage [{#NODE.NAME}/{#STORAGE.NAME}] high filesystem space usage	Root filesystem space usage.	`min(/Proxmox VE by HTTP/proxmox.node.disk[{#NODE.NAME},{#STORAGE.NAME}],5m) / last(/Proxmox VE by HTTP/proxmox.node.maxdisk[{#NODE.NAME},{#STORAGE.NAME}]) * 100 >{$PVE.STORAGE.PUSE.MAX.WARN:"{#NODE.NAME}/{#STORAGE.NAME}"}`	WARNING

Feedback

Please report any issues with the template at https://support.zabbix.com

app

app_pop_service

View README Download JSON

POP Service

Overview

For Zabbix version: 6.2 and higher

Setup

Refer to the vendor documentation.

Zabbix configuration

No specific Zabbix configuration is required.

Template links

There are no template links in this template.

Discovery rules

Items collected

Group	Name	Description	Type	Key and additional info
Services	POP service is running	-	SIMPLE	net.tcp.service[pop]

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
POP service is down on {HOST.NAME}	-	`max(/POP Service/net.tcp.service[pop],#3)=0`	AVERAGE

Feedback

Please report any issues with the template at https://support.zabbix.com

app

app_php-fpm_http

View README Download JSON

PHP-FPM by HTTP

Overview

For Zabbix version: 6.2 and higher
The template to monitor PHP-FPM by Zabbix that work without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

Template PHP-FPM by HTTP collects metrics by polling PHP-FPM status-page with HTTP agent remotely.

Note that this solution supports https and redirects.

This template was tested on:

PHP, version 7

Setup

Open the php-fpm configuration file and enable the status page as shown.
```
pm.status_path = /status
ping.path = /ping
```
Validate the syntax is fine before we reload the service
```
$ php-fpm7 -t
```
Reload the php-fpm service to make the change active
```
$ systemctl reload php-fpm
```

Next, edit your Nginx server block (virtual host) configuration file and add the location block below in it.

# Enable php-fpm status page
location ~ ^/(status|ping)$ {
## disable access logging for request if you prefer
access_log off;

## Only allow trusted IPs for security, deny everyone else
# allow 127.0.0.1;
# allow 1.2.3.4;    # your IP here
# deny all;

fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
fastcgi_index index.php;
include fastcgi_params;
## Now the port or socket of the php-fpm pool we want the status of
fastcgi_pass 127.0.0.1:9000;
# fastcgi_pass unix:/run/php-fpm/your_socket.sock;
}

Check the syntax $ nginx -t
Reload Nginx $ systemctl reload nginx
Verify curl -L 127.0.0.1/status

If you use another location of status/ping page, don't forget to change {$PHPFPM.STATUS.PAGE}/{$PHPFPM.PING.PAGE} macro.

If you use an atypical location for PHP-FPM status-page don't forget to change the macros {$PHPFPM.SCHEME},{$PHPFPM.PORT}.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$PHP_FPM.HOST}	Hostname or IP of PHP-FPM status host or container.	`localhost`
{$PHP_FPM.PING.PAGE}	The path of PHP-FPM ping page.	`ping`
{$PHP_FPM.PING.REPLY}	Expected reply to the ping.	`pong`
{$PHP_FPM.PORT}	The port of PHP-FPM status host or container.	`80`
{$PHP_FPM.QUEUE.WARN.MAX}	The maximum PHP-FPM queue usage percent for trigger expression.	`80`
{$PHP_FPM.SCHEME}	Request scheme which may be http or https	`http`
{$PHP_FPM.STATUS.PAGE}	The path of PHP-FPM status page.	`status`

Template links

There are no template links in this template.

Discovery rules

Items collected

Group	Name	Description	Type	Key and additional info
PHP-FPM	PHP-FPM: Ping	-	DEPENDENT	php-fpm.ping Preprocessing: - REGEX: `{$PHP_FPM.PING.REPLY}($	\r?\n) 1`</p><p>⛔️ON_FAIL:`CUSTOM_VALUE -> 0`
PHP-FPM	PHP-FPM: Processes, active	The total number of active processes.	DEPENDENT	php-fpm.processes_active Preprocessing: - JSONPATH: `$.['active processes']`
PHP-FPM	PHP-FPM: Version	Current version PHP. Get from HTTP-Header "X-Powered-By" and may not work if you change default HTTP-headers.	DEPENDENT	php-fpm.version Preprocessing: - REGEX: `^[.\s\S]X-Powered-By: PHP/([.\d]{1,}) \1` ⛔️ONFAIL: `DISCARD_VALUE ->` - DISCARD*UNCHANGED_HEARTBEAT: `3h`
PHP-FPM	PHP-FPM: Pool name	The name of current pool.	DEPENDENT	php-fpm.name Preprocessing: - JSONPATH: `$.pool` - DISCARDUNCHANGEDHEARTBEAT: `3h`
PHP-FPM	PHP-FPM: Uptime	How long has this pool been running.	DEPENDENT	php-fpm.uptime Preprocessing: - JSONPATH: `$.['start since']`
PHP-FPM	PHP-FPM: Start time	The time when this pool was started.	DEPENDENT	php-fpm.start_time Preprocessing: - JSONPATH: `$.['start time']`
PHP-FPM	PHP-FPM: Processes, total	The total number of server processes currently running.	DEPENDENT	php-fpm.processes_total Preprocessing: - JSONPATH: `$.['total processes']`
PHP-FPM	PHP-FPM: Processes, idle	The total number of idle processes.	DEPENDENT	php-fpm.processes_idle Preprocessing: - JSONPATH: `$.['idle processes']`
PHP-FPM	PHP-FPM: Process manager	The method used by the process manager to control the number of child processes for this pool.	DEPENDENT	php-fpm.processmanager Preprocessing: - JSONPATH: `$.['process manager']` - DISCARDUNCHANGED_HEARTBEAT: `3h`
PHP-FPM	PHP-FPM: Processes, max active	The highest value that 'active processes' has reached since the php-fpm server started.	DEPENDENT	php-fpm.processesmaxactive Preprocessing: - JSONPATH: `$.['max active processes']`
PHP-FPM	PHP-FPM: Accepted connections per second	The number of accepted requests per second.	DEPENDENT	php-fpm.connaccepted.rate Preprocessing: - JSONPATH: `$.['accepted conn']` - CHANGEPER_SECOND
PHP-FPM	PHP-FPM: Slow requests	The number of requests that exceeded your requestslowlogtimeout value.	DEPENDENT	php-fpm.slowrequests Preprocessing: - JSONPATH: `$.['slow requests']` - SIMPLECHANGE
PHP-FPM	PHP-FPM: Listen queue	The current number of connections that have been initiated, but not yet accepted.	DEPENDENT	php-fpm.listen_queue Preprocessing: - JSONPATH: `$.['listen queue']`
PHP-FPM	PHP-FPM: Listen queue, max	The maximum number of requests in the queue of pending connections since this FPM pool has started.	DEPENDENT	php-fpm.listenqueuemax Preprocessing: - JSONPATH: `$.['max listen queue']`
PHP-FPM	PHP-FPM: Listen queue, len	Size of the socket queue of pending connections.	DEPENDENT	php-fpm.listenqueuelen Preprocessing: - JSONPATH: `$.['listen queue len']`
PHP-FPM	PHP-FPM: Queue usage	Queue utilization	CALCULATED	php-fpm.listenqueueusage Expression: `last(//php-fpm.listen_queue)/(last(//php-fpm.listen_queue_len)+(last(//php-fpm.listen_queue_len)=0))*100`
PHP-FPM	PHP-FPM: Max children reached	The number of times that pm.max_children has been reached since the php-fpm pool started	DEPENDENT	php-fpm.maxchildren Preprocessing: - JSONPATH: `$.['max children reached']` - SIMPLECHANGE
Zabbix raw items	PHP-FPM: Get ping page	-	HTTP_AGENT	php-fpm.get_ping
Zabbix raw items	PHP-FPM: Get status page	-	HTTP_AGENT	php-fpm.get_status

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
PHP-FPM: Service is down	-	`last(/PHP-FPM by HTTP/php-fpm.ping)=0 or nodata(/PHP-FPM by HTTP/php-fpm.ping,3m)=1`	HIGH	Manual close: YES
PHP-FPM: Version has changed	PHP-FPM version has changed. Ack to close.	`last(/PHP-FPM by HTTP/php-fpm.version,#1)<>last(/PHP-FPM by HTTP/php-fpm.version,#2) and length(last(/PHP-FPM by HTTP/php-fpm.version))>0`	INFO	Manual close: YES
PHP-FPM: Failed to fetch info data	Zabbix has not received data for items for the last 30 minutes	`nodata(/PHP-FPM by HTTP/php-fpm.uptime,30m)=1`	INFO	Manual close: YES Depends on: - PHP-FPM: Service is down
PHP-FPM: Pool has been restarted	Uptime is less than 10 minutes.	`last(/PHP-FPM by HTTP/php-fpm.uptime)<10m`	INFO	Manual close: YES
PHP-FPM: Manager changed	PHP-FPM manager changed. Ack to close.	`last(/PHP-FPM by HTTP/php-fpm.process_manager,#1)<>last(/PHP-FPM by HTTP/php-fpm.process_manager,#2)`	INFO	Manual close: YES
PHP-FPM: Detected slow requests	PHP-FPM detected slow request. A slow request means that it took more time to execute than expected (defined in the configuration of your pool).	`min(/PHP-FPM by HTTP/php-fpm.slow_requests,#3)>0`	WARNING
PHP-FPM: Queue utilization is high	The queue for this pool reached {$PHP_FPM.QUEUE.WARN.MAX}% of its maximum capacity. Items in queue represent the current number of connections that have been initiated on this pool, but not yet accepted.	`min(/PHP-FPM by HTTP/php-fpm.listen_queue_usage,15m) > {$PHP_FPM.QUEUE.WARN.MAX}`	WARNING

Feedback

Please report any issues with the template at https://support.zabbix.com

app

app_php-fpm_agent

View README Download JSON

PHP-FPM by Zabbix agent

Overview

For Zabbix version: 6.2 and higher. This template is developed to monitor the FastCGI Process Manager (PHP-FPM) by Zabbix that works without any external scripts.

Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

The template PHP-FPM by Zabbix agent - collects metrics by polling the PHP-FPM status-page locally with Zabbix agent.

Note that this template doesn't support HTTPS and redirects (limitations of web.page.get).

It also uses Zabbix agent to collect PHP-FPM Linux process statistics, such as CPU usage, memory usage, and whether the process is running or not.

This template was tested on:

PHP, version 7

Setup

Open the php-fpm configuration file and enable the status page as shown.
```
pm.status_path = /status
ping.path = /ping
```
Validate the syntax to ensure it is correct, before you reload the service.
```
$ php-fpm7 -t
```
Reload the php-fpm service to make the change active.
```
$ systemctl reload php-fpm
```

Next, edit the configuration file of your Nginx server block (virtual host) and add the location block below it.

# Enable php-fpm status page
location ~ ^/(status|ping)$ {
## disable access logging for request if you prefer
access_log off;

## Only allow trusted IPs for security, deny everyone else
# allow 127.0.0.1;
# allow 1.2.3.4;    # your IP here
# deny all;

fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
fastcgi_index index.php;
include fastcgi_params;
## Now the port or socket of the php-fpm pool we want the status of
fastcgi_pass 127.0.0.1:9000;
# fastcgi_pass unix:/run/php-fpm/your_socket.sock;
}

Check the syntax again.

$ nginx -t
Reload Nginx server.

$ systemctl reload nginx
Verify it with this command line.

curl -L 127.0.0.1/status

If you use another location of the status/ping page, don't forget to change the {$PHP_FPM.STATUS.PAGE}/{$PHP_FPM.PING.PAGE} macro.

If you use an atypical location for the PHP-FPM status-page, don't forget to change the macro {$PHP_FPM.PORT}.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$PHP_FPM.HOST}	The Hostname or an IP address of the PHP-FPM status for a host or container.	`localhost`
{$PHP_FPM.PING.PAGE}	The path of the PHP-FPM ping page.	`ping`
{$PHP_FPM.PING.REPLY}	The expected reply to the ping.	`pong`
{$PHP_FPM.PORT}	The port of the PHP-FPM status host or container.	`80`
{$PHPFPM.PROCESSNAME}	The name of the PHP-FPM process.	`php-fpm`
{$PHP_FPM.QUEUE.WARN.MAX}	The maximum percent of the PHP-FPM queue usage for a trigger expression.	`80`
{$PHP_FPM.STATUS.PAGE}	The path of the PHP-FPM status page.	`status`

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

PHP-FPM process discovery

The discovery of the PHP-FPM summary processes.

DEPENDENT

php-fpm.proc.discovery

Filter:

AND

- {#NAME} MATCHES_REGEX {$PHP_FPM.PROCESS_NAME}

Items collected

Group	Name	Description	Type	Key and additional info
PHP-FPM	PHP-FPM: Get processes summary	The aggregated data of summary metrics for all processes.	ZABBIX_PASSIVE	proc.get[,,,summary]
PHP-FPM	PHP-FPM: Ping	-	DEPENDENT	php-fpm.ping Preprocessing: - REGEX: `{$PHP_FPM.PING.REPLY}($	\r?\n) 1`</p><p>⛔️ON_FAIL:`CUSTOM_VALUE -> 0`
PHP-FPM	PHP-FPM: Processes, active	The total number of active processes.	DEPENDENT	php-fpm.processes_active Preprocessing: - JSONPATH: `$.['active processes']`
PHP-FPM	PHP-FPM: Version	The current version of the PHP. You can get it from the HTTP-Header "X-Powered-By"; it may not work if you have changed the default HTTP-headers.	DEPENDENT	php-fpm.version Preprocessing: - REGEX: `^[.\s\S]X-Powered-By: PHP/([.\d]{1,}) \1` ⛔️ONFAIL: `DISCARD_VALUE ->` - DISCARD*UNCHANGED_HEARTBEAT: `3h`
PHP-FPM	PHP-FPM: Pool name	The name of the current pool.	DEPENDENT	php-fpm.name Preprocessing: - JSONPATH: `$.pool` - DISCARDUNCHANGEDHEARTBEAT: `3h`
PHP-FPM	PHP-FPM: Uptime	It indicates how long has this pool been running.	DEPENDENT	php-fpm.uptime Preprocessing: - JSONPATH: `$.['start since']`
PHP-FPM	PHP-FPM: Start time	The time when this pool was started.	DEPENDENT	php-fpm.start_time Preprocessing: - JSONPATH: `$.['start time']`
PHP-FPM	PHP-FPM: Processes, total	The total number of server processes running currently.	DEPENDENT	php-fpm.processes_total Preprocessing: - JSONPATH: `$.['total processes']`
PHP-FPM	PHP-FPM: Processes, idle	The total number of idle processes.	DEPENDENT	php-fpm.processes_idle Preprocessing: - JSONPATH: `$.['idle processes']`
PHP-FPM	PHP-FPM: Queue usage	The utilization of the queue.	CALCULATED	php-fpm.listenqueueusage Expression: `last(//php-fpm.listen_queue)/(last(//php-fpm.listen_queue_len)+(last(//php-fpm.listen_queue_len)=0))*100`
PHP-FPM	PHP-FPM: Process manager	The method used by the process manager to control the number of child processes for this pool.	DEPENDENT	php-fpm.processmanager Preprocessing: - JSONPATH: `$.['process manager']` - DISCARDUNCHANGED_HEARTBEAT: `3h`
PHP-FPM	PHP-FPM: Processes, max active	The highest value of "active processes" since the PHP-FPM server was started.	DEPENDENT	php-fpm.processesmaxactive Preprocessing: - JSONPATH: `$.['max active processes']`
PHP-FPM	PHP-FPM: Accepted connections per second	The number of accepted requests per second.	DEPENDENT	php-fpm.connaccepted.rate Preprocessing: - JSONPATH: `$.['accepted conn']` - CHANGEPER_SECOND
PHP-FPM	PHP-FPM: Slow requests	The number of requests that has exceeded your `request_slowlog_timeout` value.	DEPENDENT	php-fpm.slowrequests Preprocessing: - JSONPATH: `$.['slow requests']` - SIMPLECHANGE
PHP-FPM	PHP-FPM: Listen queue	The current number of connections that have been initiated but not yet accepted.	DEPENDENT	php-fpm.listen_queue Preprocessing: - JSONPATH: `$.['listen queue']`
PHP-FPM	PHP-FPM: Listen queue, max	The maximum number of requests in the queue of pending connections since this FPM pool was started.	DEPENDENT	php-fpm.listenqueuemax Preprocessing: - JSONPATH: `$.['max listen queue']`
PHP-FPM	PHP-FPM: Listen queue, len	The size of the socket queue of pending connections.	DEPENDENT	php-fpm.listenqueuelen Preprocessing: - JSONPATH: `$.['listen queue len']`
PHP-FPM	PHP-FPM: Max children reached	The number of times that `pm.max_children` has been reached since the PHP-FPM pool was started.	DEPENDENT	php-fpm.maxchildren Preprocessing: - JSONPATH: `$.['max children reached']` - SIMPLECHANGE
PHP-FPM	PHP-FPM: Get process data	The summary metrics aggregated by a process `{#NAME}`.	DEPENDENT	php-fpm.proc.get[{#NAME}] Preprocessing: - JSONPATH: `$.[?(@["name"]=="{#NAME}")].first()` ⛔️ON_FAIL: `CUSTOM_VALUE -> Failed to retrieve process {#NAME} data`
PHP-FPM	PHP-FPM: Memory usage (rss)	The summary of resident set size memory used by a process `{#NAME}` expressed in bytes.	DEPENDENT	php-fpm.proc.rss[{#NAME}] Preprocessing: - JSONPATH: `$.rss` ⛔️ON_FAIL: `DISCARD_VALUE ->`
PHP-FPM	PHP-FPM: Memory usage (vsize)	The summary of virtual memory used by a process `{#NAME}` expressed in bytes.	DEPENDENT	php-fpm.proc.vmem[{#NAME}] Preprocessing: - JSONPATH: `$.vsize` ⛔️ON_FAIL: `DISCARD_VALUE ->`
PHP-FPM	PHP-FPM: Memory usage, %	The percentage of real memory used by a process `{#NAME}`.	DEPENDENT	php-fpm.proc.pmem[{#NAME}] Preprocessing: - JSONPATH: `$.pmem` ⛔️ON_FAIL: `DISCARD_VALUE ->`
PHP-FPM	PHP-FPM: Number of running processes	The number of running processes `{#NAME}`.	DEPENDENT	php-fpm.proc.num[{#NAME}] Preprocessing: - JSONPATH: `$.processes` ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - DISCARDUNCHANGED_HEARTBEAT: `1h`
PHP-FPM	PHP-FPM: CPU utilization	The percentage of the CPU utilization by a process `{#NAME}`.	ZABBIX_PASSIVE	proc.cpu.util[{#NAME}]
Zabbix raw items	PHP-FPM: php-fpm_ping	-	ZABBIX_PASSIVE	web.page.get["{$PHPFPM.HOST}","{$PHPFPM.PING.PAGE}","{$PHP_FPM.PORT}"]
Zabbix raw items	PHP-FPM: Get status page	-	ZABBIX_PASSIVE	web.page.get["{$PHPFPM.HOST}","{$PHPFPM.STATUS.PAGE}?json","{$PHP_FPM.PORT}"] Preprocessing: - REGEX: `^[.\s\S]*({.+}) \1`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
PHP-FPM: Version has changed	The PHP-FPM version has changed. Acknowledge (Ack) to close manually.	`last(/PHP-FPM by Zabbix agent/php-fpm.version,#1)<>last(/PHP-FPM by Zabbix agent/php-fpm.version,#2) and length(last(/PHP-FPM by Zabbix agent/php-fpm.version))>0`	INFO	Manual close: YES
PHP-FPM: Pool has been restarted	Uptime is less than 10 minutes.	`last(/PHP-FPM by Zabbix agent/php-fpm.uptime)<10m`	INFO	Manual close: YES
PHP-FPM: Queue utilization is high	The queue for this pool has reached `{$PHP_FPM.QUEUE.WARN.MAX}%` of its maximum capacity. Items in the queue represent the current number of connections that have been initiated on this pool but not yet accepted.	`min(/PHP-FPM by Zabbix agent/php-fpm.listen_queue_usage,15m) > {$PHP_FPM.QUEUE.WARN.MAX}`	WARNING
PHP-FPM: Manager changed	The PHP-FPM manager has changed. `Ack` to close manually.	`last(/PHP-FPM by Zabbix agent/php-fpm.process_manager,#1)<>last(/PHP-FPM by Zabbix agent/php-fpm.process_manager,#2)`	INFO	Manual close: YES
PHP-FPM: Detected slow requests	The PHP-FPM has detected a slow request. The slow request means that it took more time to execute than expected (defined in the configuration of your pool).	`min(/PHP-FPM by Zabbix agent/php-fpm.slow_requests,#3)>0`	WARNING
PHP-FPM: Process is not running	-	`last(/PHP-FPM by Zabbix agent/php-fpm.proc.num[{#NAME}])=0`	HIGH
PHP-FPM: Failed to fetch info data	Zabbix has not received any data for items for the last 30 minutes.	`nodata(/PHP-FPM by Zabbix agent/php-fpm.uptime,30m)=1 and last(/PHP-FPM by Zabbix agent/php-fpm.proc.num[{#NAME}])>0`	INFO	Manual close: YES
PHP-FPM: Service is down	-	`(last(/PHP-FPM by Zabbix agent/php-fpm.ping)=0 or nodata(/PHP-FPM by Zabbix agent/php-fpm.ping,3m)=1) and last(/PHP-FPM by Zabbix agent/php-fpm.proc.num[{#NAME}])>0`	HIGH	Manual close: YES

Feedback

Please report any issues with the template at https://support.zabbix.com.

app

app_pfsense_snmp

View README Download JSON

PFSense by SNMP

Overview

For Zabbix version: 6.2 and higher. Template for monitoring pfSense by SNMP

This template was tested on:

pfSense, version 2.5.0, 2.5.1, 2.5.2

Setup

Import template into Zabbix
Enable SNMP daemon at Services in pfSense web interface https://docs.netgate.com/pfsense/en/latest/services/snmp.html
Setup firewall rule to get access from Zabbix proxy or Zabbix server by SNMP https://docs.netgate.com/pfsense/en/latest/firewall/index.html#managing-firewall-rules
Link template to the host

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$IF.ERRORS.WARN}	Threshold of error packets rate for warning trigger. Can be used with interface name as context.	`2`
{$IF.UTIL.MAX}	Threshold of interface bandwidth utilization for warning trigger in %. Can be used with interface name as context.	`90`
{$IFCONTROL}	Macro for operational state of the interface for link down trigger. Can be used with interface name as context.	`1`
{$NET.IF.IFADMINSTATUS.MATCHES}	This macro is used in filters of network interfaces discovery rule.	`^.*`
{$NET.IF.IFADMINSTATUS.NOT_MATCHES}	Ignore down(2) administrative status	`^2$`
{$NET.IF.IFALIAS.MATCHES}	This macro is used in filters of network interfaces discovery rule.	`.*`
{$NET.IF.IFALIAS.NOT_MATCHES}	This macro is used in filters of network interfaces discovery rule.	`CHANGE_IF_NEEDED`
{$NET.IF.IFDESCR.MATCHES}	This macro used in filters of network interfaces discovery rule.	`.*`
{$NET.IF.IFDESCR.NOT_MATCHES}	This macro used in filters of network interfaces discovery rule.	`CHANGE_IF_NEEDED`
{$NET.IF.IFNAME.NOT_MATCHES}	This macro used in filters of network interfaces discovery rule.	`(^pflog[0-9.]*$	^pfsync[0-9.]*$)`
{$NET.IF.IFOPERSTATUS.MATCHES}	This macro used in filters of network interfaces discovery rule.	`^.*$`
{$NET.IF.IFOPERSTATUS.NOT_MATCHES}	Ignore notPresent(6)	`^6$`
{$NET.IF.IFTYPE.MATCHES}	This macro used in filters of network interfaces discovery rule.	`.*`
{$NET.IF.IFTYPE.NOT_MATCHES}	This macro used in filters of network interfaces discovery rule.	`CHANGE_IF_NEEDED`
{$SNMP.TIMEOUT}	The time interval for SNMP availability trigger.	`5m`
{$SOURCE.TRACKING.TABLE.UTIL.MAX}	Threshold of source tracking table utilization trigger in %.	`90`
{$STATE.TABLE.UTIL.MAX}	Threshold of state table utilization trigger in %.	`90`

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Network interfaces discovery

Discovering interfaces from IF-MIB.

SNMP

pfsense.net.if.discovery

Filter:

AND

- {#IFADMINSTATUS} MATCHESREGEX {$NET.IF.IFADMINSTATUS.MATCHES}

- {#IFADMINSTATUS} NOTMATCHESREGEX {$NET.IF.IFADMINSTATUS.NOT_MATCHES}

- {#IFOPERSTATUS} MATCHESREGEX {$NET.IF.IFOPERSTATUS.MATCHES}

- {#IFOPERSTATUS} NOTMATCHESREGEX {$NET.IF.IFOPERSTATUS.NOT_MATCHES}

- {#IFNAME} MATCHESREGEX @Network interfaces for discovery

- {#IFNAME} NOTMATCHESREGEX {$NET.IF.IFNAME.NOT_MATCHES}

- {#IFDESCR} MATCHESREGEX {$NET.IF.IFDESCR.MATCHES}

- {#IFDESCR} NOTMATCHESREGEX {$NET.IF.IFDESCR.NOT_MATCHES}

- {#IFALIAS} MATCHESREGEX {$NET.IF.IFALIAS.MATCHES}

- {#IFALIAS} NOTMATCHESREGEX {$NET.IF.IFALIAS.NOT_MATCHES}

- {#IFTYPE} MATCHESREGEX {$NET.IF.IFTYPE.MATCHES}

- {#IFTYPE} NOTMATCHESREGEX {$NET.IF.IFTYPE.NOT_MATCHES}

Items collected

Group	Name	Description	Type	Key and additional info
Network interfaces	PFSense: Interface [{#IFNAME}({#IFALIAS})]: Inbound packets discarded	MIB: IF-MIB The number of inbound packets which were chosen to be discarded even though no errors had been detected to prevent their being deliverable to a higher-layer protocol. One possible reason for discarding such a packet could be to free up buffer space. Discontinuities in the value of this counter can occur at re-initialization of the management system, and at other times as indicated by the value of ifCounterDiscontinuityTime.	SNMP	net.if.in.discards[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND: ``
Network interfaces	PFSense: Interface [{#IFNAME}({#IFALIAS})]: Inbound packets with errors	MIB: IF-MIB For packet-oriented interfaces, the number of inbound packets that contained errors preventing them from being deliverable to a higher-layer protocol. For character-oriented or fixed-length interfaces, the number of inbound transmission units that contained errors preventing them from being deliverable to a higher-layer protocol. Discontinuities in the value of this counter can occur at re-initialization of the management system, and at other times as indicated by the value of ifCounterDiscontinuityTime.	SNMP	net.if.in.errors[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND: ``
Network interfaces	PFSense: Interface [{#IFNAME}({#IFALIAS})]: Bits received	MIB: IF-MIB The total number of octets received on the interface, including framing characters. This object is a 64-bit version of ifInOctets. Discontinuities in the value of this counter can occur at re-initialization of the management system, and at other times as indicated by the value of ifCounterDiscontinuityTime.	SNMP	net.if.in[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND: ``</p><p>- MULTIPLIER:`8`
Network interfaces	PFSense: Interface [{#IFNAME}({#IFALIAS})]: Outbound packets discarded	MIB: IF-MIB The number of outbound packets which were chosen to be discarded even though no errors had been detected to prevent their being deliverable to a higher-layer protocol. One possible reason for discarding such a packet could be to free up buffer space. Discontinuities in the value of this counter can occur at re-initialization of the management system, and at other times as indicated by the value of ifCounterDiscontinuityTime.	SNMP	net.if.out.discards[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND: ``
Network interfaces	PFSense: Interface [{#IFNAME}({#IFALIAS})]: Outbound packets with errors	MIB: IF-MIB For packet-oriented interfaces, the number of outbound packets that contained errors preventing them from being deliverable to a higher-layer protocol. For character-oriented or fixed-length interfaces, the number of outbound transmission units that contained errors preventing them from being deliverable to a higher-layer protocol. Discontinuities in the value of this counter can occur at re-initialization of the management system, and at other times as indicated by the value of ifCounterDiscontinuityTime.	SNMP	net.if.out.errors[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND: ``
Network interfaces	PFSense: Interface [{#IFNAME}({#IFALIAS})]: Bits sent	MIB: IF-MIB The total number of octets transmitted out of the interface, including framing characters. This object is a 64-bit version of ifOutOctets.Discontinuities in the value of this counter can occur at re-initialization of the management system, and at other times as indicated by the value of ifCounterDiscontinuityTime.	SNMP	net.if.out[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND: ``</p><p>- MULTIPLIER:`8`
Network interfaces	PFSense: Interface [{#IFNAME}({#IFALIAS})]: Speed	MIB: IF-MIB An estimate of the interface's current bandwidth in units of 1,000,000 bits per second. If this object reports a value of `n' then the speed of the interface is somewhere in the range of`n-500,000' to`n+499,999'. For interfaces which do not vary in bandwidth or for those where no accurate estimation can be made, this object should contain the nominal bandwidth. For a sub-layer which has no concept of bandwidth, this object should be zero.	SNMP	net.if.speed[{#SNMPINDEX}] Preprocessing: - MULTIPLIER: `1000000` - DISCARDUNCHANGEDHEARTBEAT: `1h`
Network interfaces	PFSense: Interface [{#IFNAME}({#IFALIAS})]: Operational status	MIB: IF-MIB The current operational state of the interface. - The testing(3) state indicates that no operational packet scan be passed - If ifAdminStatus is down(2) then ifOperStatus should be down(2) - If ifAdminStatus is changed to up(1) then ifOperStatus should change to up(1) if the interface is ready to transmit and receive network traffic - It should change todormant(5) if the interface is waiting for external actions (such as a serial line waiting for an incoming connection) - It should remain in the down(2) state if and only if there is a fault that prevents it from going to the up(1) state - It should remain in the notPresent(6) state if the interface has missing(typically, hardware) components.	SNMP	net.if.status[{#SNMPINDEX}] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `6h`
Network interfaces	PFSense: Interface [{#IFNAME}({#IFALIAS})]: Interface type	MIB: IF-MIB The type of interface. Additional values for ifType are assigned by the Internet Assigned Numbers Authority (IANA), through updating the syntax of the IANAifType textual convention.	SNMP	net.if.type[{#SNMPINDEX}] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `6h`
Network interfaces	PFSense: Interface [{#IFNAME}({#IFALIAS})]: Rules references count	MIB: BEGEMOT-PF-MIB The number of rules referencing this interface.	SNMP	net.if.rules.refs[{#SNMPINDEX}]
Network interfaces	PFSense: Interface [{#IFNAME}({#IFALIAS})]: Inbound IPv4 traffic passed	MIB: BEGEMOT-PF-MIB IPv4 bits per second passed coming in on this interface.	SNMP	net.if.in.pass.v4.bps[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND - MULTIPLIER: `8`
Network interfaces	PFSense: Interface [{#IFNAME}({#IFALIAS})]: Inbound IPv4 traffic blocked	MIB: BEGEMOT-PF-MIB IPv4 bits per second blocked coming in on this interface.	SNMP	net.if.in.block.v4.bps[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND - MULTIPLIER: `8`
Network interfaces	PFSense: Interface [{#IFNAME}({#IFALIAS})]: Outbound IPv4 traffic passed	MIB: BEGEMOT-PF-MIB IPv4 bits per second passed going out on this interface.	SNMP	net.if.out.pass.v4.bps[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND - MULTIPLIER: `8`
Network interfaces	PFSense: Interface [{#IFNAME}({#IFALIAS})]: Outbound IPv4 traffic blocked	MIB: BEGEMOT-PF-MIB IPv4 bits per second blocked going out on this interface.	SNMP	net.if.out.block.v4.bps[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND - MULTIPLIER: `8`
Network interfaces	PFSense: Interface [{#IFNAME}({#IFALIAS})]: Inbound IPv4 packets passed	MIB: BEGEMOT-PF-MIB The number of IPv4 packets passed coming in on this interface.	SNMP	net.if.in.pass.v4.pps[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND
Network interfaces	PFSense: Interface [{#IFNAME}({#IFALIAS})]: Inbound IPv4 packets blocked	MIB: BEGEMOT-PF-MIB The number of IPv4 packets blocked coming in on this interface.	SNMP	net.if.in.block.v4.pps[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND
Network interfaces	PFSense: Interface [{#IFNAME}({#IFALIAS})]: Outbound IPv4 packets passed	MIB: BEGEMOT-PF-MIB The number of IPv4 packets passed going out on this interface.	SNMP	net.if.out.pass.v4.pps[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND
Network interfaces	PFSense: Interface [{#IFNAME}({#IFALIAS})]: Outbound IPv4 packets blocked	MIB: BEGEMOT-PF-MIB The number of IPv4 packets blocked going out on this interface.	SNMP	net.if.out.block.v4.pps[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND
Network interfaces	PFSense: Interface [{#IFNAME}({#IFALIAS})]: Inbound IPv6 traffic passed	MIB: BEGEMOT-PF-MIB IPv6 bits per second passed coming in on this interface.	SNMP	net.if.in.pass.v6.bps[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND - MULTIPLIER: `8`
Network interfaces	PFSense: Interface [{#IFNAME}({#IFALIAS})]: Inbound IPv6 traffic blocked	MIB: BEGEMOT-PF-MIB IPv6 bits per second blocked coming in on this interface.	SNMP	net.if.in.block.v6.bps[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND - MULTIPLIER: `8`
Network interfaces	PFSense: Interface [{#IFNAME}({#IFALIAS})]: Outbound IPv6 traffic passed	MIB: BEGEMOT-PF-MIB IPv6 bits per second passed going out on this interface.	SNMP	net.if.out.pass.v6.bps[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND - MULTIPLIER: `8`
Network interfaces	PFSense: Interface [{#IFNAME}({#IFALIAS})]: Outbound IPv6 traffic blocked	MIB: BEGEMOT-PF-MIB IPv6 bits per second blocked going out on this interface.	SNMP	net.if.out.block.v6.bps[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND - MULTIPLIER: `8`
Network interfaces	PFSense: Interface [{#IFNAME}({#IFALIAS})]: Inbound IPv6 packets passed	MIB: BEGEMOT-PF-MIB The number of IPv6 packets passed coming in on this interface.	SNMP	net.if.in.pass.v6.pps[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND
Network interfaces	PFSense: Interface [{#IFNAME}({#IFALIAS})]: Inbound IPv6 packets blocked	MIB: BEGEMOT-PF-MIB The number of IPv6 packets blocked coming in on this interface.	SNMP	net.if.in.block.v6.pps[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND
Network interfaces	PFSense: Interface [{#IFNAME}({#IFALIAS})]: Outbound IPv6 packets passed	MIB: BEGEMOT-PF-MIB The number of IPv6 packets passed going out on this interface.	SNMP	net.if.out.pass.v6.pps[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND
Network interfaces	PFSense: Interface [{#IFNAME}({#IFALIAS})]: Outbound IPv6 packets blocked	MIB: BEGEMOT-PF-MIB The number of IPv6 packets blocked going out on this interface.	SNMP	net.if.out.block.v6.pps[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND
pfSense	PFSense: Packet filter running status	MIB: BEGEMOT-PF-MIB True if packet filter is currently enabled.	SNMP	pfsense.pf.status
pfSense	PFSense: States table current	MIB: BEGEMOT-PF-MIB Number of entries in the state table.	SNMP	pfsense.state.table.count
pfSense	PFSense: States table limit	MIB: BEGEMOT-PF-MIB Maximum number of 'keep state' rules in the ruleset.	SNMP	pfsense.state.table.limit
pfSense	PFSense: States table utilization in %	Utilization of state table in %.	CALCULATED	pfsense.state.table.pused Expression: `last(//pfsense.state.table.count) * 100 / last(//pfsense.state.table.limit)`
pfSense	PFSense: Source tracking table current	MIB: BEGEMOT-PF-MIB Number of entries in the source tracking table.	SNMP	pfsense.source.tracking.table.count
pfSense	PFSense: Source tracking table limit	MIB: BEGEMOT-PF-MIB Maximum number of 'sticky-address' or 'source-track' rules in the ruleset.	SNMP	pfsense.source.tracking.table.limit
pfSense	PFSense: Source tracking table utilization in %	Utilization of source tracking table in %.	CALCULATED	pfsense.source.tracking.table.pused Expression: `last(//pfsense.source.tracking.table.count) * 100 / last(//pfsense.source.tracking.table.limit)`
pfSense	PFSense: DHCP server status	MIB: HOST-RESOURCES-MIB The status of DHCP server process.	SNMP	pfsense.dhcpd.status Preprocessing: - CHECKNOTSUPPORTED: ``</p><p>⛔️ON_FAIL:`CUSTOM_VALUE -> 0`
pfSense	PFSense: DNS server status	MIB: HOST-RESOURCES-MIB The status of DNS server process.	SNMP	pfsense.dns.status Preprocessing: - CHECKNOTSUPPORTED: ``</p><p>⛔️ON_FAIL:`CUSTOM_VALUE -> 0`
pfSense	PFSense: State of nginx process	MIB: HOST-RESOURCES-MIB The status of nginx process.	SNMP	pfsense.nginx.status Preprocessing: - CHECKNOTSUPPORTED: ``</p><p>⛔️ON_FAIL:`CUSTOM_VALUE -> 0`
pfSense	PFSense: Packets matched a filter rule	MIB: BEGEMOT-PF-MIB True if the packet was logged with the specified packet filter reason code. The known codes are: match, bad-offset, fragment, short, normalize, and memory.	SNMP	pfsense.packets.match Preprocessing: - CHANGEPERSECOND
pfSense	PFSense: Packets with bad offset	MIB: BEGEMOT-PF-MIB True if the packet was logged with the specified packet filter reason code. The known codes are: match, bad-offset, fragment, short, normalize, and memory.	SNMP	pfsense.packets.bad.offset Preprocessing: - CHANGEPERSECOND
pfSense	PFSense: Fragmented packets	MIB: BEGEMOT-PF-MIB True if the packet was logged with the specified packet filter reason code. The known codes are: match, bad-offset, fragment, short, normalize, and memory.	SNMP	pfsense.packets.fragment Preprocessing: - CHANGEPERSECOND
pfSense	PFSense: Short packets	MIB: BEGEMOT-PF-MIB True if the packet was logged with the specified packet filter reason code. The known codes are: match, bad-offset, fragment, short, normalize, and memory.	SNMP	pfsense.packets.short Preprocessing: - CHANGEPERSECOND
pfSense	PFSense: Normalized packets	MIB: BEGEMOT-PF-MIB True if the packet was logged with the specified packet filter reason code. The known codes are: match, bad-offset, fragment, short, normalize, and memory.	SNMP	pfsense.packets.normalize Preprocessing: - CHANGEPERSECOND
pfSense	PFSense: Packets dropped due to memory limitation	MIB: BEGEMOT-PF-MIB True if the packet was logged with the specified packet filter reason code. The known codes are: match, bad-offset, fragment, short, normalize, and memory.	SNMP	pfsense.packets.mem.drop Preprocessing: - CHANGEPERSECOND
pfSense	PFSense: Firewall rules count	MIB: BEGEMOT-PF-MIB The number of labeled filter rules on this system.	SNMP	pfsense.rules.count
Status	PFSense: SNMP agent availability	Availability of SNMP checks on the host. The value of this item corresponds to availability icons in the host list. Possible value: 0 - not available 1 - available 2 - unknown	INTERNAL	zabbix[host,snmp,available]

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
PFSense: Interface [{#IFNAME}({#IFALIAS})]: High input error rate	Recovers when below 80% of {$IF.ERRORS.WARN:"{#IFNAME}"} threshold.	`min(/PFSense by SNMP/net.if.in.errors[{#SNMPINDEX}],5m)>{$IF.ERRORS.WARN:"{#IFNAME}"}` Recovery expression: `max(/PFSense by SNMP/net.if.in.errors[{#SNMPINDEX}],5m)<{$IF.ERRORS.WARN:"{#IFNAME}"}*0.8`	WARNING	Depends on: - PFSense: Interface [{#IFNAME}({#IFALIAS})]: Link down
PFSense: Interface [{#IFNAME}({#IFALIAS})]: High inbound bandwidth usage	The network interface utilization is close to its estimated maximum bandwidth.	`(avg(/PFSense by SNMP/net.if.in[{#SNMPINDEX}],15m)>({$IF.UTIL.MAX:"{#IFNAME}"}/100)last(/PFSense by SNMP/net.if.speed[{#SNMPINDEX}])) and last(/PFSense by SNMP/net.if.speed[{#SNMPINDEX}])>0` Recovery expression: `avg(/PFSense by SNMP/net.if.in[{#SNMPINDEX}],15m)<(({$IF.UTIL.MAX:"{#IFNAME}"}-3)/100)last(/PFSense by SNMP/net.if.speed[{#SNMPINDEX}])`	WARNING	Depends on: - PFSense: Interface [{#IFNAME}({#IFALIAS})]: Link down
PFSense: Interface [{#IFNAME}({#IFALIAS})]: High output error rate	Recovers when below 80% of {$IF.ERRORS.WARN:"{#IFNAME}"} threshold.	`min(/PFSense by SNMP/net.if.out.errors[{#SNMPINDEX}],5m)>{$IF.ERRORS.WARN:"{#IFNAME}"}` Recovery expression: `max(/PFSense by SNMP/net.if.out.errors[{#SNMPINDEX}],5m)<{$IF.ERRORS.WARN:"{#IFNAME}"}*0.8`	WARNING	Depends on: - PFSense: Interface [{#IFNAME}({#IFALIAS})]: Link down
PFSense: Interface [{#IFNAME}({#IFALIAS})]: High outbound bandwidth usage	The network interface utilization is close to its estimated maximum bandwidth.	`(avg(/PFSense by SNMP/net.if.out[{#SNMPINDEX}],15m)>({$IF.UTIL.MAX:"{#IFNAME}"}/100)last(/PFSense by SNMP/net.if.speed[{#SNMPINDEX}])) and last(/PFSense by SNMP/net.if.speed[{#SNMPINDEX}])>0` Recovery expression: `avg(/PFSense by SNMP/net.if.out[{#SNMPINDEX}],15m)<(({$IF.UTIL.MAX:"{#IFNAME}"}-3)/100)last(/PFSense by SNMP/net.if.speed[{#SNMPINDEX}])`	WARNING	Depends on: - PFSense: Interface [{#IFNAME}({#IFALIAS})]: Link down
PFSense: Interface [{#IFNAME}({#IFALIAS})]: Ethernet has changed to lower speed than it was before	This Ethernet connection has transitioned down from its known maximum speed. This might be a sign of autonegotiation issues. Ack to close.	change(/PFSense by SNMP/net.if.speed[{#SNMPINDEX}])<0 and last(/PFSense by SNMP/net.if.speed[{#SNMPINDEX}])>0 and ( last(/PFSense by SNMP/net.if.type[{#SNMPINDEX}])=6 or last(/PFSense by SNMP/net.if.type[{#SNMPINDEX}])=7 or last(/PFSense by SNMP/net.if.type[{#SNMPINDEX}])=11 or last(/PFSense by SNMP/net.if.type[{#SNMPINDEX}])=62 or last(/PFSense by SNMP/net.if.type[{#SNMPINDEX}])=69 or last(/PFSense by SNMP/net.if.type[{#SNMPINDEX}])=117 ) and (last(/PFSense by SNMP/net.if.status[{#SNMPINDEX}])<>2) Recovery expression: `(change(/PFSense by SNMP/net.if.speed[{#SNMPINDEX}])>0 and last(/PFSense by SNMP/net.if.speed[{#SNMPINDEX}],#2)>0) or (last(/PFSense by SNMP/net.if.status[{#SNMPINDEX}])=2)`	INFO	Depends on: - PFSense: Interface [{#IFNAME}({#IFALIAS})]: Link down
PFSense: Interface [{#IFNAME}({#IFALIAS})]: Link down	This trigger expression works as follows: 1. Can be triggered if operations status is down. 2. {$IFCONTROL:"{#IFNAME}"}=1 - user can redefine Context macro to value - 0. That marks this interface as not important. No new trigger will be fired if this interface is down.	`{$IFCONTROL:"{#IFNAME}"}=1 and (last(/PFSense by SNMP/net.if.status[{#SNMPINDEX}])=2)`	AVERAGE
PFSense: Packet filter is not running	Please check PF status.	`last(/PFSense by SNMP/pfsense.pf.status)<>1`	HIGH
PFSense: State table usage is high	Please check the number of connections https://docs.netgate.com/pfsense/en/latest/config/advanced-firewall-nat.html#config-advanced-firewall-maxstates	`min(/PFSense by SNMP/pfsense.state.table.pused,#3)>{$STATE.TABLE.UTIL.MAX}`	WARNING
PFSense: Source tracking table usage is high	Please check the number of sticky connections https://docs.netgate.com/pfsense/en/latest/monitoring/status/firewall-states-sources.html	`min(/PFSense by SNMP/pfsense.source.tracking.table.pused,#3)>{$SOURCE.TRACKING.TABLE.UTIL.MAX}`	WARNING
PFSense: DHCP server is not running	Please check DHCP server settings https://docs.netgate.com/pfsense/en/latest/services/dhcp/index.html	`last(/PFSense by SNMP/pfsense.dhcpd.status)=0`	AVERAGE
PFSense: DNS server is not running	Please check DNS server settings https://docs.netgate.com/pfsense/en/latest/services/dns/index.html	`last(/PFSense by SNMP/pfsense.dns.status)=0`	AVERAGE
PFSense: Web server is not running	Please check nginx service status.	`last(/PFSense by SNMP/pfsense.nginx.status)=0`	AVERAGE
PFSense: No SNMP data collection	SNMP is not available for polling. Please check device connectivity and SNMP settings.	`max(/PFSense by SNMP/zabbix[host,snmp,available],{$SNMP.TIMEOUT})=0`	WARNING

Feedback

Please report any issues with the template at https://support.zabbix.com.

app

app_opnsense_snmp

View README Download JSON

OPNsense by SNMP

Overview

For Zabbix version: 6.2 and higher. Template for monitoring OPNsense by SNMP

This template was tested on:

OPNsense, version 22.1.9

Setup

Enable bsnmpd daemon by creating new config file "/etc/rc.conf.d/bsnmpd" with the following content:
bsnmpd_enable="YES"
Uncomment the following lines in "/etc/snmpd.config" file to enable required SNMP modules:
begemotSnmpdModulePath."hostres" = "/usr/lib/snmphostres.so"
begemotSnmpdModulePath."pf" = "/usr/lib/snmppf.so"
Start bsnmpd daemon with the following command:
/etc/rc.d/bsnmpd start
Setup a firewall rule to get access from Zabbix proxy or Zabbix server by SNMP (https://docs.opnsense.org/manual/firewall.html).
Link the template to a host.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$IF.ERRORS.WARN}	Threshold of error packets rate for warning trigger. Can be used with interface name as context.	`2`
{$IF.UTIL.MAX}	Threshold of interface bandwidth utilization for warning trigger in %. Can be used with interface name as context.	`90`
{$IFCONTROL}	Macro for operational state of the interface for link down trigger. Can be used with interface name as context.	`1`
{$NET.IF.IFADMINSTATUS.MATCHES}	This macro is used in filters of network interfaces discovery rule.	`^.*`
{$NET.IF.IFADMINSTATUS.NOT_MATCHES}	Ignore down(2) administrative status.	`^2$`
{$NET.IF.IFALIAS.MATCHES}	This macro is used in filters of network interfaces discovery rule.	`.*`
{$NET.IF.IFALIAS.NOT_MATCHES}	This macro is used in filters of network interfaces discovery rule.	`CHANGE_IF_NEEDED`
{$NET.IF.IFDESCR.MATCHES}	This macro is used in filters of network interfaces discovery rule.	`.*`
{$NET.IF.IFDESCR.NOT_MATCHES}	This macro is used in filters of network interfaces discovery rule.	`CHANGE_IF_NEEDED`
{$NET.IF.IFNAME.NOT_MATCHES}	This macro is used in filters of network interfaces discovery rule.	`(^pflog[0-9.]*$	^pfsync[0-9.]*$)`
{$NET.IF.IFOPERSTATUS.MATCHES}	This macro is used in filters of network interfaces discovery rule.	`^.*$`
{$NET.IF.IFOPERSTATUS.NOT_MATCHES}	Ignore notPresent(6).	`^6$`
{$NET.IF.IFTYPE.MATCHES}	This macro is used in filters of network interfaces discovery rule.	`.*`
{$NET.IF.IFTYPE.NOT_MATCHES}	This macro is used in filters of network interfaces discovery rule.	`CHANGE_IF_NEEDED`
{$SNMP.TIMEOUT}	The time interval for SNMP availability trigger.	`5m`
{$SOURCE.TRACKING.TABLE.UTIL.MAX}	Threshold of source tracking table utilization trigger in %.	`90`
{$STATE.TABLE.UTIL.MAX}	Threshold of state table utilization trigger in %.	`90`

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Network interfaces discovery

Discovering interfaces from IF-MIB.

SNMP

opnsense.net.if.discovery

Filter:

AND

- {#IFADMINSTATUS} MATCHESREGEX {$NET.IF.IFADMINSTATUS.MATCHES}

- {#IFADMINSTATUS} NOTMATCHESREGEX {$NET.IF.IFADMINSTATUS.NOT_MATCHES}

- {#IFOPERSTATUS} MATCHESREGEX {$NET.IF.IFOPERSTATUS.MATCHES}

- {#IFOPERSTATUS} NOTMATCHESREGEX {$NET.IF.IFOPERSTATUS.NOT_MATCHES}

- {#IFNAME} MATCHESREGEX @Network interfaces for discovery

- {#IFNAME} NOTMATCHESREGEX {$NET.IF.IFNAME.NOT_MATCHES}

- {#IFDESCR} MATCHESREGEX {$NET.IF.IFDESCR.MATCHES}

- {#IFDESCR} NOTMATCHESREGEX {$NET.IF.IFDESCR.NOT_MATCHES}

- {#IFALIAS} MATCHESREGEX {$NET.IF.IFALIAS.MATCHES}

- {#IFALIAS} NOTMATCHESREGEX {$NET.IF.IFALIAS.NOT_MATCHES}

- {#IFTYPE} MATCHESREGEX {$NET.IF.IFTYPE.MATCHES}

- {#IFTYPE} NOTMATCHESREGEX {$NET.IF.IFTYPE.NOT_MATCHES}

Items collected

Group	Name	Description	Type	Key and additional info
Network interfaces	OPNsense: Interface [{#IFNAME}({#IFALIAS})]: Inbound packets discarded	MIB: IF-MIB The number of inbound packets which were chosen to be discarded even though no errors had been detected to prevent their being deliverable to a higher-layer protocol. One possible reason for discarding such a packet could be to free up buffer space. Discontinuities in the value of this counter can occur at re-initialization of the management system, and at other times as indicated by the value of ifCounterDiscontinuityTime.	SNMP	net.if.in.discards[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND: ``
Network interfaces	OPNsense: Interface [{#IFNAME}({#IFALIAS})]: Inbound packets with errors	MIB: IF-MIB For packet-oriented interfaces, the number of inbound packets that contained errors preventing them from being deliverable to a higher-layer protocol. For character-oriented or fixed-length interfaces, the number of inbound transmission units that contained errors preventing them from being deliverable to a higher-layer protocol. Discontinuities in the value of this counter can occur at re-initialization of the management system, and at other times as indicated by the value of ifCounterDiscontinuityTime.	SNMP	net.if.in.errors[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND: ``
Network interfaces	OPNsense: Interface [{#IFNAME}({#IFALIAS})]: Bits received	MIB: IF-MIB The total number of octets received on the interface, including framing characters. This object is a 64-bit version of ifInOctets. Discontinuities in the value of this counter can occur at re-initialization of the management system, and at other times as indicated by the value of ifCounterDiscontinuityTime.	SNMP	net.if.in[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND: ``</p><p>- MULTIPLIER:`8`
Network interfaces	OPNsense: Interface [{#IFNAME}({#IFALIAS})]: Outbound packets discarded	MIB: IF-MIB The number of outbound packets which were chosen to be discarded even though no errors had been detected to prevent their being deliverable to a higher-layer protocol. One possible reason for discarding such a packet could be to free up buffer space. Discontinuities in the value of this counter can occur at re-initialization of the management system, and at other times as indicated by the value of ifCounterDiscontinuityTime.	SNMP	net.if.out.discards[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND: ``
Network interfaces	OPNsense: Interface [{#IFNAME}({#IFALIAS})]: Outbound packets with errors	MIB: IF-MIB For packet-oriented interfaces, the number of outbound packets that contained errors preventing them from being deliverable to a higher-layer protocol. For character-oriented or fixed-length interfaces, the number of outbound transmission units that contained errors preventing them from being deliverable to a higher-layer protocol. Discontinuities in the value of this counter can occur at re-initialization of the management system, and at other times as indicated by the value of ifCounterDiscontinuityTime.	SNMP	net.if.out.errors[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND: ``
Network interfaces	OPNsense: Interface [{#IFNAME}({#IFALIAS})]: Bits sent	MIB: IF-MIB The total number of octets transmitted out of the interface, including framing characters. This object is a 64-bit version of ifOutOctets.Discontinuities in the value of this counter can occur at re-initialization of the management system, and at other times as indicated by the value of ifCounterDiscontinuityTime.	SNMP	net.if.out[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND: ``</p><p>- MULTIPLIER:`8`
Network interfaces	OPNsense: Interface [{#IFNAME}({#IFALIAS})]: Speed	MIB: IF-MIB An estimate of the interface's current bandwidth in units of 1,000,000 bits per second. If this object reports a value of `n' then the speed of the interface is somewhere in the range of`n-500,000' to`n+499,999'. For interfaces which do not vary in bandwidth or for those where no accurate estimation can be made, this object should contain the nominal bandwidth. For a sub-layer which has no concept of bandwidth, this object should be zero.	SNMP	net.if.speed[{#SNMPINDEX}] Preprocessing: - MULTIPLIER: `1000000` - DISCARDUNCHANGEDHEARTBEAT: `1h`
Network interfaces	OPNsense: Interface [{#IFNAME}({#IFALIAS})]: Operational status	MIB: IF-MIB The current operational state of the interface. - The testing(3) state indicates that no operational packet scan be passed - If ifAdminStatus is down(2) then ifOperStatus should be down(2) - If ifAdminStatus is changed to up(1) then ifOperStatus should change to up(1) if the interface is ready to transmit and receive network traffic - It should change todormant(5) if the interface is waiting for external actions (such as a serial line waiting for an incoming connection) - It should remain in the down(2) state if and only if there is a fault that prevents it from going to the up(1) state - It should remain in the notPresent(6) state if the interface has missing(typically, hardware) components.	SNMP	net.if.status[{#SNMPINDEX}] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `6h`
Network interfaces	OPNsense: Interface [{#IFNAME}({#IFALIAS})]: Interface type	MIB: IF-MIB The type of interface. Additional values for ifType are assigned by the Internet Assigned Numbers Authority (IANA), through updating the syntax of the IANAifType textual convention.	SNMP	net.if.type[{#SNMPINDEX}] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `6h`
Network interfaces	OPNsense: Interface [{#IFNAME}({#IFALIAS})]: Rules references count	MIB: BEGEMOT-PF-MIB The number of rules referencing this interface.	SNMP	net.if.rules.refs[{#SNMPINDEX}]
Network interfaces	OPNsense: Interface [{#IFNAME}({#IFALIAS})]: Inbound IPv4 traffic passed	MIB: BEGEMOT-PF-MIB IPv4 bits per second passed coming in on this interface.	SNMP	net.if.in.pass.v4.bps[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND - MULTIPLIER: `8`
Network interfaces	OPNsense: Interface [{#IFNAME}({#IFALIAS})]: Inbound IPv4 traffic blocked	MIB: BEGEMOT-PF-MIB IPv4 bits per second blocked coming in on this interface.	SNMP	net.if.in.block.v4.bps[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND - MULTIPLIER: `8`
Network interfaces	OPNsense: Interface [{#IFNAME}({#IFALIAS})]: Outbound IPv4 traffic passed	MIB: BEGEMOT-PF-MIB IPv4 bits per second passed going out on this interface.	SNMP	net.if.out.pass.v4.bps[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND - MULTIPLIER: `8`
Network interfaces	OPNsense: Interface [{#IFNAME}({#IFALIAS})]: Outbound IPv4 traffic blocked	MIB: BEGEMOT-PF-MIB IPv4 bits per second blocked going out on this interface.	SNMP	net.if.out.block.v4.bps[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND - MULTIPLIER: `8`
Network interfaces	OPNsense: Interface [{#IFNAME}({#IFALIAS})]: Inbound IPv4 packets passed	MIB: BEGEMOT-PF-MIB The number of IPv4 packets passed coming in on this interface.	SNMP	net.if.in.pass.v4.pps[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND
Network interfaces	OPNsense: Interface [{#IFNAME}({#IFALIAS})]: Inbound IPv4 packets blocked	MIB: BEGEMOT-PF-MIB The number of IPv4 packets blocked coming in on this interface.	SNMP	net.if.in.block.v4.pps[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND
Network interfaces	OPNsense: Interface [{#IFNAME}({#IFALIAS})]: Outbound IPv4 packets passed	MIB: BEGEMOT-PF-MIB The number of IPv4 packets passed going out on this interface.	SNMP	net.if.out.pass.v4.pps[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND
Network interfaces	OPNsense: Interface [{#IFNAME}({#IFALIAS})]: Outbound IPv4 packets blocked	MIB: BEGEMOT-PF-MIB The number of IPv4 packets blocked going out on this interface.	SNMP	net.if.out.block.v4.pps[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND
Network interfaces	OPNsense: Interface [{#IFNAME}({#IFALIAS})]: Inbound IPv6 traffic passed	MIB: BEGEMOT-PF-MIB IPv6 bits per second passed coming in on this interface.	SNMP	net.if.in.pass.v6.bps[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND - MULTIPLIER: `8`
Network interfaces	OPNsense: Interface [{#IFNAME}({#IFALIAS})]: Inbound IPv6 traffic blocked	MIB: BEGEMOT-PF-MIB IPv6 bits per second blocked coming in on this interface.	SNMP	net.if.in.block.v6.bps[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND - MULTIPLIER: `8`
Network interfaces	OPNsense: Interface [{#IFNAME}({#IFALIAS})]: Outbound IPv6 traffic passed	MIB: BEGEMOT-PF-MIB IPv6 bits per second passed going out on this interface.	SNMP	net.if.out.pass.v6.bps[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND - MULTIPLIER: `8`
Network interfaces	OPNsense: Interface [{#IFNAME}({#IFALIAS})]: Outbound IPv6 traffic blocked	MIB: BEGEMOT-PF-MIB IPv6 bits per second blocked going out on this interface.	SNMP	net.if.out.block.v6.bps[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND - MULTIPLIER: `8`
Network interfaces	OPNsense: Interface [{#IFNAME}({#IFALIAS})]: Inbound IPv6 packets passed	MIB: BEGEMOT-PF-MIB The number of IPv6 packets passed coming in on this interface.	SNMP	net.if.in.pass.v6.pps[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND
Network interfaces	OPNsense: Interface [{#IFNAME}({#IFALIAS})]: Inbound IPv6 packets blocked	MIB: BEGEMOT-PF-MIB The number of IPv6 packets blocked coming in on this interface.	SNMP	net.if.in.block.v6.pps[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND
Network interfaces	OPNsense: Interface [{#IFNAME}({#IFALIAS})]: Outbound IPv6 packets passed	MIB: BEGEMOT-PF-MIB The number of IPv6 packets passed going out on this interface.	SNMP	net.if.out.pass.v6.pps[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND
Network interfaces	OPNsense: Interface [{#IFNAME}({#IFALIAS})]: Outbound IPv6 packets blocked	MIB: BEGEMOT-PF-MIB The number of IPv6 packets blocked going out on this interface.	SNMP	net.if.out.block.v6.pps[{#SNMPINDEX}] Preprocessing: - CHANGEPERSECOND
OPNsense	OPNsense: Packet filter running status	MIB: BEGEMOT-PF-MIB True if packet filter is currently enabled.	SNMP	opnsense.pf.status
OPNsense	OPNsense: States table current	MIB: BEGEMOT-PF-MIB Number of entries in the state table.	SNMP	opnsense.state.table.count
OPNsense	OPNsense: States table limit	MIB: BEGEMOT-PF-MIB Maximum number of 'keep state' rules in the ruleset.	SNMP	opnsense.state.table.limit
OPNsense	OPNsense: States table utilization in %	Utilization of state table in %.	CALCULATED	opnsense.state.table.pused Expression: `last(//opnsense.state.table.count) * 100 / last(//opnsense.state.table.limit)`
OPNsense	OPNsense: Source tracking table current	MIB: BEGEMOT-PF-MIB Number of entries in the source tracking table.	SNMP	opnsense.source.tracking.table.count
OPNsense	OPNsense: Source tracking table limit	MIB: BEGEMOT-PF-MIB Maximum number of 'sticky-address' or 'source-track' rules in the ruleset.	SNMP	opnsense.source.tracking.table.limit
OPNsense	OPNsense: Source tracking table utilization in %	Utilization of source tracking table in %.	CALCULATED	opnsense.source.tracking.table.pused Expression: `last(//opnsense.source.tracking.table.count) * 100 / last(//opnsense.source.tracking.table.limit)`
OPNsense	OPNsense: DHCP server status	MIB: HOST-RESOURCES-MIB The status of DHCP server process.	SNMP	opnsense.dhcpd.status Preprocessing: - CHECKNOTSUPPORTED: ``</p><p>⛔️ON_FAIL:`CUSTOM_VALUE -> 0`
OPNsense	OPNsense: DNS server status	MIB: HOST-RESOURCES-MIB The status of DNS server process.	SNMP	opnsense.dns.status Preprocessing: - CHECKNOTSUPPORTED: ``</p><p>⛔️ON_FAIL:`CUSTOM_VALUE -> 0`
OPNsense	OPNsense: Web server status	MIB: HOST-RESOURCES-MIB The status of lighttpd process.	SNMP	opnsense.lighttpd.status Preprocessing: - CHECKNOTSUPPORTED: ``</p><p>⛔️ON_FAIL:`CUSTOM_VALUE -> 0`
OPNsense	OPNsense: Packets matched a filter rule	MIB: BEGEMOT-PF-MIB True if the packet was logged with the specified packet filter reason code. The known codes are: match, bad-offset, fragment, short, normalize, and memory.	SNMP	opnsense.packets.match Preprocessing: - CHANGEPERSECOND
OPNsense	OPNsense: Packets with bad offset	MIB: BEGEMOT-PF-MIB True if the packet was logged with the specified packet filter reason code. The known codes are: match, bad-offset, fragment, short, normalize, and memory.	SNMP	opnsense.packets.bad.offset Preprocessing: - CHANGEPERSECOND
OPNsense	OPNsense: Fragmented packets	MIB: BEGEMOT-PF-MIB True if the packet was logged with the specified packet filter reason code. The known codes are: match, bad-offset, fragment, short, normalize, and memory.	SNMP	opnsense.packets.fragment Preprocessing: - CHANGEPERSECOND
OPNsense	OPNsense: Short packets	MIB: BEGEMOT-PF-MIB True if the packet was logged with the specified packet filter reason code. The known codes are: match, bad-offset, fragment, short, normalize, and memory.	SNMP	opnsense.packets.short Preprocessing: - CHANGEPERSECOND
OPNsense	OPNsense: Normalized packets	MIB: BEGEMOT-PF-MIB True if the packet was logged with the specified packet filter reason code. The known codes are: match, bad-offset, fragment, short, normalize, and memory.	SNMP	opnsense.packets.normalize Preprocessing: - CHANGEPERSECOND
OPNsense	OPNsense: Packets dropped due to memory limitation	MIB: BEGEMOT-PF-MIB True if the packet was logged with the specified packet filter reason code. The known codes are: match, bad-offset, fragment, short, normalize, and memory.	SNMP	opnsense.packets.mem.drop Preprocessing: - CHANGEPERSECOND
OPNsense	OPNsense: Firewall rules count	MIB: BEGEMOT-PF-MIB The number of labeled filter rules on this system.	SNMP	opnsense.rules.count
Status	OPNsense: SNMP agent availability	Availability of SNMP checks on the host. The value of this item corresponds to availability icons in the host list. Possible value: 0 - not available 1 - available 2 - unknown	INTERNAL	zabbix[host,snmp,available]

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
OPNsense: Interface [{#IFNAME}({#IFALIAS})]: High input error rate	Recovers when below 80% of {$IF.ERRORS.WARN:"{#IFNAME}"} threshold.	`min(/OPNsense by SNMP/net.if.in.errors[{#SNMPINDEX}],5m)>{$IF.ERRORS.WARN:"{#IFNAME}"}` Recovery expression: `max(/OPNsense by SNMP/net.if.in.errors[{#SNMPINDEX}],5m)<{$IF.ERRORS.WARN:"{#IFNAME}"}*0.8`	WARNING	Depends on: - OPNsense: Interface [{#IFNAME}({#IFALIAS})]: Link down
OPNsense: Interface [{#IFNAME}({#IFALIAS})]: High inbound bandwidth usage	The network interface utilization is close to its estimated maximum bandwidth.	`(avg(/OPNsense by SNMP/net.if.in[{#SNMPINDEX}],15m)>({$IF.UTIL.MAX:"{#IFNAME}"}/100)last(/OPNsense by SNMP/net.if.speed[{#SNMPINDEX}])) and last(/OPNsense by SNMP/net.if.speed[{#SNMPINDEX}])>0` Recovery expression: `avg(/OPNsense by SNMP/net.if.in[{#SNMPINDEX}],15m)<(({$IF.UTIL.MAX:"{#IFNAME}"}-3)/100)last(/OPNsense by SNMP/net.if.speed[{#SNMPINDEX}])`	WARNING	Depends on: - OPNsense: Interface [{#IFNAME}({#IFALIAS})]: Link down
OPNsense: Interface [{#IFNAME}({#IFALIAS})]: High output error rate	Recovers when below 80% of {$IF.ERRORS.WARN:"{#IFNAME}"} threshold.	`min(/OPNsense by SNMP/net.if.out.errors[{#SNMPINDEX}],5m)>{$IF.ERRORS.WARN:"{#IFNAME}"}` Recovery expression: `max(/OPNsense by SNMP/net.if.out.errors[{#SNMPINDEX}],5m)<{$IF.ERRORS.WARN:"{#IFNAME}"}*0.8`	WARNING	Depends on: - OPNsense: Interface [{#IFNAME}({#IFALIAS})]: Link down
OPNsense: Interface [{#IFNAME}({#IFALIAS})]: High outbound bandwidth usage	The network interface utilization is close to its estimated maximum bandwidth.	`(avg(/OPNsense by SNMP/net.if.out[{#SNMPINDEX}],15m)>({$IF.UTIL.MAX:"{#IFNAME}"}/100)last(/OPNsense by SNMP/net.if.speed[{#SNMPINDEX}])) and last(/OPNsense by SNMP/net.if.speed[{#SNMPINDEX}])>0` Recovery expression: `avg(/OPNsense by SNMP/net.if.out[{#SNMPINDEX}],15m)<(({$IF.UTIL.MAX:"{#IFNAME}"}-3)/100)last(/OPNsense by SNMP/net.if.speed[{#SNMPINDEX}])`	WARNING	Depends on: - OPNsense: Interface [{#IFNAME}({#IFALIAS})]: Link down
OPNsense: Interface [{#IFNAME}({#IFALIAS})]: Ethernet has changed to lower speed than it was before	This Ethernet connection has transitioned down from its known maximum speed. This might be a sign of autonegotiation issues. Ack to close.	change(/OPNsense by SNMP/net.if.speed[{#SNMPINDEX}])<0 and last(/OPNsense by SNMP/net.if.speed[{#SNMPINDEX}])>0 and ( last(/OPNsense by SNMP/net.if.type[{#SNMPINDEX}])=6 or last(/OPNsense by SNMP/net.if.type[{#SNMPINDEX}])=7 or last(/OPNsense by SNMP/net.if.type[{#SNMPINDEX}])=11 or last(/OPNsense by SNMP/net.if.type[{#SNMPINDEX}])=62 or last(/OPNsense by SNMP/net.if.type[{#SNMPINDEX}])=69 or last(/OPNsense by SNMP/net.if.type[{#SNMPINDEX}])=117 ) and (last(/OPNsense by SNMP/net.if.status[{#SNMPINDEX}])<>2) Recovery expression: `(change(/OPNsense by SNMP/net.if.speed[{#SNMPINDEX}])>0 and last(/OPNsense by SNMP/net.if.speed[{#SNMPINDEX}],#2)>0) or (last(/OPNsense by SNMP/net.if.status[{#SNMPINDEX}])=2)`	INFO	Depends on: - OPNsense: Interface [{#IFNAME}({#IFALIAS})]: Link down
OPNsense: Interface [{#IFNAME}({#IFALIAS})]: Link down	This trigger expression works as follows: 1. Can be triggered if operations status is down. 2. {$IFCONTROL:"{#IFNAME}"}=1 - user can redefine Context macro to value - 0. That marks this interface as not important. No new trigger will be fired if this interface is down.	`{$IFCONTROL:"{#IFNAME}"}=1 and (last(/OPNsense by SNMP/net.if.status[{#SNMPINDEX}])=2)`	AVERAGE
OPNsense: Packet filter is not running	Please check PF status.	`last(/OPNsense by SNMP/opnsense.pf.status)<>1`	HIGH
OPNsense: State table usage is high	Please check the number of connections.	`min(/OPNsense by SNMP/opnsense.state.table.pused,#3)>{$STATE.TABLE.UTIL.MAX}`	WARNING
OPNsense: Source tracking table usage is high	Please check the number of sticky connections.	`min(/OPNsense by SNMP/opnsense.source.tracking.table.pused,#3)>{$SOURCE.TRACKING.TABLE.UTIL.MAX}`	WARNING
OPNsense: DHCP server is not running	Please check DHCP server settings.	`last(/OPNsense by SNMP/opnsense.dhcpd.status)=0`	AVERAGE
OPNsense: DNS server is not running	Please check DNS server settings.	`last(/OPNsense by SNMP/opnsense.dns.status)=0`	AVERAGE
OPNsense: Web server is not running	Please check lighttpd service status.	`last(/OPNsense by SNMP/opnsense.lighttpd.status)=0`	AVERAGE
OPNsense: No SNMP data collection	SNMP is not available for polling. Please check device connectivity and SNMP settings.	`max(/OPNsense by SNMP/zabbix[host,snmp,available],{$SNMP.TIMEOUT})=0`	WARNING

Feedback

Please report any issues with the template at https://support.zabbix.com.

app

app_openweathermap_http

View README Download JSON

OpenWeatherMap by HTTP

Overview

For Zabbix version: 6.2 and higher
Get weather metrics from OpenWeatherMap current weather API by HTTP. It works without any external scripts and uses the Script item.

Setup

Create a host.
Link the template to the host.
Customize the values of {$OPENWEATHERMAP.API.TOKEN} and {$LOCATION} macros.
OpenWeatherMap API Tokens are available in your OpenWeatherMap account https://home.openweathermap.org/api_keys.
Locations can be set by few ways:
- by geo coordinates (for example: 56.95,24.0833)
- by location name (for example: Riga)
- by location ID. Link to the list of city ID: http://bulk.openweathermap.org/sample/city.list.json.gz
- by zip/post code with a country code (for example: 94040,us) A few locations can be added to the macro at the same time by | delimiter. For example: 43.81821,7.76115|Riga|2643743|94040,us. Please note that API requests by city name, zip-codes and city id will be deprecated soon.
Language and units macros can be customized too if necessary. List of available languages: https://openweathermap.org/current#multi. Available units of measurement are: standard, metric and imperial https://openweathermap.org/current#data.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$LANG}	List of available languages https://openweathermap.org/current#multi.	`en`
{$LOCATION}	Locations can be set by few ways: 1. by geo coordinates (for example: 56.95,24.0833) 2. by location name (for example: Riga) 3. by location ID. Link to the list of city ID: http://bulk.openweathermap.org/sample/city.list.json.gz 4. by zip/post code with a country code (for example: 94040,us) A few locations can be added to the macro at the same time by `\|` delimiter. For example: `43.81821,7.76115\|Riga\|2643743\|94040,us`. Please note that API requests by city name, zip-codes and city id will be deprecated soon.	`Riga`
{$OPENWEATHERMAP.API.ENDPOINT}	OpenWeatherMap API endpoint.	`api.openweathermap.org/data/2.5/weather?`
{$OPENWEATHERMAP.API.TOKEN}	Specify openweathermap API key.	``
{$OPENWEATHERMAP.DATA.TIMEOUT}	Response timeout for OpenWeatherMap API.	`3s`
{$TEMP.CRIT.HIGH}	Threshold for high temperature trigger.	`30`
{$TEMP.CRIT.LOW}	Threshold for low temperature trigger.	`-20`
{$UNITS}	Available units of measurement are standard, metric and imperial https://openweathermap.org/current#data.	`metric`

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Locations discovery

Weather metrics discovery by location.

DEPENDENT

openweathermap.locations.discovery

Preprocessing:

- JSONPATH: $.data

- NOTMATCHESREGEX: \[\]

- DISCARDUNCHANGEDHEARTBEAT: 1h

Items collected

Group	Name	Description	Type	Key and additional info
OpenWeatherMap	[{#LOCATION}, {#COUNTRY}]: Data	JSON with result of OpenWeatherMap API request by location.	DEPENDENT	openweathermap.location.data[{#ID}] Preprocessing: - JSONPATH: `$.data.[?(@.id=='{#ID}')].first()`
OpenWeatherMap	[{#LOCATION}, {#COUNTRY}]: Atmospheric pressure	Atmospheric pressure in Pa.	DEPENDENT	openweathermap.pressure[{#ID}] Preprocessing: - JSONPATH: `$.main.pressure` - MULTIPLIER: `100` - DISCARDUNCHANGEDHEARTBEAT: `1h`
OpenWeatherMap	[{#LOCATION}, {#COUNTRY}]: Cloudiness	Cloudiness in %.	DEPENDENT	openweathermap.clouds[{#ID}] Preprocessing: - JSONPATH: `$.clouds.all` - DISCARDUNCHANGEDHEARTBEAT: `1h`
OpenWeatherMap	[{#LOCATION}, {#COUNTRY}]: Humidity	Humidity in %.	DEPENDENT	openweathermap.humidity[{#ID}] Preprocessing: - JSONPATH: `$.main.humidity` - DISCARDUNCHANGEDHEARTBEAT: `1h`
OpenWeatherMap	[{#LOCATION}, {#COUNTRY}]: Rain volume for the last one hour	Rain volume for the lat one hour in m.	DEPENDENT	openweathermap.rain[{#ID}] Preprocessing: - JSONPATH: `$.rain.1h` ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - MULTIPLIER: `0.001` - DISCARDUNCHANGED_HEARTBEAT: `1h`
OpenWeatherMap	[{#LOCATION}, {#COUNTRY}]: Short weather status	Short weather status description.	DEPENDENT	openweathermap.description[{#ID}] Preprocessing: - JSONPATH: `$.weather..description.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
OpenWeatherMap	[{#LOCATION}, {#COUNTRY}]: Snow volume for the last one hour	Snow volume for the lat one hour in m.	DEPENDENT	openweathermap.snow[{#ID}] Preprocessing: - JSONPATH: `$.snow.1h` ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - MULTIPLIER: `0.001` - DISCARDUNCHANGED_HEARTBEAT: `1h`
OpenWeatherMap	[{#LOCATION}, {#COUNTRY}]: Temperature	Atmospheric temperature value.	DEPENDENT	openweathermap.temp[{#ID}] Preprocessing: - JSONPATH: `$.main.temp` - DISCARDUNCHANGEDHEARTBEAT: `1h`
OpenWeatherMap	[{#LOCATION}, {#COUNTRY}]: Visibility	Visibility in m.	DEPENDENT	openweathermap.visibility[{#ID}] Preprocessing: - JSONPATH: `$.visibility` - DISCARDUNCHANGEDHEARTBEAT: `1h`
OpenWeatherMap	[{#LOCATION}, {#COUNTRY}]: Wind direction	Wind direction in degrees.	DEPENDENT	openweathermap.wind.direction[{#ID}] Preprocessing: - JSONPATH: `$.wind.deg` - DISCARDUNCHANGEDHEARTBEAT: `1h`
OpenWeatherMap	[{#LOCATION}, {#COUNTRY}]: Wind speed	Wind speed value.	DEPENDENT	openweathermap.wind.speed[{#ID}] Preprocessing: - JSONPATH: `$.wind.speed` - DISCARDUNCHANGEDHEARTBEAT: `1h`
Zabbix raw items	Openweathermap: Get data	JSON array with result of OpenWeatherMap API requests.	SCRIPT	openweathermap.get.data Expression: `The text is too long. Please see the template.`
Zabbix raw items	Openweathermap: Get data collection errors	Errors from get data requests by script item.	DEPENDENT	openweathermap.get.errors Preprocessing: - JSONPATH: `$.errors` - DISCARDUNCHANGEDHEARTBEAT: `1h`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
[{#LOCATION}, {#COUNTRY}]: Temperature is too high	Temperature value is too high.	`min(/OpenWeatherMap by HTTP/openweathermap.temp[{#ID}],#3)>{$TEMP.CRIT.HIGH}`	AVERAGE	Manual close: YES
[{#LOCATION}, {#COUNTRY}]: Temperature is too low	Temperature value is too low.	`max(/OpenWeatherMap by HTTP/openweathermap.temp[{#ID}],#3)<{$TEMP.CRIT.LOW}`	AVERAGE	Manual close: YES
Openweathermap: There are errors in requests to OpenWeatherMap API	Zabbix has received errors in requests to OpenWeatherMap API.	`length(last(/OpenWeatherMap by HTTP/openweathermap.get.errors))>0`	AVERAGE	Manual close: YES

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

app

app_ntp_service

View README Download JSON

NTP Service

Overview

For Zabbix version: 6.2 and higher

Setup

Refer to the vendor documentation.

Zabbix configuration

No specific Zabbix configuration is required.

Template links

There are no template links in this template.

Discovery rules

Items collected

Group	Name	Description	Type	Key and additional info
Services	NTP service is running	-	SIMPLE	net.udp.service[ntp]

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
NTP service is down on {HOST.NAME}	-	`max(/NTP Service/net.udp.service[ntp],#3)=0`	AVERAGE

Feedback

Please report any issues with the template at https://support.zabbix.com

app

app_nntp_service

View README Download JSON

NNTP Service

Overview

For Zabbix version: 6.2 and higher

Setup

Refer to the vendor documentation.

Zabbix configuration

No specific Zabbix configuration is required.

Template links

There are no template links in this template.

Discovery rules

Items collected

Group	Name	Description	Type	Key and additional info
Services	NNTP service is running	-	SIMPLE	net.tcp.service[nntp]

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
NNTP service is down on {HOST.NAME}	-	`max(/NNTP Service/net.tcp.service[nntp],#3)=0`	AVERAGE

Feedback

Please report any issues with the template at https://support.zabbix.com

app

app_nginx_plus_http

View README Download JSON

NGINX Plus by HTTP

Overview

For Zabbix version: 6.2 and higher. This template is designed to monitor NGINX Plus by Zabbix that works without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

Tested versions

This template has been tested on:

NGINX Plus, version 1.19.10

Setup

Enable the NGINX Plus API. > Refer to the vendor documentation.
Set the {$NGINX.API.ENDPOINT} such as <scheme>://<host>:<port>/<location>/.

Note that depending on the number of zones and upstreams discovery operation may be expensive. Therefore, use the following filters with these macros:

{$NGINX.LLD.FILTER.HTTP.ZONE.MATCHES}
{$NGINX.LLD.FILTER.HTTP.ZONE.NOT_MATCHES}
{$NGINX.LLD.FILTER.HTTP.LOCATION.ZONE.MATCHES}
{$NGINX.LLD.FILTER.HTTP.LOCATION.ZONE.NOT_MATCHES}
{$NGINX.LLD.FILTER.HTTP.UPSTREAM.MATCHES}
{$NGINX.LLD.FILTER.HTTP.UPSTREAM.NOT_MATCHES}
{$NGINX.LLD.FILTER.STREAM.ZONE.MATCHES}
{$NGINX.LLD.FILTER.STREAM.ZONE.NOT_MATCHES}
{$NGINX.LLD.FILTER.STREAM.UPSTREAM.MATCHES}
{$NGINX.LLD.FILTER.STREAM.UPSTREAM.NOT_MATCHES}
{$NGINX.LLD.FILTER.RESOLVER.MATCHES}
{$NGINX.LLD.FILTER.RESOLVER.NOT_MATCHES}

Configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$NGINX.API.ENDPOINT}	NGINX Plus API URL in the format `<scheme>://<host>:<port>/<location>/`.	``
{$NGINX.DROP_RATE.MAX.WARN}	The critical rate of the dropped connections for a trigger expression.	`1`
{$NGINX.HTTP.UPSTREAM.4XX.MAX.WARN}	The maximum percentage of errors with the status code `4xx` (for a trigger expression).	`5`
{$NGINX.HTTP.UPSTREAM.5XX.MAX.WARN}	The maximum percentage of errors with the status code `5xx` (for a trigger expression).	`5`
{$NGINX.LLD.FILTER.HTTP.LOCATION.ZONE.MATCHES}	The filter to include the necessary discovered HTTP location zones.	`.*`
{$NGINX.LLD.FILTER.HTTP.LOCATION.ZONE.NOT_MATCHES}	The filter to exclude discovered HTTP location zones.	`CHANGE_IF_NEEDED`
{$NGINX.LLD.FILTER.HTTP.UPSTREAM.MATCHES}	The filter to include the necessary discovered HTTP upstreams.	`.*`
{$NGINX.LLD.FILTER.HTTP.UPSTREAM.NOT_MATCHES}	The filter to exclude discovered HTTP upstreams.	`CHANGE_IF_NEEDED`
{$NGINX.LLD.FILTER.HTTP.ZONE.MATCHES}	The filter to include the necessary discovered HTTP server zones.	`.*`
{$NGINX.LLD.FILTER.HTTP.ZONE.NOT_MATCHES}	The filter to exclude discovered HTTP server zones.	`CHANGE_IF_NEEDED`
{$NGINX.LLD.FILTER.RESOLVER.MATCHES}	The filter to include the necessary discovered `Resolvers`.	`.*`
{$NGINX.LLD.FILTER.RESOLVER.NOT_MATCHES}	The filter to exclude discovered `Resolvers`.	`CHANGE_IF_NEEDED`
{$NGINX.LLD.FILTER.STREAM.UPSTREAM.MATCHES}	The filter to include the necessary discovered upstreams of the "stream" directive.	`.*`
{$NGINX.LLD.FILTER.STREAM.UPSTREAM.NOT_MATCHES}	The filter to exclude discovered upstreams of the "stream" directive	`CHANGE_IF_NEEDED`
{$NGINX.LLD.FILTER.STREAM.ZONE.MATCHES}	The filter to include discovered server zones of the "stream" directive.	`.*`
{$NGINX.LLD.FILTER.STREAM.ZONE.NOT_MATCHES}	The filter to exclude discovered server zones of the "stream" directive.	`CHANGE_IF_NEEDED`

Template links

There are no template links in this template.

Discovery rules

Name	Description	Type	Key and additional info
HTTP location zones discovery	-	DEPENDENT	nginx.http.locationzones.discovery Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `30m` Filter: AND - {#NAME} MATCHESREGEX `{$NGINX.LLD.FILTER.HTTP.LOCATION.ZONE.MATCHES}` - {#NAME} NOTMATCHESREGEX `{$NGINX.LLD.FILTER.HTTP.LOCATION.ZONE.NOT_MATCHES}`
HTTP server zones discovery	-	DEPENDENT	nginx.http.serverzones.discovery Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `30m` Filter: AND - {#NAME} MATCHESREGEX `{$NGINX.LLD.FILTER.HTTP.ZONE.MATCHES}` - {#NAME} NOTMATCHESREGEX `{$NGINX.LLD.FILTER.HTTP.ZONE.NOT_MATCHES}`
HTTP upstream peers discovery	-	DEPENDENT	nginx.http.upstream.peers.discovery Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `30m` Filter: AND - {#UPSTREAM} MATCHESREGEX `{$NGINX.LLD.FILTER.HTTP.UPSTREAM.MATCHES}` - {#UPSTREAM} NOTMATCHES_REGEX `{$NGINX.LLD.FILTER.HTTP.UPSTREAM.NOT_MATCHES}`
HTTP upstreams discovery	-	DEPENDENT	nginx.http.upstreams.discovery Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `30m` Filter: AND - {#NAME} MATCHESREGEX `{$NGINX.LLD.FILTER.HTTP.UPSTREAM.MATCHES}` - {#NAME} NOTMATCHES_REGEX `{$NGINX.LLD.FILTER.HTTP.UPSTREAM.NOT_MATCHES}`
Resolvers discovery	-	DEPENDENT	nginx.resolvers.discovery Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `30m` Filter: AND - {#NAME} MATCHESREGEX `{$NGINX.LLD.FILTER.RESOLVER.MATCHES}` - {#NAME} NOTMATCHES_REGEX `{$NGINX.LLD.FILTER.RESOLVER.NOT_MATCHES}`
Stream server zones discovery	-	DEPENDENT	nginx.stream.serverzones.discovery Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `30m` Filter: AND - {#NAME} MATCHESREGEX `{$NGINX.LLD.FILTER.STREAM.ZONE.MATCHES}` - {#NAME} NOTMATCHESREGEX `{$NGINX.LLD.FILTER.STREAM.ZONE.NOT_MATCHES}`
Stream upstream peers discovery	-	DEPENDENT	nginx.stream.upstream.peers.discovery Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `30m` Filter: AND - {#UPSTREAM} MATCHESREGEX `{$NGINX.LLD.FILTER.STREAM.UPSTREAM.MATCHES}` - {#UPSTREAM} NOTMATCHES_REGEX `{$NGINX.LLD.FILTER.STREAM.UPSTREAM.NOT_MATCHES}`
Stream upstreams discovery	-	DEPENDENT	nginx.stream.upstreams.discovery Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `30m` Filter: AND - {#NAME} MATCHESREGEX `{$NGINX.LLD.FILTER.STREAM.UPSTREAM.MATCHES}` - {#NAME} NOTMATCHES_REGEX `{$NGINX.LLD.FILTER.STREAM.UPSTREAM.NOT_MATCHES}`

Items collected

Group	Name	Description	Type	Key and additional info
Nginx	Nginx: Get info error	The description of NGINX errors.	DEPENDENT	nginx.info.error Preprocessing: - JSONPATH: `$.error.text` ⛔️ONFAIL: `CUSTOM_VALUE ->` - DISCARDUNCHANGED_HEARTBEAT: `1h`
Nginx	Nginx: Version	A version number of NGINX.	DEPENDENT	nginx.info.version Preprocessing: - JSONPATH: `$.version` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Nginx	Nginx: Address	The address of the server that accepted status request.	DEPENDENT	nginx.info.address Preprocessing: - JSONPATH: `$.address` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Nginx	Nginx: Generation	The total number of configuration reloads.	DEPENDENT	nginx.info.generation Preprocessing: - JSONPATH: `$.generation` - DISCARDUNCHANGEDHEARTBEAT: `30m`
Nginx	Nginx: Uptime	The server uptime.	DEPENDENT	nginx.info.uptime Preprocessing: - JSONPATH: `$.load_timestamp` ⛔️ON_FAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return Math.floor((Date.now() - new Date(value)) / 1000);`
Nginx	Nginx: Connections accepted, rate	The total number of accepted client connections per second.	DEPENDENT	nginx.connections.accepted.rate Preprocessing: - JSONPATH: `$.accepted` - CHANGEPERSECOND
Nginx	Nginx: Connections dropped	The total number of dropped client connections.	DEPENDENT	nginx.connections.dropped Preprocessing: - JSONPATH: `$.dropped`
Nginx	Nginx: Connections active	The current number of active client connections.	DEPENDENT	nginx.connections.active Preprocessing: - JSONPATH: `$.active`
Nginx	Nginx: Connections idle	The current number of idle client connections.	DEPENDENT	nginx.connections.idle Preprocessing: - JSONPATH: `$.idle`
Nginx	Nginx: SSL handshakes, rate	The total number of successful SSL handshakes per second.	DEPENDENT	nginx.ssl.handshakes.rate Preprocessing: - JSONPATH: `$.handshakes` - CHANGEPERSECOND
Nginx	Nginx: SSL handshakes failed, rate	The total number of failed SSL handshakes per second.	DEPENDENT	nginx.ssl.handshakesfailed.rate Preprocessing: - JSONPATH: `$.handshakes_failed` - CHANGEPER_SECOND
Nginx	Nginx: SSL session reuses, rate	The total number of session reuses during SSL handshake per second.	DEPENDENT	nginx.ssl.sessionreuses.rate Preprocessing: - JSONPATH: `$.session_reuses` - CHANGEPER_SECOND
Nginx	Nginx: Requests total, rate	The total number of client requests per second.	DEPENDENT	nginx.requests.total.rate Preprocessing: - JSONPATH: `$.total` - CHANGEPERSECOND
Nginx	Nginx: Requests current	The current number of client requests.	DEPENDENT	nginx.requests.current Preprocessing: - JSONPATH: `$.current`
Nginx	Nginx: HTTP server zone [{#NAME}]: Raw data	The raw data of the HTTP server zone with the name `{#NAME}` .	DEPENDENT	nginx.http.server_zones.raw[{#NAME}] Preprocessing: - JSONPATH: `$['{#NAME}']`
Nginx	Nginx: HTTP server zone [{#NAME}]: Processing	The number of client requests that are currently being processed.	DEPENDENT	nginx.http.server_zones.processing[{#NAME}] Preprocessing: - JSONPATH: `$.processing`
Nginx	Nginx: HTTP server zone [{#NAME}]: Requests, rate	The total number of client requests received from clients per second.	DEPENDENT	nginx.http.serverzones.requests.rate[{#NAME}] Preprocessing: - JSONPATH: `$.requests` - CHANGEPER_SECOND
Nginx	Nginx: HTTP server zone [{#NAME}]: Responses 1xx, rate	The number of responses with `1xx` status code per second.	DEPENDENT	nginx.http.serverzones.responses.1xx.rate[{#NAME}] Preprocessing: - JSONPATH: `$.responses.1xx` - CHANGEPER_SECOND
Nginx	Nginx: HTTP server zone [{#NAME}]: Responses 2xx, rate	The number of responses with `2xx` status code per second.	DEPENDENT	nginx.http.serverzones.responses.2xx.rate[{#NAME}] Preprocessing: - JSONPATH: `$.responses.2xx` - CHANGEPER_SECOND
Nginx	Nginx: HTTP server zone [{#NAME}]: Responses 3xx, rate	The number of responses with `3xx` status code per second.	DEPENDENT	nginx.http.serverzones.responses.3xx.rate[{#NAME}] Preprocessing: - JSONPATH: `$.responses.3xx` - CHANGEPER_SECOND
Nginx	Nginx: HTTP server zone [{#NAME}]: Responses 4xx, rate	The number of responses with `4xx` status code per second.	DEPENDENT	nginx.http.serverzones.responses.4xx.rate[{#NAME}] Preprocessing: - JSONPATH: `$.responses.4xx` - CHANGEPER_SECOND
Nginx	Nginx: HTTP server zone [{#NAME}]: Responses 5xx, rate	The number of responses with `5xx` status code per second.	DEPENDENT	nginx.http.serverzones.responses.5xx.rate[{#NAME}] Preprocessing: - JSONPATH: `$.responses.5xx` - CHANGEPER_SECOND
Nginx	Nginx: HTTP server zone [{#NAME}]: Responses total, rate	The total number of responses sent to clients per second.	DEPENDENT	nginx.http.serverzones.responses.total.rate[{#NAME}] Preprocessing: - JSONPATH: `$.responses.total` - CHANGEPER_SECOND
Nginx	Nginx: HTTP server zone [{#NAME}]: Discarded, rate	The total number of requests completed without sending a response per second.	DEPENDENT	nginx.http.serverzones.discarded.rate[{#NAME}] Preprocessing: - JSONPATH: `$.discarded` - CHANGEPER_SECOND
Nginx	Nginx: HTTP server zone [{#NAME}]: Received, rate	The total number of bytes received from clients per second.	DEPENDENT	nginx.http.serverzones.received.rate[{#NAME}] Preprocessing: - JSONPATH: `$.received` - CHANGEPER_SECOND
Nginx	Nginx: HTTP server zone [{#NAME}]: Sent, rate	The total number of bytes sent to clients per second.	DEPENDENT	nginx.http.serverzones.sent.rate[{#NAME}] Preprocessing: - JSONPATH: `$.sent` - CHANGEPER_SECOND
Nginx	Nginx: HTTP location zone [{#NAME}]: Raw data	The raw data of the location zone with the name `{#NAME}`.	DEPENDENT	nginx.http.location_zones.raw[{#NAME}] Preprocessing: - JSONPATH: `$['{#NAME}']`
Nginx	Nginx: HTTP location zone [{#NAME}]: Requests, rate	The total number of client requests received from clients per second.	DEPENDENT	nginx.http.locationzones.requests.rate[{#NAME}] Preprocessing: - JSONPATH: `$.requests` - CHANGEPER_SECOND
Nginx	Nginx: HTTP location zone [{#NAME}]: Responses 1xx, rate	The number of responses with `1xx` status code per second.	DEPENDENT	nginx.http.locationzones.responses.1xx.rate[{#NAME}] Preprocessing: - JSONPATH: `$.responses.1xx` - CHANGEPER_SECOND
Nginx	Nginx: HTTP location zone [{#NAME}]: Responses 2xx, rate	The number of responses with `2xx` status code per second.	DEPENDENT	nginx.http.locationzones.responses.2xx.rate[{#NAME}] Preprocessing: - JSONPATH: `$.responses.2xx` - CHANGEPER_SECOND
Nginx	Nginx: HTTP location zone [{#NAME}]: Responses 3xx, rate	The number of responses with `3xx` status code per second.	DEPENDENT	nginx.http.locationzones.responses.3xx.rate[{#NAME}] Preprocessing: - JSONPATH: `$.responses.3xx` - CHANGEPER_SECOND
Nginx	Nginx: HTTP location zone [{#NAME}]: Responses 4xx, rate	The number of responses with `4xx` status code per second.	DEPENDENT	nginx.http.locationzones.responses.4xx.rate[{#NAME}] Preprocessing: - JSONPATH: `$.responses.4xx` - CHANGEPER_SECOND
Nginx	Nginx: HTTP location zone [{#NAME}]: Responses 5xx, rate	The number of responses with `5xx` status code per second.	DEPENDENT	nginx.http.locationzones.responses.5xx.rate[{#NAME}] Preprocessing: - JSONPATH: `$.responses.5xx` - CHANGEPER_SECOND
Nginx	Nginx: HTTP location zone [{#NAME}]: Responses total, rate	The total number of responses sent to clients per second.	DEPENDENT	nginx.http.locationzones.responses.total.rate[{#NAME}] Preprocessing: - JSONPATH: `$.responses.total` - CHANGEPER_SECOND
Nginx	Nginx: HTTP location zone [{#NAME}]: Discarded, rate	The total number of requests completed without sending a response per second.	DEPENDENT	nginx.http.locationzones.discarded.rate[{#NAME}] Preprocessing: - JSONPATH: `$.discarded` - CHANGEPER_SECOND
Nginx	Nginx: HTTP location zone [{#NAME}]: Received, rate	The total number of bytes received from clients per second.	DEPENDENT	nginx.http.locationzones.received.rate[{#NAME}] Preprocessing: - JSONPATH: `$.received` - CHANGEPER_SECOND
Nginx	Nginx: HTTP location zone [{#NAME}]: Sent, rate	The total number of bytes sent to clients per second.	DEPENDENT	nginx.http.locationzones.sent.rate[{#NAME}] Preprocessing: - JSONPATH: `$.sent` - CHANGEPER_SECOND
Nginx	Nginx: HTTP upstream [{#NAME}]: Raw data	The raw data of the HTTP upstream with the name `{#NAME}`.	DEPENDENT	nginx.http.upstreams.raw[{#NAME}] Preprocessing: - JSONPATH: `$['{#NAME}']`
Nginx	Nginx: HTTP upstream [{#NAME}]: Keepalive	The current number of idle keepalive connections.	DEPENDENT	nginx.http.upstreams.keepalive[{#NAME}] Preprocessing: - JSONPATH: `$.keepalive`
Nginx	Nginx: HTTP upstream [{#NAME}]: Zombies	The current number of servers removed from the group but still processing active client requests.	DEPENDENT	nginx.http.upstreams.zombies[{#NAME}] Preprocessing: - JSONPATH: `$.zombies`
Nginx	Nginx: HTTP upstream [{#NAME}]: Zone	The name of the shared memory zone that keeps the group's configuration and run-time state.	DEPENDENT	nginx.http.upstreams.zone[{#NAME}] Preprocessing: - JSONPATH: `$.zone` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Nginx	Nginx: HTTP upstream [{#UPSTREAM}] peer [{#PEER}]: Raw data	The raw data of the HTTP upstream with the name `[{#UPSTREAM}]`and peer with the name`[{#PEER}]`.	DEPENDENT	nginx.http.upstream.peer.raw[{#UPSTREAM},{#PEER}] Preprocessing: - JSONPATH: `$['{#UPSTREAM}'].peers[?(@.server == '{#PEER}')].first()`
Nginx	Nginx: HTTP upstream [{#UPSTREAM}] peer [{#PEER}]: State	The current state, which may be one of “up”, “draining”, “down”, “unavail”, “checking”, and “unhealthy”.	DEPENDENT	nginx.http.upstream.peer.state[{#UPSTREAM},{#PEER}] Preprocessing: - JSONPATH: `$.state` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Nginx	Nginx: HTTP upstream [{#UPSTREAM}] peer [{#PEER}]: Active	The current number of active connections.	DEPENDENT	nginx.http.upstream.peer.active[{#UPSTREAM},{#PEER}] Preprocessing: - JSONPATH: `$.active`
Nginx	Nginx: HTTP upstream [{#UPSTREAM}] peer [{#PEER}]: Requests, rate	The total number of client requests forwarded to this server per second.	DEPENDENT	nginx.http.upstream.peer.requests.rate[{#UPSTREAM},{#PEER}] Preprocessing: - JSONPATH: `$.requests` - CHANGEPERSECOND
Nginx	Nginx: HTTP upstream [{#UPSTREAM}] peer [{#PEER}]: Responses 1xx, rate	The number of responses with `1xx` status code per second.	DEPENDENT	nginx.http.upstream.peer.responses.1xx.rate[{#UPSTREAM},{#PEER}] Preprocessing: - JSONPATH: `$.responses.1xx` - CHANGEPERSECOND
Nginx	Nginx: HTTP upstream [{#UPSTREAM}] peer [{#PEER}]: Responses 2xx, rate	The number of responses with `2xx` status code per second.	DEPENDENT	nginx.http.upstream.peer.responses.2xx.rate[{#UPSTREAM},{#PEER}] Preprocessing: - JSONPATH: `$.responses.2xx` - CHANGEPERSECOND
Nginx	Nginx: HTTP upstream [{#UPSTREAM}] peer [{#PEER}]: Responses 3xx, rate	The number of responses with `3xx` status code per second.	DEPENDENT	nginx.http.upstream.peer.responses.3xx.rate[{#UPSTREAM},{#PEER}] Preprocessing: - JSONPATH: `$.responses.3xx` - CHANGEPERSECOND
Nginx	Nginx: HTTP upstream [{#UPSTREAM}] peer [{#PEER}]: Responses 4xx, rate	The number of responses with `4xx` status code per second.	DEPENDENT	nginx.http.upstream.peer.responses.4xx.rate[{#UPSTREAM},{#PEER}] Preprocessing: - JSONPATH: `$.responses.4xx` - CHANGEPERSECOND
Nginx	Nginx: HTTP upstream [{#UPSTREAM}] peer [{#PEER}]: Responses 5xx, rate	The number of responses with `5xx` status code per second.	DEPENDENT	nginx.http.upstream.peer.responses.5xx.rate[{#UPSTREAM},{#PEER}] Preprocessing: - JSONPATH: `$.responses.5xx` - CHANGEPERSECOND
Nginx	Nginx: HTTP upstream [{#UPSTREAM}] peer [{#PEER}]: Responses total, rate	The total number of responses obtained from this server.	DEPENDENT	nginx.http.upstream.peer.responses.total.rate[{#UPSTREAM},{#PEER}] Preprocessing: - JSONPATH: `$.responses.total` - CHANGEPERSECOND
Nginx	Nginx: HTTP upstream [{#UPSTREAM}] peer [{#PEER}]: Sent, rate	The total number of bytes sent to this server per second.	DEPENDENT	nginx.http.upstream.peer.sent.rate[{#UPSTREAM},{#PEER}] Preprocessing: - JSONPATH: `$.sent` - CHANGEPERSECOND
Nginx	Nginx: HTTP upstream [{#UPSTREAM}] peer [{#PEER}]: Received, rate	The total number of bytes received from this server per second.	DEPENDENT	nginx.http.upstream.peer.received.rate[{#UPSTREAM},{#PEER}] Preprocessing: - JSONPATH: `$.received` - CHANGEPERSECOND
Nginx	Nginx: HTTP upstream [{#UPSTREAM}] peer [{#PEER}]: Fails, rate	The total number of unsuccessful attempts to communicate with the server per second.	DEPENDENT	nginx.http.upstream.peer.fails.rate[{#UPSTREAM},{#PEER}] Preprocessing: - JSONPATH: `$.fails` - CHANGEPERSECOND
Nginx	Nginx: HTTP upstream [{#UPSTREAM}] peer [{#PEER}]: Unavail	Displays how many times the server has become unavailable for client requests (the state - “unavail”) due to the number of unsuccessful attempts reaching the `max_fails` threshold.	DEPENDENT	nginx.http.upstream.peer.unavail.rate[{#UPSTREAM},{#PEER}] Preprocessing: - JSONPATH: `$.unavail`
Nginx	Nginx: HTTP upstream [{#UPSTREAM}] peer [{#PEER}]: Header time	The average time to get the response header from the server.	DEPENDENT	nginx.http.upstream.peer.headertime.rate[{#UPSTREAM},{#PEER}] Preprocessing: - JSONPATH: `$.header_time` ⛔️ONFAIL: `DISCARD_VALUE ->`
Nginx	Nginx: HTTP upstream [{#UPSTREAM}] peer [{#PEER}]: Response time	The average time to get the full response from the server.	DEPENDENT	nginx.http.upstream.peer.responsetime.rate[{#UPSTREAM},{#PEER}] Preprocessing: - JSONPATH: `$.response_time` ⛔️ONFAIL: `DISCARD_VALUE ->`
Nginx	Nginx: HTTP upstream [{#UPSTREAM}] peer [{#PEER}]: Health checks, check	The total number of health check requests made.	DEPENDENT	nginx.http.upstream.peer.health_checks.checks[{#UPSTREAM},{#PEER}] Preprocessing: - JSONPATH: `$.health_checks.checks`
Nginx	Nginx: HTTP upstream [{#UPSTREAM}] peer [{#PEER}]: Health checks, fails	The number of failed health checks.	DEPENDENT	nginx.http.upstream.peer.health_checks.fails[{#UPSTREAM},{#PEER}] Preprocessing: - JSONPATH: `$.health_checks.fails`
Nginx	Nginx: HTTP upstream [{#UPSTREAM}] peer [{#PEER}]: Health checks, unhealthy	Displays how many times the server has become `unhealthy` (the state - “unhealthy”.	DEPENDENT	nginx.http.upstream.peer.health_checks.unhealthy[{#UPSTREAM},{#PEER}] Preprocessing: - JSONPATH: `$.health_checks.unhealthy`
Nginx	Nginx: Stream server zone [{#NAME}]: Raw data	The raw data of server zone with the name `{#NAME}`, configured in the "stream" directive.	DEPENDENT	nginx.stream.server_zones.raw[{#NAME}] Preprocessing: - JSONPATH: `$['{#NAME}']`
Nginx	Nginx: Stream server zone [{#NAME}]: Processing	The number of client connections that are currently being processed.	DEPENDENT	nginx.stream.server_zones.processing[{#NAME}] Preprocessing: - JSONPATH: `$.processing`
Nginx	Nginx: Stream server zone [{#NAME}]: Connections, rate	The total number of connections accepted from clients per second.	DEPENDENT	nginx.stream.serverzones.connections.rate[{#NAME}] Preprocessing: - JSONPATH: `$.connections` - CHANGEPER_SECOND
Nginx	Nginx: Stream server zone [{#NAME}]: Sessions 2xx, rate	The total number of sessions completed with status code `2xx` per second.	DEPENDENT	nginx.stream.serverzones.sessions.2xx.rate[{#NAME}] Preprocessing: - JSONPATH: `$.sessions.2xx` - CHANGEPER_SECOND
Nginx	Nginx: Stream server zone [{#NAME}]: Sessions 4xx, rate	The total number of sessions completed with status code `4xx` per second.	DEPENDENT	nginx.stream.serverzones.sessions.4xx.rate[{#NAME}] Preprocessing: - JSONPATH: `$.sessions.4xx` - CHANGEPER_SECOND
Nginx	Nginx: Stream server zone [{#NAME}]: Sessions 5xx, rate	The total number of sessions completed with status code `5xx` per second.	DEPENDENT	nginx.stream.serverzones.sessions.5xx.rate[{#NAME}] Preprocessing: - JSONPATH: `$.sessions.5xx` - CHANGEPER_SECOND
Nginx	Nginx: Stream server zone [{#NAME}]: Sessions total, rate	The total number of completed client sessions per second.	DEPENDENT	nginx.stream.serverzones.sessions.total.rate[{#NAME}] Preprocessing: - JSONPATH: `$.sessions.total` - CHANGEPER_SECOND
Nginx	Nginx: Stream server zone [{#NAME}]: Discarded, rate	The total number of connections completed without creating a session per second.	DEPENDENT	nginx.stream.serverzones.discarded.rate[{#NAME}] Preprocessing: - JSONPATH: `$.discarded` - CHANGEPER_SECOND
Nginx	Nginx: Stream server zone [{#NAME}]: Received, rate	The total number of bytes received from clients per second.	DEPENDENT	nginx.stream.serverzones.received.rate[{#NAME}] Preprocessing: - JSONPATH: `$.received` - CHANGEPER_SECOND
Nginx	Nginx: Stream server zone [{#NAME}]: Sent, rate	The total number of bytes sent to clients per second.	DEPENDENT	nginx.stream.serverzones.sent.rate[{#NAME}] Preprocessing: - JSONPATH: `$.sent` - CHANGEPER_SECOND
Nginx	Nginx: Stream upstream [{#NAME}]: Raw data	The raw data of the upstream with the name `[{#UPSTREAM}]`, configured in the "stream" directive.	DEPENDENT	nginx.stream.upstreams.raw[{#NAME}] Preprocessing: - JSONPATH: `$['{#NAME}']`
Nginx	Nginx: Stream upstream [{#NAME}]: Zombies	-	DEPENDENT	nginx.stream.upstreams.zombies[{#NAME}] Preprocessing: - JSONPATH: `$.zombies`
Nginx	Nginx: Stream upstream [{#NAME}]: Zone	-	DEPENDENT	nginx.stream.upstreams.zone[{#NAME}] Preprocessing: - JSONPATH: `$.zone` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Nginx	Nginx: Stream upstream [{#UPSTREAM}] peer [{#PEER}]: Raw data	The raw data of the upstream with the name `[{#UPSTREAM}]`and peer with the name`[{#PEER}]`, configured in the "stream" directive.	DEPENDENT	nginx.stream.upstream.peer.raw[{#UPSTREAM},{#PEER}] Preprocessing: - JSONPATH: `$['{#UPSTREAM}'].peers[?(@.server == '{#PEER}')].first()`
Nginx	Nginx: Stream upstream [{#UPSTREAM}] peer [{#PEER}]: State	The current state, which may be one of “up”, “draining”, “down”, “unavail”, “checking”, and “unhealthy”.	DEPENDENT	nginx.stream.upstream.peer.state[{#UPSTREAM},{#PEER}] Preprocessing: - JSONPATH: `$.state` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Nginx	Nginx: Stream upstream [{#UPSTREAM}] peer [{#PEER}]: Active	The current number of connections.	DEPENDENT	nginx.stream.upstream.peer.active[{#UPSTREAM},{#PEER}] Preprocessing: - JSONPATH: `$.active`
Nginx	Nginx: Stream upstream [{#UPSTREAM}] peer [{#PEER}]: Sent, rate	The total number of bytes sent to this server per second.	DEPENDENT	nginx.stream.upstream.peer.sent.rate[{#UPSTREAM},{#PEER}] Preprocessing: - JSONPATH: `$.sent` - CHANGEPERSECOND
Nginx	Nginx: Stream upstream [{#UPSTREAM}] peer [{#PEER}]: Received, rate	The total number of bytes received from this server per second.	DEPENDENT	nginx.stream.upstream.peer.received.rate[{#UPSTREAM},{#PEER}] Preprocessing: - JSONPATH: `$.received` - CHANGEPERSECOND
Nginx	Nginx: Stream upstream [{#UPSTREAM}] peer [{#PEER}]: Fails, rate	The total number of unsuccessful attempts to communicate with the server per second.	DEPENDENT	nginx.stream.upstream.peer.fails.rate[{#UPSTREAM},{#PEER}] Preprocessing: - JSONPATH: `$.fails` - CHANGEPERSECOND
Nginx	Nginx: Stream upstream [{#UPSTREAM}] peer [{#PEER}]: Unavail	Displays how many times the server has become unavailable for client requests (the state - “unavail”) due to the number of unsuccessful attempts reaching the `max_fails` threshold.	DEPENDENT	nginx.stream.upstream.peer.unavail.rate[{#UPSTREAM},{#PEER}] Preprocessing: - JSONPATH: `$.unavail`
Nginx	Nginx: Stream upstream [{#UPSTREAM}] peer [{#PEER}]: Connections	The total number of client connections forwarded to this server.	DEPENDENT	nginx.stream.upstream.peer.connections.rate[{#UPSTREAM},{#PEER}] Preprocessing: - JSONPATH: `$.connections`
Nginx	Nginx: Stream upstream [{#UPSTREAM}] peer [{#PEER}]: Connect time	The average time to connect to the upstream server.	DEPENDENT	nginx.stream.upstream.peer.connecttime.rate[{#UPSTREAM},{#PEER}] Preprocessing: - JSONPATH: `$.connect_time` ⛔️ONFAIL: `DISCARD_VALUE ->`
Nginx	Nginx: Stream upstream [{#UPSTREAM}] peer [{#PEER}]: First byte time	The average time to receive the first byte of data.	DEPENDENT	nginx.stream.upstream.peer.firstbytetime.rate[{#UPSTREAM},{#PEER}] Preprocessing: - JSONPATH: `$.first_byte_time` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Nginx	Nginx: Stream upstream [{#UPSTREAM}] peer [{#PEER}]: Response time	The average time to receive the last byte of data.	DEPENDENT	nginx.stream.upstream.peer.responsetime.rate[{#UPSTREAM},{#PEER}] Preprocessing: - JSONPATH: `$.response_time` ⛔️ONFAIL: `DISCARD_VALUE ->`
Nginx	Nginx: Stream upstream [{#UPSTREAM}] peer [{#PEER}]: Health checks, check	The total number of health check requests made.	DEPENDENT	nginx.stream.upstream.peer.health_checks.checks[{#UPSTREAM},{#PEER}] Preprocessing: - JSONPATH: `$.health_checks.checks`
Nginx	Nginx: Stream upstream [{#UPSTREAM}] peer [{#PEER}]: Health checks, fails	The number of failed health checks.	DEPENDENT	nginx.stream.upstream.peer.health_checks.fails[{#UPSTREAM},{#PEER}] Preprocessing: - JSONPATH: `$.health_checks.fails`
Nginx	Nginx: Stream upstream [{#UPSTREAM}] peer [{#PEER}]: Health checks, unhealthy	Displays how many times the server has become `unhealthy` (the state - “unhealthy”).	DEPENDENT	nginx.stream.upstream.peer.health_checks.unhealthy[{#UPSTREAM},{#PEER}] Preprocessing: - JSONPATH: `$.health_checks.unhealthy`
Nginx	Nginx: Resolver [{#NAME}]: Raw data	The raw data of the `Resolver` with the name `{#NAME}`.	DEPENDENT	nginx.resolvers.raw[{#NAME}] Preprocessing: - JSONPATH: `$['{#NAME}']`
Nginx	Nginx: Resolver [{#NAME}]: Requests name, rate	The total number of requests to resolve names to addresses per second.	DEPENDENT	nginx.resolvers.requests.name.rate[{#NAME}] Preprocessing: - JSONPATH: `$.requests.name` - CHANGEPERSECOND
Nginx	Nginx: Resolver [{#NAME}]: Requests srv, rate	The total number of requests to resolve SRV records per second.	DEPENDENT	nginx.resolvers.requests.srv.rate[{#NAME}] Preprocessing: - JSONPATH: `$.requests.srv` - CHANGEPERSECOND
Nginx	Nginx: Resolver [{#NAME}]: Requests addr, rate	The total number of requests to resolve addresses to names per second.	DEPENDENT	nginx.resolvers.requests.addr.rate[{#NAME}] Preprocessing: - JSONPATH: `$.requests.addr` - CHANGEPERSECOND
Nginx	Nginx: Resolver [{#NAME}]: Responses noerror, rate	The total number of successful responses per second.	DEPENDENT	nginx.resolvers.responses.noerror.rate[{#NAME}] Preprocessing: - JSONPATH: `$.responses.noerror` - CHANGEPERSECOND
Nginx	Nginx: Resolver [{#NAME}]: Responses formerr, rate	The total number of `FORMERR` (format error) responses per second.	DEPENDENT	nginx.resolvers.responses.formerr.rate[{#NAME}] Preprocessing: - JSONPATH: `$.responses.formerr` - CHANGEPERSECOND
Nginx	Nginx: Resolver [{#NAME}]: Responses servfail, rate	The total number of `SERVFAIL` (server failure) responses per second.	DEPENDENT	nginx.resolvers.responses.servfail.rate[{#NAME}] Preprocessing: - JSONPATH: `$.responses.servfail` - CHANGEPERSECOND
Nginx	Nginx: Resolver [{#NAME}]: Responses nxdomain, rate	The total number of `NXDOMAIN` (host not found) responses per second.	DEPENDENT	nginx.resolvers.responses.nxdomain.rate[{#NAME}] Preprocessing: - JSONPATH: `$.responses.nxdomain` - CHANGEPERSECOND
Nginx	Nginx: Resolver [{#NAME}]: Responses notimp, rate	The total number of `NOTIMP` (unimplemented) responses per second.	DEPENDENT	nginx.resolvers.responses.notimp.rate[{#NAME}] Preprocessing: - JSONPATH: `$.responses.notimp` - CHANGEPERSECOND
Nginx	Nginx: Resolver [{#NAME}]: Responses refused, rate	The total number of `REFUSED` (operation refused) responses per second.	DEPENDENT	nginx.resolvers.responses.refused.rate[{#NAME}] Preprocessing: - JSONPATH: `$.responses.refused` - CHANGEPERSECOND
Nginx	Nginx: Resolver [{#NAME}]: Responses timedout, rate	The total number of timed out requests per second.	DEPENDENT	nginx.resolvers.responses.timedout.rate[{#NAME}] Preprocessing: - JSONPATH: `$.responses.timedout` - CHANGEPERSECOND
Nginx	Nginx: Resolver [{#NAME}]: Responses unknown, rate	The total number of requests completed with an unknown error per second.	DEPENDENT	nginx.resolvers.responses.unknown.rate[{#NAME}] Preprocessing: - JSONPATH: `$.responses.unknown` - CHANGEPERSECOND
Zabbix raw items	Nginx: Get info	Return status of the NGINX running instance.	HTTP_AGENT	nginx.info
Zabbix raw items	Nginx: Get connections	Returns the statistics of client connections.	HTTP_AGENT	nginx.connections
Zabbix raw items	Nginx: Get SSL	Returns the SSL statistics.	HTTP_AGENT	nginx.ssl
Zabbix raw items	Nginx: Get requests	Returns the status of the client's HTTP requests.	HTTP_AGENT	nginx.requests
Zabbix raw items	Nginx: Get HTTP zones	Returns the status information for each HTTP server zone.	HTTP_AGENT	nginx.http.server_zones
Zabbix raw items	Nginx: Get HTTP location zones	Returns the status information for each HTTP location zone.	HTTP_AGENT	nginx.http.location_zones
Zabbix raw items	Nginx: Get HTTP upstreams	Returns the status of each HTTP upstream server group and its servers.	HTTP_AGENT	nginx.http.upstreams
Zabbix raw items	Nginx: Get Stream server zones	Returns the status information for each server zone configured in the "stream" directive.	HTTP_AGENT	nginx.stream.server_zones
Zabbix raw items	Nginx: Get Stream upstreams	Returns status of each stream upstream server group and its servers.	HTTP_AGENT	nginx.stream.upstreams
Zabbix raw items	Nginx: Get resolvers	Returns the status information for each Resolver zone.	HTTP_AGENT	nginx.resolvers

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Nginx: Server response error	-	`length(last(/NGINX Plus by HTTP/nginx.info.error))>0`	HIGH
Nginx: Version has changed	Nginx version has changed. Acknowledge to close manually.	`last(/NGINX Plus by HTTP/nginx.info.version,#1)<>last(/NGINX Plus by HTTP/nginx.info.version,#2) and length(last(/NGINX Plus by HTTP/nginx.info.version))>0`	INFO	Manual close: YES
Nginx: Host has been restarted	The host uptime is less than 10 minutes.	`last(/NGINX Plus by HTTP/nginx.info.uptime)<10m`	INFO	Manual close: YES
Nginx: Failed to fetch info data	Zabbix has not received any data for metrics for the last 30 minutes	`nodata(/NGINX Plus by HTTP/nginx.info.uptime,30m)=1`	WARNING	Manual close: YES
Nginx: High connections drop rate	The rate of dropped connections is greater than `{$NGINX.DROP_RATE.MAX.WARN}` for the last 5 minutes.	`min(/NGINX Plus by HTTP/nginx.connections.dropped,5m) > {$NGINX.DROP_RATE.MAX.WARN}`	WARNING
Nginx: HTTP upstream server is not in UP or DOWN state.	-	`find(/NGINX Plus by HTTP/nginx.http.upstream.peer.state[{#UPSTREAM},{#PEER}],,"like","up")=0 and find(/NGINX Plus by HTTP/nginx.http.upstream.peer.state[{#UPSTREAM},{#PEER}],,"like","down")=0`	WARNING
Nginx: Too many HTTP requests with code 4xx	-	`sum(/NGINX Plus by HTTP/nginx.http.upstream.peer.responses.4xx.rate[{#UPSTREAM},{#PEER}],5m) > (sum(/NGINX Plus by HTTP/nginx.http.upstream.peer.responses.total.rate[{#UPSTREAM},{#PEER}],5m)*({$NGINX.HTTP.UPSTREAM.4XX.MAX.WARN}/100))`	WARNING
Nginx: Too many HTTP requests with code 5xx	-	`sum(/NGINX Plus by HTTP/nginx.http.upstream.peer.responses.5xx.rate[{#UPSTREAM},{#PEER}],5m) > (sum(/NGINX Plus by HTTP/nginx.http.upstream.peer.responses.total.rate[{#UPSTREAM},{#PEER}],5m)*({$NGINX.HTTP.UPSTREAM.5XX.MAX.WARN}/100))`	HIGH
Nginx: Stream upstream server is not in UP or DOWN state.	-	`find(/NGINX Plus by HTTP/nginx.stream.upstream.peer.state[{#UPSTREAM},{#PEER}],,"like","up")=0 and find(/NGINX Plus by HTTP/nginx.stream.upstream.peer.state[{#UPSTREAM},{#PEER}],,"like","down")=0`	WARNING

Feedback

Please report any issues with the template at https://support.zabbix.com.

You can also provide feedback, discuss the template, or ask for help at ZABBIX forums.

app

app_nginx_http

View README Download JSON

Nginx by HTTP

Overview

For Zabbix version: 6.2 and higher
The template to monitor Nginx by Zabbix that work without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

Template Nginx by HTTP collects metrics by polling ngxhttpstubstatusmodule with HTTP agent remotely:

Active connections: 291
server accepts handled requests
16630948 16630948 31070465
Reading: 6 Writing: 179 Waiting: 106

Note that this solution supports https and redirects.

This template was tested on:

Nginx, version 1.17.2

Setup

Setup ngxhttpstubstatusmodule. Test availability of httpstubstatus module with nginx -V 2>&1 | grep -o with-http_stub_status_module.

Example configuration of Nginx:

location = /basic_status {
    stub_status;
    allow <IP of your Zabbix server/proxy>;
    deny all;
}

If you use another location, don't forget to change {$NGINX.STUB_STATUS.PATH} macro.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$NGINX.DROP_RATE.MAX.WARN}	The critical rate of the dropped connections for trigger expression.	`1`
{$NGINX.RESPONSE_TIME.MAX.WARN}	The Nginx maximum response time in seconds for trigger expression.	`10`
{$NGINX.STUB_STATUS.PATH}	The path of Nginx stub_status page.	`basic_status`
{$NGINX.STUB_STATUS.PORT}	The port of Nginx stub_status host or container.	`80`
{$NGINX.STUB_STATUS.SCHEME}	The protocol http or https of Nginx stub_status host or container.	`http`

Template links

There are no template links in this template.

Discovery rules

Items collected

Group	Name	Description	Type	Key and additional info
Nginx	Nginx: Service status	-	SIMPLE	net.tcp.service[http,"{HOST.CONN}","{$NGINX.STUBSTATUS.PORT}"] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
Nginx	Nginx: Service response time	-	SIMPLE	net.tcp.service.perf[http,"{HOST.CONN}","{$NGINX.STUB_STATUS.PORT}"]
Nginx	Nginx: Requests total	The total number of client requests.	DEPENDENT	nginx.requests.total Preprocessing: - REGEX: `server accepts handled requests\s+([0-9]+) ([0-9]+) ([0-9]+) \3`
Nginx	Nginx: Requests per second	The total number of client requests.	DEPENDENT	nginx.requests.total.rate Preprocessing: - REGEX: `server accepts handled requests\s+([0-9]+) ([0-9]+) ([0-9]+) \3` - CHANGEPERSECOND
Nginx	Nginx: Connections accepted per second	The total number of accepted client connections.	DEPENDENT	nginx.connections.accepted.rate Preprocessing: - REGEX: `server accepts handled requests\s+([0-9]+) ([0-9]+) ([0-9]+) \1` - CHANGEPERSECOND
Nginx	Nginx: Connections dropped per second	The total number of dropped client connections.	DEPENDENT	nginx.connections.dropped.rate Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.` - CHANGEPERSECOND
Nginx	Nginx: Connections handled per second	The total number of handled connections. Generally, the parameter value is the same as accepts unless some resource limits have been reached (for example, the worker_connections limit).	DEPENDENT	nginx.connections.handled.rate Preprocessing: - REGEX: `server accepts handled requests\s+([0-9]+) ([0-9]+) ([0-9]+) \2` - CHANGEPERSECOND
Nginx	Nginx: Connections active	The current number of active client connections including Waiting connections.	DEPENDENT	nginx.connections.active Preprocessing: - REGEX: `Active connections: ([0-9]+) \1`
Nginx	Nginx: Connections reading	The current number of connections where nginx is reading the request header.	DEPENDENT	nginx.connections.reading Preprocessing: - REGEX: `Reading: ([0-9]+) Writing: ([0-9]+) Waiting: ([0-9]+) \1`
Nginx	Nginx: Connections waiting	The current number of idle client connections waiting for a request.	DEPENDENT	nginx.connections.waiting Preprocessing: - REGEX: `Reading: ([0-9]+) Writing: ([0-9]+) Waiting: ([0-9]+) \3`
Nginx	Nginx: Connections writing	The current number of connections where nginx is writing the response back to the client.	DEPENDENT	nginx.connections.writing Preprocessing: - REGEX: `Reading: ([0-9]+) Writing: ([0-9]+) Waiting: ([0-9]+) \2`
Nginx	Nginx: Version	-	DEPENDENT	nginx.version Preprocessing: - REGEX: `Server: nginx\/(.+(?<!\r)) \1` - DISCARDUNCHANGEDHEARTBEAT: `1d`
Zabbix raw items	Nginx: Get stub status page	The following status information is provided: Active connections - the current number of active client connections including Waiting connections. Accepts - the total number of accepted client connections. Handled - the total number of handled connections. Generally, the parameter value is the same as accepts unless some resource limits have been reached (for example, the workerconnections limit). Requests - the total number of client requests. Reading - the current number of connections where nginx is reading the request header. Writing - the current number of connections where nginx is writing the response back to the client. Waiting - the current number of idle client connections waiting for a request. https://nginx.org/en/docs/http/ngxhttpstubstatus_module.html	HTTP_AGENT	nginx.getstubstatus

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Nginx: Service is down	-	`last(/Nginx by HTTP/net.tcp.service[http,"{HOST.CONN}","{$NGINX.STUB_STATUS.PORT}"])=0`	AVERAGE	Manual close: YES
Nginx: Service response time is too high	-	`min(/Nginx by HTTP/net.tcp.service.perf[http,"{HOST.CONN}","{$NGINX.STUB_STATUS.PORT}"],5m)>{$NGINX.RESPONSE_TIME.MAX.WARN}`	WARNING	Manual close: YES Depends on: - Nginx: Service is down
Nginx: High connections drop rate	The dropping rate connections is greater than {$NGINX.DROP_RATE.MAX.WARN} for the last 5 minutes.	`min(/Nginx by HTTP/nginx.connections.dropped.rate,5m) > {$NGINX.DROP_RATE.MAX.WARN}`	WARNING	Depends on: - Nginx: Service is down
Nginx: Version has changed	Nginx version has changed. Ack to close.	`last(/Nginx by HTTP/nginx.version,#1)<>last(/Nginx by HTTP/nginx.version,#2) and length(last(/Nginx by HTTP/nginx.version))>0`	INFO	Manual close: YES
Nginx: Failed to fetch stub status page	Zabbix has not received data for items for the last 30 minutes.	`find(/Nginx by HTTP/nginx.get_stub_status,,"like","HTTP/1.1 200")=0 or nodata(/Nginx by HTTP/nginx.get_stub_status,30m)=1`	WARNING	Manual close: YES Depends on: - Nginx: Service is down

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

app

app_nginx_agent

View README Download JSON

Nginx by Zabbix agent

Overview

For Zabbix version: 6.2 and higher. This template is developed to monitor Nginx by Zabbix that works without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

The template Nginx by Zabbix agent - collects metrics by polling the Module ngxhttpstubstatusmodule locally with Zabbix agent:

Active connections: 291
server accepts handled requests
16630948 16630948 31070465
Reading: 6 Writing: 179 Waiting: 106

Note that this template doesn't support HTTPS and redirects (limitations of web.page.get).

It also uses Zabbix agent to collect Nginx Linux process statistics, such as CPU usage, memory usage and whether the process is running or not.

This template was tested on:

Nginx, version 1.17.2

Setup

See the setup instructions for ngxhttpstubstatusmodule. Test the availability of the http_stub_status_module with nginx -V 2>&1 | grep -o with-http_stub_status_module.

Example configuration of Nginx:

location = /basic_status {
    stub_status;
    allow 127.0.0.1;
    allow ::1;
    deny all;
}

If you use another location, then don't forget to change the {$NGINX.STUB_STATUS.PATH} macro. Install and setup Zabbix agent.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$NGINX.DROP_RATE.MAX.WARN}	The critical rate of the dropped connections for a trigger expression.	`1`
{$NGINX.PROCESS_NAME}	The process name of the Nginx server.	`nginx`
{$NGINX.RESPONSE_TIME.MAX.WARN}	The maximum response time of Nginx expressed in seconds for a trigger expression.	`10`
{$NGINX.STUB_STATUS.HOST}	The Hostname or an IP addess of the Nginx host or Nginx container of `astub_status`.	`localhost`
{$NGINX.STUB_STATUS.PATH}	The path of the `Nginx stub_status` page.	`basic_status`
{$NGINX.STUB_STATUS.PORT}	The port of the `Nginx stub_status` host or container.	`80`

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Nginx process discovery

The discovery of Nginx process summary.

DEPENDENT

nginx.proc.discovery

Filter:

AND

- {#NAME} MATCHES_REGEX {$NGINX.PROCESS_NAME}

Items collected

Group	Name	Description	Type	Key and additional info
Nginx	Nginx: Service status	-	ZABBIX_PASSIVE	net.tcp.service[http,"{$NGINX.STUBSTATUS.HOST}","{$NGINX.STUBSTATUS.PORT}"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
Nginx	Nginx: Service response time	-	ZABBIX_PASSIVE	net.tcp.service.perf[http,"{$NGINX.STUBSTATUS.HOST}","{$NGINX.STUBSTATUS.PORT}"]
Nginx	Nginx: Requests total	The total number of client requests.	DEPENDENT	nginx.requests.total Preprocessing: - REGEX: `server accepts handled requests\s+([0-9]+) ([0-9]+) ([0-9]+) \3`
Nginx	Nginx: Requests per second	The total number of client requests.	DEPENDENT	nginx.requests.total.rate Preprocessing: - REGEX: `server accepts handled requests\s+([0-9]+) ([0-9]+) ([0-9]+) \3` - CHANGEPERSECOND
Nginx	Nginx: Connections accepted per second	The total number of accepted client connections.	DEPENDENT	nginx.connections.accepted.rate Preprocessing: - REGEX: `server accepts handled requests\s+([0-9]+) ([0-9]+) ([0-9]+) \1` - CHANGEPERSECOND
Nginx	Nginx: Connections dropped per second	The total number of dropped client connections.	DEPENDENT	nginx.connections.dropped.rate Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.` - CHANGEPERSECOND
Nginx	Nginx: Connections handled per second	The total number of handled connections. Generally, the parameter value is the same as for the accepted connections, unless some resource limits have been reached (for example, the `worker_connections limit`).	DEPENDENT	nginx.connections.handled.rate Preprocessing: - REGEX: `server accepts handled requests\s+([0-9]+) ([0-9]+) ([0-9]+) \2` - CHANGEPERSECOND
Nginx	Nginx: Connections active	The current number of active client connections including waiting connections.	DEPENDENT	nginx.connections.active Preprocessing: - REGEX: `Active connections: ([0-9]+) \1`
Nginx	Nginx: Connections reading	The current number of connections where Nginx is reading the request header.	DEPENDENT	nginx.connections.reading Preprocessing: - REGEX: `Reading: ([0-9]+) Writing: ([0-9]+) Waiting: ([0-9]+) \1`
Nginx	Nginx: Connections waiting	The current number of idle client connections waiting for a request.	DEPENDENT	nginx.connections.waiting Preprocessing: - REGEX: `Reading: ([0-9]+) Writing: ([0-9]+) Waiting: ([0-9]+) \3`
Nginx	Nginx: Connections writing	The current number of connections where Nginx is writing a response back to the client.	DEPENDENT	nginx.connections.writing Preprocessing: - REGEX: `Reading: ([0-9]+) Writing: ([0-9]+) Waiting: ([0-9]+) \2`
Nginx	Nginx: Get processes summary	The aggregated data of summary metrics for all processes.	ZABBIX_PASSIVE	proc.get[,,,summary]
Nginx	Nginx: Version	-	DEPENDENT	nginx.version Preprocessing: - REGEX: `Server: nginx\/(.+(?<!\r)) \1` - DISCARDUNCHANGEDHEARTBEAT: `1d`
Nginx	Nginx: CPU utilization	The percentage of the CPU utilization by a process {#NAME}.	ZABBIX_PASSIVE	proc.cpu.util[{#NAME}]
Nginx	Nginx: Get process data	The summary metrics aggregated by a process {#NAME}.	DEPENDENT	nginx.proc.get[{#NAME}] Preprocessing: - JSONPATH: `$.[?(@["name"]=="{#NAME}")].first()` ⛔️ON_FAIL: `CUSTOM_VALUE -> Failed to retrieve process {#NAME} data`
Nginx	Nginx: Memory usage (vsize)	The summary of virtual memory used by a process {#NAME} expressed in bytes.	DEPENDENT	nginx.proc.vmem[{#NAME}] Preprocessing: - JSONPATH: `$.vsize` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Nginx	Nginx: Memory usage (rss)	The summary of resident set size memory used by a process {#NAME} expressed in bytes.	DEPENDENT	nginx.proc.rss[{#NAME}] Preprocessing: - JSONPATH: `$.rss` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Nginx	Nginx: Memory usage, %	The percentage of real memory used by a process {#NAME}.	DEPENDENT	nginx.proc.pmem[{#NAME}] Preprocessing: - JSONPATH: `$.pmem` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Nginx	Nginx: Number of running processes	The number of running processes {#NAME}.	DEPENDENT	nginx.proc.num[{#NAME}] Preprocessing: - JSONPATH: `$.processes` ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - DISCARDUNCHANGED_HEARTBEAT: `1h`
Zabbix raw items	Nginx: Get stub status page	The following status information is provided: `Active connections` - the current number of active client connections including waiting connections. `Accepted` - the total number of accepted client connections. `Handled` - the total number of handled connections. Generally, the parameter value is the same as for the accepted connections, unless some resource limits have been reached (for example, the `worker_connections` limit). `Requests` - the total number of client requests. `Reading` - the current number of connections where Nginx is reading the request header. `Writing` - the current number of connections where Nginx is writing a response back to the client. `Waiting` - the current number of idle client connections waiting for a request. See also Module ngxhttpstubstatusmodule.	ZABBIX_PASSIVE	web.page.get["{$NGINX.STUBSTATUS.HOST}","{$NGINX.STUBSTATUS.PATH}","{$NGINX.STUB_STATUS.PORT}"]

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Nginx: Version has changed	The Nginx version has changed. Acknowledge (Ack) to close manually.	`last(/Nginx by Zabbix agent/nginx.version,#1)<>last(/Nginx by Zabbix agent/nginx.version,#2) and length(last(/Nginx by Zabbix agent/nginx.version))>0`	INFO	Manual close: YES
Nginx: Process is not running	-	`last(/Nginx by Zabbix agent/nginx.proc.num[{#NAME}])=0`	HIGH
Nginx: Service is down	-	`last(/Nginx by Zabbix agent/net.tcp.service[http,"{$NGINX.STUB_STATUS.HOST}","{$NGINX.STUB_STATUS.PORT}"])=0 and last(/Nginx by Zabbix agent/nginx.proc.num[{#NAME}])>0`	AVERAGE	Manual close: YES
Nginx: High connections drop rate	The rate of dropping connections has been greater than {$NGINX.DROP_RATE.MAX.WARN} for the last 5 minutes.	`min(/Nginx by Zabbix agent/nginx.connections.dropped.rate,5m) > {$NGINX.DROP_RATE.MAX.WARN} and last(/Nginx by Zabbix agent/nginx.proc.num[{#NAME}])>0`	WARNING	Depends on: - Nginx: Service is down
Nginx: Service response time is too high	-	`min(/Nginx by Zabbix agent/net.tcp.service.perf[http,"{$NGINX.STUB_STATUS.HOST}","{$NGINX.STUB_STATUS.PORT}"],5m)>{$NGINX.RESPONSE_TIME.MAX.WARN} and last(/Nginx by Zabbix agent/nginx.proc.num[{#NAME}])>0`	WARNING	Manual close: YES Depends on: - Nginx: Service is down
Nginx: Failed to fetch stub status page	Zabbix has not received any data for items for the last 30 minutes.	`(find(/Nginx by Zabbix agent/web.page.get["{$NGINX.STUB_STATUS.HOST}","{$NGINX.STUB_STATUS.PATH}","{$NGINX.STUB_STATUS.PORT}"],,"like","HTTP/1.1 200")=0 or nodata(/Nginx by Zabbix agent/web.page.get["{$NGINX.STUB_STATUS.HOST}","{$NGINX.STUB_STATUS.PATH}","{$NGINX.STUB_STATUS.PORT}"],30m)) and last(/Nginx by Zabbix agent/nginx.proc.num[{#NAME}])>0`	WARNING	Manual close: YES Depends on: - Nginx: Service is down

Feedback

Please report any issues with the template at https://support.zabbix.com.

You can also provide feedback, discuss the template, or ask for help at ZABBIX forums.

app

app_memcached

View README Download JSON

Memcached by Zabbix agent 2

Overview

For Zabbix version: 6.2 and higher
The template to monitor Memcached server by Zabbix that work without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

Template Memcached by Zabbix agent 2 — collects metrics by polling zabbix-agent2.

This template was tested on:

Memcached, version 1.4, 1.5, 1.6

Setup

Setup and configure zabbix-agent2 compiled with the Memcached monitoring plugin.

Test availability: zabbix_get -s memcached-host -k memcached.ping

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$MEMCACHED.CONN.PRC.MAX.WARN}	Maximum percentage of connected clients	`80`
{$MEMCACHED.CONN.QUEUED.MAX.WARN}	Maximum number of queued connections per second	`1`
{$MEMCACHED.CONN.THROTTLED.MAX.WARN}	Maximum number of throttled connections per second	`1`
{$MEMCACHED.CONN.URI}	Connection string in the URI format (password is not used). This param overwrites a value configured in the "Plugins.Memcached.Uri" option of the configuration file (if it's set), otherwise, the plugin's default value is used: "tcp://localhost:11211"	`tcp://localhost:11211`
{$MEMCACHED.MEM.PUSED.MAX.WARN}	Maximum percentage of memory used	`90`

Template links

There are no template links in this template.

Discovery rules

Items collected

Group	Name	Description	Type	Key and additional info
Memcached	Memcached: Ping		ZABBIX_PASSIVE	memcached.ping["{$MEMCACHED.CONN.URI}"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
Memcached	Memcached: Max connections	Max number of concurrent connections	DEPENDENT	memcached.connections.max Preprocessing: - JSONPATH: `$.max_connections` - DISCARDUNCHANGEDHEARTBEAT: `30m`
Memcached	Memcached: Maximum number of bytes	Maximum number of bytes allowed in cache. You can adjust this setting via a config file or the command line while starting your Memcached server.	DEPENDENT	memcached.config.limitmaxbytes Preprocessing: - JSONPATH: `$.limit_maxbytes` - DISCARDUNCHANGED_HEARTBEAT: `30m`
Memcached	Memcached: CPU sys	System CPU consumed by the Memcached server	DEPENDENT	memcached.cpu.sys Preprocessing: - JSONPATH: `$.rusage_system`
Memcached	Memcached: CPU user	User CPU consumed by the Memcached server	DEPENDENT	memcached.cpu.user Preprocessing: - JSONPATH: `$.rusage_user`
Memcached	Memcached: Queued connections per second	Number of times that memcached has hit its connections limit and disabled its listener	DEPENDENT	memcached.connections.queued.rate Preprocessing: - JSONPATH: `$.listen_disabled_num` - CHANGEPERSECOND
Memcached	Memcached: New connections per second	Number of connections opened per second	DEPENDENT	memcached.connections.rate Preprocessing: - JSONPATH: `$.total_connections` - CHANGEPERSECOND
Memcached	Memcached: Throttled connections	Number of times a client connection was throttled. When sending GETs in batch mode and the connection contains too many requests (limited by -R parameter) the connection might be throttled to prevent starvation.	DEPENDENT	memcached.connections.throttled.rate Preprocessing: - JSONPATH: `$.conn_yields` - CHANGEPERSECOND
Memcached	Memcached: Connection structures	Number of connection structures allocated by the server	DEPENDENT	memcached.connections.structures Preprocessing: - JSONPATH: `$.connection_structures`
Memcached	Memcached: Open connections	The number of clients presently connected	DEPENDENT	memcached.connections.current Preprocessing: - JSONPATH: `$.curr_connections`
Memcached	Memcached: Commands: FLUSH per second	The flush_all command invalidates all items in the database. This operation incurs a performance penalty and shouldn't take place in production, so check your debug scripts.	DEPENDENT	memcached.commands.flush.rate Preprocessing: - JSONPATH: `$.cmd_flush` - CHANGEPERSECOND
Memcached	Memcached: Commands: GET per second	Number of GET requests received by server per second.	DEPENDENT	memcached.commands.get.rate Preprocessing: - JSONPATH: `$.cmd_get` - CHANGEPERSECOND
Memcached	Memcached: Commands: SET per second	Number of SET requests received by server per second.	DEPENDENT	memcached.commands.set.rate Preprocessing: - JSONPATH: `$.cmd_set` - CHANGEPERSECOND
Memcached	Memcached: Process id	PID of the server process	DEPENDENT	memcached.processid Preprocessing: - JSONPATH: `$.pid` - DISCARDUNCHANGED_HEARTBEAT: `1d`
Memcached	Memcached: Memcached version	Version of the Memcached server	DEPENDENT	memcached.version Preprocessing: - JSONPATH: `$.version` - DISCARDUNCHANGEDHEARTBEAT: `1d`
Memcached	Memcached: Uptime	Number of seconds since Memcached server start	DEPENDENT	memcached.uptime Preprocessing: - JSONPATH: `$.uptime`
Memcached	Memcached: Bytes used	Current number of bytes used to store items.	DEPENDENT	memcached.stats.bytes Preprocessing: - JSONPATH: `$.bytes`
Memcached	Memcached: Written bytes per second	The network's read rate per second in B/sec	DEPENDENT	memcached.stats.byteswritten.rate Preprocessing: - JSONPATH: `$.bytes_written` - CHANGEPER_SECOND
Memcached	Memcached: Read bytes per second	The network's read rate per second in B/sec	DEPENDENT	memcached.stats.bytesread.rate Preprocessing: - JSONPATH: `$.bytes_read` - CHANGEPER_SECOND
Memcached	Memcached: Hits per second	Number of successful GET requests (items requested and found) per second.	DEPENDENT	memcached.stats.hits.rate Preprocessing: - JSONPATH: `$.get_hits` - CHANGEPERSECOND
Memcached	Memcached: Misses per second	Number of missed GET requests (items requested but not found) per second.	DEPENDENT	memcached.stats.misses.rate Preprocessing: - JSONPATH: `$.get_misses` - CHANGEPERSECOND
Memcached	Memcached: Evictions per second	"An eviction is when an item that still has time to live is removed from the cache because a brand new item needs to be allocated. The item is selected with a pseudo-LRU mechanism. A high number of evictions coupled with a low hit rate means your application is setting a large number of keys that are never used again."	DEPENDENT	memcached.stats.evictions.rate Preprocessing: - JSONPATH: `$.evictions` - CHANGEPERSECOND
Memcached	Memcached: New items per second	Number of new items stored per second.	DEPENDENT	memcached.stats.totalitems.rate Preprocessing: - JSONPATH: `$.total_items` - CHANGEPER_SECOND
Memcached	Memcached: Current number of items stored	Current number of items stored by this instance.	DEPENDENT	memcached.stats.curr_items Preprocessing: - JSONPATH: `$.curr_items`
Memcached	Memcached: Threads	Number of worker threads requested	DEPENDENT	memcached.stats.threads Preprocessing: - JSONPATH: `$.threads`
Zabbix raw items	Memcached: Get status		ZABBIX_PASSIVE	memcached.stats["{$MEMCACHED.CONN.URI}"]

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Memcached: Service is down	-	`last(/Memcached by Zabbix agent 2/memcached.ping["{$MEMCACHED.CONN.URI}"])=0`	AVERAGE	Manual close: YES
Memcached: Failed to fetch info data	Zabbix has not received data for items for the last 30 minutes	`nodata(/Memcached by Zabbix agent 2/memcached.cpu.sys,30m)=1`	WARNING	Manual close: YES Depends on: - Memcached: Service is down
Memcached: Too many queued connections	The max number of connections is reached and a new connection had to wait in the queue as a result.	`min(/Memcached by Zabbix agent 2/memcached.connections.queued.rate,5m)>{$MEMCACHED.CONN.QUEUED.MAX.WARN}`	WARNING
Memcached: Too many throttled connections	Number of times a client connection was throttled is too high. When sending GETs in batch mode and the connection contains too many requests (limited by -R parameter) the connection might be throttled to prevent starvation.	`min(/Memcached by Zabbix agent 2/memcached.connections.throttled.rate,5m)>{$MEMCACHED.CONN.THROTTLED.MAX.WARN}`	WARNING
Memcached: Total number of connected clients is too high	When the number of connections reaches the value of the "max_connections" parameter, new connections will be rejected.	`min(/Memcached by Zabbix agent 2/memcached.connections.current,5m)/last(/Memcached by Zabbix agent 2/memcached.connections.max)*100>{$MEMCACHED.CONN.PRC.MAX.WARN}`	WARNING
Memcached: Version has changed	Memcached version has changed. Ack to close.	`last(/Memcached by Zabbix agent 2/memcached.version,#1)<>last(/Memcached by Zabbix agent 2/memcached.version,#2) and length(last(/Memcached by Zabbix agent 2/memcached.version))>0`	INFO	Manual close: YES
Memcached: has been restarted	Uptime is less than 10 minutes.	`last(/Memcached by Zabbix agent 2/memcached.uptime)<10m`	INFO	Manual close: YES
Memcached: Memory usage is too high	-	`min(/Memcached by Zabbix agent 2/memcached.stats.bytes,5m)/last(/Memcached by Zabbix agent 2/memcached.config.limit_maxbytes)*100>{$MEMCACHED.MEM.PUSED.MAX.WARN}`	WARNING

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

app

app_ldap_service

View README Download JSON

LDAP Service

Overview

For Zabbix version: 6.2 and higher

Setup

Refer to the vendor documentation.

Zabbix configuration

No specific Zabbix configuration is required.

Template links

There are no template links in this template.

Discovery rules

Items collected

Group	Name	Description	Type	Key and additional info
Services	LDAP service is running	-	SIMPLE	net.tcp.service[ldap]

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
LDAP service is down on {HOST.NAME}	-	`max(/LDAP Service/net.tcp.service[ldap],#3)=0`	AVERAGE

Feedback

Please report any issues with the template at https://support.zabbix.com

app

kubernetes_state

View README Download JSON

Kubernetes cluster state by HTTP

Overview

For Zabbix version: 6.2 and higher. The template to monitor Kubernetes state that work without any external scripts. It works without external scripts and uses the script item to make HTTP requests to the Kubernetes API.

Template Kubernetes cluster state by HTTP — collects metrics by HTTP agent from kube-state-metrics endpoint and Kubernetes API.

Don't forget change macros {$KUBE.API.URL} and {$KUBE.API.TOKEN}. Also, see the Macros section for a list of macros used to set trigger values. NOTE. Some metrics may not be collected depending on your Kubernetes version and configuration.

This template was tested on:

Kubernetes, version 1.19

Setup

Install the Zabbix Helm Chart in your Kubernetes cluster. Internal service metrics are collected from kube-state-metrics endpoint.

Template needs to use Authorization via API token.

Set the {$KUBE.API.URL} such as <scheme>://<host>:<port>.

Get the generated service account token using the command

kubectl get secret zabbix-service-account -n monitoring -o jsonpath={.data.token} | base64 -d

Then set it to the macro {$KUBE.API.TOKEN}.
Set {$KUBE.STATE.ENDPOINT.NAME} with Kube state metrics endpoint name. See kubectl -n monitoring get ep. Default: zabbix-kube-state-metrics.

Also, see the Macros section for a list of macros used to set trigger values. NOTE. Some metrics may not be collected depending on your Kubernetes version and configuration.

Set up the macros to filter the metrics of discovered worker nodes:

{$KUBE.LLD.FILTER.WORKER_NODE.MATCHES}
{$KUBE.LLD.FILTER.WORKERNODE.NOTMATCHES}

Set up macros to filter metrics by namespace:

{$KUBE.LLD.FILTER.NAMESPACE.MATCHES}
{$KUBE.LLD.FILTER.NAMESPACE.NOT_MATCHES}

Set up macros to filter node metrics by nodename:

{$KUBE.LLD.FILTER.NODE.MATCHES}
{$KUBE.LLD.FILTER.NODE.NOT_MATCHES}

Note, If you have a large cluster, it is highly recommended to set a filter for discoverable namespaces.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$KUBE.API.COMPONENTSTATUSES.ENDPOINT}	Kubernetes API componentstatuses endpoint /api/v1/componentstatuses	`/api/v1/componentstatuses`
{$KUBE.API.LIVEZ.ENDPOINT}	Kubernetes API livez endpoint /livez	`/livez`
{$KUBE.API.READYZ.ENDPOINT}	Kubernetes API readyz endpoint /readyz	`/readyz`
{$KUBE.API.TOKEN}	Service account bearer token	``
{$KUBE.API.URL}	Kubernetes API endpoint URL in the format ://:	`https://localhost:6443`
{$KUBE.API_SERVER.PORT}	Kubernetes API servers metrics endpoint port. Used in ControlPlane LLD.	`6443`
{$KUBE.API_SERVER.SCHEME}	Kubernetes API servers metrics endpoint scheme. Used in ControlPlane LLD.	`https`
{$KUBE.CONTROLLER_MANAGER.PORT}	Kubernetes Controller manager metrics endpoint port. Used in ControlPlane LLD.	`10252`
{$KUBE.CONTROLLER_MANAGER.SCHEME}	Kubernetes Controller manager metrics endpoint scheme. Used in ControlPlane LLD.	`http`
{$KUBE.KUBELET.PORT}	Kubernetes Kubelet manager metrics endpoint port. Used in Kubelet LLD.	`10250`
{$KUBE.KUBELET.SCHEME}	Kubernetes Kubelet manager metrics endpoint scheme. Used in Kubelet LLD.	`https`
{$KUBE.LLD.FILTER.NAMESPACE.MATCHES}	Filter of discoverable pods by namespace	`.*`
{$KUBE.LLD.FILTER.NAMESPACE.NOT_MATCHES}	Filter to exclude discovered pods by namespace	`CHANGE_IF_NEEDED`
{$KUBE.LLD.FILTER.NODE.MATCHES}	Filter of discoverable nodes by nodename	`.*`
{$KUBE.LLD.FILTER.NODE.NOT_MATCHES}	Filter to exclude discovered nodes by nodename	`CHANGE_IF_NEEDED`
{$KUBE.LLD.FILTER.WORKER_NODE.MATCHES}	Filter of discoverable worker nodes by nodename	`.*`
{$KUBE.LLD.FILTER.WORKERNODE.NOTMATCHES}	Filter to exclude discovered worker nodes by nodename	`CHANGE_IF_NEEDED`
{$KUBE.SCHEDULER.PORT}	Kubernetes Scheduler manager metrics endpoint port. Used in ControlPlane LLD.	`10251`
{$KUBE.SCHEDULER.SCHEME}	Kubernetes Scheduler manager metrics endpoint scheme. Used in ControlPlane LLD.	`http`
{$KUBE.STATE.ENDPOINT.NAME}	Kubernetes state endpoint name	`zabbix-kube-state-metrics`

Template links

There are no template links in this template.

Discovery rules

Name	Description	Type	Key and additional info
API servers discovery	-	DEPENDENT	kube.api_servers.discovery
Component statuses discovery	-	DEPENDENT	kube.componentstatuses.discovery Preprocessing: - JAVASCRIPT - DISCARDUNCHANGEDHEARTBEAT
Controller manager nodes discovery	-	DEPENDENT	kube.controller_manager.discovery
CronJob discovery	-	DEPENDENT	kube.cronjob.discovery Preprocessing: - PROMETHEUSTOJSON - JAVASCRIPT - DISCARDUNCHANGEDHEARTBEAT Filter: AND - {#NAMESPACE} MATCHESREGEX `{$KUBE.LLD.FILTER.NAMESPACE.MATCHES}` - {#NAMESPACE} NOTMATCHES_REGEX `{$KUBE.LLD.FILTER.NAMESPACE.NOT_MATCHES}`
Daemonset discovery	-	DEPENDENT	kube.daemonset.discovery Preprocessing: - PROMETHEUSTOJSON - JAVASCRIPT - DISCARDUNCHANGEDHEARTBEAT Filter: AND - {#NAMESPACE} MATCHESREGEX `{$KUBE.LLD.FILTER.NAMESPACE.MATCHES}` - {#NAMESPACE} NOTMATCHES_REGEX `{$KUBE.LLD.FILTER.NAMESPACE.NOT_MATCHES}`
Deployment discovery	-	DEPENDENT	kube.deployment.discovery Preprocessing: - PROMETHEUSTOJSON - JAVASCRIPT - DISCARDUNCHANGEDHEARTBEAT Filter: AND - {#NAMESPACE} MATCHESREGEX `{$KUBE.LLD.FILTER.NAMESPACE.MATCHES}` - {#NAMESPACE} NOTMATCHES_REGEX `{$KUBE.LLD.FILTER.NAMESPACE.NOT_MATCHES}`
Endpoint discovery	-	DEPENDENT	kube.endpoint.discovery Preprocessing: - PROMETHEUSTOJSON - JAVASCRIPT - DISCARDUNCHANGEDHEARTBEAT Filter: AND - {#NAMESPACE} MATCHESREGEX `{$KUBE.LLD.FILTER.NAMESPACE.MATCHES}` - {#NAMESPACE} NOTMATCHES_REGEX `{$KUBE.LLD.FILTER.NAMESPACE.NOT_MATCHES}`
Job discovery	-	DEPENDENT	kube.job.discovery Preprocessing: - PROMETHEUSTOJSON - JAVASCRIPT - DISCARDUNCHANGEDHEARTBEAT Filter: AND - {#NAMESPACE} MATCHESREGEX `{$KUBE.LLD.FILTER.NAMESPACE.MATCHES}` - {#NAMESPACE} NOTMATCHES_REGEX `{$KUBE.LLD.FILTER.NAMESPACE.NOT_MATCHES}`
Kubelet discovery	-	DEPENDENT	kube.kubelet.discovery Filter: AND - {#NAME} MATCHESREGEX `{$KUBE.LLD.FILTER.WORKER_NODE.MATCHES}` - {#NAME} NOTMATCHES_REGEX `{$KUBE.LLD.FILTER.WORKER_NODE.NOT_MATCHES}`
Livez discovery	-	DEPENDENT	kube.livez.discovery Preprocessing: - JAVASCRIPT - DISCARDUNCHANGEDHEARTBEAT
Node discovery	-	DEPENDENT	kube.node.discovery Preprocessing: - PROMETHEUSTOJSON - JAVASCRIPT - DISCARDUNCHANGEDHEARTBEAT Filter: AND - {#NAME} MATCHESREGEX `{$KUBE.LLD.FILTER.NODE.MATCHES}` - {#NAME} NOTMATCHES_REGEX `{$KUBE.LLD.FILTER.NODE.NOT_MATCHES}`
Pod discovery	-	DEPENDENT	kube.pod.discovery Preprocessing: - PROMETHEUSTOJSON - JAVASCRIPT - DISCARDUNCHANGEDHEARTBEAT Filter: AND - {#NAMESPACE} MATCHESREGEX `{$KUBE.LLD.FILTER.NAMESPACE.MATCHES}` - {#NAMESPACE} NOTMATCHES_REGEX `{$KUBE.LLD.FILTER.NAMESPACE.NOT_MATCHES}`
PodDisruptionBudget discovery	-	DEPENDENT	kube.pdb.discovery Preprocessing: - PROMETHEUSTOJSON - JAVASCRIPT - DISCARDUNCHANGEDHEARTBEAT Filter: AND - {#NAMESPACE} MATCHESREGEX `{$KUBE.LLD.FILTER.NAMESPACE.MATCHES}` - {#NAMESPACE} NOTMATCHES_REGEX `{$KUBE.LLD.FILTER.NAMESPACE.NOT_MATCHES}`
PVC discovery	-	DEPENDENT	kube.pvc.discovery Preprocessing: - PROMETHEUSTOJSON - JAVASCRIPT - DISCARDUNCHANGEDHEARTBEAT Filter: AND - {#NAMESPACE} MATCHESREGEX `{$KUBE.LLD.FILTER.NAMESPACE.MATCHES}` - {#NAMESPACE} NOTMATCHES_REGEX `{$KUBE.LLD.FILTER.NAMESPACE.NOT_MATCHES}`
Readyz discovery	-	DEPENDENT	kube.readyz.discovery Preprocessing: - JAVASCRIPT - DISCARDUNCHANGEDHEARTBEAT
Replicaset discovery	-	DEPENDENT	kube.replicaset.discovery Preprocessing: - PROMETHEUSTOJSON - JAVASCRIPT - DISCARDUNCHANGEDHEARTBEAT Filter: AND - {#NAMESPACE} MATCHESREGEX `{$KUBE.LLD.FILTER.NAMESPACE.MATCHES}` - {#NAMESPACE} NOTMATCHES_REGEX `{$KUBE.LLD.FILTER.NAMESPACE.NOT_MATCHES}`
Scheduler servers nodes discovery	-	DEPENDENT	kube.scheduler.discovery
Statefulset discovery	-	DEPENDENT	kube.statefulset.discovery Preprocessing: - PROMETHEUSTOJSON - JAVASCRIPT - DISCARDUNCHANGEDHEARTBEAT Filter: AND - {#NAMESPACE} MATCHESREGEX `{$KUBE.LLD.FILTER.NAMESPACE.MATCHES}` - {#NAMESPACE} NOTMATCHES_REGEX `{$KUBE.LLD.FILTER.NAMESPACE.NOT_MATCHES}`

Items collected

Group	Name	Description	Type	Key and additional info
Kubernetes	Kubernetes: Get state metrics	Collecting Kubernetes metrics from kube-state-metrics.	SCRIPT	kube.state.metrics Expression: `The text is too long. Please see the template.`
Kubernetes	Kubernetes: Control plane LLD	Generation of data for Control plane discovery rules.	SCRIPT	kube.controlplane.lld Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `3h` Expression: `The text is too long. Please see the template.`
Kubernetes	Kubernetes: Node LLD	Generation of data for Kubelet discovery rules.	SCRIPT	kube.node.lld Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `3h` Expression: `The text is too long. Please see the template.`
Kubernetes	Kubernetes: Get component statuses	-	HTTP_AGENT	kube.componentstatuses Preprocessing: - CHECKNOTSUPPORTED ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Get readyz	-	HTTP_AGENT	kube.readyz Preprocessing: - JAVASCRIPT: `var output = [], component; value.split(/\n/).forEach(function (entry) { if (component = entry.match(/^\[.+\](.+)\s(\w+)$/)) { output.push({ name: component[1], value: component[2] }); } }); return JSON.stringify(output);`
Kubernetes	Kubernetes: Get livez	-	HTTP_AGENT	kube.livez Preprocessing: - JAVASCRIPT: `var output = [], component; value.split(/\n/).forEach(function (entry) { if (component = entry.match(/^\[.+\](.+)\s(\w+)$/)) { output.push({ name: component[1], value: component[2] }); } }); return JSON.stringify(output);`
Kubernetes	Kubernetes: Namespace count	The number of namespaces.	DEPENDENT	kube.namespace.count Preprocessing: - PROMETHEUSPATTERN: `kube_namespace_created`: `function`: `count` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: CronJob count	Number of cronjobs.	DEPENDENT	kube.cronjob.count Preprocessing: - PROMETHEUSPATTERN: `kube_cronjob_created`: `function`: `count` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Job count	Number of jobs(generated by cronjob + job).	DEPENDENT	kube.job.count Preprocessing: - PROMETHEUSPATTERN: `kube_job_created`: `function`: `count` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Endpoint count	Number of endpoints.	DEPENDENT	kube.endpoint.count Preprocessing: - PROMETHEUSPATTERN: `kube_endpoint_created`: `function`: `count` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Deployment count	The number of deployments.	DEPENDENT	kube.deployment.count Preprocessing: - PROMETHEUSPATTERN: `kube_deployment_created`: `function`: `count` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Service count	The number of services.	DEPENDENT	kube.service.count Preprocessing: - PROMETHEUSPATTERN: `kube_service_created`: `function`: `count` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Statefulset count	The number of statefulsets.	DEPENDENT	kube.statefulset.count Preprocessing: - PROMETHEUSPATTERN: `kube_statefulset_created`: `function`: `count` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Node count	The number of nodes.	DEPENDENT	kube.node.count Preprocessing: - PROMETHEUSPATTERN: `kube_node_created`: `function`: `count` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Daemonset [{#NAME}]: Ready	The number of nodes that should be running the daemon pod and have one or more running and ready.	DEPENDENT	kube.daemonset.ready[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_daemonset_status_number_ready{namespace="{#NAMESPACE}", daemonset="{#NAME}"}` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Daemonset [{#NAME}]: Scheduled	The number of nodes running at least one daemon pod and are supposed to.	DEPENDENT	kube.daemonset.scheduled[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_daemonset_status_current_number_scheduled{namespace="{#NAMESPACE}", daemonset="{#NAME}"}` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Daemonset [{#NAME}]: Desired	The number of nodes that should be running the daemon pod.	DEPENDENT	kube.daemonset.desired[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_daemonset_status_desired_number_scheduled{namespace="{#NAMESPACE}", daemonset="{#NAME}"}` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Daemonset [{#NAME}]: Misscheduled	The number of nodes running a daemon pod but are not supposed to.	DEPENDENT	kube.daemonset.misscheduled[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_daemonset_status_number_misscheduled{namespace="{#NAMESPACE}", daemonset="{#NAME}"}` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Daemonset [{#NAME}]: Updated number scheduled	The total number of nodes that are running updated daemon pod.	DEPENDENT	kube.daemonset.updated[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_daemonset_status_updated_number_scheduled{namespace="{#NAMESPACE}", daemonset="{#NAME}"}` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] PVC [{#NAME}] Status phase: Available	Persistent volume claim is currently in Active phase.	DEPENDENT	kube.pvc.statusphase.active[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_persistentvolumeclaim_status_phase{namespace="{#NAMESPACE}", name="{#NAME}", phase="Available"}` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] PVC [{#NAME}] Status phase: Lost	Persistent volume claim is currently in Lost phase.	DEPENDENT	kube.pvc.statusphase.lost[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_persistentvolumeclaim_status_phase{namespace="{#NAMESPACE}", name="{#NAME}", phase="Lost"}` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] PVC [{#NAME}] Status phase: Bound	Persistent volume claim is currently in Bound phase.	DEPENDENT	kube.pvc.statusphase.bound[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_persistentvolumeclaim_status_phase{namespace="{#NAMESPACE}", name="{#NAME}", phase="Bound"}` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] PVC [{#NAME}] Status phase: Pending	Persistent volume claim is currently in Pending phase.	DEPENDENT	kube.pvc.statusphase.pending[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_persistentvolumeclaim_status_phase{namespace="{#NAMESPACE}", name="{#NAME}", phase="Pending"}` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] PVC [{#NAME}] Requested storage	The capacity of storage requested by the persistent volume claim.	DEPENDENT	kube.pvc.requested.storage[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_persistentvolumeclaim_resource_requests_storage_bytes{namespace="{#NAMESPACE}", name="{#NAME}"}` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Status phase: Pending, sum	Persistent volume claim is currently in Pending phase.	DEPENDENT	kube.pvc.statusphase.pending.sum[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_persistentvolumeclaim_status_phase{namespace="{#NAMESPACE}", persistentvolumeclaim="{#NAME}", phase="Pending"}`: `function`: `sum` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Status phase: Active, sum	Persistent volume claim is currently in Active phase.	DEPENDENT	kube.pvc.statusphase.active.sum[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_persistentvolumeclaim_status_phase{namespace="{#NAMESPACE}", persistentvolumeclaim="{#NAME}", phase="Active"}`: `function`: `sum` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Status phase: Bound, sum	Persistent volume claim is currently in Bound phase.	DEPENDENT	kube.pvc.statusphase.bound.sum[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_persistentvolumeclaim_status_phase{namespace="{#NAMESPACE}", persistentvolumeclaim="{#NAME}", phase="Bound"}`: `function`: `sum` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Status phase: Lost, sum	Persistent volume claim is currently in Lost phase.	DEPENDENT	kube.pvc.statusphase.lost.sum[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_persistentvolumeclaim_status_phase{namespace="{#NAMESPACE}",persistentvolumeclaim="{#NAME}", phase="Lost"}`: `function`: `sum` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Deployment [{#NAME}]: Paused	Whether the deployment is paused and will not be processed by the deployment controller.	DEPENDENT	kube.deployment.specpaused[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_deployment_spec_paused{namespace="{#NAMESPACE}", deployment="{#NAME}"}` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Deployment [{#NAME}]: Replicas desired	Number of desired pods for a deployment.	DEPENDENT	kube.deployment.replicasdesired[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_deployment_spec_replicas{namespace="{#NAMESPACE}", deployment="{#NAME}"}` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Deployment [{#NAME}]: Rollingupdate max unavailable	Maximum number of unavailable replicas during a rolling update of a deployment.	DEPENDENT	kube.deployment.rollingupdate.maxunavailable[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_deployment_spec_strategy_rollingupdate_max_unavailable{namespace="{#NAMESPACE}", deployment="{#NAME}"}` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Deployment [{#NAME}]: Replicas	The number of replicas per deployment.	DEPENDENT	kube.deployment.replicas[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_deployment_status_replicas{namespace="{#NAMESPACE}", deployment="{#NAME}"}` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Deployment [{#NAME}]: Replicas available	The number of available replicas per deployment.	DEPENDENT	kube.deployment.replicasavailable[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_deployment_status_replicas_available{namespace="{#NAMESPACE}", deployment="{#NAME}"}` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Deployment [{#NAME}]: Replicas unavailable	The number of unavailable replicas per deployment.	DEPENDENT	kube.deployment.replicasunavailable[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_deployment_status_replicas_unavailable{namespace="{#NAMESPACE}", deployment="{#NAME}"}` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Deployment [{#NAME}]: Replicas updated	The number of updated replicas per deployment.	DEPENDENT	kube.deployment.replicasupdated[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_deployment_status_replicas_updated{namespace="{#NAMESPACE}", deployment="{#NAME}"}` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Endpoint [{#NAME}]: Address available	Number of addresses available in endpoint.	DEPENDENT	kube.endpoint.addressavailable[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_endpoint_address_available{namespace="{#NAMESPACE}", endpoint="{#NAME}"}` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Endpoint [{#NAME}]: Address not ready	Number of addresses not ready in endpoint.	DEPENDENT	kube.endpoint.addressnotready[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_endpoint_address_not_ready{namespace="{#NAMESPACE}", endpoint="{#NAME}"}` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Endpoint [{#NAME}]: Age	Endpoint age (number of seconds since creation).	DEPENDENT	kube.endpoint.age[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_endpoint_created{namespace="{#NAMESPACE}", endpoint="{#NAME}"}` ⛔️ONFAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return (Math.floor(Date.now()/1000)-Number(value))`
Kubernetes	Kubernetes: Node [{#NAME}]: CPU allocatable	The CPU resources of a node that are available for scheduling.	DEPENDENT	kube.node.cpuallocatable[{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_node_status_allocatable{node="{#NAME}", resource="cpu"}` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Node [{#NAME}]: Memory allocatable	The Memory resources of a node that are available for scheduling.	DEPENDENT	kube.node.memoryallocatable[{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_node_status_allocatable{node="{#NAME}", resource="memory"}` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Node [{#NAME}]: Pods allocatable	The Pods resources of a node that are available for scheduling.	DEPENDENT	kube.node.podsallocatable[{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_node_status_allocatable{node="{#NAME}", resource="pods"}` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Node [{#NAME}]: Ephemeral storage allocatable	The allocatable ephemeral-storage of a node that is available for scheduling.	DEPENDENT	kube.node.ephemeralstorageallocatable[{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_node_status_allocatable{node="{#NAME}", resource="ephemeral_storage"}` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Node [{#NAME}]: CPU capacity	The capacity for CPU resources of a node.	DEPENDENT	kube.node.cpucapacity[{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_node_status_capacity{node="{#NAME}", resource="cpu"}` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Node [{#NAME}]: Memory capacity	The capacity for Memory resources of a node.	DEPENDENT	kube.node.memorycapacity[{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_node_status_capacity{node="{#NAME}", resource="memory"}` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Node [{#NAME}]: Ephemeral storage capacity	The ephemeral-storage capacity of a node.	DEPENDENT	kube.node.ephemeralstoragecapacity[{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_node_status_capacity{node="{#NAME}", resource="ephemeral_storage"}` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Node [{#NAME}]: Pods capacity	The capacity for Pods resources of a node.	DEPENDENT	kube.node.podscapacity[{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_node_status_capacity{node="{#NAME}", resource="pods"}` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Pod [{#NAME}] Phase: Pending	Pod is in pending state.	DEPENDENT	kube.pod.phase.pending[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_pod_status_phase{pod="{#NAME}", phase="Pending"}` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Pod [{#NAME}] Phase: Succeeded	Pod is in succeeded state.	DEPENDENT	kube.pod.phase.succeeded[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_pod_status_phase{pod="{#NAME}", phase="Succeeded"}` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Pod [{#NAME}] Phase: Failed	Pod is in failed state.	DEPENDENT	kube.pod.phase.failed[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_pod_status_phase{pod="{#NAME}", phase="Failed"}` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Pod [{#NAME}] Phase: Unknown	Pod is in unknown state.	DEPENDENT	kube.pod.phase.unknown[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_pod_status_phase{pod="{#NAME}", phase="Unknown"}` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Pod [{#NAME}] Phase: Running	Pod is in unknown state.	DEPENDENT	kube.pod.phase.running[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_pod_status_phase{pod="{#NAME}", phase="Running"}` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Pod [{#NAME}]: Containers terminated	Describes whether the container is currently in terminated state.	DEPENDENT	kube.pod.containersterminated[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_pod_container_status_terminated{pod="{#NAME}"}`: `function`: `sum` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Pod [{#NAME}]: Containers waiting	Describes whether the container is currently in waiting state.	DEPENDENT	kube.pod.containerswaiting[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_pod_container_status_waiting{pod="{#NAME}"}`: `function`: `sum` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Pod [{#NAME}]: Containers ready	Describes whether the containers readiness check succeeded.	DEPENDENT	kube.pod.containersready[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_pod_container_status_ready{pod="{#NAME}"}`: `function`: `sum` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Pod [{#NAME}]: Containers restarts	The number of container restarts.	DEPENDENT	kube.pod.containersrestarts[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_pod_container_status_restarts_total{pod="{#NAME}"}`: `function`: `sum` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Pod [{#NAME}]: Containers running	Describes whether the container is currently in running state.	DEPENDENT	kube.pod.containersrunning[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_pod_container_status_running{pod="{#NAME}"}`: `function`: `sum` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Pod [{#NAME}]: Ready	Describes whether the pod is ready to serve requests.	DEPENDENT	kube.pod.ready[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_pod_status_ready{pod="{#NAME}", condition="true"}` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Pod [{#NAME}]: Scheduled	Describes the status of the scheduling process for the pod.	DEPENDENT	kube.pod.scheduled[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_pod_status_scheduled{pod="{#NAME}", condition="true"}` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Pod [{#NAME}]: Unschedulable	Describes the unschedulable status for the pod.	DEPENDENT	kube.pod.unschedulable[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_pod_status_unschedulable{pod="{#NAME}"}` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Pod [{#NAME}]: Containers CPU limits	The limit on CPU cores to be used by a container.	DEPENDENT	kube.pod.containers.limits.cpu[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_pod_container_resource_limits{pod="{#NAME}", resource="cpu"}`: `function`: `sum` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Pod [{#NAME}]: Containers memory limits	The limit on memory to be used by a container.	DEPENDENT	kube.pod.containers.limits.memory[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_pod_container_resource_limits{pod="{#NAME}", resource="memory"}`: `function`: `sum` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Pod [{#NAME}]: Containers CPU requests	The number of requested cpu cores by a container.	DEPENDENT	kube.pod.containers.requests.cpu[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_pod_container_resource_requests{pod="{#NAME}", resource="cpu"}`: `function`: `sum` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Pod [{#NAME}]: Containers memory requests	The number of requested memory bytes by a container.	DEPENDENT	kube.pod.containers.requests.memory[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_pod_container_resource_requests{pod="{#NAME}", resource="memory"}`: `function`: `sum` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Replicaset [{#NAME}]: Replicas	The number of replicas per ReplicaSet.	DEPENDENT	kube.replicaset.replicas[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_replicaset_status_replicas{namespace="{#NAMESPACE}", replicaset="{#NAME}"}` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Replicaset [{#NAME}]: Desired replicas	Number of desired pods for a ReplicaSet.	DEPENDENT	kube.replicaset.replicasdesired[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_replicaset_spec_replicas{namespace="{#NAMESPACE}", replicaset="{#NAME}"}` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Replicaset [{#NAME}]: Fully labeled replicas	The number of fully labeled replicas per ReplicaSet.	DEPENDENT	kube.replicaset.fullylabeledreplicas[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_replicaset_status_fully_labeled_replicas{namespace="{#NAMESPACE}", replicaset="{#NAME}"}` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Replicaset [{#NAME}]: Ready	The number of ready replicas per ReplicaSet.	DEPENDENT	kube.replicaset.ready[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_replicaset_status_ready_replicas{namespace="{#NAMESPACE}", replicaset="{#NAME}"}` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Statefulset [{#NAME}]: Replicas	The number of replicas per StatefulSet.	DEPENDENT	kube.statefulset.replicas[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_statefulset_status_replicas{namespace="{#NAMESPACE}", statefulset="{#NAME}"}` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Statefulset [{#NAME}]: Desired replicas	Number of desired pods for a StatefulSet.	DEPENDENT	kube.statefulset.replicasdesired[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_statefulset_replicas{namespace="{#NAMESPACE}", statefulset="{#NAME}"}` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Statefulset [{#NAME}]: Current replicas	The number of current replicas per StatefulSet.	DEPENDENT	kube.statefulset.replicascurrent[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_statefulset_status_replicas_current{namespace="{#NAMESPACE}", statefulset="{#NAME}"}` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Statefulset [{#NAME}]: Ready replicas	The number of ready replicas per StatefulSet.	DEPENDENT	kube.statefulset.replicasready[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_statefulset_status_replicas_ready{namespace="{#NAMESPACE}", statefulset="{#NAME}"}` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Statefulset [{#NAME}]: Updated replicas	The number of updated replicas per StatefulSet.	DEPENDENT	kube.statefulset.replicasupdated[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_statefulset_status_replicas_updated{namespace="{#NAMESPACE}", statefulset="{#NAME}"}` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] PodDisruptionBudget [{#NAME}]: Pods healthy	Current number of healthy pods.	DEPENDENT	kube.pdb.podshealthy[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_poddisruptionbudget_status_current_healthy{namespace="{#NAMESPACE}", poddisruptionbudget="{#NAME}"}` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] PodDisruptionBudget [{#NAME}]: Pods desired	Minimum desired number of healthy pods.	DEPENDENT	kube.pdb.podsdesired[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_poddisruptionbudget_status_desired_healthy{namespace="{#NAMESPACE}", poddisruptionbudget="{#NAME}"}` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] PodDisruptionBudget [{#NAME}]: Disruptions allowed	Number of pod disruptions that are allowed.	DEPENDENT	kube.pdb.disruptionsallowed[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_poddisruptionbudget_status_pod_disruptions_allowed{namespace="{#NAMESPACE}", poddisruptionbudget="{#NAME}"}` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] PodDisruptionBudget [{#NAME}]: Pods total	Total number of pods counted by this disruption budget.	DEPENDENT	kube.pdb.podstotal[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_poddisruptionbudget_status_expected_pods{namespace="{#NAMESPACE}", poddisruptionbudget="{#NAME}"}` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] CronJob [{#NAME}]: Suspend	Suspend flag tells the controller to suspend subsequent executions.	DEPENDENT	kube.cronjob.specsuspend[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_cronjob_spec_suspend{namespace="{#NAMESPACE}", cronjob="{#NAME}"}` ⛔️ONFAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGED_HEARTBEAT: `1h`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] CronJob [{#NAME}]: Active	Active holds pointers to currently running jobs.	DEPENDENT	kube.cronjob.statusactive[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_cronjob_status_active{namespace="{#NAMESPACE}", cronjob="{#NAME}"}` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] CronJob [{#NAME}]: Last schedule	LastScheduleTime keeps information of when was the last time the job was successfully scheduled.	DEPENDENT	kube.cronjob.lastscheduletime[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_cronjob_status_last_schedule_time{namespace="{#NAMESPACE}", cronjob="{#NAME}"}` ⛔️ONFAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return new Date(value * 1000).toString().slice(0,19);`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] CronJob [{#NAME}]: Next schedule	Next time the cronjob should be scheduled. The time after lastScheduleTime, or after the cron job's creation time if it's never been scheduled. Use this to determine if the job is delayed.	DEPENDENT	kube.cronjob.nextscheduletime[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_cronjob_next_schedule_time{namespace="{#NAMESPACE}", cronjob="{#NAME}"}` ⛔️ONFAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return new Date(value * 1000).toString().slice(0,19);`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] CronJob [{#NAME}]: Failed	The number of pods which reached Phase Failed and the reason for failure.	DEPENDENT	kube.cronjob.statusfailed[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_job_status_failed{namespace="{#NAMESPACE}", job_name=~"{#NAME}-*"}`: `function`: `sum` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] CronJob [{#NAME}]: Succeeded	The number of pods which reached Phase Succeeded.	DEPENDENT	kube.cronjob.statussucceeded[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_job_status_succeeded{namespace="{#NAMESPACE}", job_name=~"{#NAME}-*"}`: `function`: `sum` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] CronJob [{#NAME}]: Completion succeeded	Number of job has completed its execution.	DEPENDENT	kube.cronjob.completion.succeeded[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_job_complete{namespace="{#NAMESPACE}", job_name=~"{#NAME}-", condition="true"}`: `function`: `sum` ⛔️ON*FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] CronJob [{#NAME}]: Completion failed	Number of job has failed its execution.	DEPENDENT	kube.cronjob.completion.failed[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_job_failed{namespace="{#NAMESPACE}", job_name=~"{#NAME}-", condition="true"}`: `function`: `sum` ⛔️ON*FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Job [{#NAME}]: Failed	The number of pods which reached Phase Failed and the reason for failure.	DEPENDENT	kube.job.statusfailed[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_job_status_failed{namespace="{#NAMESPACE}", job_name="{#NAME}"}` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Job [{#NAME}]: Succeeded	The number of pods which reached Phase Succeeded.	DEPENDENT	kube.job.statussucceeded[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_job_status_succeeded{namespace="{#NAMESPACE}", job_name="{#NAME}"}` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Job [{#NAME}]: Completion succeeded	Number of job has completed its execution.	DEPENDENT	kube.job.completion.succeeded[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_job_complete{namespace="{#NAMESPACE}", job_name="{#NAME}", condition="true"}` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Job [{#NAME}]: Completion failed	Number of job has failed its execution.	DEPENDENT	kube.job.completion.failed[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `kube_job_failed{namespace="{#NAMESPACE}", job_name="{#NAME}", condition="true"}` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Component [{#NAME}]: Healthy	Cluster component healthy.	DEPENDENT	kube.componentstatuses.healthy[{#NAME}] Preprocessing: - JSONPATH: `$.items.[?(@.metadata.name == "{#NAME}")].conditions[?(@.type == "Healthy")].status.first()` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Readyz [{#NAME}]: Healthcheck	Result of readyz healthcheck for component.	DEPENDENT	kube.readyz.healthcheck[{#NAME}] Preprocessing: - JSONPATH: `$.[?(@.name == "{#NAME}")].value.first()` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Livez [{#NAME}]: Healthcheck	Result of livez healthcheck for component.	DEPENDENT	kube.livez.healthcheck[{#NAME}] Preprocessing: - JSONPATH: `$.[?(@.name == "{#NAME}")].value.first()` ⛔️ON_FAIL: `DISCARD_VALUE ->`

Triggers

Name	Description	Expression	Severity
Kubernetes: NS [{#NAMESPACE}] PVC [{#NAME}]: PVC is pending	-	`min(/Kubernetes cluster state by HTTP/kube.pvc.status_phase.pending[{#NAMESPACE}/{#NAME}],2m)>0`	WARNING
Kubernetes: Namespace [{#NAMESPACE}] Deployment [{#NAME}]: Deployment replicas mismatch	-	`(last(/Kubernetes cluster state by HTTP/kube.deployment.replicas[{#NAMESPACE}/{#NAME}])-last(/Kubernetes cluster state by HTTP/kube.deployment.replicas_available[{#NAMESPACE}/{#NAME}]))<>0`	WARNING
Kubernetes: Namespace [{#NAMESPACE}] Pod [{#NAME}]: Pod is not healthy	-	`min(/Kubernetes cluster state by HTTP/kube.pod.phase.failed[{#NAMESPACE}/{#NAME}],10m)>0 or min(/Kubernetes cluster state by HTTP/kube.pod.phase.pending[{#NAMESPACE}/{#NAME}],10m)>0 or min(/Kubernetes cluster state by HTTP/kube.pod.phase.unknown[{#NAMESPACE}/{#NAME}],10m)>0`	HIGH
Kubernetes: Namespace [{#NAMESPACE}] Pod [{#NAME}]: Pod is crash looping	-	`(last(/Kubernetes cluster state by HTTP/kube.pod.containers_restarts[{#NAMESPACE}/{#NAME}])-min(/Kubernetes cluster state by HTTP/kube.pod.containers_restarts[{#NAMESPACE}/{#NAME}],#3))>2`	WARNING
Kubernetes: Namespace [{#NAMESPACE}] RS [{#NAME}]: ReplicasSet mismatch	-	`(last(/Kubernetes cluster state by HTTP/kube.replicaset.replicas[{#NAMESPACE}/{#NAME}])-last(/Kubernetes cluster state by HTTP/kube.replicaset.ready[{#NAMESPACE}/{#NAME}]))<>0`	WARNING
Kubernetes: Namespace [{#NAMESPACE}] StatefulSet [{#NAME}]: StatfulSet is down	-	`(last(/Kubernetes cluster state by HTTP/kube.statefulset.replicas_ready[{#NAMESPACE}/{#NAME}]) / last(/Kubernetes cluster state by HTTP/kube.statefulset.replicas_current[{#NAMESPACE}/{#NAME}]))<>1`	HIGH
Kubernetes: Namespace [{#NAMESPACE}] RS [{#NAME}]: Statefulset replicas mismatch	-	`(last(/Kubernetes cluster state by HTTP/kube.statefulset.replicas[{#NAMESPACE}/{#NAME}])-last(/Kubernetes cluster state by HTTP/kube.statefulset.replicas_ready[{#NAMESPACE}/{#NAME}]))<>0`	WARNING
Kubernetes: Component [{#NAME}] is unhealthy	-	`count(/Kubernetes cluster state by HTTP/kube.componentstatuses.healthy[{#NAME}],#3,,"True")<2 and length(last(/Kubernetes cluster state by HTTP/kube.componentstatuses.healthy[{#NAME}]))>0`	WARNING
Kubernetes: Readyz [{#NAME}] is unhealthy	-	`count(/Kubernetes cluster state by HTTP/kube.readyz.healthcheck[{#NAME}],#3,,"ok")<2 and length(last(/Kubernetes cluster state by HTTP/kube.readyz.healthcheck[{#NAME}]))>0`	WARNING
Kubernetes: Livez [{#NAME}] is unhealthy	-	`count(/Kubernetes cluster state by HTTP/kube.livez.healthcheck[{#NAME}],#3,,"ok")<2 and length(last(/Kubernetes cluster state by HTTP/kube.livez.healthcheck[{#NAME}]))>0`	WARNING

Feedback

Please report any issues with the template at https://support.zabbix.com.

You can also provide feedback, discuss the template, or ask for help at ZABBIX forums.

app

kubernetes_scheduler

View README Download JSON

Kubernetes Scheduler by HTTP

Overview

For Zabbix version: 6.2 and higher
The template to monitor Kubernetes Scheduler by Zabbix that works without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

Template Kubernetes Scheduler by HTTP — collects metrics by HTTP agent from Scheduler /metrics endpoint.

This template was tested on:

Kubernetes Scheduler, version 1.19.10

Setup

Internal service metrics are collected from /metrics endpoint. Template needs to use Authorization via API token.

Don't forget change macros {$KUBE.SCHEDULER.SERVER.URL}, {$KUBE.API.TOKEN}. Also, see the Macros section for a list of macros used to set trigger values. NOTE. Some metrics may not be collected depending on your Kubernetes Scheduler instance version and configuration.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$KUBE.API.TOKEN}	API Authorization Token	``
{$KUBE.SCHEDULER.ERROR}	Maximum number of scheduling failures with 'error' used for trigger	`2`
{$KUBE.SCHEDULER.HTTP.CLIENT.ERROR}	Maximum number of HTTP client requests failures used for trigger	`2`
{$KUBE.SCHEDULER.SERVER.URL}	Instance URL	`http://localhost:10251/metrics`
{$KUBE.SCHEDULER.UNSCHEDULABLE}	Maximum number of scheduling failures with 'unschedulable' used for trigger	`2`

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Binding histogram

Discovery raw data of binding latency.

DEPENDENT

kubernetes.scheduler.binding.discovery

Preprocessing:

- PROMETHEUSTOJSON: {__name__=~ "scheduler_binding_duration_seconds_*"}

- JAVASCRIPT: The text is too long. Please see the template.

- DISCARDUNCHANGEDHEARTBEAT: 3h

Overrides:

bucket item
- {#TYPE} MATCHESREGEX buckets
- ITEMPROTOTYPE LIKE bucket - DISCOVER

total item
- {#TYPE} MATCHESREGEX totals
- ITEMPROTOTYPE NOT_LIKE bucket - DISCOVER

e2e scheduling histogram

Discovery raw data and percentile items of e2e scheduling latency.

DEPENDENT

kubernetes.controller.e2escheduling.discovery
Preprocessing:

- PROMETHEUSTOJSON: {__name__=~ "scheduler_e2e_scheduling_duration_*", result =~ ".*"}

- JAVASCRIPT: The text is too long. Please see the template.

- DISCARDUNCHANGEDHEARTBEAT: 3h

Overrides:

bucket item
- {#TYPE} MATCHESREGEX buckets
- ITEMPROTOTYPE LIKE bucket - DISCOVER

total item
- {#TYPE} MATCHESREGEX totals
- ITEMPROTOTYPE NOTLIKE bucket - DISCOVER

Scheduling algorithm histogram

Discovery raw data of scheduling algorithm latency.

DEPENDENT

kubernetes.scheduler.schedulingalgorithm.discovery
Preprocessing:

- PROMETHEUSTOJSON: {__name__=~ "scheduler_scheduling_algorithm_duration_seconds_*"}

- JAVASCRIPT: The text is too long. Please see the template.

- DISCARDUNCHANGEDHEARTBEAT: 3h

Overrides:

bucket item
- {#TYPE} MATCHESREGEX buckets
- ITEMPROTOTYPE LIKE bucket - DISCOVER

total item
- {#TYPE} MATCHESREGEX totals
- ITEMPROTOTYPE NOTLIKE bucket - DISCOVER

Items collected

Group	Name	Description	Type	Key and additional info
Kubernetes Scheduler	Kubernetes Scheduler: Virtual memory, bytes	Virtual memory size in bytes.	DEPENDENT	kubernetes.scheduler.processvirtualmemorybytes Preprocessing: - PROMETHEUSPATTERN: `process_virtual_memory_bytes` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes Scheduler	Kubernetes Scheduler: Resident memory, bytes	Resident memory size in bytes.	DEPENDENT	kubernetes.scheduler.processresidentmemorybytes Preprocessing: - PROMETHEUSPATTERN: `process_resident_memory_bytes` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes Scheduler	Kubernetes Scheduler: CPU	Total user and system CPU usage ratio.	DEPENDENT	kubernetes.scheduler.cpu.util Preprocessing: - PROMETHEUSPATTERN: `process_cpu_seconds_total` - CHANGEPER_SECOND - MULTIPLIER: `100`
Kubernetes Scheduler	Kubernetes Scheduler: Goroutines	Number of goroutines that currently exist.	DEPENDENT	kubernetes.scheduler.gogoroutines Preprocessing: - PROMETHEUSPATTERN: `go_goroutines`: `function`: `sum` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes Scheduler	Kubernetes Scheduler: Go threads	Number of OS threads created.	DEPENDENT	kubernetes.scheduler.gothreads Preprocessing: - PROMETHEUSPATTERN: `go_threads` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes Scheduler	Kubernetes Scheduler: Fds open	Number of open file descriptors.	DEPENDENT	kubernetes.scheduler.openfds Preprocessing: - PROMETHEUSPATTERN: `process_open_fds` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes Scheduler	Kubernetes Scheduler: Fds max	Maximum allowed open file descriptors.	DEPENDENT	kubernetes.scheduler.maxfds Preprocessing: - PROMETHEUSPATTERN: `process_max_fds` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes Scheduler	Kubernetes Scheduler: REST Client requests: 2xx, rate	Number of HTTP requests with 2xx status code per second.	DEPENDENT	kubernetes.scheduler.clienthttprequests200.rate Preprocessing: - PROMETHEUSPATTERN: `rest_client_requests_total{code =~ "2.."}`: `function`: `sum` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
Kubernetes Scheduler	Kubernetes Scheduler: REST Client requests: 3xx, rate	Number of HTTP requests with 3xx status code per second.	DEPENDENT	kubernetes.scheduler.clienthttprequests300.rate Preprocessing: - PROMETHEUSPATTERN: `rest_client_requests_total{code =~ "3.."}`: `function`: `sum` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
Kubernetes Scheduler	Kubernetes Scheduler: REST Client requests: 4xx, rate	Number of HTTP requests with 4xx status code per second.	DEPENDENT	kubernetes.scheduler.clienthttprequests400.rate Preprocessing: - PROMETHEUSPATTERN: `rest_client_requests_total{code =~ "4.."}`: `function`: `sum` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
Kubernetes Scheduler	Kubernetes Scheduler: REST Client requests: 5xx, rate	Number of HTTP requests with 5xx status code per second.	DEPENDENT	kubernetes.scheduler.clienthttprequests500.rate Preprocessing: - PROMETHEUSPATTERN: `rest_client_requests_total{code =~ "5.."}`: `function`: `sum` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
Kubernetes Scheduler	Kubernetes Scheduler: Schedule attempts: scheduled	Number of attempts to schedule pods with result "scheduled" per second.	DEPENDENT	kubernetes.scheduler.schedulerscheduleattempts.scheduled.rate Preprocessing: - PROMETHEUSPATTERN: `scheduler_schedule_attempts_total{result = "scheduled"}`: `function`: `sum` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Kubernetes Scheduler	Kubernetes Scheduler: Schedule attempts: unschedulable	Number of attempts to schedule pods with result "unschedulable" per second.	DEPENDENT	kubernetes.scheduler.schedulerscheduleattempts.unschedulable.rate Preprocessing: - PROMETHEUSPATTERN: `scheduler_schedule_attempts_total{result = "unschedulable"}`: `function`: `sum` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Kubernetes Scheduler	Kubernetes Scheduler: Schedule attempts: error	Number of attempts to schedule pods with result "error" per second.	DEPENDENT	kubernetes.scheduler.schedulerscheduleattempts.error.rate Preprocessing: - PROMETHEUSPATTERN: `scheduler_schedule_attempts_total{result = "error"}`: `function`: `sum` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Kubernetes Scheduler	Kubernetes Scheduler: Scheduling algorithm duration bucket, {#LE}	Scheduling algorithm latency in seconds.	DEPENDENT	kubernetes.scheduler.schedulingalgorithmduration[{#LE}] Preprocessing: - PROMETHEUSPATTERN: `scheduler_scheduling_algorithm_duration_seconds_bucket{le = "{#LE}"}` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes Scheduler	Kubernetes Scheduler: Scheduling algorithm duration, p90	90 percentile of scheduling algorithm latency in seconds.	CALCULATED	kubernetes.scheduler.schedulingalgorithmduration_p90[{#SINGLETON}] Expression: `bucket_percentile(//kubernetes.scheduler.scheduling_algorithm_duration[*],5m,90)`
Kubernetes Scheduler	Kubernetes Scheduler: Scheduling algorithm duration, p95	95 percentile of scheduling algorithm latency in seconds.	CALCULATED	kubernetes.scheduler.schedulingalgorithmduration_p95[{#SINGLETON}] Expression: `bucket_percentile(//kubernetes.scheduler.scheduling_algorithm_duration[*],5m,95)`
Kubernetes Scheduler	Kubernetes Scheduler: Scheduling algorithm duration, p99	99 percentile of scheduling algorithm latency in seconds.	CALCULATED	kubernetes.scheduler.schedulingalgorithmduration_p99[{#SINGLETON}] Expression: `bucket_percentile(//kubernetes.scheduler.scheduling_algorithm_duration[*],5m,99)`
Kubernetes Scheduler	Kubernetes Scheduler: Scheduling algorithm duration, p50	50 percentile of scheduling algorithm latency in seconds.	CALCULATED	kubernetes.scheduler.schedulingalgorithmduration_p50[{#SINGLETON}] Expression: `bucket_percentile(//kubernetes.scheduler.scheduling_algorithm_duration[*],5m,50)`
Kubernetes Scheduler	Kubernetes Scheduler: Binding duration bucket, {#LE}	Binding latency in seconds.	DEPENDENT	kubernetes.scheduler.bindingduration[{#LE}] Preprocessing: - PROMETHEUSPATTERN: `scheduler_binding_duration_seconds_bucket{le = "{#LE}"}` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes Scheduler	Kubernetes Scheduler: Binding duration, p90	90 percentile of binding latency in seconds.	CALCULATED	kubernetes.scheduler.bindingdurationp90[{#SINGLETON}] Expression: `bucket_percentile(//kubernetes.scheduler.binding_duration[*],5m,90)`
Kubernetes Scheduler	Kubernetes Scheduler: Binding duration, p95	99 percentile of binding latency in seconds.	CALCULATED	kubernetes.scheduler.bindingdurationp95[{#SINGLETON}] Expression: `bucket_percentile(//kubernetes.scheduler.binding_duration[*],5m,95)`
Kubernetes Scheduler	Kubernetes Scheduler: Binding duration, p99	95 percentile of binding latency in seconds.	CALCULATED	kubernetes.scheduler.bindingdurationp99[{#SINGLETON}] Expression: `bucket_percentile(//kubernetes.scheduler.binding_duration[*],5m,99)`
Kubernetes Scheduler	Kubernetes Scheduler: Binding duration, p50	50 percentile of binding latency in seconds.	CALCULATED	kubernetes.scheduler.bindingdurationp50[{#SINGLETON}] Expression: `bucket_percentile(//kubernetes.scheduler.binding_duration[*],5m,50)`
Kubernetes Scheduler	Kubernetes Scheduler: ["{#RESULT}"]: e2e scheduling seconds bucket, {#LE}	E2e scheduling latency in seconds (scheduling algorithm + binding)	DEPENDENT	kubernetes.scheduler.e2eschedulingbucket[{#LE},"{#RESULT}"] Preprocessing: - PROMETHEUSPATTERN: `scheduler_e2e_scheduling_duration_seconds_bucket{result = "{#RESULT}",le = "{#LE}"}` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes Scheduler	Kubernetes Scheduler: ["{#RESULT}"]: e2e scheduling, p50	50 percentile of e2e scheduling latency.	CALCULATED	kubernetes.scheduler.e2eschedulingp50["{#RESULT}"] Expression: `bucket_percentile(//kubernetes.scheduler.e2e_scheduling_bucket[*,"{#RESULT}"],5m,50)`
Kubernetes Scheduler	Kubernetes Scheduler: ["{#RESULT}"]: e2e scheduling, p90	90 percentile of e2e scheduling latency.	CALCULATED	kubernetes.scheduler.e2eschedulingp90["{#RESULT}"] Expression: `bucket_percentile(//kubernetes.scheduler.e2e_scheduling_bucket[*,"{#RESULT}"],5m,90)`
Kubernetes Scheduler	Kubernetes Scheduler: ["{#RESULT}"]: e2e scheduling, p95	95 percentile of e2e scheduling latency.	CALCULATED	kubernetes.scheduler.e2eschedulingp95["{#RESULT}"] Expression: `bucket_percentile(//kubernetes.scheduler.e2e_scheduling_bucket[*,"{#RESULT}"],5m,95)`
Kubernetes Scheduler	Kubernetes Scheduler: ["{#RESULT}"]: e2e scheduling, p99	95 percentile of e2e scheduling latency.	CALCULATED	kubernetes.scheduler.e2eschedulingp99["{#RESULT}"] Expression: `bucket_percentile(//kubernetes.scheduler.e2e_scheduling_bucket[*,"{#RESULT}"],5m,99)`
Zabbix raw items	Kubernetes Scheduler: Get Scheduler metrics	Get raw metrics from Scheduler instance /metrics endpoint.	HTTP_AGENT	kubernetes.scheduler.getmetrics Preprocessing: - CHECKNOTSUPPORTED ⛔️ONFAIL: `DISCARD_VALUE ->`

Triggers

Name	Description	Expression	Severity
Kubernetes Scheduler: Too many REST Client errors	"Kubernetes Scheduler REST Client requests is experiencing high error rate (with 5xx HTTP code).	`min(/Kubernetes Scheduler by HTTP/kubernetes.scheduler.client_http_requests_500.rate,5m)>{$KUBE.SCHEDULER.HTTP.CLIENT.ERROR}`	WARNING
Kubernetes Scheduler: Too many unschedulable pods	"Number of attempts to schedule pods with 'unschedulable' result is too high. 'unschedulable' means a pod could not be scheduled."	`min(/Kubernetes Scheduler by HTTP/kubernetes.scheduler.scheduler_schedule_attempts.unschedulable.rate,5m)>{$KUBE.SCHEDULER.UNSCHEDULABLE}`	WARNING
Kubernetes Scheduler: Too many schedule attempts with errors	"Number of attempts to schedule pods with 'error' result is too high. 'error' means an internal scheduler problem."	`min(/Kubernetes Scheduler by HTTP/kubernetes.scheduler.scheduler_schedule_attempts.error.rate,5m)>{$KUBE.SCHEDULER.ERROR}`	WARNING

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

app

kubernetes_nodes

View README Download JSON

Kubernetes nodes by HTTP

Overview

For Zabbix version: 6.2 and higher. The template to monitor Kubernetes nodes that work without any external scripts.
It works without external scripts and uses the script item to make HTTP requests to the Kubernetes API. Install the Zabbix Helm Chart (https://git.zabbix.com/projects/ZT/repos/kubernetes-helm/browse?at=refs%2Fheads%2Frelease%2F6.2) in your Kubernetes cluster.

Set the {$KUBE.API.ENDPOINT.URL} such as <scheme>://<host>:<port>/api.

Get the generated service account token using the command

kubectl get secret zabbix-service-account -n monitoring -o jsonpath={.data.token} | base64 -d

Then set it to the macro {$KUBE.API.TOKEN}.

Set {$KUBE.NODES.ENDPOINT.NAME} with Zabbix agent's endpoint name. See kubectl -n monitoring get ep. Default: zabbix-zabbix-helm-chrt-agent. Set up the macros to filter the metrics of discovered nodes

This template was tested on:

Kubernetes, version 1.19

Setup

Install the Zabbix Helm Chart in your Kubernetes cluster.

Set the {$KUBE.API.ENDPOINT.URL} such as <scheme>://<host>:<port>/api.

Get the generated service account token using the command

kubectl get secret zabbix-service-account -n monitoring -o jsonpath={.data.token} | base64 -d

Then set it to the macro {$KUBE.API.TOKEN}.
Set {$KUBE.NODES.ENDPOINT.NAME} with Zabbix agent's endpoint name. See kubectl -n monitoring get ep. Default: zabbix-zabbix-helm-chrt-agent.

Set up the macros to filter the metrics of discovered nodes:

{$KUBE.LLD.FILTER.NODE.MATCHES}
{$KUBE.LLD.FILTER.NODE.NOT_MATCHES}
{$KUBE.LLD.FILTER.NODE.ROLE.MATCHES}
{$KUBE.LLD.FILTER.NODE.ROLE.NOT_MATCHES}

Set up the macros to filter host creation based on host prototypes:

{$KUBE.LLD.FILTER.NODE_HOST.MATCHES}
{$KUBE.LLD.FILTER.NODEHOST.NOTMATCHES}
{$KUBE.LLD.FILTER.NODE_HOST.ROLE.MATCHES}
{$KUBE.LLD.FILTER.NODEHOST.ROLE.NOTMATCHES}

Set up macros to filter pod metrics by namespace:

{$KUBE.LLD.FILTER.POD.NAMESPACE.MATCHES}
{$KUBE.LLD.FILTER.POD.NAMESPACE.NOT_MATCHES}

Note, If you have a large cluster, it is highly recommended to set a filter for discoverable pods.

You can use {$KUBE.NODE.FILTER.LABELS}, {$KUBE.POD.FILTER.LABELS}, {$KUBE.NODE.FILTER.ANNOTATIONS} and {$KUBE.POD.FILTER.ANNOTATIONS} macros for advanced filtering nodes and pods by labels and annotations. Macro values are specified separated by commas and must have the key/value form with support for regular expressions in the value.

For example: kubernetes.io/hostname: kubernetes-node[5-25], !node-role.kubernetes.io/ingress: .*. As a result, the nodes 5-25 without the "ingress" role will be discovered.

See documentation for details:

Note, the discovered nodes will be created as separate hosts in Zabbix with the Linux template automatically assigned to them.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$KUBE.API.ENDPOINT.URL}	Kubernetes API endpoint URL in the format ://:/api	`https://localhost:6443/api`
{$KUBE.API.TOKEN}	Service account bearer token	``
{$KUBE.LLD.FILTER.NODE.MATCHES}	Filter of discoverable nodes	`.*`
{$KUBE.LLD.FILTER.NODE.NOT_MATCHES}	Filter to exclude discovered nodes	`CHANGE_IF_NEEDED`
{$KUBE.LLD.FILTER.NODE.ROLE.MATCHES}	Filter of discoverable nodes by role	`.*`
{$KUBE.LLD.FILTER.NODE.ROLE.NOT_MATCHES}	Filter to exclude discovered node by role	`CHANGE_IF_NEEDED`
{$KUBE.LLD.FILTER.NODE_HOST.MATCHES}	Filter of discoverable cluster nodes	`.*`
{$KUBE.LLD.FILTER.NODEHOST.NOTMATCHES}	Filter to exclude discovered cluster nodes	`CHANGE_IF_NEEDED`
{$KUBE.LLD.FILTER.NODE_HOST.ROLE.MATCHES}	Filter of discoverable nodes hosts by role	`.*`
{$KUBE.LLD.FILTER.NODEHOST.ROLE.NOTMATCHES}	Filter to exclude discovered cluster nodes by role	`CHANGE_IF_NEEDED`
{$KUBE.LLD.FILTER.POD.NAMESPACE.MATCHES}	Filter of discoverable pods by namespace	`.*`
{$KUBE.LLD.FILTER.POD.NAMESPACE.NOT_MATCHES}	Filter to exclude discovered pods by namespace	`CHANGE_IF_NEEDED`
{$KUBE.NODE.FILTER.ANNOTATIONS}	Annotations to filter nodes (regex in values are supported)	``
{$KUBE.NODE.FILTER.LABELS}	Labels to filter nodes (regex in values are supported)	``
{$KUBE.NODES.ENDPOINT.NAME}	Kubernetes nodes endpoint name. See kubectl -n monitoring get ep	`zabbix-zabbix-helm-chrt-agent`
{$KUBE.POD.FILTER.ANNOTATIONS}	Annotations to filter pods (regex in values are supported)	``
{$KUBE.POD.FILTER.LABELS}	Labels to filter Pods (regex in values are supported)	``

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Cluster node discovery

DEPENDENT

kube.nodehost.discovery
Filter:
AND

- {#NAME} MATCHESREGEX {$KUBE.LLD.FILTER.NODE_HOST.MATCHES}

- {#NAME} NOTMATCHESREGEX {$KUBE.LLD.FILTER.NODE_HOST.NOT_MATCHES}

- {#ROLES} MATCHESREGEX {$KUBE.LLD.FILTER.NODE_HOST.ROLE.MATCHES}

- {#ROLES} NOTMATCHES_REGEX {$KUBE.LLD.FILTER.NODE_HOST.ROLE.NOT_MATCHES}

Node discovery

DEPENDENT

kube.node.discovery

Filter:

AND

- {#NAME} MATCHESREGEX {$KUBE.LLD.FILTER.NODE.MATCHES}

- {#NAME} NOTMATCHESREGEX {$KUBE.LLD.FILTER.NODE.NOT_MATCHES}

- {#ROLES} MATCHESREGEX {$KUBE.LLD.FILTER.NODE.ROLE.MATCHES}

- {#ROLES} NOTMATCHESREGEX {$KUBE.LLD.FILTER.NODE.ROLE.NOT_MATCHES}

Pod discovery

DEPENDENT

kube.pod.discovery

Preprocessing:

- JAVASCRIPT

- DISCARDUNCHANGEDHEARTBEAT

Filter:

AND

- {#NODE} MATCHESREGEX {$KUBE.LLD.FILTER.NODE.MATCHES}

- {#NODE} NOTMATCHESREGEX {$KUBE.LLD.FILTER.NODE.NOT_MATCHES}

- {#NAMESPACE} MATCHESREGEX {$KUBE.LLD.FILTER.POD.NAMESPACE.MATCHES}

- {#NAMESPACE} NOTMATCHESREGEX {$KUBE.LLD.FILTER.POD.NAMESPACE.NOT_MATCHES}

Items collected

Group	Name	Description	Type	Key and additional info
Kubernetes	Kubernetes: Get nodes	Collecting and processing cluster nodes data via Kubernetes API.	SCRIPT	kube.nodes Expression: `The text is too long. Please see the template.`
Kubernetes	Get nodes check	Data collection check.	DEPENDENT	kube.nodes.check Preprocessing: - JSONPATH: `$.error` ⛔️ONFAIL: `CUSTOM_VALUE ->` - DISCARDUNCHANGED_HEARTBEAT: `3h`
Kubernetes	Node LLD	Generation of data for node discovery rules.	DEPENDENT	kube.nodes.lld Preprocessing: - JAVASCRIPT: `function parseFilters(filter) { var pairs = {}; filter.split(/\s,\s/).forEach(function (kv) { if (/([\w.-]+\/[\w.-]+):\s.+/.test(kv)) { var pair = kv.split(/\s:\s*/); pairs[pair[0]] = pair[1]; } }); return pairs; } function filter(name, data, filters) { var filtered = true; if (typeof data === 'object') { Object.keys(filters).some(function (filter) { var exclude = filter.match(/^!(.+)/); if (filter in data	(exclude && exclude[1] in data)) { if ((exclude && new RegExp(filters[filter]).test(data[exclude[1]]))	(!exclude && !(new RegExp(filters[filter]).test(data[filter])))) { Zabbix.log(4, '[ Kubernetes discovery ] Discarded "' + name + '" by filter "' + filter + ': ' + filters[filter] + '"'); filtered = false; return true; } }; }); } return filtered; } try { var input = JSON.parse(value), output = []; apiurl = '{$KUBE.API.ENDPOINT.URL}', hostname = apiurl.match(/\/\/(.+):/); if (typeof hostname[1] === 'undefined') { Zabbix.log(4, '[ Kubernetes ] Received incorrect Kubernetes API url: ' + api_url + '. Expected format: ://:'); throw 'Cannot get hostname from Kubernetes API url. Check debug log for more information.'; }; if (typeof input !== 'object'	typeof input.items === 'undefined') { Zabbix.log(4, '[ Kubernetes ] Received incorrect JSON: ' + value); throw 'Incorrect JSON. Check debug log for more information.'; } var filterLabels = parseFilters('{$KUBE.NODE.FILTER.LABELS}'), filterAnnotations = parseFilters('{$KUBE.NODE.FILTER.ANNOTATIONS}'); input.items.forEach(function (node) { if (filter(node.metadata.name, node.metadata.labels, filterLabels) && filter(node.metadata.name, node.metadata.annotations, filterAnnotations)) { Zabbix.log(4, '[ Kubernetes discovery ] Filtered node "' + node.metadata.name + '"'); var internalIPs = node.status.addresses.filter(function (addr) { return addr.type === 'InternalIP'; }); var internalIP = internalIPs.length && internalIPs[0].address; if (internalIP in input.endpointIPs) { output.push({ '{#NAME}': node.metadata.name, '{#IP}': internalIP, '{#ROLES}': node.status.roles, '{#ARCH}': node.metadata.labels['kubernetes.io/arch']	'', '{#OS}': node.metadata.labels['kubernetes.io/os']	'', '{#CLUSTER_HOSTNAME}': hostname[1] }); } else { Zabbix.log(4, '[ Kubernetes discovery ] Node "' + node.metadata.name + '" is not included in the list of endpoint IPs'); } } }); return JSON.stringify(output); } catch (error) { error += (String(error).endsWith('.')) ? '' : '.'; Zabbix.log(3, '[ Kubernetes discovery ] ERROR: ' + error); throw 'Discovery error: ' + error; } `</p><p>- DISCARD_UNCHANGED_HEARTBEAT:`3h`
Kubernetes	Node [{#NAME}]: Get data	Collecting and processing cluster by node [{#NAME}] data via Kubernetes API.	DEPENDENT	kube.node.get[{#NAME}] Preprocessing: - JSONPATH: `$.items[?(@.metadata.name == "{#NAME}")].first()`
Kubernetes	Node [{#NAME}] Addresses: External IP	Typically the IP address of the node that is externally routable (available from outside the cluster).	DEPENDENT	kube.node.addresses.externalip[{#NAME}] Preprocessing: - JSONPATH: `$.status.addresses[?(@.type == "ExternalIP")].address.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Kubernetes	Node [{#NAME}] Addresses: Internal IP	Typically the IP address of the node that is routable only within the cluster.	DEPENDENT	kube.node.addresses.internalip[{#NAME}] Preprocessing: - JSONPATH: `$.status.addresses[?(@.type == "InternalIP")].address.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Kubernetes	Node [{#NAME}] Allocatable: CPU	Allocatable CPU. 'Allocatable' on a Kubernetes node is defined as the amount of compute resources that are available for pods. The scheduler does not over-subscribe 'Allocatable'. 'CPU', 'memory' and 'ephemeral-storage' are supported as of now.	DEPENDENT	kube.node.allocatable.cpu[{#NAME}] Preprocessing: - JSONPATH: `$.status.allocatable.cpu`
Kubernetes	Node [{#NAME}] Allocatable: Memory	Allocatable Memory. 'Allocatable' on a Kubernetes node is defined as the amount of compute resources that are available for pods. The scheduler does not over-subscribe 'Allocatable'. 'CPU', 'memory' and 'ephemeral-storage' are supported as of now.	DEPENDENT	kube.node.allocatable.memory[{#NAME}] Preprocessing: - JSONPATH: `$.status.allocatable.memory`
Kubernetes	Node [{#NAME}] Allocatable: Pods	https://kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/	DEPENDENT	kube.node.allocatable.pods[{#NAME}] Preprocessing: - JSONPATH: `$.status.allocatable.pods`
Kubernetes	Node [{#NAME}] Capacity: CPU	CPU resource capacity. https://kubernetes.io/docs/concepts/architecture/nodes/#capacity	DEPENDENT	kube.node.capacity.cpu[{#NAME}] Preprocessing: - JSONPATH: `$.status.capacity.cpu`
Kubernetes	Node [{#NAME}] Capacity: Memory	Memory resource capacity. https://kubernetes.io/docs/concepts/architecture/nodes/#capacity	DEPENDENT	kube.node.capacity.memory[{#NAME}] Preprocessing: - JSONPATH: `$.status.capacity.memory`
Kubernetes	Node [{#NAME}] Capacity: Pods	https://kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/	DEPENDENT	kube.node.capacity.pods[{#NAME}] Preprocessing: - JSONPATH: `$.status.capacity.pods`
Kubernetes	Node [{#NAME}] Conditions: Disk pressure	True if pressure exists on the disk size - that is, if the disk capacity is low; otherwise False.	DEPENDENT	kube.node.conditions.diskpressure[{#NAME}] Preprocessing: - JSONPATH: `$.status.conditions[?(@.type == "DiskPressure")].status.first()` ⛔️ON_FAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return ['True', 'False', 'Unknown'].indexOf(value) + 1	'Problem with status processing in JS'; `
Kubernetes	Node [{#NAME}] Conditions: Memory pressure	True if pressure exists on the node memory - that is, if the node memory is low; otherwise False.	DEPENDENT	kube.node.conditions.memorypressure[{#NAME}] Preprocessing: - JSONPATH: `$.status.conditions[?(@.type == "MemoryPressure")].status.first()` ⛔️ON_FAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return ['True', 'False', 'Unknown'].indexOf(value) + 1	'Problem with status processing in JS'; `
Kubernetes	Node [{#NAME}] Conditions: Network unavailable	True if the network for the node is not correctly configured, otherwise False.	DEPENDENT	kube.node.conditions.networkunavailable[{#NAME}] Preprocessing: - JSONPATH: `$.status.conditions[?(@.type == "NetworkUnavailable")].status.first()` ⛔️ON_FAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return ['True', 'False', 'Unknown'].indexOf(value) + 1	'Problem with status processing in JS'; `
Kubernetes	Node [{#NAME}] Conditions: PID pressure	True if pressure exists on the processes - that is, if there are too many processes on the node; otherwise False.	DEPENDENT	kube.node.conditions.pidpressure[{#NAME}] Preprocessing: - JSONPATH: `$.status.conditions[?(@.type == "PIDPressure")].status.first()` ⛔️ON_FAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return ['True', 'False', 'Unknown'].indexOf(value) + 1	'Problem with status processing in JS'; `
Kubernetes	Node [{#NAME}] Conditions: Ready	True if the node is healthy and ready to accept pods, False if the node is not healthy and is not accepting pods, and Unknown if the node controller has not heard from the node in the last node-monitor-grace-period (default is 40 seconds).	DEPENDENT	kube.node.conditions.ready[{#NAME}] Preprocessing: - JSONPATH: `$.status.conditions[?(@.type == "Ready")].status.first()` ⛔️ON_FAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return ['True', 'False', 'Unknown'].indexOf(value) + 1	'Problem with status processing in JS'; `
Kubernetes	Node [{#NAME}] Info: Architecture	Node architecture.	DEPENDENT	kube.node.info.architecture[{#NAME}] Preprocessing: - JSONPATH: `$.status.nodeInfo.architecture` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Kubernetes	Node [{#NAME}] Info: Container runtime	Container runtime. https://kubernetes.io/docs/setup/production-environment/container-runtimes/	DEPENDENT	kube.node.info.containerruntime[{#NAME}] Preprocessing: - JSONPATH: `$.status.nodeInfo.containerRuntimeVersion` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Kubernetes	Node [{#NAME}] Info: Kernel version	Node kernel version.	DEPENDENT	kube.node.info.kernelversion[{#NAME}] Preprocessing: - JSONPATH: `$.status.nodeInfo.kernelVersion` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Kubernetes	Node [{#NAME}] Info: Kubelet version	Version of Kubelet.	DEPENDENT	kube.node.info.kubeletversion[{#NAME}] Preprocessing: - JSONPATH: `$.status.nodeInfo.kubeletVersion` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Kubernetes	Node [{#NAME}] Info: KubeProxy version	Version of KubeProxy.	DEPENDENT	kube.node.info.kubeproxyversion[{#NAME}] Preprocessing: - JSONPATH: `$.status.nodeInfo.kubeProxyVersion` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Kubernetes	Node [{#NAME}] Info: Operating system	Node operating system.	DEPENDENT	kube.node.info.operatingsystem[{#NAME}] Preprocessing: - JSONPATH: `$.status.nodeInfo.operatingSystem` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Kubernetes	Node [{#NAME}] Info: OS image	Node OS image.	DEPENDENT	kube.node.info.osversion[{#NAME}] Preprocessing: - JSONPATH: `$.status.nodeInfo.kernelVersion` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Kubernetes	Node [{#NAME}] Info: Roles	Node roles.	DEPENDENT	kube.node.info.roles[{#NAME}] Preprocessing: - JSONPATH: `$.status.roles` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Kubernetes	Node [{#NAME}] Limits: CPU	Node CPU limits. https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/	DEPENDENT	kube.node.limits.cpu[{#NAME}] Preprocessing: - JSONPATH: `$.pods[*].containers.limits.cpu.sum()`
Kubernetes	Node [{#NAME}] Limits: Memory	Node Memory limits. https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/	DEPENDENT	kube.node.limits.memory[{#NAME}] Preprocessing: - JSONPATH: `$.pods[*].containers.limits.memory.sum()`
Kubernetes	Node [{#NAME}] Requests: CPU	Node CPU requests. https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/	DEPENDENT	kube.node.requests.cpu[{#NAME}] Preprocessing: - JSONPATH: `$.pods[*].containers.requests.cpu.sum()`
Kubernetes	Node [{#NAME}] Requests: Memory	Node Memory requests. https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/	DEPENDENT	kube.node.requests.memory[{#NAME}] Preprocessing: - JSONPATH: `$.pods[*].containers.requests.memory.sum()`
Kubernetes	Node [{#NAME}] Uptime	Node uptime.	DEPENDENT	kube.node.uptime[{#NAME}] Preprocessing: - JSONPATH: `$.metadata.creationTimestamp` ⛔️ON_FAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return Math.floor((Date.now() - new Date(value)) / 1000);`
Kubernetes	Node [{#NAME}] Used: Pods	Current number of pods on the node.	DEPENDENT	kube.node.used.pods[{#NAME}] Preprocessing: - JSONPATH: `$.status.podsCount`
Kubernetes	Node [{#NODE}] Pod [{#POD}]: Get data	Collecting and processing cluster by node [{#NODE}] data via Kubernetes API.	DEPENDENT	kube.pod.get[{#POD}] Preprocessing: - JSONPATH: `$.items[?(@.metadata.name == "{#NODE}")].pods[?(@.name == "{#POD}")].first()`
Kubernetes	Node [{#NODE}] Pod [{#POD}] Conditions: Containers ready	All containers in the Pod are ready. https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-conditions	DEPENDENT	kube.pod.conditions.containersready[{#POD}] Preprocessing: - JSONPATH: `$.conditions[?(@.type == "ContainersReady")].status.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return ['True', 'False', 'Unknown'].indexOf(value) + 1	'Problem with status processing in JS'; `
Kubernetes	Node [{#NODE}] Pod [{#POD}] Conditions: Initialized	All init containers have started successfully. https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-conditions	DEPENDENT	kube.pod.conditions.initialized[{#POD}] Preprocessing: - JSONPATH: `$.conditions[?(@.type == "Initialized")].status.first()` ⛔️ON_FAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return ['True', 'False', 'Unknown'].indexOf(value) + 1	'Problem with status processing in JS'; `
Kubernetes	Node [{#NODE}] Pod [{#POD}] Conditions: Ready	The Pod is able to serve requests and should be added to the load balancing pools of all matching Services. https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-conditions	DEPENDENT	kube.pod.conditions.ready[{#POD}] Preprocessing: - JSONPATH: `$.conditions[?(@.type == "Ready")].status.first()` ⛔️ON_FAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return ['True', 'False', 'Unknown'].indexOf(value) + 1	'Problem with status processing in JS'; `
Kubernetes	Node [{#NODE}] Pod [{#POD}] Conditions: Scheduled	The Pod has been scheduled to a node. https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-conditions	DEPENDENT	kube.pod.conditions.scheduled[{#POD}] Preprocessing: - JSONPATH: `$.conditions[?(@.type == "PodScheduled")].status.first()` ⛔️ON_FAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return ['True', 'False', 'Unknown'].indexOf(value) + 1	'Problem with status processing in JS'; `
Kubernetes	Node [{#NODE}] Pod [{#POD}] Containers: Restarts	The number of times the container has been restarted, currently based on the number of dead containers that have not yet been removed. Note that this is calculated from dead containers. But those containers are subject to garbage collection.	DEPENDENT	kube.pod.containers.restartcount[{#POD}] Preprocessing: - JSONPATH: `$.containers.restartCount` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes	Node [{#NODE}] Pod [{#POD}] Status: Phase	The phase of a Pod is a simple, high-level summary of where the Pod is in its lifecycle. https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#pod-phase	DEPENDENT	kube.pod.status.phase[{#POD}] Preprocessing: - JSONPATH: `$.phase` ⛔️ON_FAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return ['Pending', 'Running', 'Succeeded', 'Failed', 'Unknown'].indexOf(value) + 1	'Problem with status processing in JS'; `
Kubernetes	Node [{#NODE}] Pod [{#POD}] Uptime	Pod uptime.	DEPENDENT	kube.pod.uptime[{#POD}] Preprocessing: - JSONPATH: `$.startTime` ⛔️ON_FAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return Math.floor((Date.now() - new Date(value)) / 1000);`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Kubernetes: Failed to get nodes	-	`length(last(/Kubernetes nodes by HTTP/kube.nodes.check))>0`	WARNING
Node [{#NAME}] Conditions: Pressure exists on the disk size	True - pressure exists on the disk size - that is, if the disk capacity is low; otherwise False.	`last(/Kubernetes nodes by HTTP/kube.node.conditions.diskpressure[{#NAME}])=1`	WARNING
Node [{#NAME}] Conditions: Pressure exists on the node memory	True - pressure exists on the node memory - that is, if the node memory is low; otherwise False	`last(/Kubernetes nodes by HTTP/kube.node.conditions.memorypressure[{#NAME}])=1`	WARNING
Node [{#NAME}] Conditions: Network is not correctly configured	True - the network for the node is not correctly configured, otherwise False	`last(/Kubernetes nodes by HTTP/kube.node.conditions.networkunavailable[{#NAME}])=1`	WARNING
Node [{#NAME}] Conditions: Pressure exists on the processes	True - pressure exists on the processes - that is, if there are too many processes on the node; otherwise False	`last(/Kubernetes nodes by HTTP/kube.node.conditions.pidpressure[{#NAME}])=1`	WARNING
Node [{#NAME}] Conditions: Is not in Ready state	False - if the node is not healthy and is not accepting pods. Unknown - if the node controller has not heard from the node in the last node-monitor-grace-period (default is 40 seconds).	`last(/Kubernetes nodes by HTTP/kube.node.conditions.ready[{#NAME}])<>1`	WARNING
Node [{#NAME}] Limits: Total CPU limits are too high	-	`last(/Kubernetes nodes by HTTP/kube.node.limits.cpu[{#NAME}]) / last(/Kubernetes nodes by HTTP/kube.node.allocatable.cpu[{#NAME}]) > 0.9`	WARNING	Depends on: - Node [{#NAME}] Limits: Total CPU limits are too high
Node [{#NAME}] Limits: Total CPU limits are too high	-	`last(/Kubernetes nodes by HTTP/kube.node.limits.cpu[{#NAME}]) / last(/Kubernetes nodes by HTTP/kube.node.allocatable.cpu[{#NAME}]) > 1`	AVERAGE
Node [{#NAME}] Limits: Total memory limits are too high	-	`last(/Kubernetes nodes by HTTP/kube.node.limits.memory[{#NAME}]) / last(/Kubernetes nodes by HTTP/kube.node.allocatable.memory[{#NAME}]) > 0.9`	WARNING	Depends on: - Node [{#NAME}] Limits: Total memory limits are too high
Node [{#NAME}] Limits: Total memory limits are too high	-	`last(/Kubernetes nodes by HTTP/kube.node.limits.memory[{#NAME}]) / last(/Kubernetes nodes by HTTP/kube.node.allocatable.memory[{#NAME}]) > 1`	AVERAGE
Node [{#NAME}] Requests: Total CPU requests are too high	-	`last(/Kubernetes nodes by HTTP/kube.node.requests.cpu[{#NAME}]) / last(/Kubernetes nodes by HTTP/kube.node.allocatable.cpu[{#NAME}]) > 0.5`	WARNING	Depends on: - Node [{#NAME}] Requests: Total CPU requests are too high
Node [{#NAME}] Requests: Total CPU requests are too high	-	`last(/Kubernetes nodes by HTTP/kube.node.requests.cpu[{#NAME}]) / last(/Kubernetes nodes by HTTP/kube.node.allocatable.cpu[{#NAME}]) > 0.8`	AVERAGE
Node [{#NAME}] Requests: Total memory requests are too high	-	`last(/Kubernetes nodes by HTTP/kube.node.requests.memory[{#NAME}]) / last(/Kubernetes nodes by HTTP/kube.node.allocatable.memory[{#NAME}]) > 0.5`	WARNING	Depends on: - Node [{#NAME}] Requests: Total memory requests are too high
Node [{#NAME}] Requests: Total memory requests are too high	-	`last(/Kubernetes nodes by HTTP/kube.node.requests.memory[{#NAME}]) / last(/Kubernetes nodes by HTTP/kube.node.allocatable.memory[{#NAME}]) > 0.8`	AVERAGE
Node [{#NAME}]: Has been restarted	Uptime is less than 10 minutes	`last(/Kubernetes nodes by HTTP/kube.node.uptime[{#NAME}])<10`	INFO
Node [{#NAME}] Used: Kubelet too many pods	Kubelet is running at capacity.	`last(/Kubernetes nodes by HTTP/kube.node.used.pods[{#NAME}])/ last(/Kubernetes nodes by HTTP/kube.node.capacity.pods[{#NAME}]) > 0.9`	WARNING
Node [{#NODE}] Pod [{#POD}]: Pod is crash looping	Pos restarts more than 2 times in the last 3 minutes.	`(last(/Kubernetes nodes by HTTP/kube.pod.containers.restartcount[{#POD}])-min(/Kubernetes nodes by HTTP/kube.pod.containers.restartcount[{#POD}],3m))>2`	WARNING
Node [{#NODE}] Pod [{#POD}] Status: Kubernetes Pod not healthy	Pod has been in a non-ready state for longer than 10 minutes.	`count(/Kubernetes nodes by HTTP/kube.pod.status.phase[{#POD}],10m, "regexp","^(1	4	5)$")>=9`	HIGH

Feedback

Please report any issues with the template at https://support.zabbix.com.

You can also provide feedback, discuss the template, or ask for help at ZABBIX forums.

app

kubernetes_kubelet

View README Download JSON

Kubernetes Kubelet by HTTP

Overview

For Zabbix version: 6.2 and higher. The template to monitor Kubernetes Controller manager by Zabbix that works without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

Template Kubernetes Controller manager by HTTP — collects metrics by HTTP agent from Controller manager /metrics endpoint.

Don't forget change macros {$KUBE.KUBELET.URL}, {$KUBE.API.TOKEN}. NOTE. Some metrics may not be collected depending on your Kubernetes instance version and configuration.

This template was tested on:

Kubernetes, version 1.19

Setup

Internal service metrics are collected from /metrics endpoint. Template needs to use Authorization via API token.

Don't forget change macros {$KUBE.KUBELET.URL}, {$KUBE.API.TOKEN}. NOTE. Some metrics may not be collected depending on your Kubernetes instance version and configuration.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$KUBE.API.TOKEN}	Service account bearer token	``
{$KUBE.KUBELET.CADVISOR.ENDPOINT}	cAdvisor metrics from Kubelet /metrics/cadvisor endpoint	`/metrics/cadvisor`
{$KUBE.KUBELET.METRIC.ENDPOINT}	Kubelet /metrics endpoint	`/metrics`
{$KUBE.KUBELET.PODS.ENDPOINT}	Kubelet /pods endpoint	`/pods`
{$KUBE.KUBELET.URL}	Instance URL	`https://localhost:10250`

Template links

There are no template links in this template.

Discovery rules

Name	Type	Key and additional info
Container memory discovery	DEPENDENT	kube.kubelet.container.memory.cache.discovery Preprocessing: - PROMETHEUSTOJSON - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Pods discovery	DEPENDENT	kube.kubelet.pods.discovery Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `3h`
REST client requests discovery	DEPENDENT	kube.kubelet.rest.requests.discovery Preprocessing: - PROMETHEUSTOJSON - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Runtime operations discovery	DEPENDENT	kube.kubelet.runtimeoperationsbucket.discovery Preprocessing: - PROMETHEUSTOJSON: `{__name__=~ "kubelet_runtime_operations_", operation_type =~ "."}` - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `3h` Overrides: bucket item - {#TYPE} MATCHESREGEX `buckets` - ITEMPROTOTYPE LIKE `bucket` - DISCOVER total item - {#TYPE} MATCHESREGEX `totals` - ITEMPROTOTYPE NOT_LIKE `bucket` - DISCOVER

Items collected

Group	Name	Description	Type	Key and additional info
Kubernetes	Kubernetes: Get kubelet metrics	Collecting raw Kubelet metrics from /metrics endpoint.	HTTP_AGENT	kube.kubelet.metrics
Kubernetes	Kubernetes: Get cadvisor metrics	Collecting raw Kubelet metrics from /metrics/cadvisor endpoint.	HTTP_AGENT	kube.cadvisor.metrics
Kubernetes	Kubernetes: Get pods	Collecting raw Kubelet metrics from /pods endpoint.	HTTP_AGENT	kube.pods
Kubernetes	Kubernetes: Pods running	The number of running pods.	DEPENDENT	kube.kubelet.pods.running Preprocessing: - JSONPATH: `$.items[?(@.status.phase == "Running")].length()`
Kubernetes	Kubernetes: Containers running	The number of running containers.	DEPENDENT	kube.kubelet.containers.running Preprocessing: - JSONPATH: `$.items[].status.containerStatuses[].restartCount.sum()`
Kubernetes	Kubernetes: Containers last state terminated	The number of containers that were previously terminated.	DEPENDENT	kube.kublet.containers.terminated Preprocessing: - JSONPATH: `$.items[*].status.containerStatuses[?(@.lastState.terminated.exitCode > 0)].length()`
Kubernetes	Kubernetes: Containers restarts	The number of times the container has been restarted.	DEPENDENT	kube.kubelet.containers.restarts Preprocessing: - JSONPATH: `$.items[].status.containerStatuses[].restartCount.sum()`
Kubernetes	Kubernetes: CPU cores, total	The number of cores in this machine (available until kubernetes v1.18).	DEPENDENT	kube.kubelet.cpu.cores Preprocessing: - PROMETHEUS_PATTERN: `machine_cpu_cores`
Kubernetes	Kubernetes: Machine memory, bytes	Resident memory size in bytes.	DEPENDENT	kube.kubelet.machine.memory Preprocessing: - PROMETHEUS_PATTERN: `process_resident_memory_bytes`
Kubernetes	Kubernetes: Virtual memory, bytes	Virtual memory size in bytes.	DEPENDENT	kube.kubelet.virtual.memory Preprocessing: - PROMETHEUS_PATTERN: `process_virtual_memory_bytes`
Kubernetes	Kubernetes: File descriptors, max	Maximum number of open file descriptors.	DEPENDENT	kube.kubelet.processmaxfds Preprocessing: - PROMETHEUS_PATTERN: `process_max_fds`
Kubernetes	Kubernetes: File descriptors, open	Number of open file descriptors.	DEPENDENT	kube.kubelet.processopenfds Preprocessing: - PROMETHEUS_PATTERN: `process_open_fds`
Kubernetes	Kubernetes: [{#OP_TYPE}] Runtime operations bucket: {#LE}	Duration in seconds of runtime operations. Broken down by operation type.	DEPENDENT	kube.kublet.runtimeopsdurationsecondsbucket[{#LE},"{#OPTYPE}"] Preprocessing: - PROMETHEUSPATTERN: `kubelet_runtime_operations_duration_seconds_bucket{le="{#LE}",operation_type="{#OP_TYPE}"}`: `function`: `sum`
Kubernetes	Kubernetes: [{#OP_TYPE}] Runtime operations total, rate	Cumulative number of runtime operations by operation type.	DEPENDENT	kube.kublet.runtimeopstotal.rate["{#OPTYPE}"] Preprocessing: - PROMETHEUSPATTERN: `kubelet_runtime_operations_total{operation_type="{#OP_TYPE}"}` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
Kubernetes	Kubernetes: [{#OP_TYPE}] Operations, p90	90 percentile of operation latency distribution in seconds for each verb.	CALCULATED	kube.kublet.runtimeopsdurationsecondsp90["{#OP_TYPE}"] Expression: `bucket_percentile(//kube.kublet.runtime_ops_duration_seconds_bucket[*,"{#OP_TYPE}"],5m,90)`
Kubernetes	Kubernetes: [{#OP_TYPE}] Operations, p95	95 percentile of operation latency distribution in seconds for each verb.	CALCULATED	kube.kublet.runtimeopsdurationsecondsp95["{#OP_TYPE}"] Expression: `bucket_percentile(//kube.kublet.runtime_ops_duration_seconds_bucket[*,"{#OP_TYPE}"],5m,95)`
Kubernetes	Kubernetes: [{#OP_TYPE}] Operations, p99	99 percentile of operation latency distribution in seconds for each verb.	CALCULATED	kube.kublet.runtimeopsdurationsecondsp99["{#OP_TYPE}"] Expression: `bucket_percentile(//kube.kublet.runtime_ops_duration_seconds_bucket[*,"{#OP_TYPE}"],5m,99)`
Kubernetes	Kubernetes: [{#OP_TYPE}] Operations, p50	50 percentile of operation latency distribution in seconds for each verb.	CALCULATED	kube.kublet.runtimeopsdurationsecondsp50["{#OP_TYPE}"] Expression: `bucket_percentile(//kube.kublet.runtime_ops_duration_seconds_bucket[*,"{#OP_TYPE}"],5m,50)`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Pod [{#NAME}] CPU: Load average, 10s	Pods cpu load average over the last 10 seconds.	DEPENDENT	kube.pod.containercpuloadaverage10s[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `container_cpu_load_average_10s{pod="{#NAME}", namespace="{#NAMESPACE}"}`: `function`: `sum` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Pod [{#NAME}] CPU: System seconds, total	The number of cores used for system time.	DEPENDENT	kube.pod.containercpusystemsecondstotal[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `container_cpu_system_seconds_total{pod="{#NAME}", namespace="{#NAMESPACE}"}`: `function`: `sum` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Pod [{#NAME}] CPU: User seconds, total	The number of cores used for user time.	DEPENDENT	kube.pod.containercpuusersecondstotal[{#NAMESPACE}/{#NAME}] Preprocessing: - PROMETHEUSPATTERN: `container_cpu_user_seconds_total{pod="{#NAME}", namespace="{#NAMESPACE}"}`: `function`: `sum` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes	Kubernetes: Host [{#HOST}] Request method [{#METHOD}] Code:[{#CODE}]	Number of HTTP requests, partitioned by status code, method, and host.	DEPENDENT	kube.kubelet.rest.requests["{#CODE}", "{#HOST}", "{#METHOD}"] Preprocessing: - PROMETHEUSPATTERN: `rest_client_requests_total{code="{#CODE}", host="{#HOST}", method="{#METHOD}"}` - DISCARDUNCHANGED_HEARTBEAT: `3h`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Pod [{#POD}] Container [{#CONTAINER}]: Memory page cache	Number of bytes of page cache memory.	DEPENDENT	kube.kubelet.container.memory.cache["{#CONTAINER}", "{#NAMESPACE}", "{#POD}"] Preprocessing: - PROMETHEUSPATTERN: `container_memory_cache{container="{#CONTAINER}", namespace="{#NAMESPACE}", pod="{#POD}"}` - DISCARDUNCHANGED_HEARTBEAT: `3h`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Pod [{#POD}] Container [{#CONTAINER}]: Memory max usage	Maximum memory usage recorded in bytes.	DEPENDENT	kube.kubelet.container.memory.maxusage["{#CONTAINER}", "{#NAMESPACE}", "{#POD}"] Preprocessing: - PROMETHEUSPATTERN: `container_memory_max_usage_bytes{container="{#CONTAINER}", namespace="{#NAMESPACE}", pod="{#POD}"}` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Pod [{#POD}] Container [{#CONTAINER}]: RSS	Size of RSS in bytes.	DEPENDENT	kube.kubelet.container.memory.rss["{#CONTAINER}", "{#NAMESPACE}", "{#POD}"] Preprocessing: - PROMETHEUSPATTERN: `container_memory_rss{container="{#CONTAINER}", namespace="{#NAMESPACE}", pod="{#POD}"}` - DISCARDUNCHANGED_HEARTBEAT: `3h`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Pod [{#POD}] Container [{#CONTAINER}]: Swap	Container swap usage in bytes.	DEPENDENT	kube.kubelet.container.memory.swap["{#CONTAINER}", "{#NAMESPACE}", "{#POD}"] Preprocessing: - PROMETHEUSPATTERN: `container_memory_swap{container="{#CONTAINER}", namespace="{#NAMESPACE}", pod="{#POD}"}` - DISCARDUNCHANGED_HEARTBEAT: `3h`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Pod [{#POD}] Container [{#CONTAINER}]: Usage	Current memory usage in bytes, including all memory regardless of when it was accessed.	DEPENDENT	kube.kubelet.container.memory.usage["{#CONTAINER}", "{#NAMESPACE}", "{#POD}"] Preprocessing: - PROMETHEUSPATTERN: `container_memory_usage_bytes{container="{#CONTAINER}", namespace="{#NAMESPACE}", pod="{#POD}"}` - DISCARDUNCHANGED_HEARTBEAT: `3h`
Kubernetes	Kubernetes: Namespace [{#NAMESPACE}] Pod [{#POD}] Container [{#CONTAINER}]: Working set	Current working set in bytes.	DEPENDENT	kube.kubelet.container.memory.workingset["{#CONTAINER}", "{#NAMESPACE}", "{#POD}"] Preprocessing: - PROMETHEUSPATTERN: `container_memory_working_set_bytes{container="{#CONTAINER}", namespace="{#NAMESPACE}", pod="{#POD}"}` - DISCARDUNCHANGEDHEARTBEAT: `3h`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info

Feedback

Please report any issues with the template at https://support.zabbix.com.

You can also provide feedback, discuss the template, or ask for help at ZABBIX forums.

app

kubernetes_controller_manager

View README Download JSON

Kubernetes Controller manager by HTTP

Overview

For Zabbix version: 6.2 and higher
The template to monitor Kubernetes Controller manager by Zabbix that works without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

Template Kubernetes Controller manager by HTTP — collects metrics by HTTP agent from Controller manager /metrics endpoint.

This template was tested on:

Kubernetes Controller manager, version 1.19.10

Setup

Internal service metrics are collected from /metrics endpoint. Template needs to use Authorization via API token.

Don't forget change macros {$KUBE.CONTROLLER.SERVER.URL}, {$KUBE.API.TOKEN}. Also, see the Macros section for a list of macros used to set trigger values. NOTE. Some metrics may not be collected depending on your Kubernetes Controller manager instance version and configuration.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$KUBE.API.TOKEN}	API Authorization Token	``
{$KUBE.CONTROLLER.HTTP.CLIENT.ERROR}	Maximum number of HTTP client requests failures used for trigger	`2`
{$KUBE.CONTROLLER.SERVER.URL}	Instance URL	`http://localhost:10252/metrics`

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Workqueue metrics discovery

DEPENDENT

kubernetes.controller.workqueue.discovery

Preprocessing:

- PROMETHEUSTOJSON: {__name__=~ "workqueue_*", name =~ ".*"}

- JAVASCRIPT: The text is too long. Please see the template.

- DISCARDUNCHANGEDHEARTBEAT: 3h

Overrides:

bucket item
- {#TYPE} MATCHESREGEX buckets
- ITEMPROTOTYPE LIKE bucket - DISCOVER

total item
- {#TYPE} MATCHESREGEX totals
- ITEMPROTOTYPE NOT_LIKE bucket - DISCOVER

Items collected

Group	Name	Description	Type	Key and additional info
Kubernetes Controller	Kubernetes Controller Manager: Leader election status	Gauge of if the reporting system is master of the relevant lease, 0 indicates backup, 1 indicates master.	DEPENDENT	kubernetes.controller.leaderelectionmasterstatus Preprocessing: - PROMETHEUSPATTERN: `leader_election_master_status` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes Controller	Kubernetes Controller Manager: Virtual memory, bytes	Virtual memory size in bytes.	DEPENDENT	kubernetes.controller.processvirtualmemorybytes Preprocessing: - PROMETHEUSPATTERN: `process_virtual_memory_bytes` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes Controller	Kubernetes Controller Manager: Resident memory, bytes	Resident memory size in bytes.	DEPENDENT	kubernetes.controller.processresidentmemorybytes Preprocessing: - PROMETHEUSPATTERN: `process_resident_memory_bytes` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes Controller	Kubernetes Controller Manager: CPU	Total user and system CPU usage ratio.	DEPENDENT	kubernetes.controller.cpu.util Preprocessing: - PROMETHEUSPATTERN: `process_cpu_seconds_total` - CHANGEPER_SECOND - MULTIPLIER: `100`
Kubernetes Controller	Kubernetes Controller Manager: Goroutines	Number of goroutines that currently exist.	DEPENDENT	kubernetes.controller.gogoroutines Preprocessing: - PROMETHEUSPATTERN: `go_goroutines`: `function`: `sum` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes Controller	Kubernetes Controller Manager: Go threads	Number of OS threads created.	DEPENDENT	kubernetes.controller.gothreads Preprocessing: - PROMETHEUSPATTERN: `go_threads` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes Controller	Kubernetes Controller Manager: Fds open	Number of open file descriptors.	DEPENDENT	kubernetes.controller.openfds Preprocessing: - PROMETHEUSPATTERN: `process_open_fds` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes Controller	Kubernetes Controller Manager: Fds max	Maximum allowed open file descriptors.	DEPENDENT	kubernetes.controller.maxfds Preprocessing: - PROMETHEUSPATTERN: `process_max_fds` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes Controller	Kubernetes Controller Manager: REST Client requests: 2xx, rate	Number of HTTP requests with 2xx status code per second.	DEPENDENT	kubernetes.controller.clienthttprequests200.rate Preprocessing: - PROMETHEUSPATTERN: `rest_client_requests_total{code =~ "2.."}`: `function`: `sum` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
Kubernetes Controller	Kubernetes Controller Manager: REST Client requests: 3xx, rate	Number of HTTP requests with 3xx status code per second.	DEPENDENT	kubernetes.controller.clienthttprequests300.rate Preprocessing: - PROMETHEUSPATTERN: `rest_client_requests_total{code =~ "3.."}`: `function`: `sum` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
Kubernetes Controller	Kubernetes Controller Manager: REST Client requests: 4xx, rate	Number of HTTP requests with 4xx status code per second.	DEPENDENT	kubernetes.controller.clienthttprequests400.rate Preprocessing: - PROMETHEUSPATTERN: `rest_client_requests_total{code =~ "4.."}`: `function`: `sum` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
Kubernetes Controller	Kubernetes Controller Manager: REST Client requests: 5xx, rate	Number of HTTP requests with 5xx status code per second.	DEPENDENT	kubernetes.controller.clienthttprequests500.rate Preprocessing: - PROMETHEUSPATTERN: `rest_client_requests_total{code =~ "5.."}`: `function`: `sum` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
Kubernetes Controller	Kubernetes Controller Manager: ["{#NAME}"]: Workqueue adds total, rate	Total number of adds handled by workqueue per second.	DEPENDENT	kubernetes.controller.workqueueaddstotal["{#NAME}"] Preprocessing: - PROMETHEUSPATTERN: `workqueue_adds_total{name = "{#NAME}"}` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Kubernetes Controller	Kubernetes Controller Manager: ["{#NAME}"]: Workqueue depth	Current depth of workqueue.	DEPENDENT	kubernetes.controller.workqueuedepth["{#NAME}"] Preprocessing: - PROMETHEUSPATTERN: `workqueue_depth{name = "{#NAME}"}` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes Controller	Kubernetes Controller Manager: ["{#NAME}"]: Workqueue unfinished work, sec	How many seconds of work has done that is in progress and hasn't been observed by work_duration. Large values indicate stuck threads. One can deduce the number of stuck threads by observing the rate at which this increases.	DEPENDENT	kubernetes.controller.workqueueunfinishedworkseconds["{#NAME}"] Preprocessing: - PROMETHEUSPATTERN: `workqueue_unfinished_work_seconds{name = "{#NAME}"}` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes Controller	Kubernetes Controller Manager: ["{#NAME}"]: Workqueue retries, rate	Total number of retries handled by workqueue per second.	DEPENDENT	kubernetes.controller.workqueueretriestotal["{#NAME}"] Preprocessing: - PROMETHEUSPATTERN: `workqueue_retries_total{name = "{#NAME}"}` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Kubernetes Controller	Kubernetes Controller Manager: ["{#NAME}"]: Workqueue longest running processor, sec	How many seconds has the longest running processor for workqueue been running.	DEPENDENT	kubernetes.controller.workqueuelongestrunningprocessorseconds["{#NAME}"] Preprocessing: - PROMETHEUSPATTERN: `workqueue_longest_running_processor_seconds{name = "{#NAME}"}` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes Controller	Kubernetes Controller Manager: ["{#NAME}"]: Workqueue work duration, p90	90 percentile of how long in seconds processing an item from workqueue takes, by queue.	CALCULATED	kubernetes.controller.workqueueworkdurationsecondsp90["{#NAME}"] Expression: `bucket_percentile(//kubernetes.controller.duration_seconds_bucket[*,"{#NAME}"],5m,90)`
Kubernetes Controller	Kubernetes Controller Manager: ["{#NAME}"]: Workqueue work duration, p95	95 percentile of how long in seconds processing an item from workqueue takes, by queue.	CALCULATED	kubernetes.controller.workqueueworkdurationsecondsp95["{#NAME}"] Expression: `bucket_percentile(//kubernetes.controller.duration_seconds_bucket[*,"{#NAME}"],5m,95)`
Kubernetes Controller	Kubernetes Controller Manager: ["{#NAME}"]: Workqueue work duration, p99	99 percentile of how long in seconds processing an item from workqueue takes, by queue.	CALCULATED	kubernetes.controller.workqueueworkdurationsecondsp99["{#NAME}"] Expression: `bucket_percentile(//kubernetes.controller.duration_seconds_bucket[*,"{#NAME}"],5m,99)`
Kubernetes Controller	Kubernetes Controller Manager: ["{#NAME}"]: Workqueue work duration, 50p	50 percentiles of how long in seconds processing an item from workqueue takes, by queue.	CALCULATED	kubernetes.controller.workqueueworkdurationsecondsp50["{#NAME}"] Expression: `bucket_percentile(//kubernetes.controller.duration_seconds_bucket[*,"{#NAME}"],5m,50)`
Kubernetes Controller	Kubernetes Controller Manager: ["{#NAME}"]: Workqueue queue duration, p90	90 percentile of how long in seconds an item stays in workqueue before being requested, by queue.	CALCULATED	kubernetes.controller.workqueuequeuedurationsecondsp90["{#NAME}"] Expression: `bucket_percentile(//kubernetes.controller.queue_duration_seconds_bucket[*,"{#NAME}"],5m,90)`
Kubernetes Controller	Kubernetes Controller Manager: ["{#NAME}"]: Workqueue queue duration, p95	95 percentile of how long in seconds an item stays in workqueue before being requested, by queue.	CALCULATED	kubernetes.controller.workqueuequeuedurationsecondsp95["{#NAME}"] Expression: `bucket_percentile(//kubernetes.controller.queue_duration_seconds_bucket[*,"{#NAME}"],5m,95)`
Kubernetes Controller	Kubernetes Controller Manager: ["{#NAME}"]: Workqueue queue duration, p99	99 percentile of how long in seconds an item stays in workqueue before being requested, by queue.	CALCULATED	kubernetes.controller.workqueuequeuedurationsecondsp99["{#NAME}"] Expression: `bucket_percentile(//kubernetes.controller.queue_duration_seconds_bucket[*,"{#NAME}"],5m,99)`
Kubernetes Controller	Kubernetes Controller Manager: ["{#NAME}"]: Workqueue queue duration, 50p	50 percentile of how long in seconds an item stays in workqueue before being requested. If there are no requests for 5 minute, item value will be discarded.	CALCULATED	kubernetes.controller.workqueuequeuedurationsecondsp50["{#NAME}"] Preprocessing: - CHECKNOTSUPPORTED ⛔️ON_FAIL: `DISCARD_VALUE ->` Expression: `bucket_percentile(//kubernetes.controller.queue_duration_seconds_bucket[*,"{#NAME}"],5m,50)`
Kubernetes Controller	Kubernetes Controller Manager: ["{#NAME}"]: Workqueue duration seconds bucket, {#LE}	How long in seconds processing an item from workqueue takes.	DEPENDENT	kubernetes.controller.durationsecondsbucket[{#LE},"{#NAME}"] Preprocessing: - PROMETHEUSPATTERN: `workqueue_work_duration_seconds_bucket{name = "{#NAME}",le = "{#LE}"}` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes Controller	Kubernetes Controller Manager: ["{#NAME}"]: Queue duration seconds bucket, {#LE}	How long in seconds an item stays in workqueue before being requested.	DEPENDENT	kubernetes.controller.queuedurationsecondsbucket[{#LE},"{#NAME}"] Preprocessing: - PROMETHEUSPATTERN: `workqueue_queue_duration_seconds_bucket{name = "{#NAME}",le = "{#LE}"}` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Zabbix raw items	Kubernetes Controller: Get Controller metrics	Get raw metrics from Controller instance /metrics endpoint.	HTTP_AGENT	kubernetes.controller.getmetrics Preprocessing: - CHECKNOTSUPPORTED ⛔️ONFAIL: `DISCARD_VALUE ->`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Kubernetes Controller Manager: Too many HTTP client errors	"Kubernetes Controller manager is experiencing high error rate (with 5xx HTTP code).	`min(/Kubernetes Controller manager by HTTP/kubernetes.controller.client_http_requests_500.rate,5m)>{$KUBE.CONTROLLER.HTTP.CLIENT.ERROR}`	WARNING

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

app

kubernetes_api_servers

View README Download JSON

Kubernetes API server by HTTP

Overview

For Zabbix version: 6.2 and higher
The template to monitor InfluxDB by Zabbix that works without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

Template Kubernetes API server by HTTP — collects metrics by HTTP agent from API server /metrics endpoint.

This template was tested on:

Kubernetes API server, version 1.19.10

Setup

Internal service metrics are collected from /metrics endpoint. Template needs to use Authorization via API token.

Don't forget change macros {$KUBE.API.SERVER.URL}, {$KUBE.API.TOKEN}. Also, see the Macros section for a list of macros used to set trigger values. NOTE. Some metrics may not be collected depending on your Kubernetes API server instance version and configuration.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$KUBE.API.CERT.EXPIRATION}	Number of days for alert of client certificate used for trigger	`7`
{$KUBE.API.HTTP.CLIENT.ERROR}	Maximum number of HTTP client requests failures used for trigger	`2`
{$KUBE.API.HTTP.SERVER.ERROR}	Maximum number of HTTP client requests failures used for trigger	`2`
{$KUBE.API.SERVER.URL}	instance URL	`http://localhost:8086/metrics`
{$KUBE.API.TOKEN}	API Authorization Token	``

Template links

There are no template links in this template.

Discovery rules

Name	Description	Type	Key and additional info
Authentication attempts discovery	Discovery authentication attempts by result.	DEPENDENT	kubernetes.api.authenticationattempts.discovery Preprocessing: - PROMETHEUSTOJSON: `authentication_attempts{result =~ "."}` - JAVASCRIPT: `The text is too long. Please see the template.`* - DISCARDUNCHANGED_HEARTBEAT: `3h`
Authentication requests discovery	Discovery authentication attempts by name.	DEPENDENT	kubernetes.api.authenticateduserrequests.discovery Preprocessing: - PROMETHEUSTOJSON: `authenticated_user_requests{username =~ ".*"}` - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Client certificate expiration histogram	Discovery raw data of client certificate expiration	DEPENDENT	kubernetes.api.certificateexpiration.discovery Preprocessing: - PROMETHEUSTOJSON: `{__name__=~ "apiserver_client_certificate_expiration_seconds_"}` - JAVASCRIPT: `The text is too long. Please see the template.`* - DISCARDUNCHANGEDHEARTBEAT: `3h` Overrides: bucket item - {#TYPE} MATCHESREGEX `buckets` - ITEMPROTOTYPE LIKE `bucket` - DISCOVER total item - {#TYPE} MATCHESREGEX `totals` - ITEMPROTOTYPE NOTLIKE `bucket` - DISCOVER
Etcd objects metrics discovery	Discovery etcd objects by resource.	DEPENDENT	kubernetes.api.etcdobjectcounts.discovery Preprocessing: - PROMETHEUSTOJSON: `etcd_object_counts{resource =~ ".*"}` - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `3h`
gRPC completed requests discovery	Discovery grpc completed requests by grpc code.	DEPENDENT	kubernetes.api.grpcclienthandled.discovery Preprocessing: - PROMETHEUSTOJSON: `grpc_client_handled_total{grpc_code =~ ".*"}` - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Long-running requests	Discovery of long-running requests by verb, resource and scope.	DEPENDENT	kubernetes.api.longrunninggauge.discovery Preprocessing: - PROMETHEUSTOJSON: `apiserver_longrunning_gauge{resource =~ ".", scope =~ ".", verb =~ "."}` - JAVASCRIPT: `The text is too long. Please see the template.`* - DISCARDUNCHANGED_HEARTBEAT: `3h`
Request duration histogram	Discovery raw data and percentile items of request duration.	DEPENDENT	kubernetes.api.requestsbucket.discovery Preprocessing: - PROMETHEUSTOJSON: `{__name__=~ "apiserver_request_duration_", verb =~ "."}` - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `3h` Overrides: bucket item - {#TYPE} MATCHESREGEX `buckets` - ITEMPROTOTYPE LIKE `bucket` - DISCOVER total item - {#TYPE} MATCHESREGEX `totals` - ITEMPROTOTYPE NOTLIKE `bucket` - DISCOVER
Requests inflight discovery	Discovery requests inflight by kind.	DEPENDENT	kubernetes.api.inflightrequests.discovery Preprocessing: - PROMETHEUSTOJSON: `apiserver_current_inflight_requests{request_kind =~ "."}` - JAVASCRIPT: `The text is too long. Please see the template.`* - DISCARDUNCHANGED_HEARTBEAT: `3h`
Watchers metrics discovery	Discovery watchers by kind.	DEPENDENT	kubernetes.api.apiserverregisteredwatchers.discovery Preprocessing: - PROMETHEUSTOJSON: `apiserver_registered_watchers{kind =~ ".*"}` - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Workqueue metrics discovery	Discovery workqueue metrics by name.	DEPENDENT	kubernetes.api.workqueue.discovery Preprocessing: - PROMETHEUSTOJSON: `workqueue_adds_total{name =~ ".*"}` - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `3h`

Items collected

Group	Name	Description	Type	Key and additional info
Kubernetes API	Kubernetes API: Audit events, total	Accumulated number audit events generated and sent to the audit backend.	DEPENDENT	kubernetes.api.auditeventtotal Preprocessing: - PROMETHEUSPATTERN: `apiserver_audit_event_total`: `function`: `sum` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes API	Kubernetes API: Virtual memory, bytes	Virtual memory size in bytes.	DEPENDENT	kubernetes.api.processvirtualmemorybytes Preprocessing: - PROMETHEUSPATTERN: `process_virtual_memory_bytes` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes API	Kubernetes API: Resident memory, bytes	Resident memory size in bytes.	DEPENDENT	kubernetes.api.processresidentmemorybytes Preprocessing: - PROMETHEUSPATTERN: `process_resident_memory_bytes` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes API	Kubernetes API: CPU	Total user and system CPU usage ratio.	DEPENDENT	kubernetes.api.cpu.util Preprocessing: - PROMETHEUSPATTERN: `process_cpu_seconds_total` - CHANGEPER_SECOND - MULTIPLIER: `100`
Kubernetes API	Kubernetes API: Goroutines	Number of goroutines that currently exist.	DEPENDENT	kubernetes.api.gogoroutines Preprocessing: - PROMETHEUSPATTERN: `go_goroutines`: `function`: `sum` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes API	Kubernetes API: Go threads	Number of OS threads created.	DEPENDENT	kubernetes.api.gothreads Preprocessing: - PROMETHEUSPATTERN: `go_threads` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes API	Kubernetes API: Fds open	Number of open file descriptors.	DEPENDENT	kubernetes.api.openfds Preprocessing: - PROMETHEUSPATTERN: `process_open_fds` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes API	Kubernetes API: Fds max	Maximum allowed open file descriptors.	DEPENDENT	kubernetes.api.maxfds Preprocessing: - PROMETHEUSPATTERN: `process_max_fds` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes API	Kubernetes API: gRPCs client started, rate	Total number of RPCs started per second.	DEPENDENT	kubernetes.api.grpcclientstarted.rate Preprocessing: - PROMETHEUSPATTERN: `grpc_client_started_total`: `function`: `sum` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Kubernetes API	Kubernetes API: gRPCs messages ressived, rate	Total number of gRPC stream messages received per second.	DEPENDENT	kubernetes.api.grpcclientmsgreceived.rate Preprocessing: - PROMETHEUSPATTERN: `grpc_client_msg_received_total`: `function`: `sum` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
Kubernetes API	Kubernetes API: gRPCs messages sent, rate	Total number of gRPC stream messages sent per second.	DEPENDENT	kubernetes.api.grpcclientmsgsent.rate Preprocessing: - PROMETHEUSPATTERN: `grpc_client_msg_sent_total`: `function`: `sum` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
Kubernetes API	Kubernetes API: Request terminations, rate	Number of requests which apiserver terminated in self-defense per second.	DEPENDENT	kubernetes.api.apiserverrequestterminations Preprocessing: - PROMETHEUSPATTERN: `apiserver_request_terminations_total`: `function`: `sum` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Kubernetes API	Kubernetes API: TLS handshake errors, rate	Number of requests dropped with 'TLS handshake error from' error per second.	DEPENDENT	kubernetes.api.apiservertlshandshakeerrorstotal.rate Preprocessing: - PROMETHEUSPATTERN: `apiserver_tls_handshake_errors_total`: `function`: `sum` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes API	Kubernetes API: API server requests: 5xx, rate	Counter of apiserver requests broken out for each HTTP response code.	DEPENDENT	kubernetes.api.apiserverrequesttotal500.rate Preprocessing: - PROMETHEUSPATTERN: `apiserver_request_total{code =~ "5.."}`: `function`: `sum` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
Kubernetes API	Kubernetes API: API server requests: 4xx, rate	Counter of apiserver requests broken out for each HTTP response code.	DEPENDENT	kubernetes.api.apiserverrequesttotal400.rate Preprocessing: - PROMETHEUSPATTERN: `apiserver_request_total{code =~ "4.."}`: `function`: `sum` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
Kubernetes API	Kubernetes API: API server requests: 3xx, rate	Counter of apiserver requests broken out for each HTTP response code.	DEPENDENT	kubernetes.api.apiserverrequesttotal300.rate Preprocessing: - PROMETHEUSPATTERN: `apiserver_request_total{code =~ "3.."}`: `function`: `sum` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
Kubernetes API	Kubernetes API: API server requests: 0	Counter of apiserver requests broken out for each HTTP response code.	DEPENDENT	kubernetes.api.apiserverrequesttotal0.rate Preprocessing: - PROMETHEUSPATTERN: `apiserver_request_total{code = "0"}`: `function`: `sum` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
Kubernetes API	Kubernetes API: API server requests: 2xx, rate	Counter of apiserver requests broken out for each HTTP response code.	DEPENDENT	kubernetes.api.apiserverrequesttotal200.rate Preprocessing: - PROMETHEUSPATTERN: `apiserver_request_total{code =~ "2.."}`: `function`: `sum` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
Kubernetes API	Kubernetes API: HTTP requests: 5xx, rate	Number of HTTP requests with 5xx status code per second.	DEPENDENT	kubernetes.api.restclientrequeststotal500.rate Preprocessing: - PROMETHEUSPATTERN: `rest_client_requests_total{code =~ "5.."}`: `function`: `sum` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Kubernetes API	Kubernetes API: HTTP requests: 4xx, rate	Number of HTTP requests with 4xx status code per second.	DEPENDENT	kubernetes.api.restclientrequeststotal400.rate Preprocessing: - PROMETHEUSPATTERN: `rest_client_requests_total{code =~ "4.."}`: `function`: `sum` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Kubernetes API	Kubernetes API: HTTP requests: 3xx, rate	Number of HTTP requests with 3xx status code per second.	DEPENDENT	kubernetes.api.restclientrequeststotal300.rate Preprocessing: - PROMETHEUSPATTERN: `rest_client_requests_total{code =~ "3.."}`: `function`: `sum` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Kubernetes API	Kubernetes API: HTTP requests: 2xx, rate	Number of HTTP requests with 2xx status code per second.	DEPENDENT	kubernetes.api.restclientrequeststotal200.rate Preprocessing: - PROMETHEUSPATTERN: `rest_client_requests_total{code =~ "2.."}`: `function`: `sum` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Kubernetes API	Kubernetes API: Long-running ["{#VERB}"] requests ["{#RESOURCE}"]: {#SCOPE}	Gauge of all active long-running apiserver requests broken out by verb, resource and scope. Not all requests are tracked this way.	DEPENDENT	kubernetes.api.longrunninggauge["{#RESOURCE}","{#SCOPE}","{#VERB}"] Preprocessing: - PROMETHEUSPATTERN: `apiserver_longrunning_gauge{resource = "{#RESOURCE}", scope = "{#SCOPE}", verb = "{#VERB}"}`: `function`: `sum` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes API	Kubernetes API: ["{#VERB}"] Requests bucket: {#LE}	Response latency distribution in seconds for each verb.	DEPENDENT	kubernetes.api.requestdurationsecondsbucket[{#LE},"{#VERB}"] Preprocessing: - PROMETHEUSPATTERN: `apiserver_request_duration_seconds_bucket{le="{#LE}",verb="{#VERB}"}`: `function`: `sum`
Kubernetes API	Kubernetes API: ["{#VERB}"] Requests, p90	90 percentile of response latency distribution in seconds for each verb.	CALCULATED	kubernetes.api.requestdurationseconds_p90["{#VERB}"] Expression: `bucket_percentile(//kubernetes.api.request_duration_seconds_bucket[*,"{#VERB}"],5m,90)`
Kubernetes API	Kubernetes API: ["{#VERB}"] Requests, p95	95 percentile of response latency distribution in seconds for each verb.	CALCULATED	kubernetes.api.requestdurationseconds_p95["{#VERB}"] Expression: `bucket_percentile(//kubernetes.api.request_duration_seconds_bucket[*,"{#VERB}"],5m,95)`
Kubernetes API	Kubernetes API: ["{#VERB}"] Requests, p99	99 percentile of response latency distribution in seconds for each verb.	CALCULATED	kubernetes.api.requestdurationseconds_p99["{#VERB}"] Expression: `bucket_percentile(//kubernetes.api.request_duration_seconds_bucket[*,"{#VERB}"],5m,99)`
Kubernetes API	Kubernetes API: ["{#VERB}"] Requests, p50	50 percentile of response latency distribution in seconds for each verb.	CALCULATED	kubernetes.api.requestdurationseconds_p50["{#VERB}"] Expression: `bucket_percentile(//kubernetes.api.request_duration_seconds_bucket[*,"{#VERB}"],5m,50)`
Kubernetes API	Kubernetes API: Requests current: {#KIND}	Maximal number of currently used inflight request limit of this apiserver per request kind in last second.	DEPENDENT	kubernetes.api.currentinflightrequests["{#KIND}"] Preprocessing: - PROMETHEUSPATTERN: `apiserver_current_inflight_requests{request_kind = "{#KIND}"}` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes API	Kubernetes API: gRPCs completed: {#GRPC_CODE}, rate	Total number of RPCs completed by the client regardless of success or failure per second.	DEPENDENT	kubernetes.api.grpcclienthandledtotal.rate["{#GRPCCODE}"] Preprocessing: - PROMETHEUSPATTERN: `grpc_client_handled_total{grpc_code = "{#GRPC_CODE}"}`: `function`: `sum` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Kubernetes API	Kubernetes API: Authentication attempts: {#RESULT}, rate	Authentication attempts by result per second.	DEPENDENT	kubernetes.api.authenticationattempts.rate["{#RESULT}"] Preprocessing: - PROMETHEUSPATTERN: `authentication_attempts{result = "{#RESULT}"}`: `function`: `sum` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
Kubernetes API	Kubernetes API: Authenticated requests: {#NAME}, rate	Counter of authenticated requests broken out by username per second.	DEPENDENT	kubernetes.api.authenticateduserrequests.rate["{#NAME}"] Preprocessing: - PROMETHEUSPATTERN: `authenticated_user_requests{result = "{#NAME}"}` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Kubernetes API	Kubernetes API: Watchers: {#KIND}	Number of currently registered watchers for a given resource.	DEPENDENT	kubernetes.api.apiserverregisteredwatchers["{#KIND}"] Preprocessing: - PROMETHEUSPATTERN: `apiserver_registered_watchers{kind = "{#KIND}"}` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes API	Kubernetes API: etcd objects: {#RESOURCE}	Number of stored objects at the time of last check split by kind.	DEPENDENT	kubernetes.api.etcdobjectcounts["{#RESOURCE}"] Preprocessing: - PROMETHEUSPATTERN: `etcd_object_counts{ resource = "{#RESOURCE}"}` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes API	Kubernetes API: ["{#NAME}"] Workqueue depth	Current depth of workqueue.	DEPENDENT	kubernetes.api.workqueuedepth["{#NAME}"] Preprocessing: - PROMETHEUSPATTERN: `workqueue_depth{name = "{#NAME}"}` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Kubernetes API	Kubernetes API: ["{#NAME}"] Workqueue adds total, rate	Total number of adds handled by workqueue per second.	DEPENDENT	kubernetes.api.workqueueaddstotal.rate["{#NAME}"] Preprocessing: - PROMETHEUSPATTERN: `workqueue_adds_total{name = "{#NAME}"}` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Kubernetes API	Kubernetes API: Certificate expiration seconds bucket, {#LE}	Distribution of the remaining lifetime on the certificate used to authenticate a request.	DEPENDENT	kubernetes.api.clientcertificateexpirationsecondsbucket[{#LE}] Preprocessing: - PROMETHEUSPATTERN: `apiserver_client_certificate_expiration_seconds_bucket{le = "{#LE}"}` ⛔️ONFAIL: `DISCARD_VALUE ->`
Kubernetes API	Kubernetes API: Client certificate expiration, p1	1 percentile of the remaining lifetime on the certificate used to authenticate a request.	CALCULATED	kubernetes.api.clientcertificateexpiration_p1[{#SINGLETON}] Expression: `bucket_percentile(//kubernetes.api.client_certificate_expiration_seconds_bucket[*],5m,1)`
Zabbix raw items	Kubernetes API: Get API instance metrics	Get raw metrics from API instance /metrics endpoint.	HTTP_AGENT	kubernetes.api.getmetrics Preprocessing: - CHECKNOTSUPPORTED ⛔️ONFAIL: `DISCARD_VALUE ->`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Kubernetes API: Too many server errors	"Kubernetes API server is experiencing high error rate (with 5xx HTTP code).	`min(/Kubernetes API server by HTTP/kubernetes.api.apiserver_request_total_500.rate,5m)>{$KUBE.API.HTTP.SERVER.ERROR}`	WARNING
Kubernetes API: Too many client errors	"Kubernetes API client is experiencing high error rate (with 5xx HTTP code).	`min(/Kubernetes API server by HTTP/kubernetes.api.rest_client_requests_total_500.rate,5m)>{$KUBE.API.HTTP.CLIENT.ERROR}`	WARNING
Kubernetes API: Kubernetes client certificate is expiring	A client certificate used to authenticate to the apiserver is expiring in {$KUBE.API.CERT.EXPIRATION} days.	`last(/Kubernetes API server by HTTP/kubernetes.api.client_certificate_expiration_p1[{#SINGLETON}]) > 0 and last(/Kubernetes API server by HTTP/kubernetes.api.client_certificate_expiration_p1[{#SINGLETON}]) < {$KUBE.API.CERT.EXPIRATION}2460*60`	WARNING	Depends on: - Kubernetes API: Kubernetes client certificate expires soon
Kubernetes API: Kubernetes client certificate expires soon	A client certificate used to authenticate to the apiserver is expiring in less than 24.0 hours.	`last(/Kubernetes API server by HTTP/kubernetes.api.client_certificate_expiration_p1[{#SINGLETON}]) > 0 and last(/Kubernetes API server by HTTP/kubernetes.api.client_certificate_expiration_p1[{#SINGLETON}]) < 246060`	WARNING

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

app

app_kafka_jmx

View README Download JSON

Apache Kafka by JMX

Overview

For Zabbix version: 6.2 and higher
Official JMX Template for Apache Kafka.

This template was tested on:

Apache Kafka, version 2.6.0

Setup

Metrics are collected by JMX.

Enable and configure JMX access to Apache Kafka. See documentation for instructions.
Set the user name and password in host macros {$KAFKA.USER} and {$KAFKA.PASSWORD}.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$KAFKA.NETPROCAVG_IDLE.MIN.WARN}	The minimum Network processor average idle percent for trigger expression.	`30`
{$KAFKA.PASSWORD}	-	`zabbix`
{$KAFKA.REQUESTHANDLERAVG_IDLE.MIN.WARN}	The minimum Request handler average idle percent for trigger expression.	`30`
{$KAFKA.TOPIC.MATCHES}	Filter of discoverable topics	`.*`
{$KAFKA.TOPIC.NOT_MATCHES}	Filter to exclude discovered topics	`__consumer_offsets`
{$KAFKA.USER}	-	`zabbix`

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Topic Metrics (errors)

JMX

jmx.discovery[beans,"kafka.server:type=BrokerTopicMetrics,name=BytesRejectedPerSec,topic=*"]

Filter:

AND

- {#JMXTOPIC} MATCHESREGEX {$KAFKA.TOPIC.MATCHES}

- {#JMXTOPIC} NOTMATCHES_REGEX {$KAFKA.TOPIC.NOT_MATCHES}

Topic Metrics (read)

JMX

jmx.discovery[beans,"kafka.server:type=BrokerTopicMetrics,name=BytesOutPerSec,topic=*"]

Filter:

AND

- {#JMXTOPIC} MATCHESREGEX {$KAFKA.TOPIC.MATCHES}

- {#JMXTOPIC} NOTMATCHES_REGEX {$KAFKA.TOPIC.NOT_MATCHES}

Topic Metrics (write)

JMX

jmx.discovery[beans,"kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec,topic=*"]

Filter:

AND

- {#JMXTOPIC} MATCHESREGEX {$KAFKA.TOPIC.MATCHES}

- {#JMXTOPIC} NOTMATCHES_REGEX {$KAFKA.TOPIC.NOT_MATCHES}

Items collected

Group	Name	Description	Type	Key and additional info
Kafka	Kafka: Leader election per second	Number of leader elections per second.	JMX	jmx["kafka.controller:type=ControllerStats,name=LeaderElectionRateAndTimeMs","Count"]
Kafka	Kafka: Unclean leader election per second	Number of “unclean” elections per second.	JMX	jmx["kafka.controller:type=ControllerStats,name=UncleanLeaderElectionsPerSec","Count"] Preprocessing: - CHANGEPERSECOND
Kafka	Kafka: Controller state on broker	One indicates that the broker is the controller for the cluster.	JMX	jmx["kafka.controller:type=KafkaController,name=ActiveControllerCount","Value"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1h`
Kafka	Kafka: Ineligible pending replica deletes	The number of ineligible pending replica deletes.	JMX	jmx["kafka.controller:type=KafkaController,name=ReplicasIneligibleToDeleteCount","Value"]
Kafka	Kafka: Pending replica deletes	The number of pending replica deletes.	JMX	jmx["kafka.controller:type=KafkaController,name=ReplicasToDeleteCount","Value"]
Kafka	Kafka: Ineligible pending topic deletes	The number of ineligible pending topic deletes.	JMX	jmx["kafka.controller:type=KafkaController,name=TopicsIneligibleToDeleteCount","Value"]
Kafka	Kafka: Pending topic deletes	The number of pending topic deletes.	JMX	jmx["kafka.controller:type=KafkaController,name=TopicsToDeleteCount","Value"]
Kafka	Kafka: Offline log directory count	The number of offline log directories (for example, after a hardware failure).	JMX	jmx["kafka.log:type=LogManager,name=OfflineLogDirectoryCount","Value"]
Kafka	Kafka: Offline partitions count	Number of partitions that don't have an active leader.	JMX	jmx["kafka.controller:type=KafkaController,name=OfflinePartitionsCount","Value"]
Kafka	Kafka: Bytes out per second	The rate at which data is fetched and read from the broker by consumers.	JMX	jmx["kafka.server:type=BrokerTopicMetrics,name=BytesOutPerSec","Count"] Preprocessing: - CHANGEPERSECOND
Kafka	Kafka: Bytes in per second	The rate at which data sent from producers is consumed by the broker.	JMX	jmx["kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec","Count"] Preprocessing: - CHANGEPERSECOND
Kafka	Kafka: Messages in per second	The rate at which individual messages are consumed by the broker.	JMX	jmx["kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec","Count"] Preprocessing: - CHANGEPERSECOND
Kafka	Kafka: Bytes rejected per second	The rate at which bytes rejected per second by the broker.	JMX	jmx["kafka.server:type=BrokerTopicMetrics,name=BytesRejectedPerSec","Count"] Preprocessing: - CHANGEPERSECOND
Kafka	Kafka: Client fetch request failed per second	Number of client fetch request failures per second.	JMX	jmx["kafka.server:type=BrokerTopicMetrics,name=FailedFetchRequestsPerSec","Count"] Preprocessing: - CHANGEPERSECOND
Kafka	Kafka: Produce requests failed per second	Number of failed produce requests per second.	JMX	jmx["kafka.server:type=BrokerTopicMetrics,name=FailedProduceRequestsPerSec","Count"] Preprocessing: - CHANGEPERSECOND
Kafka	Kafka: Request handler average idle percent	Indicates the percentage of time that the request handler (IO) threads are not in use.	JMX	jmx["kafka.server:type=KafkaRequestHandlerPool,name=RequestHandlerAvgIdlePercent","OneMinuteRate"] Preprocessing: - MULTIPLIER: `100`
Kafka	Kafka: Fetch-Consumer response send time, mean	Average time taken, in milliseconds, to send the response.	JMX	jmx["kafka.network:type=RequestMetrics,name=ResponseSendTimeMs,request=FetchConsumer","Mean"]
Kafka	Kafka: Fetch-Consumer response send time, p95	The time taken, in milliseconds, to send the response for 95th percentile.	JMX	jmx["kafka.network:type=RequestMetrics,name=ResponseSendTimeMs,request=FetchConsumer","95thPercentile"]
Kafka	Kafka: Fetch-Consumer response send time, p99	The time taken, in milliseconds, to send the response for 99th percentile.	JMX	jmx["kafka.network:type=RequestMetrics,name=ResponseSendTimeMs,request=FetchConsumer","99thPercentile"]
Kafka	Kafka: Fetch-Follower response send time, mean	Average time taken, in milliseconds, to send the response.	JMX	jmx["kafka.network:type=RequestMetrics,name=ResponseSendTimeMs,request=FetchFollower","Mean"]
Kafka	Kafka: Fetch-Follower response send time, p95	The time taken, in milliseconds, to send the response for 95th percentile.	JMX	jmx["kafka.network:type=RequestMetrics,name=ResponseSendTimeMs,request=FetchFollower","95thPercentile"]
Kafka	Kafka: Fetch-Follower response send time, p99	The time taken, in milliseconds, to send the response for 99th percentile.	JMX	jmx["kafka.network:type=RequestMetrics,name=ResponseSendTimeMs,request=FetchFollower","99thPercentile"]
Kafka	Kafka: Produce response send time, mean	Average time taken, in milliseconds, to send the response.	JMX	jmx["kafka.network:type=RequestMetrics,name=ResponseSendTimeMs,request=Produce","Mean"]
Kafka	Kafka: Produce response send time, p95	The time taken, in milliseconds, to send the response for 95th percentile.	JMX	jmx["kafka.network:type=RequestMetrics,name=ResponseSendTimeMs,request=Produce","95thPercentile"]
Kafka	Kafka: Produce response send time, p99	The time taken, in milliseconds, to send the response for 99th percentile.	JMX	jmx["kafka.network:type=RequestMetrics,name=ResponseSendTimeMs,request=Produce","99thPercentile"]
Kafka	Kafka: Fetch-Consumer request total time, mean	Average time in ms to serve the Fetch-Consumer request.	JMX	jmx["kafka.network:type=RequestMetrics,name=TotalTimeMs,request=FetchConsumer","Mean"]
Kafka	Kafka: Fetch-Consumer request total time, p95	Time in ms to serve the Fetch-Consumer request for 95th percentile.	JMX	jmx["kafka.network:type=RequestMetrics,name=TotalTimeMs,request=FetchConsumer","95thPercentile"]
Kafka	Kafka: Fetch-Consumer request total time, p99	Time in ms to serve the specified Fetch-Consumer for 99th percentile.	JMX	jmx["kafka.network:type=RequestMetrics,name=TotalTimeMs,request=FetchConsumer","99thPercentile"]
Kafka	Kafka: Fetch-Follower request total time, mean	Average time in ms to serve the Fetch-Follower request.	JMX	jmx["kafka.network:type=RequestMetrics,name=TotalTimeMs,request=FetchFollower","Mean"]
Kafka	Kafka: Fetch-Follower request total time, p95	Time in ms to serve the Fetch-Follower request for 95th percentile.	JMX	jmx["kafka.network:type=RequestMetrics,name=TotalTimeMs,request=FetchFollower","95thPercentile"]
Kafka	Kafka: Fetch-Follower request total time, p99	Time in ms to serve the Fetch-Follower request for 99th percentile.	JMX	jmx["kafka.network:type=RequestMetrics,name=TotalTimeMs,request=FetchFollower","99thPercentile"]
Kafka	Kafka: Produce request total time, mean	Average time in ms to serve the Produce request.	JMX	jmx["kafka.network:type=RequestMetrics,name=TotalTimeMs,request=Produce","Mean"]
Kafka	Kafka: Produce request total time, p95	Time in ms to serve the Produce requests for 95th percentile.	JMX	jmx["kafka.network:type=RequestMetrics,name=TotalTimeMs,request=Produce","95thPercentile"]
Kafka	Kafka: Produce request total time, p99	Time in ms to serve the Produce requests for 99th percentile.	JMX	jmx["kafka.network:type=RequestMetrics,name=TotalTimeMs,request=Produce","99thPercentile"]
Kafka	Kafka: Fetch-Consumer request total time, mean	Average time for a request to update metadata.	JMX	jmx["kafka.network:type=RequestMetrics,name=TotalTimeMs,request=UpdateMetadata","Mean"]
Kafka	Kafka: UpdateMetadata request total time, p95	Time for update metadata requests for 95th percentile.	JMX	jmx["kafka.network:type=RequestMetrics,name=TotalTimeMs,request=UpdateMetadata","95thPercentile"]
Kafka	Kafka: UpdateMetadata request total time, p99	Time for update metadata requests for 99th percentile.	JMX	jmx["kafka.network:type=RequestMetrics,name=TotalTimeMs,request=UpdateMetadata","99thPercentile"]
Kafka	Kafka: Temporary memory size in bytes (Fetch), max	The maximum of temporary memory used for converting message formats and decompressing messages.	JMX	jmx["kafka.network:type=RequestMetrics,name=TemporaryMemoryBytes,request=Fetch","Max"]
Kafka	Kafka: Temporary memory size in bytes (Fetch), min	The minimum of temporary memory used for converting message formats and decompressing messages.	JMX	jmx["kafka.network:type=RequestMetrics,name=TemporaryMemoryBytes,request=Fetch","Mean"]
Kafka	Kafka: Temporary memory size in bytes (Produce), max	The maximum of temporary memory used for converting message formats and decompressing messages.	JMX	jmx["kafka.network:type=RequestMetrics,name=TemporaryMemoryBytes,request=Produce","Max"]
Kafka	Kafka: Temporary memory size in bytes (Produce), avg	The amount of temporary memory used for converting message formats and decompressing messages.	JMX	jmx["kafka.network:type=RequestMetrics,name=TemporaryMemoryBytes,request=Produce","Mean"]
Kafka	Kafka: Temporary memory size in bytes (Produce), min	The minimum of temporary memory used for converting message formats and decompressing messages.	JMX	jmx["kafka.network:type=RequestMetrics,name=TemporaryMemoryBytes,request=Produce","Min"]
Kafka	Kafka: Network processor average idle percent	The average percentage of time that the network processors are idle.	JMX	jmx["kafka.network:type=SocketServer,name=NetworkProcessorAvgIdlePercent","Value"] Preprocessing: - MULTIPLIER: `100`
Kafka	Kafka: Requests in producer purgatory	Number of requests waiting in producer purgatory.	JMX	jmx["kafka.server:type=DelayedOperationPurgatory,name=PurgatorySize,delayedOperation=Fetch","Value"]
Kafka	Kafka: Requests in fetch purgatory	Number of requests waiting in fetch purgatory.	JMX	jmx["kafka.server:type=DelayedOperationPurgatory,name=PurgatorySize,delayedOperation=Produce","Value"]
Kafka	Kafka: Replication maximum lag	The maximum lag between the time that messages are received by the leader replica and by the follower replicas.	JMX	jmx["kafka.server:type=ReplicaFetcherManager,name=MaxLag,clientId=Replica","Value"]
Kafka	Kafka: Under minimum ISR partition count	The number of partitions under the minimum In-Sync Replica (ISR) count.	JMX	jmx["kafka.server:type=ReplicaManager,name=UnderMinIsrPartitionCount","Value"]
Kafka	Kafka: Under replicated partitions	The number of partitions that have not been fully replicated in the follower replicas (the number of non-reassigning replicas - the number of ISR > 0).	JMX	jmx["kafka.server:type=ReplicaManager,name=UnderReplicatedPartitions","Value"]
Kafka	Kafka: ISR expands per second	The rate at which the number of ISRs in the broker increases.	JMX	jmx["kafka.server:type=ReplicaManager,name=IsrExpandsPerSec","Count"] Preprocessing: - CHANGEPERSECOND
Kafka	Kafka: ISR shrink per second	Rate of replicas leaving the ISR pool.	JMX	jmx["kafka.server:type=ReplicaManager,name=IsrShrinksPerSec","Count"] Preprocessing: - CHANGEPERSECOND
Kafka	Kafka: Leader count	The number of replicas for which this broker is the leader.	JMX	jmx["kafka.server:type=ReplicaManager,name=LeaderCount","Value"]
Kafka	Kafka: Partition count	The number of partitions in the broker.	JMX	jmx["kafka.server:type=ReplicaManager,name=PartitionCount","Value"]
Kafka	Kafka: Number of reassigning partitions	The number of reassigning leader partitions on a broker.	JMX	jmx["kafka.server:type=ReplicaManager,name=ReassigningPartitions","Value"]
Kafka	Kafka: Request queue size	The size of the delay queue.	JMX	jmx["kafka.server:type=Request","queue-size"]
Kafka	Kafka: Version	Current version of broker.	JMX	jmx["kafka.server:type=app-info","version"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1h`
Kafka	Kafka: Uptime	Service uptime in seconds.	JMX	jmx["kafka.server:type=app-info","start-time-ms"] Preprocessing: - JAVASCRIPT: `return (Math.floor((Date.now()-Number(value))/1000))`
Kafka	Kafka: ZooKeeper client request latency	Latency in milliseconds for ZooKeeper requests from broker.	JMX	jmx["kafka.server:type=ZooKeeperClientMetrics,name=ZooKeeperRequestLatencyMs","Count"]
Kafka	Kafka: ZooKeeper connection status	Connection status of broker's ZooKeeper session.	JMX	jmx["kafka.server:type=SessionExpireListener,name=SessionState","Value"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1h`
Kafka	Kafka: ZooKeeper disconnect rate	ZooKeeper client disconnect per second.	JMX	jmx["kafka.server:type=SessionExpireListener,name=ZooKeeperDisconnectsPerSec","Count"] Preprocessing: - CHANGEPERSECOND
Kafka	Kafka: ZooKeeper session expiration rate	ZooKeeper client session expiration per second.	JMX	jmx["kafka.server:type=SessionExpireListener,name=ZooKeeperExpiresPerSec","Count"] Preprocessing: - CHANGEPERSECOND
Kafka	Kafka: ZooKeeper readonly rate	ZooKeeper client readonly per second.	JMX	jmx["kafka.server:type=SessionExpireListener,name=ZooKeeperReadOnlyConnectsPerSec","Count"] Preprocessing: - CHANGEPERSECOND
Kafka	Kafka: ZooKeeper sync rate	ZooKeeper client sync per second.	JMX	jmx["kafka.server:type=SessionExpireListener,name=ZooKeeperSyncConnectsPerSec","Count"] Preprocessing: - CHANGEPERSECOND
Kafka	Kafka {#JMXTOPIC}: Messages in per second	The rate at which individual messages are consumed by topic.	JMX	jmx["kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec,topic={#JMXTOPIC}","Count"] Preprocessing: - CHANGEPERSECOND
Kafka	Kafka {#JMXTOPIC}: Bytes in per second	The rate at which data sent from producers is consumed by topic.	JMX	jmx["kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec,topic={#JMXTOPIC}","Count"] Preprocessing: - CHANGEPERSECOND
Kafka	Kafka {#JMXTOPIC}: Bytes out per second	The rate at which data is fetched and read from the broker by consumers (by topic).	JMX	jmx["kafka.server:type=BrokerTopicMetrics,name=BytesOutPerSec,topic={#JMXTOPIC}","Count"] Preprocessing: - CHANGEPERSECOND
Kafka	Kafka {#JMXTOPIC}: Bytes rejected per second	Rejected bytes rate by topic.	JMX	jmx["kafka.server:type=BrokerTopicMetrics,name=BytesRejectedPerSec,topic={#JMXTOPIC}","Count"] Preprocessing: - CHANGEPERSECOND

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Kafka: Unclean leader election detected	Unclean leader elections occur when there is no qualified partition leader among Kafka brokers. If Kafka is configured to allow an unclean leader election, a leader is chosen from the out-of-sync replicas, and any messages that were not synced prior to the loss of the former leader are lost forever. Essentially, unclean leader elections sacrifice consistency for availability.	`last(/Apache Kafka by JMX/jmx["kafka.controller:type=ControllerStats,name=UncleanLeaderElectionsPerSec","Count"])>0`	AVERAGE
Kafka: There are offline log directories	The offline log directory count metric indicate the number of log directories which are offline (due to a hardware failure for example) so that the broker cannot store incoming messages anymore.	`last(/Apache Kafka by JMX/jmx["kafka.log:type=LogManager,name=OfflineLogDirectoryCount","Value"]) > 0`	WARNING
Kafka: One or more partitions have no leader	Any partition without an active leader will be completely inaccessible, and both consumers and producers of that partition will be blocked until a leader becomes available.	`last(/Apache Kafka by JMX/jmx["kafka.controller:type=KafkaController,name=OfflinePartitionsCount","Value"]) > 0`	WARNING
Kafka: Request handler average idle percent is too low	The request handler idle ratio metric indicates the percentage of time the request handlers are not in use. The lower this number, the more loaded the broker is.	`max(/Apache Kafka by JMX/jmx["kafka.server:type=KafkaRequestHandlerPool,name=RequestHandlerAvgIdlePercent","OneMinuteRate"],15m)<{$KAFKA.REQUEST_HANDLER_AVG_IDLE.MIN.WARN}`	AVERAGE
Kafka: Network processor average idle percent is too low	The network processor idle ratio metric indicates the percentage of time the network processor are not in use. The lower this number, the more loaded the broker is.	`max(/Apache Kafka by JMX/jmx["kafka.network:type=SocketServer,name=NetworkProcessorAvgIdlePercent","Value"],15m)<{$KAFKA.NET_PROC_AVG_IDLE.MIN.WARN}`	AVERAGE
Kafka: Failed to fetch info data	Zabbix has not received data for items for the last 15 minutes	`nodata(/Apache Kafka by JMX/jmx["kafka.network:type=SocketServer,name=NetworkProcessorAvgIdlePercent","Value"],15m)=1`	WARNING
Kafka: There are partitions under the min ISR	The Under min ISR partitions metric displays the number of partitions, where the number of In-Sync Replicas (ISR) is less than the minimum number of in-sync replicas specified. The two most common causes of under-min ISR partitions are that one or more brokers is unresponsive, or the cluster is experiencing performance issues and one or more brokers are falling behind.	`last(/Apache Kafka by JMX/jmx["kafka.server:type=ReplicaManager,name=UnderMinIsrPartitionCount","Value"])>0`	AVERAGE
Kafka: There are under replicated partitions	The Under replicated partitions metric displays the number of partitions that do not have enough replicas to meet the desired replication factor. A partition will also be considered under-replicated if the correct number of replicas exist, but one or more of the replicas have fallen significantly behind the partition leader. The two most common causes of under-replicated partitions are that one or more brokers is unresponsive, or the cluster is experiencing performance issues and one or more brokers have fallen behind.	`last(/Apache Kafka by JMX/jmx["kafka.server:type=ReplicaManager,name=UnderReplicatedPartitions","Value"])>0`	AVERAGE
Kafka: Version has changed	Kafka version has changed. Ack to close.	`last(/Apache Kafka by JMX/jmx["kafka.server:type=app-info","version"],#1)<>last(/Apache Kafka by JMX/jmx["kafka.server:type=app-info","version"],#2) and length(last(/Apache Kafka by JMX/jmx["kafka.server:type=app-info","version"]))>0`	INFO	Manual close: YES
Kafka: has been restarted	Uptime is less than 10 minutes.	`last(/Apache Kafka by JMX/jmx["kafka.server:type=app-info","start-time-ms"])<10m`	INFO	Manual close: YES
Kafka: Broker is not connected to ZooKeeper	-	`find(/Apache Kafka by JMX/jmx["kafka.server:type=SessionExpireListener,name=SessionState","Value"],,"regexp","CONNECTED")=0`	AVERAGE

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

app

app_jenkins

View README Download JSON

Jenkins by HTTP

Overview

For Zabbix version: 6.2 and higher.
The template to monitor Apache Jenkins by Zabbix that work without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

This template was tested on:

Jenkins, version 2.263.1

Setup

Metrics are collected by requests to Metrics API. For common metrics: Install and configure Metrics plugin parameters according official documentations. Do not forget to configure access to the Metrics Servlet by issuing API key and change macro {$JENKINS.API.KEY}.

For monitoring computers and builds: Create API token for monitoring user according official documentations and change macro {$JENKINS.USER}, {$JENKINS.API.TOKEN}. Don't forget to change macros {$JENKINS.URL}.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$JENKINS.API.KEY}	API key to access Metrics Servlet	``
{$JENKINS.API.TOKEN}	API token for HTTP BASIC authentication.	``
{$JENKINS.FILE_DESCRIPTORS.MAX.WARN}	Maximum percentage of file descriptors usage alert threshold (for trigger expression).	`85`
{$JENKINS.JOB.HEALTH.SCORE.MIN.WARN}	Minimum job's health score (for trigger expression).	`50`
{$JENKINS.PING.REPLY}	Expected reply to the ping.	`pong`
{$JENKINS.URL}	Jenkins URL in the format `<scheme>://<host>:<port>`	``
{$JENKINS.USER}	Username for HTTP BASIC authentication	`zabbix`

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Computers discovery

HTTP_AGENT

jenkins.computers

Preprocessing:

- JSONPATH: $.computer.[*]

Jobs discovery

HTTP_AGENT

jenkins.jobs

Preprocessing:

- JSONPATH: $.jobs.[*]

Items collected

Group	Name	Description	Type	Key and additional info
Jenkins	Jenkins: Disk space check message	The message will reference the first node which fails this check. There may be other nodes that fail the check, but this health check is designed to fail fast.	DEPENDENT	jenkins.diskspace.message Preprocessing: - JSONPATH: `$['disk-space'].message` ⛔️ONFAIL: `CUSTOM_VALUE ->` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Jenkins	Jenkins: Temporary space check message	The message will reference the first node which fails this check. There may be other nodes that fail the check, but this health check is designed to fail fast.	DEPENDENT	jenkins.temporaryspace.message Preprocessing: - JSONPATH: `$['temporary-space'].message` ⛔️ONFAIL: `CUSTOM_VALUE ->` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Jenkins	Jenkins: Plugins check message	The message of plugins health check.	DEPENDENT	jenkins.plugins.message Preprocessing: - JSONPATH: `$['plugins'].message` ⛔️ONFAIL: `CUSTOM_VALUE ->` - DISCARDUNCHANGED_HEARTBEAT: `3h`
Jenkins	Jenkins: Thread deadlock check message	The message of thread deadlock health check.	DEPENDENT	jenkins.threaddeadlock.message Preprocessing: - JSONPATH: `$['thread-deadlock'].message` ⛔️ONFAIL: `CUSTOM_VALUE ->` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Jenkins	Jenkins: Disk space check	Returns FAIL if any of the Jenkins disk space monitors are reporting the disk space as less than the configured threshold.	DEPENDENT	jenkins.diskspace Preprocessing: - JSONPATH: `$['disk-space'].healthy` - BOOLTODECIMAL - DISCARDUNCHANGED_HEARTBEAT: `1h`
Jenkins	Jenkins: Plugins check	Returns FAIL if any of the Jenkins plugins failed to start.	DEPENDENT	jenkins.plugins Preprocessing: - JSONPATH: `$.plugins.healthy` - BOOLTODECIMAL - DISCARDUNCHANGEDHEARTBEAT: `1h`
Jenkins	Jenkins: Temporary space check	Returns FAIL if any of the Jenkins temporary space monitors are reporting the temporary space as less than the configured threshold.	DEPENDENT	jenkins.temporaryspace Preprocessing: - JSONPATH: `$['temporary-space'].healthy` - BOOLTODECIMAL - DISCARDUNCHANGED_HEARTBEAT: `1h`
Jenkins	Jenkins: Thread deadlock check	Returns FAIL if there are any deadlocked threads in the Jenkins master JVM.	DEPENDENT	jenkins.threaddeadlock Preprocessing: - JSONPATH: `$['thread-deadlock'].healthy` - BOOLTODECIMAL - DISCARDUNCHANGED_HEARTBEAT: `1h`
Jenkins	Jenkins: Executors count	The number of executors available to Jenkins. This is corresponds to the sum of all the executors of all the on-line nodes.	DEPENDENT	jenkins.executor.count Preprocessing: - JSONPATH: `$.['jenkins.executor.count.value'].value` - DISCARDUNCHANGEDHEARTBEAT: `1h`
Jenkins	Jenkins: Executors free	The number of executors available to Jenkins that are not currently in use.	DEPENDENT	jenkins.executor.free Preprocessing: - JSONPATH: `$.['jenkins.executor.free.value'].value`
Jenkins	Jenkins: Executors in use	The number of executors available to Jenkins that are currently in use.	DEPENDENT	jenkins.executor.in_use Preprocessing: - JSONPATH: `$.['jenkins.executor.in-use.value'].value`
Jenkins	Jenkins: Nodes count	The number of build nodes available to Jenkins, both on-line and off-line.	DEPENDENT	jenkins.node.count Preprocessing: - JSONPATH: `$.['jenkins.node.count.value'].value` - DISCARDUNCHANGEDHEARTBEAT: `1h`
Jenkins	Jenkins: Nodes offline	The number of build nodes available to Jenkins but currently off-line.	DEPENDENT	jenkins.node.offline Preprocessing: - JSONPATH: `$.['jenkins.node.offline.value'].value` - DISCARDUNCHANGEDHEARTBEAT: `1h`
Jenkins	Jenkins: Nodes online	The number of build nodes available to Jenkins and currently on-line.	DEPENDENT	jenkins.node.online Preprocessing: - JSONPATH: `$.['jenkins.node.online.value'].value` - DISCARDUNCHANGEDHEARTBEAT: `1h`
Jenkins	Jenkins: Plugins active	The number of plugins in the Jenkins instance that started successfully.	DEPENDENT	jenkins.plugins.active Preprocessing: - JSONPATH: `$.['jenkins.plugins.active'].value` - DISCARDUNCHANGEDHEARTBEAT: `1h`
Jenkins	Jenkins: Plugins failed	The number of plugins in the Jenkins instance that failed to start. A value other than 0 is typically indicative of a potential issue within the Jenkins installation that will either be solved by explicitly disabling the plugin(s) or by resolving the plugin dependency issues.	DEPENDENT	jenkins.plugins.failed Preprocessing: - JSONPATH: `$.['jenkins.plugins.failed'].value` - DISCARDUNCHANGEDHEARTBEAT: `1h`
Jenkins	Jenkins: Plugins inactive	The number of plugins in the Jenkins instance that are not currently enabled.	DEPENDENT	jenkins.plugins.inactive Preprocessing: - JSONPATH: `$.['jenkins.plugins.inactive'].value` - DISCARDUNCHANGEDHEARTBEAT: `1h`
Jenkins	Jenkins: Plugins with update	The number of plugins in the Jenkins instance that have an newer version reported as available in the current Jenkins update center metadata held by Jenkins. This value is not indicative of an issue with Jenkins but high values can be used as a trigger to review the plugins with updates with a view to seeing whether those updates potentially contain fixes for issues that could be affecting your Jenkins instance.	DEPENDENT	jenkins.plugins.withupdate Preprocessing: - JSONPATH: `$.['jenkins.plugins.withUpdate'].value` - DISCARDUNCHANGED_HEARTBEAT: `1h`
Jenkins	Jenkins: Projects count	The number of projects.	DEPENDENT	jenkins.project.count Preprocessing: - JSONPATH: `$.['jenkins.project.count.value'].value` - DISCARDUNCHANGEDHEARTBEAT: `1h`
Jenkins	Jenkins: Jobs count	The number of jobs in Jenkins.	DEPENDENT	jenkins.job.count.value Preprocessing: - JSONPATH: `$.['jenkins.job.count.value'].value` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Jenkins	Jenkins: Job scheduled, m1 rate	The rate at which jobs are scheduled. If a job is already in the queue and an identical request for scheduling the job is received then Jenkins will coalesce the two requests. This metric gives a reasonably pure measure of the load requirements of the Jenkins master as it is unaffected by the number of executors available to the system.	DEPENDENT	jenkins.job.scheduled.m1.rate Preprocessing: - JSONPATH: `$.['jenkins.job.scheduled'].m1_rate`
Jenkins	Jenkins: Jobs scheduled, m5 rate	The rate at which jobs are scheduled. If a job is already in the queue and an identical request for scheduling the job is received then Jenkins will coalesce the two requests. This metric gives a reasonably pure measure of the load requirements of the Jenkins master as it is unaffected by the number of executors available to the system.	DEPENDENT	jenkins.job.scheduled.m5.rate Preprocessing: - JSONPATH: `$.['jenkins.job.scheduled'].m5_rate`
Jenkins	Jenkins: Job blocked, m1 rate	The rate at which jobs in the build queue enter the blocked state.	DEPENDENT	jenkins.job.blocked.m1.rate Preprocessing: - JSONPATH: `$.['jenkins.job.blocked.duration'].m1_rate`
Jenkins	Jenkins: Job blocked, m5 rate	The rate at which jobs in the build queue enter the blocked state.	DEPENDENT	jenkins.job.blocked.m5.rate Preprocessing: - JSONPATH: `$.['jenkins.job.blocked.duration'].m5_rate`
Jenkins	Jenkins: Job blocked duration, p95	The amount of time which jobs spend in the blocked state.	DEPENDENT	jenkins.job.blocked.duration.p95 Preprocessing: - JSONPATH: `$.['jenkins.job.blocked.duration'].p95`
Jenkins	Jenkins: Job blocked duration, median	The amount of time which jobs spend in the blocked state.	DEPENDENT	jenkins.job.blocked.duration.p50 Preprocessing: - JSONPATH: `$.['jenkins.job.blocked.duration'].p50`
Jenkins	Jenkins: Job building, m1 rate	The rate at which jobs are built.	DEPENDENT	jenkins.job.building.m1.rate Preprocessing: - JSONPATH: `$.['jenkins.job.building.duration'].m1_rate`
Jenkins	Jenkins: Job building, m5 rate	The rate at which jobs are built.	DEPENDENT	jenkins.job.building.m5.rate Preprocessing: - JSONPATH: `$.['jenkins.job.building.duration'].m5_rate`
Jenkins	Jenkins: Job building duration, p95	The amount of time which jobs spend building.	DEPENDENT	jenkins.job.building.duration.p95 Preprocessing: - JSONPATH: `$.['jenkins.job.building.duration'].p95`
Jenkins	Jenkins: Job building duration, median	The amount of time which jobs spend building.	DEPENDENT	jenkins.job.building.duration.p50 Preprocessing: - JSONPATH: `$.['jenkins.job.building.duration'].p50`
Jenkins	Jenkins: Job buildable, m1 rate	The rate at which jobs in the build queue enter the buildable state.	DEPENDENT	jenkins.job.buildable.m1.rate Preprocessing: - JSONPATH: `$.['jenkins.job.buildable.duration'].m1_rate`
Jenkins	Jenkins: Job buildable, m5 rate	The rate at which jobs in the build queue enter the buildable state.	DEPENDENT	jenkins.job.buildable.m5.rate Preprocessing: - JSONPATH: `$.['jenkins.job.buildable.duration'].m5_rate`
Jenkins	Jenkins: Job buildable duration, p95	The amount of time which jobs spend in the buildable state.	DEPENDENT	jenkins.job.buildable.duration.p95 Preprocessing: - JSONPATH: `$.['jenkins.job.buildable.duration'].p95`
Jenkins	Jenkins: Job buildable duration, median	The amount of time which jobs spend in the buildable state.	DEPENDENT	jenkins.job.buildable.duration.p50 Preprocessing: - JSONPATH: `$.['jenkins.job.buildable.duration'].p50`
Jenkins	Jenkins: Job queuing, m1 rate	The rate at which jobs are queued.	DEPENDENT	jenkins.job.queuing.m1.rate Preprocessing: - JSONPATH: `$.['jenkins.job.queuing.duration'].m1_rate`
Jenkins	Jenkins: Job queuing, m5 rate	The rate at which jobs are queued.	DEPENDENT	jenkins.job.queuing.m5.rate Preprocessing: - JSONPATH: `$.['jenkins.job.queuing.duration'].m5_rate`
Jenkins	Jenkins: Job queuing duration, p95	The total time which jobs spend in the build queue.	DEPENDENT	jenkins.job.queuing.duration.p95 Preprocessing: - JSONPATH: `$.['jenkins.job.queuing.duration'].p95`
Jenkins	Jenkins: Job queuing duration, median	The total time which jobs spend in the build queue.	DEPENDENT	jenkins.job.queuing.duration.p50 Preprocessing: - JSONPATH: `$.['jenkins.job.queuing.duration'].p50`
Jenkins	Jenkins: Job total, m1 rate	The rate at which jobs are queued.	DEPENDENT	jenkins.job.total.m1.rate Preprocessing: - JSONPATH: `$.['jenkins.job.total.duration'].m1_rate`
Jenkins	Jenkins: Job total, m5 rate	The rate at which jobs are queued.	DEPENDENT	jenkins.job.total.m5.rate Preprocessing: - JSONPATH: `$.['jenkins.job.total.duration'].m5_rate`
Jenkins	Jenkins: Job total duration, p95	The total time which jobs spend from entering the build queue to completing building.	DEPENDENT	jenkins.job.total.duration.p95 Preprocessing: - JSONPATH: `$.['jenkins.job.total.duration'].p95`
Jenkins	Jenkins: Job total duration, median	The total time which jobs spend from entering the build queue to completing building.	DEPENDENT	jenkins.job.total.duration.p50 Preprocessing: - JSONPATH: `$.['jenkins.job.total.duration'].p50`
Jenkins	Jenkins: Job waiting, m1 rate	The rate at which jobs enter the quiet period.	DEPENDENT	jenkins.job.waiting.m1.rate Preprocessing: - JSONPATH: `$.['jenkins.job.waiting.duration'].m1_rate`
Jenkins	Jenkins: Job waiting, m5 rate	The rate at which jobs enter the quiet period.	DEPENDENT	jenkins.job.waiting.m5.rate Preprocessing: - JSONPATH: `$.['jenkins.job.waiting.duration'].m5_rate`
Jenkins	Jenkins: Job waiting duration, p95	The total amount of time that jobs spend in their quiet period.	DEPENDENT	jenkins.job.waiting.duration.p95 Preprocessing: - JSONPATH: `$.['jenkins.job.waiting.duration'].p95`
Jenkins	Jenkins: Job waiting duration, median	The total amount of time that jobs spend in their quiet period.	DEPENDENT	jenkins.job.waiting.duration.p50 Preprocessing: - JSONPATH: `$.['jenkins.job.waiting.duration'].p50`
Jenkins	Jenkins: Build queue, blocked	The number of jobs that are in the Jenkins build queue and currently in the blocked state.	DEPENDENT	jenkins.queue.blocked Preprocessing: - JSONPATH: `$.['jenkins.queue.blocked.value'].value`
Jenkins	Jenkins: Build queue, size	The number of jobs that are in the Jenkins build queue.	DEPENDENT	jenkins.queue.size Preprocessing: - JSONPATH: `$.['jenkins.queue.size.value'].value`
Jenkins	Jenkins: Build queue, buildable	The number of jobs that are in the Jenkins build queue and currently in the blocked state.	DEPENDENT	jenkins.queue.buildable Preprocessing: - JSONPATH: `$.['jenkins.queue.buildable.value'].value`
Jenkins	Jenkins: Build queue, pending	The number of jobs that are in the Jenkins build queue and currently in the blocked state.	DEPENDENT	jenkins.queue.pending Preprocessing: - JSONPATH: `$.['jenkins.queue.pending.value'].value`
Jenkins	Jenkins: Build queue, stuck	The number of jobs that are in the Jenkins build queue and currently in the blocked state.	DEPENDENT	jenkins.queue.stuck Preprocessing: - JSONPATH: `$.['jenkins.queue.stuck.value'].value`
Jenkins	Jenkins: HTTP active requests, rate	The number of currently active requests against the Jenkins master Web UI.	DEPENDENT	jenkins.http.activerequests.rate Preprocessing: - JSONPATH: `$.counters.['http.activeRequests'].count` - CHANGEPER_SECOND
Jenkins	Jenkins: HTTP response 400, rate	The rate at which the Jenkins master Web UI is responding to requests with a HTTP/400 status code.	DEPENDENT	jenkins.http.badrequest.rate Preprocessing: - JSONPATH: `$.['http.responseCodes.badRequest'].count` - CHANGEPER_SECOND
Jenkins	Jenkins: HTTP response 500, rate	The rate at which the Jenkins master Web UI is responding to requests with a HTTP/500 status code.	DEPENDENT	jenkins.http.servererror.rate Preprocessing: - JSONPATH: `$.['http.responseCodes.serverError'].count` - CHANGEPER_SECOND
Jenkins	Jenkins: HTTP response 503, rate	The rate at which the Jenkins master Web UI is responding to requests with a HTTP/503 status code.	DEPENDENT	jenkins.http.serviceunavailable.rate Preprocessing: - JSONPATH: `$.['http.responseCodes.serviceUnavailable'].count` - CHANGEPER_SECOND
Jenkins	Jenkins: HTTP response 200, rate	The rate at which the Jenkins master Web UI is responding to requests with a HTTP/200 status code.	DEPENDENT	jenkins.http.ok.rate Preprocessing: - JSONPATH: `$.['http.responseCodes.ok'].count` - CHANGEPERSECOND
Jenkins	Jenkins: HTTP response other, rate	The rate at which the Jenkins master Web UI is responding to requests with a non-informational status code that is not in the list: HTTP/200, HTTP/201, HTTP/204, HTTP/304, HTTP/400, HTTP/403, HTTP/404, HTTP/500, or HTTP/503.	DEPENDENT	jenkins.http.other.rate Preprocessing: - JSONPATH: `$.['http.responseCodes.other'].count` - CHANGEPERSECOND
Jenkins	Jenkins: HTTP response 201, rate	The rate at which the Jenkins master Web UI is responding to requests with a HTTP/201 status code.	DEPENDENT	jenkins.http.created.rate Preprocessing: - JSONPATH: `$.['http.responseCodes.created'].count` - CHANGEPERSECOND
Jenkins	Jenkins: HTTP response 204, rate	The rate at which the Jenkins master Web UI is responding to requests with a HTTP/204 status code.	DEPENDENT	jenkins.http.nocontent.rate Preprocessing: - JSONPATH: `$.['http.responseCodes.noContent'].count` - CHANGEPER_SECOND
Jenkins	Jenkins: HTTP response 404, rate	The rate at which the Jenkins master Web UI is responding to requests with a HTTP/404 status code.	DEPENDENT	jenkins.http.notfound.rate Preprocessing: - JSONPATH: `$.['http.responseCodes.notFound'].count` - CHANGEPER_SECOND
Jenkins	Jenkins: HTTP response 304, rate	The rate at which the Jenkins master Web UI is responding to requests with a HTTP/304 status code.	DEPENDENT	jenkins.http.notmodified.rate Preprocessing: - JSONPATH: `$.['http.responseCodes.notModified'].count` - CHANGEPER_SECOND
Jenkins	Jenkins: HTTP response 403, rate	The rate at which the Jenkins master Web UI is responding to requests with a HTTP/403 status code.	DEPENDENT	jenkins.http.forbidden.rate Preprocessing: - JSONPATH: `$.['http.responseCodes.forbidden'].count` - CHANGEPERSECOND
Jenkins	Jenkins: HTTP requests, rate	The rate at which the Jenkins master Web UI is receiving requests.	DEPENDENT	jenkins.http.requests.rate Preprocessing: - JSONPATH: `$.['http.requests'].count` - CHANGEPERSECOND
Jenkins	Jenkins: HTTP requests, p95	The time spent generating the corresponding responses.	DEPENDENT	jenkins.http.requests_p95.rate Preprocessing: - JSONPATH: `$.['http.requests'].p95`
Jenkins	Jenkins: HTTP requests, median	The time spent generating the corresponding responses.	DEPENDENT	jenkins.http.requests_p50.rate Preprocessing: - JSONPATH: `$.['http.requests'].p50`
Jenkins	Jenkins: Version	Version of Jenkins server.	DEPENDENT	jenkins.version Preprocessing: - JSONPATH: `$.['jenkins.versions.core'].value` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Jenkins	Jenkins: CPU Load	The system load on the Jenkins master as reported by the JVM's Operating System JMX bean. The calculation of system load is operating system dependent. Typically this is the sum of the number of processes that are currently running plus the number that are waiting to run. This is typically comparable against the number of CPU cores.	DEPENDENT	jenkins.system.cpu.load Preprocessing: - JSONPATH: `$.['system.cpu.load'].value`
Jenkins	Jenkins: Uptime	The number of seconds since the Jenkins master JVM started.	DEPENDENT	jenkins.system.uptime Preprocessing: - JSONPATH: `$.['vm.uptime.milliseconds'].value` - MULTIPLIER: `0.001`
Jenkins	Jenkins: File descriptor ratio	The ratio of used to total file descriptors	DEPENDENT	jenkins.descriptor.ratio Preprocessing: - JSONPATH: `$.['vm.file.descriptor.ratio'].value` - MULTIPLIER: `100`
Jenkins	Jenkins: Service ping		HTTP_AGENT	jenkins.ping Preprocessing: - CHECKNOTSUPPORTED ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - REGEX: `{$JENKINS.PING.REPLY}$ 1` ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - DISCARDUNCHANGEDHEARTBEAT: `30m`
Jenkins	Jenkins job [{#NAME}]: Get job	Raw data for a job.	DEPENDENT	jenkins.job.get[{#NAME}] Preprocessing: - JSONPATH: `$.jobs.[?(@.name == "{#NAME}")].first()` ⛔️ONFAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGED_HEARTBEAT: `30m`
Jenkins	Jenkins job [{#NAME}]: Health score	Represents health of project. A number between 0-100. Job Description: {#DESCRIPTION} Job Url: {#URL}	DEPENDENT	jenkins.build.health[{#NAME}] Preprocessing: - JSONPATH: `$.healthReport..score.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGED_HEARTBEAT: `30m`
Jenkins	Jenkins job [{#NAME}]: Last Build number	Details: {#URL}/lastBuild/	DEPENDENT	jenkins.lastbuild.number[{#NAME}] Preprocessing: - JSONPATH: `$.lastBuild.number` ⛔️ONFAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGEDHEARTBEAT: `30m`
Jenkins	Jenkins job [{#NAME}]: Last Build duration	Build duration (in seconds).	DEPENDENT	jenkins.lastbuild.duration[{#NAME}] Preprocessing: - JSONPATH: `$.lastBuild.duration` ⛔️ONFAIL: `DISCARD_VALUE ->` - MULTIPLIER: `0.001` - DISCARDUNCHANGEDHEARTBEAT: `30m`
Jenkins	Jenkins job [{#NAME}]: Last Build timestamp		DEPENDENT	jenkins.lastbuild.timestamp[{#NAME}] Preprocessing: - JSONPATH: `$.lastBuild.timestamp` ⛔️ONFAIL: `DISCARD_VALUE ->` - MULTIPLIER: `0.001` - DISCARDUNCHANGEDHEARTBEAT: `30m`
Jenkins	Jenkins job [{#NAME}]: Last Build result		DEPENDENT	jenkins.lastbuild.result[{#NAME}] Preprocessing: - JSONPATH: `$.lastBuild.result` ⛔️ONFAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGEDHEARTBEAT: `30m`
Jenkins	Jenkins job [{#NAME}]: Last Failed Build number	Details: {#URL}/lastFailedBuild/	DEPENDENT	jenkins.lastfailedbuild.number[{#NAME}] Preprocessing: - JSONPATH: `$.lastFailedBuild.number` ⛔️ONFAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGED_HEARTBEAT: `30m`
Jenkins	Jenkins job [{#NAME}]: Last Failed Build duration	Build duration (in seconds).	DEPENDENT	jenkins.lastfailedbuild.duration[{#NAME}] Preprocessing: - JSONPATH: `$.lastFailedBuild.duration` ⛔️ONFAIL: `DISCARD_VALUE ->` - MULTIPLIER: `0.001` - DISCARDUNCHANGED_HEARTBEAT: `30m`
Jenkins	Jenkins job [{#NAME}]: Last Failed Build timestamp	-	DEPENDENT	jenkins.lastfailedbuild.timestamp[{#NAME}] Preprocessing: - JSONPATH: `$.lastFailedBuild.timestamp` ⛔️ONFAIL: `DISCARD_VALUE ->` - MULTIPLIER: `0.001` - DISCARDUNCHANGED_HEARTBEAT: `30m`
Jenkins	Jenkins job [{#NAME}]: Last Successful Build number	Details: {#URL}/lastSuccessfulBuild/	DEPENDENT	jenkins.lastsuccessfulbuild.number[{#NAME}] Preprocessing: - JSONPATH: `$.lastSuccessfulBuild.number` ⛔️ONFAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGED_HEARTBEAT: `30m`
Jenkins	Jenkins job [{#NAME}]: Last Successful Build duration	Build duration (in seconds).	DEPENDENT	jenkins.lastsuccessfulbuild.duration[{#NAME}] Preprocessing: - JSONPATH: `$.lastSuccessfulBuild.duration` ⛔️ONFAIL: `DISCARD_VALUE ->` - MULTIPLIER: `0.001` - DISCARDUNCHANGED_HEARTBEAT: `30m`
Jenkins	Jenkins job [{#NAME}]: Last Successful Build timestamp	-	DEPENDENT	jenkins.lastsuccessfulbuild.timestamp[{#NAME}] Preprocessing: - JSONPATH: `$.lastSuccessfulBuild.timestamp` ⛔️ONFAIL: `DISCARD_VALUE ->` - MULTIPLIER: `0.001` - DISCARDUNCHANGED_HEARTBEAT: `30m`
Jenkins	Jenkins: Computer [{#DISPLAY_NAME}]: Get computer	Raw data for a computer.	DEPENDENT	jenkins.computer.get[{#DISPLAYNAME}] Preprocessing: - JSONPATH: `$.computer.[?(@.displayName == "{#DISPLAY_NAME}")].first()` ⛔️ONFAIL: `DISCARD_VALUE ->` - DISCARDUNCHANGEDHEARTBEAT: `1h`
Jenkins	Jenkins: Computer [{#DISPLAY_NAME}]: Executors	The maximum number of concurrent builds that Jenkins may perform on this node.	DEPENDENT	jenkins.computer.numExecutors[{#DISPLAYNAME}] Preprocessing: - JSONPATH: `$.numExecutors` ⛔️ONFAIL: `DISCARD_VALUE ->`
Jenkins	Jenkins: Computer [{#DISPLAY_NAME}]: State	Represents the actual online/offline state. Node description: {#DESCRIPTION}	DEPENDENT	jenkins.computer.state[{#DISPLAYNAME}] Preprocessing: - JSONPATH: `$.offline` - BOOLTODECIMAL - DISCARDUNCHANGED_HEARTBEAT: `1h`
Jenkins	Jenkins: Computer [{#DISPLAY_NAME}]: Offline cause reason	If the computer was offline (either temporarily or not), will return the cause as a string (without user info). Empty string if the system was put offline without given a cause.	DEPENDENT	jenkins.computer.offline.reason[{#DISPLAYNAME}] Preprocessing: - JSONPATH: `$.offlineCauseReason` - DISCARDUNCHANGED_HEARTBEAT: `3h`
Jenkins	Jenkins: Computer [{#DISPLAY_NAME}]: Idle	Returns true if all the executors of this computer are idle.	DEPENDENT	jenkins.computer.idle[{#DISPLAYNAME}] Preprocessing: - JSONPATH: `$.idle` - BOOLTODECIMAL - DISCARDUNCHANGED_HEARTBEAT: `1h`
Jenkins	Jenkins: Computer [{#DISPLAY_NAME}]: Temporarily offline	Returns true if this node is marked temporarily offline.	DEPENDENT	jenkins.computer.tempoffline[{#DISPLAYNAME}] Preprocessing: - JSONPATH: `$.temporarilyOffline` - BOOLTODECIMAL - DISCARDUNCHANGEDHEARTBEAT: `1h`
Jenkins	Jenkins: Computer [{#DISPLAY_NAME}]: Available disk space	The available disk space of $JENKINS_HOME on agent.	DEPENDENT	jenkins.computer.diskspace[{#DISPLAYNAME}] Preprocessing: - JSONPATH: `$.monitorData['hudson.node_monitors.DiskSpaceMonitor'].size` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Jenkins	Jenkins: Computer [{#DISPLAY_NAME}]: Available temp space	The available disk space of the temporary directory. Java tools and tests/builds often create files in the temporary directory, and may not function properly if there's no available space.	DEPENDENT	jenkins.computer.tempspace[{#DISPLAYNAME}] Preprocessing: - JSONPATH: `$.monitorData['hudson.node_monitors.TemporarySpaceMonitor'].size` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Jenkins	Jenkins: Computer [{#DISPLAY_NAME}]: Response time average	The round trip network response time from the master to the agent	DEPENDENT	jenkins.computer.responsetime[{#DISPLAYNAME}] Preprocessing: - JSONPATH: `$.monitorData['hudson.node_monitors.ResponseTimeMonitor'].average` ⛔️ON_FAIL: `DISCARD_VALUE ->` - MULTIPLIER: `0.001`
Jenkins	Jenkins: Computer [{#DISPLAY_NAME}]: Available physical memory	The total physical memory of the system, available bytes.	DEPENDENT	jenkins.computer.availablephysicalmemory[{#DISPLAYNAME}] Preprocessing: - JSONPATH: `$.monitorData['hudson.node_monitors.SwapSpaceMonitor'].availablePhysicalMemory` ⛔️ONFAIL: `DISCARD_VALUE ->`
Jenkins	Jenkins: Computer [{#DISPLAY_NAME}]: Available swap space	Available swap space in bytes.	DEPENDENT	jenkins.computer.availableswapspace[{#DISPLAYNAME}] Preprocessing: - JSONPATH: `$.monitorData['hudson.node_monitors.SwapSpaceMonitor'].availableSwapSpace` ⛔️ONFAIL: `DISCARD_VALUE ->`
Jenkins	Jenkins: Computer [{#DISPLAY_NAME}]: Total physical memory	Total physical memory of the system, in bytes.	DEPENDENT	jenkins.computer.totalphysicalmemory[{#DISPLAYNAME}] Preprocessing: - JSONPATH: `$.monitorData['hudson.node_monitors.SwapSpaceMonitor'].totalPhysicalMemory` ⛔️ONFAIL: `DISCARD_VALUE ->`
Jenkins	Jenkins: Computer [{#DISPLAY_NAME}]: Total swap space	Total number of swap space in bytes.	DEPENDENT	jenkins.computer.totalswapspace[{#DISPLAYNAME}] Preprocessing: - JSONPATH: `$.monitorData['hudson.node_monitors.SwapSpaceMonitor'].totalSwapSpace` ⛔️ONFAIL: `DISCARD_VALUE ->`
Jenkins	Jenkins: Computer [{#DISPLAY_NAME}]: Clock difference	The clock difference between the master and nodes.	DEPENDENT	jenkins.computer.clockdifference[{#DISPLAYNAME}] Preprocessing: - JSONPATH: `$.monitorData['hudson.node_monitors.ClockMonitor'].diff` ⛔️ON_FAIL: `DISCARD_VALUE ->` - MULTIPLIER: `0.001`
Zabbix raw items	Jenkins: Get service metrics	-	HTTP_AGENT	jenkins.getmetrics Preprocessing: - CHECKNOTSUPPORTED ⛔️ONFAIL: `DISCARD_VALUE ->`
Zabbix raw items	Jenkins: Get healthcheck		HTTP_AGENT	jenkins.healthcheck Preprocessing: - CHECKNOTSUPPORTED ⛔️ON_FAIL: `DISCARD_VALUE ->`
Zabbix raw items	Jenkins: Get jobs info	-	HTTP_AGENT	jenkins.jobinfo Preprocessing: - CHECKNOTSUPPORTED ⛔️ONFAIL: `DISCARD_VALUE ->`
Zabbix raw items	Jenkins: Get computer info	-	HTTP_AGENT	jenkins.computerinfo Preprocessing: - CHECKNOTSUPPORTED ⛔️ONFAIL: `DISCARD_VALUE ->`
Zabbix raw items	Jenkins: Get gauges	Raw items for gauges metrics.	DEPENDENT	jenkins.gauges.raw Preprocessing: - JSONPATH: `$.gauges`
Zabbix raw items	Jenkins: Get meters	Raw items for meters metrics.	DEPENDENT	jenkins.meters.raw Preprocessing: - JSONPATH: `$.meters`
Zabbix raw items	Jenkins: Get timers	Raw items for timers metrics.	DEPENDENT	jenkins.timers.raw Preprocessing: - JSONPATH: `$.timers`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Jenkins: Disk space is too low	Jenkins disk space monitors are reporting the disk space as less than the configured threshold. The message will reference the first node which fails this check. Health check message: {{ITEM.LASTVALUE2}.regsub("(.*)",\1)}	`last(/Jenkins by HTTP/jenkins.disk_space)=0 and length(last(/Jenkins by HTTP/jenkins.disk_space.message))>0`	WARNING
Jenkins: One or more Jenkins plugins failed to start	A failure is typically indicative of a potential issue within the Jenkins installation that will either be solved by explicitly disabling the failing plugin(s) or by resolving the corresponding plugin dependency issues. Health check message: {{ITEM.LASTVALUE2}.regsub("(.*)",\1)}	`last(/Jenkins by HTTP/jenkins.plugins)=0 and length(last(/Jenkins by HTTP/jenkins.plugins.message))>0`	INFO	Manual close: YES
Jenkins: Temporary space is too low	Jenkins temporary space monitors are reporting the temporary space as less than the configured threshold. The message will reference the first node which fails this check. Health check message: {{ITEM.LASTVALUE2}.regsub("(.*)",\1)}	`last(/Jenkins by HTTP/jenkins.temporary_space)=0 and length(last(/Jenkins by HTTP/jenkins.temporary_space.message))>0`	WARNING
Jenkins: There are deadlocked threads in Jenkins master JVM	There are any deadlocked threads in the Jenkins master JVM. Health check message: {{ITEM.LASTVALUE2}.regsub('(.*)',\1)}	`last(/Jenkins by HTTP/jenkins.thread_deadlock)=0 and length(last(/Jenkins by HTTP/jenkins.thread_deadlock.message))>0`	WARNING
Jenkins: Service has no online nodes	-	`last(/Jenkins by HTTP/jenkins.node.online)=0`	AVERAGE
Jenkins: Version has changed	The Jenkins version has changed. Perform Ack to close.	`last(/Jenkins by HTTP/jenkins.version,#1)<>last(/Jenkins by HTTP/jenkins.version,#2) and length(last(/Jenkins by HTTP/jenkins.version))>0`	INFO	Manual close: YES
Jenkins: Host has been restarted	Uptime is less than 10 minutes.	`last(/Jenkins by HTTP/jenkins.system.uptime)<10m`	INFO	Manual close: YES
Jenkins: Current number of used files is too high	-	`min(/Jenkins by HTTP/jenkins.descriptor.ratio,5m)>{$JENKINS.FILE_DESCRIPTORS.MAX.WARN}`	WARNING
Jenkins: Service is down	-	`last(/Jenkins by HTTP/jenkins.ping)=0`	AVERAGE	Manual close: YES
Jenkins job [{#NAME}]: Job is unhealthy	-	`last(/Jenkins by HTTP/jenkins.build.health[{#NAME}])<{$JENKINS.JOB.HEALTH.SCORE.MIN.WARN}`	WARNING	Manual close: YES
Jenkins: Computer [{#DISPLAY_NAME}]: Node is down	Node down with reason: {{ITEM.LASTVALUE2}.regsub("(.*)",\1)}	`last(/Jenkins by HTTP/jenkins.computer.state[{#DISPLAY_NAME}])=1 and length(last(/Jenkins by HTTP/jenkins.computer.offline.reason[{#DISPLAY_NAME}]))>0`	AVERAGE	Depends on: - Jenkins: Computer [{#DISPLAY_NAME}]: Node is temporarily offline - Jenkins: Service has no online nodes
Jenkins: Computer [{#DISPLAY_NAME}]: Node is temporarily offline	Node is temporarily Offline with reason: {{ITEM.LASTVALUE2}.regsub("(.*)",\1)}	`last(/Jenkins by HTTP/jenkins.computer.temp_offline[{#DISPLAY_NAME}])=1 and length(last(/Jenkins by HTTP/jenkins.computer.offline.reason[{#DISPLAY_NAME}]))>0`	INFO	Manual close: YES

Feedback

Please report any issues with the template at https://support.zabbix.com.

You can also provide feedback, discuss the template, or ask for help at ZABBIX forums.

app

app_imap_service

View README Download JSON

IMAP Service

Overview

For Zabbix version: 6.2 and higher

Setup

Refer to the vendor documentation.

Zabbix configuration

No specific Zabbix configuration is required.

Template links

There are no template links in this template.

Discovery rules

Items collected

Group	Name	Description	Type	Key and additional info
Services	IMAP service is running	-	SIMPLE	net.tcp.service[imap]

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
IMAP service is down on {HOST.NAME}	-	`max(/IMAP Service/net.tcp.service[imap],#3)=0`	AVERAGE

Feedback

Please report any issues with the template at https://support.zabbix.com

app

app_iis_agent_active

View README Download JSON

IIS by Zabbix agent active

Overview

For Zabbix version: 6.2 and higher
The template to monitor IIS (Internet Information Services) by Zabbix that works without any external scripts.

This template was tested on:

Windows Server, version 2012R2

Setup

You have to enable the following Windows Features (Control Panel > Programs and Features > Turn Windows features on or off) on your server

Web Server (IIS)
Web Server (IIS)\Management Tools\IIS Management Scripts and Tools

Optionally, it is possible to customize the template:

Set value for the macro {$IIS.QUEUE.MAX.WARN}, if you want to receive alerts when a number of requests in the application pool queue exceeds the threshold.
If you use a non-standard port for the IIS, don't forget to update the macros {$IIS.SERVICE} and {$IIS.PORT}.
Change the value of macro {$IIS.APPPOOL.MONITORED} to "0", if you want to disable all notifications about application pools state.
You can also add additional context macro {$IIS.APPPOOL.MONITORED:} for excluding specific application pools from monitoring.
Change regexp in the macros {$IIS.APPPOOL.MATCHES} and {$IIS.APPPOOL.NOT_MATCHES} used for filtering application pools discovery results.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$AGENT.TIMEOUT}	Timeout after which agent is considered unavailable.	`5m`
{$IIS.APPPOOL.MATCHES}	This macro is used in application pools discovery. Can be overridden on the host or linked template level.	`.+`
{$IIS.APPPOOL.MONITORED}	Monitoring status for discovered application pools. Use context to avoid trigger firing for specific application pools. "1" - enabled, "0" - disabled.	`1`
{$IIS.APPPOOL.NOT_MATCHES}	This macro is used in application pools discovery. Can be overridden on the host or linked template level.	`<CHANGE_IF_NEEDED>`
{$IIS.PORT}	Listening port.	`80`
{$IIS.QUEUE.MAX.TIME}	The time during which the queue length may exceed the threshold.	`5m`
{$IIS.QUEUE.MAX.WARN}	Maximum application pool's request queue length for trigger expression.	``
{$IIS.SERVICE}	The service (http/https/etc) for port check. See "net.tcp.service" documentation page for more information: https://www.zabbix.com/documentation/6.2/manual/config/items/itemtypes/simple_checks	`http`

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Application pools discovery

ZABBIX_ACTIVE

wmi.getall[root\webAdministration, select Name from ApplicationPool]

Filter:

AND

- {#APPPOOL} NOTMATCHESREGEX {$IIS.APPPOOL.NOT_MATCHES}

- {#APPPOOL} MATCHES_REGEX {$IIS.APPPOOL.MATCHES}

Items collected

Group	Name	Description	Type	Key and additional info
IIS	IIS: World Wide Web Publishing Service (W3SVC) state	The World Wide Web Publishing Service (W3SVC) provides web connectivity and administration of websites through the IIS snap-in. If the World Wide Web Publishing Service stops, the operating system cannot serve any form of web request. This service was dependent on "Windows Process Activation Service".	ZABBIX_ACTIVE	service.info[W3SVC] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
IIS	IIS: Windows Process Activation Service (WAS) state	Windows Process Activation Service (WAS) is a tool for managing worker processes that contain applications that host Windows Communication Foundation (WCF) services. Worker processes handle requests that are sent to a Web Server for specific application pools. Each application pool sets boundaries for the applications it contains.	ZABBIX_ACTIVE	service.info[WAS] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
IIS	IIS: {$IIS.PORT} port ping	-	SIMPLE	net.tcp.service[{$IIS.SERVICE},,{$IIS.PORT}] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
IIS	IIS: Active agent availability	Availability of active checks on the host. The value of this item corresponds to availability icons in the host list. Possible value: 0 - unknown 1 - available 2 - not available	INTERNAL	zabbix[host,active_agent,available]
IIS	IIS: Uptime	Service uptime in seconds.	ZABBIX_ACTIVE	perfcounteren["\Web Service(_Total)\Service Uptime"]
IIS	IIS: Bytes Received per second	The average rate per minute at which data bytes are received by the service at the Application Layer. Does not include protocol headers or control bytes.	ZABBIX_ACTIVE	perfcounteren["\Web Service(_Total)\Bytes Received/sec", 60]
IIS	IIS: Bytes Sent per second	The average rate per minute at which data bytes are sent by the service.	ZABBIX_ACTIVE	perfcounteren["\Web Service(_Total)\Bytes Sent/sec", 60]
IIS	IIS: Bytes Total per second	The average rate per minute of total bytes/sec transferred by the Web service (sum of bytes sent/sec and bytes received/sec).	ZABBIX_ACTIVE	perfcounteren["\Web Service(_Total)\Bytes Total/Sec", 60]
IIS	IIS: Current connections	The number of active connections.	ZABBIX_ACTIVE	perfcounteren["\Web Service(_Total)\Current Connections"]
IIS	IIS: Total connection attempts	The total number of connections to the Web or FTP service that have been attempted since service startup. The count is the total for all Web sites or FTP sites combined.	ZABBIX_ACTIVE	perfcounteren["\Web Service(Total)\Total Connection Attempts (all instances)"] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Connection attempts per second	The average rate per minute that connections using the Web service are being attempted. The count is the average for all Web sites combined.	ZABBIX_ACTIVE	perfcounteren["\Web Service(Total)\Connection Attempts/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Anonymous users per second	The number of requests from users over an anonymous connection per second. Average per minute.	ZABBIX_ACTIVE	perfcounteren["\Web Service(_Total)\Anonymous Users/sec", 60]
IIS	IIS: NonAnonymous users per second	The number of requests from users over a non-anonymous connection per second. Average per minute.	ZABBIX_ACTIVE	perfcounteren["\Web Service(_Total)\NonAnonymous Users/sec", 60]
IIS	IIS: Method GET requests per second	The rate of HTTP requests made using the GET method. GET requests are generally used for basic file retrievals or image maps, though they can be used with forms. Average per minute.	ZABBIX_ACTIVE	perfcounteren["\Web Service(Total)\Get Requests/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Method COPY requests per second	The rate of HTTP requests made using the COPY method. Copy requests are used for copying files and directories. Average per minute.	ZABBIX_ACTIVE	perfcounteren["\Web Service(Total)\Copy Requests/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Method CGI requests per second	The rate of CGI requests that are simultaneously being processed by the Web service. Average per minute.	ZABBIX_ACTIVE	perfcounteren["\Web Service(Total)\CGI Requests/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Method DELETE requests per second	The rate of HTTP requests using the DELETE method made. Average per minute.	ZABBIX_ACTIVE	perfcounteren["\Web Service(Total)\Delete Requests/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Method HEAD requests per second	The rate of HTTP requests using the HEAD method made. HEAD requests generally indicate a client is querying the state of a document they already have to see if it needs to be refreshed. Average per minute.	ZABBIX_ACTIVE	perfcounteren["\Web Service(Total)\Head Requests/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Method ISAPI requests per second	The rate of ISAPI Extension requests that are simultaneously being processed by the Web service. Average per minute.	ZABBIX_ACTIVE	perfcounteren["\Web Service(Total)\ISAPI Extension Requests/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Method LOCK requests per second	The rate of HTTP requests made using the LOCK method. Lock requests are used to lock a file for one user so that only that user can modify the file. Average per minute.	ZABBIX_ACTIVE	perfcounteren["\Web Service(Total)\Lock Requests/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Method MKCOL requests per second	The rate of HTTP requests using the MKCOL method made. Mkcol requests are used to create directories on the server. Average per minute.	ZABBIX_ACTIVE	perfcounteren["\Web Service(Total)\Mkcol Requests/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Method MOVE requests per second	The rate of HTTP requests using the MOVE method made. Move requests are used for moving files and directories. Average per minute.	ZABBIX_ACTIVE	perfcounteren["\Web Service(Total)\Move Requests/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Method OPTIONS requests per second	The rate of HTTP requests using the OPTIONS method made. Average per minute.	ZABBIX_ACTIVE	perfcounteren["\Web Service(Total)\Options Requests/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Method POST requests per second	Rate of HTTP requests using POST method. Generally used for forms or gateway requests. Average per minute.	ZABBIX_ACTIVE	perfcounteren["\Web Service(Total)\Post Requests/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Method PROPFIND requests per second	The rate of HTTP requests using the PROPFIND method made. Propfind requests retrieve property values on files and directories. Average per minute.	ZABBIX_ACTIVE	perfcounteren["\Web Service(Total)\Propfind Requests/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Method PROPPATCH requests per second	The rate of HTTP requests using the PROPPATCH method made. Proppatch requests set property values on files and directories. Average per minute.	ZABBIX_ACTIVE	perfcounteren["\Web Service(Total)\Proppatch Requests/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Method PUT requests per second	The rate of HTTP requests using the PUT method made. Average per minute.	ZABBIX_ACTIVE	perfcounteren["\Web Service(Total)\Put Requests/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Method MS-SEARCH requests per second	The rate of HTTP requests using the MS-SEARCH method made. Search requests are used to query the server to find resources that match a set of conditions provided by the client. Average per minute.	ZABBIX_ACTIVE	perfcounteren["\Web Service(Total)\Search Requests/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Method TRACE requests per second	The rate of HTTP requests using the TRACE method made. Average per minute.	ZABBIX_ACTIVE	perfcounteren["\Web Service(Total)\Trace Requests/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Method TRACE requests per second	The rate of HTTP requests using the UNLOCK method made. Unlock requests are used to remove locks from files. Average per minute.	ZABBIX_ACTIVE	perfcounteren["\Web Service(Total)\Unlock Requests/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Method Total requests per second	The rate of all HTTP requests received. Average per minute.	ZABBIX_ACTIVE	perfcounteren["\Web Service(Total)\Total Method Requests/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Method Total Other requests per second	Total Other Request Methods is the number of HTTP requests that are not OPTIONS, GET, HEAD, POST, PUT, DELETE, TRACE, MOVE, COPY, MKCOL, PROPFIND, PROPPATCH, SEARCH, LOCK or UNLOCK methods (since service startup). Average per minute.	ZABBIX_ACTIVE	perfcounteren["\Web Service(Total)\Other Request Methods/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Locked errors per second	The rate of errors due to requests that couldn't be satisfied by the server because the requested document was locked. These are generally reported as an HTTP 423 error code to the client. Average per minute.	ZABBIX_ACTIVE	perfcounteren["\Web Service(Total)\Locked Errors/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Not Found errors per second	The rate of errors due to requests that couldn't be satisfied by the server because the requested document could not be found. These are generally reported to the client with HTTP error code 404. Average per minute.	ZABBIX_ACTIVE	perfcounteren["\Web Service(Total)\Not Found Errors/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Files cache hits percentage	The ratio of user-mode file cache hits to total cache requests (since service startup). Note: This value might be low if the Kernel URI cache hits percentage is high.	ZABBIX_ACTIVE	perfcounteren["\Web Service Cache\File Cache Hits %"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
IIS	IIS: URIs cache hits percentage	The ratio of user-mode URI Cache Hits to total cache requests (since service startup)	ZABBIX_ACTIVE	perfcounteren["\Web Service Cache\URI Cache Hits %"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
IIS	IIS: File cache misses	The total number of unsuccessful lookups in the user-mode file cache since service startup.	ZABBIX_ACTIVE	perfcounteren["\Web Service Cache\File Cache Misses"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
IIS	IIS: URI cache misses	The total number of unsuccessful lookups in the user-mode URI cache since service startup.	ZABBIX_ACTIVE	perfcounteren["\Web Service Cache\URI Cache Misses"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
IIS	IIS: {#APPPOOL} Uptime	The web application uptime period since the last restart.	ZABBIX_ACTIVE	perfcounteren["\APPPOOLWAS({#APPPOOL})\Current Application Pool Uptime"]
IIS	IIS: AppPool {#APPPOOL} state	The state of the application pool.	ZABBIX_ACTIVE	perfcounteren["\APPPOOLWAS({#APPPOOL})\Current Application Pool State"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
IIS	IIS: AppPool {#APPPOOL} recycles	The number of times the application pool has been recycled since Windows Process Activation Service (WAS) started.	ZABBIX_ACTIVE	perfcounteren["\APPPOOLWAS({#APPPOOL})\Total Application Pool Recycles"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
IIS	IIS: AppPool {#APPPOOL} current queue size	The number of requests in the queue.	ZABBIX_ACTIVE	perfcounteren["\HTTP Service Request Queues({#APPPOOL})\CurrentQueueSize"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
IIS: The World Wide Web Publishing Service (W3SVC) is not running	The World Wide Web Publishing Service (W3SVC) is not in the running state. IIS cannot start.	`last(/IIS by Zabbix agent active/service.info[W3SVC])<>0`	HIGH	Depends on: - IIS: Windows process Activation Service (WAS) is not running
IIS: Windows process Activation Service (WAS) is not running	Windows Process Activation Service (WAS) is not in the running state. IIS cannot start.	`last(/IIS by Zabbix agent active/service.info[WAS])<>0`	HIGH
IIS: Port {$IIS.PORT} is down	-	`last(/IIS by Zabbix agent active/net.tcp.service[{$IIS.SERVICE},,{$IIS.PORT}])=0`	AVERAGE	Manual close: YES Depends on: - IIS: The World Wide Web Publishing Service (W3SVC) is not running
IIS: Zabbix agent: active checks are not available	Active checks are considered unavailable. Agent is not sending heartbeat for prolonged time.	`min(/IIS by Zabbix agent active/zabbix[host,active_agent,available],{$AGENT.TIMEOUT})=2`	HIGH
IIS: has been restarted	Uptime is less than 10 minutes.	`last(/IIS by Zabbix agent active/perf_counter_en["\Web Service(_Total)\Service Uptime"])<10m`	INFO	Manual close: YES
IIS: {#APPPOOL} has been restarted	Uptime is less than 10 minutes.	`last(/IIS by Zabbix agent active/perf_counter_en["\APP_POOL_WAS({#APPPOOL})\Current Application Pool Uptime"])<10m`	INFO	Manual close: YES
IIS: Application pool {#APPPOOL} is not in Running state	-	`last(/IIS by Zabbix agent active/perf_counter_en["\APP_POOL_WAS({#APPPOOL})\Current Application Pool State"])<>3 and {$IIS.APPPOOL.MONITORED:"{#APPPOOL}"}=1`	HIGH	Depends on: - IIS: The World Wide Web Publishing Service (W3SVC) is not running
IIS: Application pool {#APPPOOL} has been recycled	-	`last(/IIS by Zabbix agent active/perf_counter_en["\APP_POOL_WAS({#APPPOOL})\Total Application Pool Recycles"],#1)<>last(/IIS by Zabbix agent active/perf_counter_en["\APP_POOL_WAS({#APPPOOL})\Total Application Pool Recycles"],#2) and {$IIS.APPPOOL.MONITORED:"{#APPPOOL}"}=1`	INFO
IIS: Request queue of {#APPPOOL} is too large	-	`min(/IIS by Zabbix agent active/perf_counter_en["\HTTP Service Request Queues({#APPPOOL})\CurrentQueueSize"],{$IIS.QUEUE.MAX.TIME})>{$IIS.QUEUE.MAX.WARN}`	WARNING	Depends on: - IIS: Application pool {#APPPOOL} is not in Running state

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

app

app_iis_agent

View README Download JSON

IIS by Zabbix agent

Overview

For Zabbix version: 6.2 and higher
The template to monitor IIS (Internet Information Services) by Zabbix that works without any external scripts.

This template was tested on:

Windows Server, version 2012R2

Setup

You have to enable the following Windows Features (Control Panel > Programs and Features > Turn Windows features on or off) on your server

Web Server (IIS)
Web Server (IIS)\Management Tools\IIS Management Scripts and Tools

Optionally, it is possible to customize the template:

Set value for the macro {$IIS.QUEUE.MAX.WARN}, if you want to receive alerts when a number of requests in the application pool queue exceeds the threshold.
If you use a non-standard port for the IIS, don't forget to update the macros {$IIS.SERVICE} and {$IIS.PORT}.
Change the value of macro {$IIS.APPPOOL.MONITORED} to "0", if you want to disable all notifications about application pools state.
You can also add additional context macro {$IIS.APPPOOL.MONITORED:} for excluding specific application pools from monitoring.
Change regexp in the macros {$IIS.APPPOOL.MATCHES} and {$IIS.APPPOOL.NOT_MATCHES} used for filtering application pools discovery results.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$IIS.APPPOOL.MATCHES}	This macro is used in application pools discovery. Can be overridden on the host or linked template level.	`.+`
{$IIS.APPPOOL.MONITORED}	Monitoring status for discovered application pools. Use context to avoid trigger firing for specific application pools. "1" - enabled, "0" - disabled.	`1`
{$IIS.APPPOOL.NOT_MATCHES}	This macro is used in application pools discovery. Can be overridden on the host or linked template level.	`<CHANGE_IF_NEEDED>`
{$IIS.PORT}	Listening port.	`80`
{$IIS.QUEUE.MAX.TIME}	The time during which the queue length may exceed the threshold.	`5m`
{$IIS.QUEUE.MAX.WARN}	Maximum application pool's request queue length for trigger expression.	``
{$IIS.SERVICE}	The service (http/https/etc) for port check. See "net.tcp.service" documentation page for more information: https://www.zabbix.com/documentation/6.2/manual/config/items/itemtypes/simple_checks	`http`

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Application pools discovery

ZABBIX_PASSIVE

wmi.getall[root\webAdministration, select Name from ApplicationPool]

Filter:

AND

- {#APPPOOL} NOTMATCHESREGEX {$IIS.APPPOOL.NOT_MATCHES}

- {#APPPOOL} MATCHES_REGEX {$IIS.APPPOOL.MATCHES}

Items collected

Group	Name	Description	Type	Key and additional info
IIS	IIS: World Wide Web Publishing Service (W3SVC) state	The World Wide Web Publishing Service (W3SVC) provides web connectivity and administration of websites through the IIS snap-in. If the World Wide Web Publishing Service stops, the operating system cannot serve any form of web request. This service was dependent on "Windows Process Activation Service".	ZABBIX_PASSIVE	service.info[W3SVC] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
IIS	IIS: Windows Process Activation Service (WAS) state	Windows Process Activation Service (WAS) is a tool for managing worker processes that contain applications that host Windows Communication Foundation (WCF) services. Worker processes handle requests that are sent to a Web Server for specific application pools. Each application pool sets boundaries for the applications it contains.	ZABBIX_PASSIVE	service.info[WAS] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
IIS	IIS: {$IIS.PORT} port ping	-	SIMPLE	net.tcp.service[{$IIS.SERVICE},,{$IIS.PORT}] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
IIS	IIS: Uptime	Service uptime in seconds.	ZABBIX_PASSIVE	perfcounteren["\Web Service(_Total)\Service Uptime"]
IIS	IIS: Bytes Received per second	The average rate per minute at which data bytes are received by the service at the Application Layer. Does not include protocol headers or control bytes.	ZABBIX_PASSIVE	perfcounteren["\Web Service(_Total)\Bytes Received/sec", 60]
IIS	IIS: Bytes Sent per second	The average rate per minute at which data bytes are sent by the service.	ZABBIX_PASSIVE	perfcounteren["\Web Service(_Total)\Bytes Sent/sec", 60]
IIS	IIS: Bytes Total per second	The average rate per minute of total bytes/sec transferred by the Web service (sum of bytes sent/sec and bytes received/sec).	ZABBIX_PASSIVE	perfcounteren["\Web Service(_Total)\Bytes Total/Sec", 60]
IIS	IIS: Current connections	The number of active connections.	ZABBIX_PASSIVE	perfcounteren["\Web Service(_Total)\Current Connections"]
IIS	IIS: Total connection attempts	The total number of connections to the Web or FTP service that have been attempted since service startup. The count is the total for all Web sites or FTP sites combined.	ZABBIX_PASSIVE	perfcounteren["\Web Service(Total)\Total Connection Attempts (all instances)"] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Connection attempts per second	The average rate per minute that connections using the Web service are being attempted. The count is the average for all Web sites combined.	ZABBIX_PASSIVE	perfcounteren["\Web Service(Total)\Connection Attempts/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Anonymous users per second	The number of requests from users over an anonymous connection per second. Average per minute.	ZABBIX_PASSIVE	perfcounteren["\Web Service(_Total)\Anonymous Users/sec", 60]
IIS	IIS: NonAnonymous users per second	The number of requests from users over a non-anonymous connection per second. Average per minute.	ZABBIX_PASSIVE	perfcounteren["\Web Service(_Total)\NonAnonymous Users/sec", 60]
IIS	IIS: Method GET requests per second	The rate of HTTP requests made using the GET method. GET requests are generally used for basic file retrievals or image maps, though they can be used with forms. Average per minute.	ZABBIX_PASSIVE	perfcounteren["\Web Service(Total)\Get Requests/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Method COPY requests per second	The rate of HTTP requests made using the COPY method. Copy requests are used for copying files and directories. Average per minute.	ZABBIX_PASSIVE	perfcounteren["\Web Service(Total)\Copy Requests/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Method CGI requests per second	The rate of CGI requests that are simultaneously being processed by the Web service. Average per minute.	ZABBIX_PASSIVE	perfcounteren["\Web Service(Total)\CGI Requests/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Method DELETE requests per second	The rate of HTTP requests using the DELETE method made. Average per minute.	ZABBIX_PASSIVE	perfcounteren["\Web Service(Total)\Delete Requests/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Method HEAD requests per second	The rate of HTTP requests using the HEAD method made. HEAD requests generally indicate a client is querying the state of a document they already have to see if it needs to be refreshed. Average per minute.	ZABBIX_PASSIVE	perfcounteren["\Web Service(Total)\Head Requests/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Method ISAPI requests per second	The rate of ISAPI Extension requests that are simultaneously being processed by the Web service. Average per minute.	ZABBIX_PASSIVE	perfcounteren["\Web Service(Total)\ISAPI Extension Requests/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Method LOCK requests per second	The rate of HTTP requests made using the LOCK method. Lock requests are used to lock a file for one user so that only that user can modify the file. Average per minute.	ZABBIX_PASSIVE	perfcounteren["\Web Service(Total)\Lock Requests/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Method MKCOL requests per second	The rate of HTTP requests using the MKCOL method made. Mkcol requests are used to create directories on the server. Average per minute.	ZABBIX_PASSIVE	perfcounteren["\Web Service(Total)\Mkcol Requests/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Method MOVE requests per second	The rate of HTTP requests using the MOVE method made. Move requests are used for moving files and directories. Average per minute.	ZABBIX_PASSIVE	perfcounteren["\Web Service(Total)\Move Requests/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Method OPTIONS requests per second	The rate of HTTP requests using the OPTIONS method made. Average per minute.	ZABBIX_PASSIVE	perfcounteren["\Web Service(Total)\Options Requests/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Method POST requests per second	Rate of HTTP requests using POST method. Generally used for forms or gateway requests. Average per minute.	ZABBIX_PASSIVE	perfcounteren["\Web Service(Total)\Post Requests/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Method PROPFIND requests per second	The rate of HTTP requests using the PROPFIND method made. Propfind requests retrieve property values on files and directories. Average per minute.	ZABBIX_PASSIVE	perfcounteren["\Web Service(Total)\Propfind Requests/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Method PROPPATCH requests per second	The rate of HTTP requests using the PROPPATCH method made. Proppatch requests set property values on files and directories. Average per minute.	ZABBIX_PASSIVE	perfcounteren["\Web Service(Total)\Proppatch Requests/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Method PUT requests per second	The rate of HTTP requests using the PUT method made. Average per minute.	ZABBIX_PASSIVE	perfcounteren["\Web Service(Total)\Put Requests/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Method MS-SEARCH requests per second	The rate of HTTP requests using the MS-SEARCH method made. Search requests are used to query the server to find resources that match a set of conditions provided by the client. Average per minute.	ZABBIX_PASSIVE	perfcounteren["\Web Service(Total)\Search Requests/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Method TRACE requests per second	The rate of HTTP requests using the TRACE method made. Average per minute.	ZABBIX_PASSIVE	perfcounteren["\Web Service(Total)\Trace Requests/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Method TRACE requests per second	The rate of HTTP requests using the UNLOCK method made. Unlock requests are used to remove locks from files. Average per minute.	ZABBIX_PASSIVE	perfcounteren["\Web Service(Total)\Unlock Requests/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Method Total requests per second	The rate of all HTTP requests received. Average per minute.	ZABBIX_PASSIVE	perfcounteren["\Web Service(Total)\Total Method Requests/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Method Total Other requests per second	Total Other Request Methods is the number of HTTP requests that are not OPTIONS, GET, HEAD, POST, PUT, DELETE, TRACE, MOVE, COPY, MKCOL, PROPFIND, PROPPATCH, SEARCH, LOCK or UNLOCK methods (since service startup). Average per minute.	ZABBIX_PASSIVE	perfcounteren["\Web Service(Total)\Other Request Methods/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Locked errors per second	The rate of errors due to requests that couldn't be satisfied by the server because the requested document was locked. These are generally reported as an HTTP 423 error code to the client. Average per minute.	ZABBIX_PASSIVE	perfcounteren["\Web Service(Total)\Locked Errors/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Not Found errors per second	The rate of errors due to requests that couldn't be satisfied by the server because the requested document could not be found. These are generally reported to the client with HTTP error code 404. Average per minute.	ZABBIX_PASSIVE	perfcounteren["\Web Service(Total)\Not Found Errors/Sec", 60] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `10m`
IIS	IIS: Files cache hits percentage	The ratio of user-mode file cache hits to total cache requests (since service startup). Note: This value might be low if the Kernel URI cache hits percentage is high.	ZABBIX_PASSIVE	perfcounteren["\Web Service Cache\File Cache Hits %"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
IIS	IIS: URIs cache hits percentage	The ratio of user-mode URI Cache Hits to total cache requests (since service startup)	ZABBIX_PASSIVE	perfcounteren["\Web Service Cache\URI Cache Hits %"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
IIS	IIS: File cache misses	The total number of unsuccessful lookups in the user-mode file cache since service startup.	ZABBIX_PASSIVE	perfcounteren["\Web Service Cache\File Cache Misses"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
IIS	IIS: URI cache misses	The total number of unsuccessful lookups in the user-mode URI cache since service startup.	ZABBIX_PASSIVE	perfcounteren["\Web Service Cache\URI Cache Misses"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
IIS	IIS: {#APPPOOL} Uptime	The web application uptime period since the last restart.	ZABBIX_PASSIVE	perfcounteren["\APPPOOLWAS({#APPPOOL})\Current Application Pool Uptime"]
IIS	IIS: AppPool {#APPPOOL} state	The state of the application pool.	ZABBIX_PASSIVE	perfcounteren["\APPPOOLWAS({#APPPOOL})\Current Application Pool State"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
IIS	IIS: AppPool {#APPPOOL} recycles	The number of times the application pool has been recycled since Windows Process Activation Service (WAS) started.	ZABBIX_PASSIVE	perfcounteren["\APPPOOLWAS({#APPPOOL})\Total Application Pool Recycles"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
IIS	IIS: AppPool {#APPPOOL} current queue size	The number of requests in the queue.	ZABBIX_PASSIVE	perfcounteren["\HTTP Service Request Queues({#APPPOOL})\CurrentQueueSize"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
IIS: The World Wide Web Publishing Service (W3SVC) is not running	The World Wide Web Publishing Service (W3SVC) is not in the running state. IIS cannot start.	`last(/IIS by Zabbix agent/service.info[W3SVC])<>0`	HIGH	Depends on: - IIS: Windows process Activation Service (WAS) is not running
IIS: Windows process Activation Service (WAS) is not running	Windows Process Activation Service (WAS) is not in the running state. IIS cannot start.	`last(/IIS by Zabbix agent/service.info[WAS])<>0`	HIGH
IIS: Port {$IIS.PORT} is down	-	`last(/IIS by Zabbix agent/net.tcp.service[{$IIS.SERVICE},,{$IIS.PORT}])=0`	AVERAGE	Manual close: YES Depends on: - IIS: The World Wide Web Publishing Service (W3SVC) is not running
IIS: has been restarted	Uptime is less than 10 minutes.	`last(/IIS by Zabbix agent/perf_counter_en["\Web Service(_Total)\Service Uptime"])<10m`	INFO	Manual close: YES
IIS: {#APPPOOL} has been restarted	Uptime is less than 10 minutes.	`last(/IIS by Zabbix agent/perf_counter_en["\APP_POOL_WAS({#APPPOOL})\Current Application Pool Uptime"])<10m`	INFO	Manual close: YES
IIS: Application pool {#APPPOOL} is not in Running state	-	`last(/IIS by Zabbix agent/perf_counter_en["\APP_POOL_WAS({#APPPOOL})\Current Application Pool State"])<>3 and {$IIS.APPPOOL.MONITORED:"{#APPPOOL}"}=1`	HIGH	Depends on: - IIS: The World Wide Web Publishing Service (W3SVC) is not running
IIS: Application pool {#APPPOOL} has been recycled	-	`last(/IIS by Zabbix agent/perf_counter_en["\APP_POOL_WAS({#APPPOOL})\Total Application Pool Recycles"],#1)<>last(/IIS by Zabbix agent/perf_counter_en["\APP_POOL_WAS({#APPPOOL})\Total Application Pool Recycles"],#2) and {$IIS.APPPOOL.MONITORED:"{#APPPOOL}"}=1`	INFO
IIS: Request queue of {#APPPOOL} is too large	-	`min(/IIS by Zabbix agent/perf_counter_en["\HTTP Service Request Queues({#APPPOOL})\CurrentQueueSize"],{$IIS.QUEUE.MAX.TIME})>{$IIS.QUEUE.MAX.WARN}`	WARNING	Depends on: - IIS: Application pool {#APPPOOL} is not in Running state

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

app

app_https_service

View README Download JSON

HTTPS Service

Overview

For Zabbix version: 6.2 and higher

Setup

Refer to the vendor documentation.

Zabbix configuration

No specific Zabbix configuration is required.

Template links

There are no template links in this template.

Discovery rules

Items collected

Group	Name	Description	Type	Key and additional info
Services	HTTPS service is running	-	SIMPLE	net.tcp.service[https]

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
HTTPS service is down on {HOST.NAME}	-	`max(/HTTPS Service/net.tcp.service[https],#3)=0`	AVERAGE

Feedback

Please report any issues with the template at https://support.zabbix.com

app

app_http_service

View README Download JSON

HTTP Service

Overview

For Zabbix version: 6.2 and higher

Setup

Refer to the vendor documentation.

Zabbix configuration

No specific Zabbix configuration is required.

Template links

There are no template links in this template.

Discovery rules

Items collected

Group	Name	Description	Type	Key and additional info
Services	HTTP service is running	-	SIMPLE	net.tcp.service[http]

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
HTTP service is down on {HOST.NAME}	-	`max(/HTTP Service/net.tcp.service[http],#3)=0`	AVERAGE

Feedback

Please report any issues with the template at https://support.zabbix.com

app

app_haproxy_http

View README Download JSON

HAProxy by HTTP

Overview

For Zabbix version: 6.2 and higher
The template to monitor HAProxy by Zabbix that works without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

Template HAProxy by HTTP collects metrics by polling HAProxy Stats Page with HTTP agent remotely.

Note that this solution supports https and redirects.

This template was tested on:

HAProxy, version 1.8

Setup

Setup HAProxy Stats Page.

Example configuration of HAProxy:

frontend stats
    bind *:8404
    stats enable
    stats uri /stats
    stats refresh 10s
    #stats auth Username:Password  # Authentication credentials

If you use another location, don't forget to change the macros {$HAPROXY.STATS.SCHEME},{HOST.CONN}, {$HAPROXY.STATS.PORT},{$HAPROXY.STATS.PATH}.

If you want to use authentication, set the username and password in the "stats auth" option of the configuration file and in the macros {$HAPROXY.USERNAME},{$HAPROXY.PASSWORD}.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$HAPROXY.BACK_ERESP.MAX.WARN}	Maximum of responses with error on Backend for trigger expression.	`10`
{$HAPROXY.BACK_QCUR.MAX.WARN}	Maximum number of requests on Backend unassigned in queue for trigger expression.	`10`
{$HAPROXY.BACK_QTIME.MAX.WARN}	Maximum of average time spent in queue on Backend for trigger expression.	`10s`
{$HAPROXY.BACK_RTIME.MAX.WARN}	Maximum of average Backend response time for trigger expression.	`10s`
{$HAPROXY.FRONT_DREQ.MAX.WARN}	The HAProxy maximum denied requests for trigger expression.	`10`
{$HAPROXY.FRONT_EREQ.MAX.WARN}	The HAProxy maximum number of request errors for trigger expression.	`10`
{$HAPROXY.FRONT_SUTIL.MAX.WARN}	Maximum of session usage percentage on frontend for trigger expression.	`80`
{$HAPROXY.PASSWORD}	The password of the HAProxy stats page.	``
{$HAPROXY.RESPONSE_TIME.MAX.WARN}	The HAProxy stats page maximum response time in seconds for trigger expression.	`10s`
{$HAPROXY.SERVER_ERESP.MAX.WARN}	Maximum of responses with error on server for trigger expression.	`10`
{$HAPROXY.SERVER_QCUR.MAX.WARN}	Maximum number of requests on server unassigned in queue for trigger expression.	`10`
{$HAPROXY.SERVER_QTIME.MAX.WARN}	Maximum of average time spent in queue on server for trigger expression.	`10s`
{$HAPROXY.SERVER_RTIME.MAX.WARN}	Maximum of average server response time for trigger expression.	`10s`
{$HAPROXY.STATS.PATH}	The path of the HAProxy stats page.	`stats`
{$HAPROXY.STATS.PORT}	The port of the HAProxy stats host or container.	`8404`
{$HAPROXY.STATS.SCHEME}	The scheme of HAProxy stats page(http/https).	`http`
{$HAPROXY.USERNAME}	The username of the HAProxy stats page.	``

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Backend discovery

Discovery backends

DEPENDENT

haproxy.backend.discovery

Filter:

AND

- {#SVNAME} MATCHESREGEX BACKEND

- {#MODE} MATCHESREGEX `http

tcp**Overrides:**Discard HTTP status codes - {#MODE} MATCHES_REGEXtcp - ITEM_PROTOTYPE LIKENumber of responses with codes` - NO_DISCOVER

Frontend discovery

Discovery frontends

DEPENDENT

haproxy.frontend.discovery

Filter:

AND

- {#SVNAME} MATCHESREGEX FRONTEND

- {#MODE} MATCHESREGEX `http

tcp**Overrides:**Discard HTTP status codes - {#MODE} MATCHES_REGEXtcp - ITEM_PROTOTYPE LIKENumber of responses with codes` - NO_DISCOVER

Server discovery

Discovery servers

DEPENDENT

haproxy.server.discovery

Filter:

AND

- {#SVNAME} NOTMATCHESREGEX `FRONTEND

BACKEND- {#MODE} MATCHES_REGEXhttp

tcp**Overrides:**Discard HTTP status codes - {#MODE} MATCHES_REGEXtcp - ITEM_PROTOTYPE LIKENumber of responses with codes` - NO_DISCOVER

Items collected

Group	Name	Description	Type	Key and additional info
HAProxy	HAProxy: Version	-	DEPENDENT	haproxy.version Preprocessing: - REGEX: `HAProxy version ([^,]), \1` ⛔️ONFAIL: `CUSTOM_ERROR -> HAProxy version is not found` - DISCARD*UNCHANGED_HEARTBEAT: `1d`
HAProxy	HAProxy: Uptime	-	DEPENDENT	haproxy.uptime Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.`
HAProxy	HAProxy: Service status	-	SIMPLE	net.tcp.service["{$HAPROXY.STATS.SCHEME}","{HOST.CONN}","{$HAPROXY.STATS.PORT}"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
HAProxy	HAProxy: Service response time	-	SIMPLE	net.tcp.service.perf["{$HAPROXY.STATS.SCHEME}","{HOST.CONN}","{$HAPROXY.STATS.PORT}"]
HAProxy	HAProxy Backend {#PXNAME}: Status	Possible values: UP - The server is reporting as healthy. DOWN - The server is reporting as unhealthy and unable to receive requests. NOLB - You've added http-check disable-on-404 to the backend and the health checked URL has returned an HTTP 404 response. MAINT - The server has been disabled or put into maintenance mode. DRAIN - The server has been put into drain mode. no check - Health checks are not enabled for this server.	DEPENDENT	haproxy.backend.status[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].status.first()` - DISCARDUNCHANGEDHEARTBEAT: `10m`
HAProxy	HAProxy Backend {#PXNAME}: Responses time	Average backend response time (in ms) for the last 1,024 requests	DEPENDENT	haproxy.backend.rtime[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].rtime.first()` - MULTIPLIER: `0.001`
HAProxy	HAProxy Backend {#PXNAME}: Errors connection per second	Number of requests that encountered an error attempting to connect to a backend server.	DEPENDENT	haproxy.backend.econ.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].econ.first()` - CHANGEPERSECOND
HAProxy	HAProxy Backend {#PXNAME}: Responses denied per second	Responses denied due to security concerns (ACL-restricted).	DEPENDENT	haproxy.backend.dresp.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].dresp.first()` - CHANGEPERSECOND
HAProxy	HAProxy Backend {#PXNAME}: Response errors per second	Number of requests whose responses yielded an error	DEPENDENT	haproxy.backend.eresp.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].eresp.first()` - CHANGEPERSECOND
HAProxy	HAProxy Backend {#PXNAME}: Unassigned requests	Current number of requests unassigned in queue.	DEPENDENT	haproxy.backend.qcur[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].qcur.first()`
HAProxy	HAProxy Backend {#PXNAME}: Time in queue	Average time spent in queue (in ms) for the last 1,024 requests	DEPENDENT	haproxy.backend.qtime[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].qtime.first()` - MULTIPLIER: `0.001`
HAProxy	HAProxy Backend {#PXNAME}: Redispatched requests per second	Number of times a request was redispatched to a different backend.	DEPENDENT	haproxy.backend.wredis.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].wredis.first()` - CHANGEPERSECOND
HAProxy	HAProxy Backend {#PXNAME}: Retried connections per second	Number of times a connection was retried.	DEPENDENT	haproxy.backend.wretr.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].wretr.first()` - CHANGEPERSECOND
HAProxy	HAProxy Backend {#PXNAME}: Number of responses with codes 1xx per second	Number of informational HTTP responses per second.	DEPENDENT	haproxy.backend.hrsp1xx.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].hrsp_1xx.first()` - CHANGEPER_SECOND
HAProxy	HAProxy Backend {#PXNAME}: Number of responses with codes 2xx per second	Number of successful HTTP responses per second.	DEPENDENT	haproxy.backend.hrsp2xx.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].hrsp_2xx.first()` - CHANGEPER_SECOND
HAProxy	HAProxy Backend {#PXNAME}: Number of responses with codes 3xx per second	Number of HTTP redirections per second.	DEPENDENT	haproxy.backend.hrsp3xx.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].hrsp_3xx.first()` - CHANGEPER_SECOND
HAProxy	HAProxy Backend {#PXNAME}: Number of responses with codes 4xx per second	Number of HTTP client errors per second.	DEPENDENT	haproxy.backend.hrsp4xx.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].hrsp_4xx.first()` - CHANGEPER_SECOND
HAProxy	HAProxy Backend {#PXNAME}: Number of responses with codes 5xx per second	Number of HTTP server errors per second.	DEPENDENT	haproxy.backend.hrsp5xx.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].hrsp_5xx.first()` - CHANGEPER_SECOND
HAProxy	HAProxy Backend {#PXNAME}: Incoming traffic	Number of bits received by the backend	DEPENDENT	haproxy.backend.bin.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].bin.first()` - MULTIPLIER: `8` - CHANGEPERSECOND
HAProxy	HAProxy Backend {#PXNAME}: Outgoing traffic	Number of bits sent by the backend	DEPENDENT	haproxy.backend.bout.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].bout.first()` - MULTIPLIER: `8` - CHANGEPERSECOND
HAProxy	HAProxy Backend {#PXNAME}: Number of active servers	Number of active servers.	DEPENDENT	haproxy.backend.act[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].act.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
HAProxy	HAProxy Backend {#PXNAME}: Number of backup servers	Number of backup servers.	DEPENDENT	haproxy.backend.bck[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].bck.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
HAProxy	HAProxy Backend {#PXNAME}: Sessions per second	Cumulative number of sessions (end-to-end connections) per second.	DEPENDENT	haproxy.backend.stot.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].stot.first()` - CHANGEPERSECOND
HAProxy	HAProxy Backend {#PXNAME}: Weight	Total effective weight.	DEPENDENT	haproxy.backend.weight[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].weight.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
HAProxy	HAProxy Frontend {#PXNAME}: Status	Possible values: OPEN, STOP. When Status is OPEN, the frontend is operating normally and ready to receive traffic.	DEPENDENT	haproxy.frontend.status[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].status.first()` - DISCARDUNCHANGEDHEARTBEAT: `6h`
HAProxy	HAProxy Frontend {#PXNAME}: Requests rate	HTTP requests per second	DEPENDENT	haproxy.frontend.req_rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].req_rate.first()`
HAProxy	HAProxy Frontend {#PXNAME}: Sessions rate	Number of sessions created per second	DEPENDENT	haproxy.frontend.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].rate.first()`
HAProxy	HAProxy Frontend {#PXNAME}: Established sessions	The current number of established sessions.	DEPENDENT	haproxy.frontend.scur[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].scur.first()`
HAProxy	HAProxy Frontend {#PXNAME}: Session limits	The most simultaneous sessions that are allowed, as defined by the maxconn setting in the frontend.	DEPENDENT	haproxy.frontend.slim[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].slim.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
HAProxy	HAProxy Frontend {#PXNAME}: Session utilization	Percentage of sessions used (scur / slim * 100).	CALCULATED	haproxy.frontend.sutil[{#PXNAME},{#SVNAME}] Expression: `last(//haproxy.frontend.scur[{#PXNAME},{#SVNAME}]) / last(//haproxy.frontend.slim[{#PXNAME},{#SVNAME}]) * 100`
HAProxy	HAProxy Frontend {#PXNAME}: Request errors per second	Number of request errors per second.	DEPENDENT	haproxy.frontend.ereq.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].ereq.first()` - CHANGEPERSECOND
HAProxy	HAProxy Frontend {#PXNAME}: Denied requests per second	Requests denied due to security concerns (ACL-restricted) per second.	DEPENDENT	haproxy.frontend.dreq.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].dreq.first()` - CHANGEPERSECOND
HAProxy	HAProxy Frontend {#PXNAME}: Number of responses with codes 1xx per second	Number of informational HTTP responses per second.	DEPENDENT	haproxy.frontend.hrsp1xx.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].hrsp_1xx.first()` - CHANGEPER_SECOND
HAProxy	HAProxy Frontend {#PXNAME}: Number of responses with codes 2xx per second	Number of successful HTTP responses per second.	DEPENDENT	haproxy.frontend.hrsp2xx.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].hrsp_2xx.first()` - CHANGEPER_SECOND
HAProxy	HAProxy Frontend {#PXNAME}: Number of responses with codes 3xx per second	Number of HTTP redirections per second.	DEPENDENT	haproxy.frontend.hrsp3xx.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].hrsp_3xx.first()` - CHANGEPER_SECOND
HAProxy	HAProxy Frontend {#PXNAME}: Number of responses with codes 4xx per second	Number of HTTP client errors per second.	DEPENDENT	haproxy.frontend.hrsp4xx.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].hrsp_4xx.first()` - CHANGEPER_SECOND
HAProxy	HAProxy Frontend {#PXNAME}: Number of responses with codes 5xx per second	Number of HTTP server errors per second.	DEPENDENT	haproxy.frontend.hrsp5xx.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].hrsp_5xx.first()` - CHANGEPER_SECOND
HAProxy	HAProxy Frontend {#PXNAME}: Incoming traffic	Number of bits received by the frontend	DEPENDENT	haproxy.frontend.bin.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].bin.first()` - MULTIPLIER: `8` - CHANGEPERSECOND
HAProxy	HAProxy Frontend {#PXNAME}: Outgoing traffic	Number of bits sent by the frontend	DEPENDENT	haproxy.frontend.bout.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].bout.first()` - MULTIPLIER: `8` - CHANGEPERSECOND
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Status		DEPENDENT	haproxy.server.status[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].status.first()` - DISCARDUNCHANGEDHEARTBEAT: `10m`
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Responses time	Average server response time (in ms) for the last 1,024 requests.	DEPENDENT	haproxy.server.rtime[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].rtime.first()` - MULTIPLIER: `0.001`
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Errors connection per second	Number of requests that encountered an error attempting to connect to a backend server.	DEPENDENT	haproxy.server.econ.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].econ.first()` - CHANGEPERSECOND
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Responses denied per second	Responses denied due to security concerns (ACL-restricted).	DEPENDENT	haproxy.server.dresp.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].dresp.first()` - CHANGEPERSECOND
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Response errors per second	Number of requests whose responses yielded an error.	DEPENDENT	haproxy.server.eresp.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].eresp.first()` - CHANGEPERSECOND
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Unassigned requests	Current number of requests unassigned in queue.	DEPENDENT	haproxy.server.qcur[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].qcur.first()`
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Time in queue	Average time spent in queue (in ms) for the last 1,024 requests.	DEPENDENT	haproxy.server.qtime[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].qtime.first()` - MULTIPLIER: `0.001`
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Redispatched requests per second	Number of times a request was redispatched to a different backend.	DEPENDENT	haproxy.server.wredis.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].wredis.first()` - CHANGEPERSECOND
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Retried connections per second	Number of times a connection was retried.	DEPENDENT	haproxy.server.wretr.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].wretr.first()` - CHANGEPERSECOND
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Number of responses with codes 1xx per second	Number of informational HTTP responses per second.	DEPENDENT	haproxy.server.hrsp1xx.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].hrsp_1xx.first()` - CHANGEPER_SECOND
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Number of responses with codes 2xx per second	Number of successful HTTP responses per second.	DEPENDENT	haproxy.server.hrsp2xx.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].hrsp_2xx.first()` - CHANGEPER_SECOND
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Number of responses with codes 3xx per second	Number of HTTP redirections per second.	DEPENDENT	haproxy.server.hrsp3xx.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].hrsp_3xx.first()` - CHANGEPER_SECOND
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Number of responses with codes 4xx per second	Number of HTTP client errors per second.	DEPENDENT	haproxy.server.hrsp4xx.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].hrsp_4xx.first()` - CHANGEPER_SECOND
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Number of responses with codes 5xx per second	Number of HTTP server errors per second.	DEPENDENT	haproxy.server.hrsp5xx.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].hrsp_5xx.first()` - CHANGEPER_SECOND
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Incoming traffic	Number of bits received by the backend	DEPENDENT	haproxy.server.bin.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].bin.first()` - MULTIPLIER: `8` - CHANGEPERSECOND
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Outgoing traffic	Number of bits sent by the backend	DEPENDENT	haproxy.server.bout.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].bout.first()` - MULTIPLIER: `8` - CHANGEPERSECOND
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Server is active	Shows whether the server is active (marked with a Y) or a backup (marked with a -).	DEPENDENT	haproxy.server.act[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].act.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Server is backup	Shows whether the server is a backup (marked with a Y) or active (marked with a -).	DEPENDENT	haproxy.server.bck[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].bck.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Sessions per second	Cumulative number of sessions (end-to-end connections) per second.	DEPENDENT	haproxy.server.stot.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].stot.first()` - CHANGEPERSECOND
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Weight	Effective weight.	DEPENDENT	haproxy.server.weight[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].weight.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Configured maxqueue	Configured maxqueue for the server, or nothing in the value is 0 (default, meaning no limit).	DEPENDENT	haproxy.server.qlimit[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].qlimit.first()` - DISCARDUNCHANGEDHEARTBEAT: `6h` - MATCHESREGEX: `^\d+$` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Server was selected per second	Number of times that server was selected.	DEPENDENT	haproxy.server.lbtot.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].lbtot.first()` - CHANGEPERSECOND
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Status of last health check	Status of last health check, one of: UNK -> unknown INI -> initializing SOCKERR -> socket error L4OK -> check passed on layer 4, no upper layers testing enabled L4TOUT -> layer 1-4 timeout L4CON -> layer 1-4 connection problem, for example "Connection refused" (tcp rst) or "No route to host" (icmp) L6OK -> check passed on layer 6 L6TOUT -> layer 6 (SSL) timeout L6RSP -> layer 6 invalid response - protocol error L7OK -> check passed on layer 7 L7OKC -> check conditionally passed on layer 7, for example 404 with disable-on-404 L7TOUT -> layer 7 (HTTP/SMTP) timeout L7RSP -> layer 7 invalid response - protocol error L7STS -> layer 7 response error, for example HTTP 5xx Notice: If a check is currently running, the last known status will be reported, prefixed with "* ". e. g. "* L7OK".	DEPENDENT	haproxy.server.checkstatus[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].check_status.first()` - DISCARDUNCHANGED_HEARTBEAT: `10m`
Zabbix raw items	HAProxy: Get stats	HAProxy Statistics Report in CSV format	HTTP_AGENT	haproxy.get Preprocessing: - REGEX: `# ([\s\S]*)\n \1` - CSVTOJSON: `1`
Zabbix raw items	HAProxy: Get nodes	Array for LLD rules.	DEPENDENT	haproxy.get.nodes Preprocessing: - JAVASCRIPT: `return JSON.stringify(JSON.parse(value),['mode','pxname','svname'])` - DISCARDUNCHANGEDHEARTBEAT: `1h`
Zabbix raw items	HAProxy: Get stats page	HAProxy Statistics Report HTML	HTTP_AGENT	haproxy.get_html

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
HAProxy: Version has changed	HAProxy version has changed. Ack to close.	`last(/HAProxy by HTTP/haproxy.version,#1)<>last(/HAProxy by HTTP/haproxy.version,#2) and length(last(/HAProxy by HTTP/haproxy.version))>0`	INFO	Manual close: YES
HAProxy: has been restarted	Uptime is less than 10 minutes	`last(/HAProxy by HTTP/haproxy.uptime)<10m`	INFO	Manual close: YES
HAProxy: Service is down	-	`last(/HAProxy by HTTP/net.tcp.service["{$HAPROXY.STATS.SCHEME}","{HOST.CONN}","{$HAPROXY.STATS.PORT}"])=0`	AVERAGE	Manual close: YES
HAProxy: Service response time is too high	-	`min(/HAProxy by HTTP/net.tcp.service.perf["{$HAPROXY.STATS.SCHEME}","{HOST.CONN}","{$HAPROXY.STATS.PORT}"],5m)>{$HAPROXY.RESPONSE_TIME.MAX.WARN}`	WARNING	Manual close: YES Depends on: - HAProxy: Service is down
HAProxy backend {#PXNAME}: Server is DOWN	Backend is not available.	`count(/HAProxy by HTTP/haproxy.backend.status[{#PXNAME},{#SVNAME}],#5,"eq","DOWN")=5`	AVERAGE
HAProxy backend {#PXNAME}: Average response time is high	Average backend response time (in ms) for the last 1,024 requests is more than {$HAPROXY.BACK_RTIME.MAX.WARN}.	`min(/HAProxy by HTTP/haproxy.backend.rtime[{#PXNAME},{#SVNAME}],5m)>{$HAPROXY.BACK_RTIME.MAX.WARN}`	WARNING
HAProxy backend {#PXNAME}: Number of responses with error is high	Number of requests on backend, whose responses yielded an error, is more than {$HAPROXY.BACK_ERESP.MAX.WARN}.	`min(/HAProxy by HTTP/haproxy.backend.eresp.rate[{#PXNAME},{#SVNAME}],5m)>{$HAPROXY.BACK_ERESP.MAX.WARN}`	WARNING
HAProxy backend {#PXNAME}: Current number of requests unassigned in queue is high	Current number of requests on backend unassigned in queue is more than {$HAPROXY.BACK_QCUR.MAX.WARN}.	`min(/HAProxy by HTTP/haproxy.backend.qcur[{#PXNAME},{#SVNAME}],5m)>{$HAPROXY.BACK_QCUR.MAX.WARN}`	WARNING
HAProxy backend {#PXNAME}: Average time spent in queue is high	Average time spent in queue (in ms) for the last 1,024 requests is more than {$HAPROXY.BACK_QTIME.MAX.WARN}.	`min(/HAProxy by HTTP/haproxy.backend.qtime[{#PXNAME},{#SVNAME}],5m)>{$HAPROXY.BACK_QTIME.MAX.WARN}`	WARNING
HAProxy frontend {#PXNAME}: Session utilization is high	Alerting on this metric is essential to ensure your server has sufficient capacity to handle all concurrent sessions. Unlike requests, upon reaching the session limit HAProxy will deny additional clients until resource consumption drops. Furthermore, if you find your session usage percentage to be hovering above 80%, it could be time to either modify HAProxy's configuration to allow more sessions, or migrate your HAProxy server to a bigger box.	`min(/HAProxy by HTTP/haproxy.frontend.sutil[{#PXNAME},{#SVNAME}],5m)>{$HAPROXY.FRONT_SUTIL.MAX.WARN}`	WARNING
HAProxy frontend {#PXNAME}: Number of request errors is high	Number of request errors is more than {$HAPROXY.FRONT_EREQ.MAX.WARN}.	`min(/HAProxy by HTTP/haproxy.frontend.ereq.rate[{#PXNAME},{#SVNAME}],5m)>{$HAPROXY.FRONT_EREQ.MAX.WARN}`	WARNING
HAProxy frontend {#PXNAME}: Number of requests denied is high	Number of requests denied due to security concerns (ACL-restricted) is more than {$HAPROXY.FRONT_DREQ.MAX.WARN}.	`min(/HAProxy by HTTP/haproxy.frontend.dreq.rate[{#PXNAME},{#SVNAME}],5m)>{$HAPROXY.FRONT_DREQ.MAX.WARN}`	WARNING
HAProxy {#PXNAME} {#SVNAME}: Server is DOWN	Server is not available.	`count(/HAProxy by HTTP/haproxy.server.status[{#PXNAME},{#SVNAME}],#5,"eq","DOWN")=5`	WARNING
HAProxy {#PXNAME} {#SVNAME}: Average response time is high	Average server response time (in ms) for the last 1,024 requests is more than {$HAPROXY.SERVER_RTIME.MAX.WARN}.	`min(/HAProxy by HTTP/haproxy.server.rtime[{#PXNAME},{#SVNAME}],5m)>{$HAPROXY.SERVER_RTIME.MAX.WARN}`	WARNING
HAProxy {#PXNAME} {#SVNAME}: Number of responses with error is high	Number of requests on server, whose responses yielded an error, is more than {$HAPROXY.SERVER_ERESP.MAX.WARN}.	`min(/HAProxy by HTTP/haproxy.server.eresp.rate[{#PXNAME},{#SVNAME}],5m)>{$HAPROXY.SERVER_ERESP.MAX.WARN}`	WARNING
HAProxy {#PXNAME} {#SVNAME}: Current number of requests unassigned in queue is high	Current number of requests unassigned in queue is more than {$HAPROXY.SERVER_QCUR.MAX.WARN}.	`min(/HAProxy by HTTP/haproxy.server.qcur[{#PXNAME},{#SVNAME}],5m)>{$HAPROXY.SERVER_QCUR.MAX.WARN}`	WARNING
HAProxy {#PXNAME} {#SVNAME}: Average time spent in queue is high	Average time spent in queue (in ms) for the last 1,024 requests is more than {$HAPROXY.SERVER_QTIME.MAX.WARN}.	`min(/HAProxy by HTTP/haproxy.server.qtime[{#PXNAME},{#SVNAME}],5m)>{$HAPROXY.SERVER_QTIME.MAX.WARN}`	WARNING
HAProxy {#PXNAME} {#SVNAME}: Health check error	Please check the server for faults.	`find(/HAProxy by HTTP/haproxy.server.check_status[{#PXNAME},{#SVNAME}],#3,"regexp","(?:L[4-7]OK	^$)")=0`	WARNING	Depends on: - HAProxy {#PXNAME} {#SVNAME}: Server is DOWN

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

app

app_haproxy_agent

View README Download JSON

HAProxy by Zabbix agent

Overview

Template HAProxy by Zabbix agent collects metrics by polling HAProxy Stats Page with Zabbix agent.

Note that this solution supports https and redirects.

This template was tested on:

HAProxy, version 1.8

Setup

Setup HAProxy Stats Page.

Example configuration of HAProxy:

frontend stats
    bind *:8404
    stats enable
    stats uri /stats
    stats refresh 10s

If you use another location, don't forget to change the macros {$HAPROXY.STATS.SCHEME},{$HAPROXY.STATS.PORT},{$HAPROXY.STATS.PATH}.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$HAPROXY.BACK_ERESP.MAX.WARN}	Maximum of responses with error on BACKEND for trigger expression.	`10`
{$HAPROXY.BACK_QCUR.MAX.WARN}	Maximum number of requests on BACKEND unassigned in queue for trigger expression.	`10`
{$HAPROXY.BACK_QTIME.MAX.WARN}	Maximum of average time spent in queue on BACKEND for trigger expression.	`10s`
{$HAPROXY.BACK_RTIME.MAX.WARN}	Maximum of average BACKEND response time for trigger expression.	`10s`
{$HAPROXY.FRONT_DREQ.MAX.WARN}	The HAProxy maximum denied requests for trigger expression.	`10`
{$HAPROXY.FRONT_EREQ.MAX.WARN}	The HAProxy maximum number of request errors for trigger expression.	`10`
{$HAPROXY.FRONT_SUTIL.MAX.WARN}	Maximum of session usage percentage on frontend for trigger expression.	`80`
{$HAPROXY.RESPONSE_TIME.MAX.WARN}	The HAProxy stats page maximum response time in seconds for trigger expression.	`10s`
{$HAPROXY.SERVER_ERESP.MAX.WARN}	Maximum of responses with error on server for trigger expression.	`10`
{$HAPROXY.SERVER_QCUR.MAX.WARN}	Maximum number of requests on server unassigned in queue for trigger expression.	`10`
{$HAPROXY.SERVER_QTIME.MAX.WARN}	Maximum of average time spent in queue on server for trigger expression.	`10s`
{$HAPROXY.SERVER_RTIME.MAX.WARN}	Maximum of average server response time for trigger expression.	`10s`
{$HAPROXY.STATS.PATH}	The path of HAProxy stats page.	`stats`
{$HAPROXY.STATS.PORT}	The port of the HAProxy stats host or container.	`8404`
{$HAPROXY.STATS.SCHEME}	The scheme of HAProxy stats page(http/https).	`http`

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Backend discovery

Discovery backends

DEPENDENT

haproxy.backend.discovery

Filter:

AND

- {#SVNAME} MATCHESREGEX BACKEND

- {#MODE} MATCHESREGEX `http

tcp**Overrides:**Discard HTTP status codes - {#MODE} MATCHES_REGEXtcp - ITEM_PROTOTYPE LIKENumber of responses with codes` - NO_DISCOVER

Frontend discovery

Discovery frontends

DEPENDENT

haproxy.frontend.discovery

Filter:

AND

- {#SVNAME} MATCHESREGEX FRONTEND

- {#MODE} MATCHESREGEX `http

tcp**Overrides:**Discard HTTP status codes - {#MODE} MATCHES_REGEXtcp - ITEM_PROTOTYPE LIKENumber of responses with codes` - NO_DISCOVER

Server discovery

Discovery servers

DEPENDENT

haproxy.server.discovery

Filter:

AND

- {#SVNAME} NOTMATCHESREGEX `FRONTEND

BACKEND- {#MODE} MATCHES_REGEXhttp

tcp**Overrides:**Discard HTTP status codes - {#MODE} MATCHES_REGEXtcp - ITEM_PROTOTYPE LIKENumber of responses with codes` - NO_DISCOVER

Items collected

Group	Name	Description	Type	Key and additional info
HAProxy	HAProxy: Version	-	DEPENDENT	haproxy.version Preprocessing: - REGEX: `HAProxy version ([^,]), \1` ⛔️ONFAIL: `CUSTOM_ERROR -> HAProxy version is not found` - DISCARD*UNCHANGED_HEARTBEAT: `1d`
HAProxy	HAProxy: Uptime	-	DEPENDENT	haproxy.uptime Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.`
HAProxy	HAProxy: Service status	-	ZABBIX_PASSIVE	net.tcp.service["{$HAPROXY.STATS.SCHEME}","{HOST.CONN}","{$HAPROXY.STATS.PORT}"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
HAProxy	HAProxy: Service response time	-	ZABBIX_PASSIVE	net.tcp.service.perf["{$HAPROXY.STATS.SCHEME}","{HOST.CONN}","{$HAPROXY.STATS.PORT}"]
HAProxy	HAProxy Backend {#PXNAME}: Status	Possible values: UP - The server is reporting as healthy. DOWN - The server is reporting as unhealthy and unable to receive requests. NOLB - You've added http-check disable-on-404 to the backend and the health checked URL has returned an HTTP 404 response. MAINT - The server has been disabled or put into maintenance mode. DRAIN - The server has been put into drain mode. no check - Health checks are not enabled for this server.	DEPENDENT	haproxy.backend.status[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].status.first()` - DISCARDUNCHANGEDHEARTBEAT: `10m`
HAProxy	HAProxy Backend {#PXNAME}: Responses time	Average backend response time (in ms) for the last 1,024 requests	DEPENDENT	haproxy.backend.rtime[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].rtime.first()` - MULTIPLIER: `0.001`
HAProxy	HAProxy Backend {#PXNAME}: Errors connection per second	Number of requests that encountered an error attempting to connect to a backend server.	DEPENDENT	haproxy.backend.econ.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].econ.first()` - CHANGEPERSECOND
HAProxy	HAProxy Backend {#PXNAME}: Responses denied per second	Responses denied due to security concerns (ACL-restricted).	DEPENDENT	haproxy.backend.dresp.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].dresp.first()` - CHANGEPERSECOND
HAProxy	HAProxy Backend {#PXNAME}: Response errors per second	Number of requests whose responses yielded an error	DEPENDENT	haproxy.backend.eresp.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].eresp.first()` - CHANGEPERSECOND
HAProxy	HAProxy Backend {#PXNAME}: Unassigned requests	Current number of requests unassigned in queue.	DEPENDENT	haproxy.backend.qcur[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].qcur.first()`
HAProxy	HAProxy Backend {#PXNAME}: Time in queue	Average time spent in queue (in ms) for the last 1,024 requests	DEPENDENT	haproxy.backend.qtime[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].qtime.first()` - MULTIPLIER: `0.001`
HAProxy	HAProxy Backend {#PXNAME}: Redispatched requests per second	Number of times a request was redispatched to a different backend.	DEPENDENT	haproxy.backend.wredis.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].wredis.first()` - CHANGEPERSECOND
HAProxy	HAProxy Backend {#PXNAME}: Retried connections per second	Number of times a connection was retried.	DEPENDENT	haproxy.backend.wretr.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].wretr.first()` - CHANGEPERSECOND
HAProxy	HAProxy Backend {#PXNAME}: Number of responses with codes 1xx per second	Number of informational HTTP responses per second.	DEPENDENT	haproxy.backend.hrsp1xx.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].hrsp_1xx.first()` - CHANGEPER_SECOND
HAProxy	HAProxy Backend {#PXNAME}: Number of responses with codes 2xx per second	Number of successful HTTP responses per second.	DEPENDENT	haproxy.backend.hrsp2xx.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].hrsp_2xx.first()` - CHANGEPER_SECOND
HAProxy	HAProxy Backend {#PXNAME}: Number of responses with codes 3xx per second	Number of HTTP redirections per second.	DEPENDENT	haproxy.backend.hrsp3xx.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].hrsp_3xx.first()` - CHANGEPER_SECOND
HAProxy	HAProxy Backend {#PXNAME}: Number of responses with codes 4xx per second	Number of HTTP client errors per second.	DEPENDENT	haproxy.backend.hrsp4xx.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].hrsp_4xx.first()` - CHANGEPER_SECOND
HAProxy	HAProxy Backend {#PXNAME}: Number of responses with codes 5xx per second	Number of HTTP server errors per second.	DEPENDENT	haproxy.backend.hrsp5xx.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].hrsp_5xx.first()` - CHANGEPER_SECOND
HAProxy	HAProxy Backend {#PXNAME}: Incoming traffic	Number of bits received by the backend	DEPENDENT	haproxy.backend.bin.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].bin.first()` - MULTIPLIER: `8` - CHANGEPERSECOND
HAProxy	HAProxy Backend {#PXNAME}: Outgoing traffic	Number of bits sent by the backend	DEPENDENT	haproxy.backend.bout.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].bout.first()` - MULTIPLIER: `8` - CHANGEPERSECOND
HAProxy	HAProxy Backend {#PXNAME}: Number of active servers	Number of active servers.	DEPENDENT	haproxy.backend.act[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].act.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
HAProxy	HAProxy Backend {#PXNAME}: Number of backup servers	Number of backup servers.	DEPENDENT	haproxy.backend.bck[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].bck.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
HAProxy	HAProxy Backend {#PXNAME}: Sessions per second	Cumulative number of sessions (end-to-end connections) per second.	DEPENDENT	haproxy.backend.stot.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].stot.first()` - CHANGEPERSECOND
HAProxy	HAProxy Backend {#PXNAME}: Weight	Total effective weight.	DEPENDENT	haproxy.backend.weight[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].weight.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
HAProxy	HAProxy Frontend {#PXNAME}: Status	Possible values: OPEN, STOP. When Status is OPEN, the frontend is operating normally and ready to receive traffic.	DEPENDENT	haproxy.frontend.status[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].status.first()` - DISCARDUNCHANGEDHEARTBEAT: `6h`
HAProxy	HAProxy Frontend {#PXNAME}: Requests rate	HTTP requests per second	DEPENDENT	haproxy.frontend.req_rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].req_rate.first()`
HAProxy	HAProxy Frontend {#PXNAME}: Sessions rate	Number of sessions created per second	DEPENDENT	haproxy.frontend.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].rate.first()`
HAProxy	HAProxy Frontend {#PXNAME}: Established sessions	The current number of established sessions.	DEPENDENT	haproxy.frontend.scur[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].scur.first()`
HAProxy	HAProxy Frontend {#PXNAME}: Session limits	The most simultaneous sessions that are allowed, as defined by the maxconn setting in the frontend.	DEPENDENT	haproxy.frontend.slim[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].slim.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
HAProxy	HAProxy Frontend {#PXNAME}: Session utilization	Percentage of sessions used (scur / slim * 100).	CALCULATED	haproxy.frontend.sutil[{#PXNAME},{#SVNAME}] Expression: `last(//haproxy.frontend.scur[{#PXNAME},{#SVNAME}]) / last(//haproxy.frontend.slim[{#PXNAME},{#SVNAME}]) * 100`
HAProxy	HAProxy Frontend {#PXNAME}: Request errors per second	Number of request errors per second.	DEPENDENT	haproxy.frontend.ereq.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].ereq.first()` - CHANGEPERSECOND
HAProxy	HAProxy Frontend {#PXNAME}: Denied requests per second	Requests denied due to security concerns (ACL-restricted) per second.	DEPENDENT	haproxy.frontend.dreq.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].dreq.first()` - CHANGEPERSECOND
HAProxy	HAProxy Frontend {#PXNAME}: Number of responses with codes 1xx per second	Number of informational HTTP responses per second.	DEPENDENT	haproxy.frontend.hrsp1xx.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].hrsp_1xx.first()` - CHANGEPER_SECOND
HAProxy	HAProxy Frontend {#PXNAME}: Number of responses with codes 2xx per second	Number of successful HTTP responses per second.	DEPENDENT	haproxy.frontend.hrsp2xx.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].hrsp_2xx.first()` - CHANGEPER_SECOND
HAProxy	HAProxy Frontend {#PXNAME}: Number of responses with codes 3xx per second	Number of HTTP redirections per second.	DEPENDENT	haproxy.frontend.hrsp3xx.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].hrsp_3xx.first()` - CHANGEPER_SECOND
HAProxy	HAProxy Frontend {#PXNAME}: Number of responses with codes 4xx per second	Number of HTTP client errors per second.	DEPENDENT	haproxy.frontend.hrsp4xx.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].hrsp_4xx.first()` - CHANGEPER_SECOND
HAProxy	HAProxy Frontend {#PXNAME}: Number of responses with codes 5xx per second	Number of HTTP server errors per second.	DEPENDENT	haproxy.frontend.hrsp5xx.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].hrsp_5xx.first()` - CHANGEPER_SECOND
HAProxy	HAProxy Frontend {#PXNAME}: Incoming traffic	Number of bits received by the frontend	DEPENDENT	haproxy.frontend.bin.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].bin.first()` - MULTIPLIER: `8` - CHANGEPERSECOND
HAProxy	HAProxy Frontend {#PXNAME}: Outgoing traffic	Number of bits sent by the frontend	DEPENDENT	haproxy.frontend.bout.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].bout.first()` - MULTIPLIER: `8` - CHANGEPERSECOND
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Status		DEPENDENT	haproxy.server.status[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].status.first()` - DISCARDUNCHANGEDHEARTBEAT: `10m`
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Responses time	Average server response time (in ms) for the last 1,024 requests.	DEPENDENT	haproxy.server.rtime[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].rtime.first()` - MULTIPLIER: `0.001`
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Errors connection per second	Number of requests that encountered an error attempting to connect to a backend server.	DEPENDENT	haproxy.server.econ.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].econ.first()` - CHANGEPERSECOND
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Responses denied per second	Responses denied due to security concerns (ACL-restricted).	DEPENDENT	haproxy.server.dresp.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].dresp.first()` - CHANGEPERSECOND
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Response errors per second	Number of requests whose responses yielded an error.	DEPENDENT	haproxy.server.eresp.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].eresp.first()` - CHANGEPERSECOND
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Unassigned requests	Current number of requests unassigned in queue.	DEPENDENT	haproxy.server.qcur[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].qcur.first()`
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Time in queue	Average time spent in queue (in ms) for the last 1,024 requests.	DEPENDENT	haproxy.server.qtime[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].qtime.first()` - MULTIPLIER: `0.001`
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Redispatched requests per second	Number of times a request was redispatched to a different backend.	DEPENDENT	haproxy.server.wredis.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].wredis.first()` - CHANGEPERSECOND
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Retried connections per second	Number of times a connection was retried.	DEPENDENT	haproxy.server.wretr.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].wretr.first()` - CHANGEPERSECOND
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Number of responses with codes 1xx per second	Number of informational HTTP responses per second.	DEPENDENT	haproxy.server.hrsp1xx.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].hrsp_1xx.first()` - CHANGEPER_SECOND
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Number of responses with codes 2xx per second	Number of successful HTTP responses per second.	DEPENDENT	haproxy.server.hrsp2xx.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].hrsp_2xx.first()` - CHANGEPER_SECOND
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Number of responses with codes 3xx per second	Number of HTTP redirections per second.	DEPENDENT	haproxy.server.hrsp3xx.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].hrsp_3xx.first()` - CHANGEPER_SECOND
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Number of responses with codes 4xx per second	Number of HTTP client errors per second.	DEPENDENT	haproxy.server.hrsp4xx.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].hrsp_4xx.first()` - CHANGEPER_SECOND
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Number of responses with codes 5xx per second	Number of HTTP server errors per second.	DEPENDENT	haproxy.server.hrsp5xx.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].hrsp_5xx.first()` - CHANGEPER_SECOND
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Incoming traffic	Number of bits received by the backend	DEPENDENT	haproxy.server.bin.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].bin.first()` - MULTIPLIER: `8` - CHANGEPERSECOND
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Outgoing traffic	Number of bits sent by the backend	DEPENDENT	haproxy.server.bout.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].bout.first()` - MULTIPLIER: `8` - CHANGEPERSECOND
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Server is active	Shows whether the server is active (marked with a Y) or a backup (marked with a -).	DEPENDENT	haproxy.server.act[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].act.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Server is backup	Shows whether the server is a backup (marked with a Y) or active (marked with a -).	DEPENDENT	haproxy.server.bck[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].bck.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Sessions per second	Cumulative number of sessions (end-to-end connections) per second.	DEPENDENT	haproxy.server.stot.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].stot.first()` - CHANGEPERSECOND
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Weight	Effective weight.	DEPENDENT	haproxy.server.weight[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].weight.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Configured maxqueue	Configured maxqueue for the server, or nothing in the value is 0 (default, meaning no limit).	DEPENDENT	haproxy.server.qlimit[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].qlimit.first()` - DISCARDUNCHANGEDHEARTBEAT: `6h` - MATCHESREGEX: `^\d+$` ⛔️ONFAIL: `CUSTOM_VALUE -> 0`
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Server was selected per second	Number of times that server was selected.	DEPENDENT	haproxy.server.lbtot.rate[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].lbtot.first()` - CHANGEPERSECOND
HAProxy	HAProxy {#PXNAME} {#SVNAME}: Status of last health check	Status of last health check, one of: UNK -> unknown INI -> initializing SOCKERR -> socket error L4OK -> check passed on layer 4, no upper layers testing enabled L4TOUT -> layer 1-4 timeout L4CON -> layer 1-4 connection problem, for example "Connection refused" (tcp rst) or "No route to host" (icmp) L6OK -> check passed on layer 6 L6TOUT -> layer 6 (SSL) timeout L6RSP -> layer 6 invalid response - protocol error L7OK -> check passed on layer 7 L7OKC -> check conditionally passed on layer 7, for example 404 with disable-on-404 L7TOUT -> layer 7 (HTTP/SMTP) timeout L7RSP -> layer 7 invalid response - protocol error L7STS -> layer 7 response error, for example HTTP 5xx Notice: If a check is currently running, the last known status will be reported, prefixed with "* ". e. g. "* L7OK".	DEPENDENT	haproxy.server.checkstatus[{#PXNAME},{#SVNAME}] Preprocessing: - JSONPATH: `$.[?(@.pxname == '{#PXNAME}' && @.svname == '{#SVNAME}')].check_status.first()` - DISCARDUNCHANGED_HEARTBEAT: `10m`
Zabbix raw items	HAProxy: Get stats	HAProxy Statistics Report in CSV format	ZABBIX_PASSIVE	web.page.get["{$HAPROXY.STATS.SCHEME}://{HOST.CONN}:{$HAPROXY.STATS.PORT}/{$HAPROXY.STATS.PATH};csv"] Preprocessing: - REGEX: `# ([\s\S]*) \1` - CSVTOJSON: `1`
Zabbix raw items	HAProxy: Get nodes	Array for LLD rules.	DEPENDENT	haproxy.get.nodes Preprocessing: - JAVASCRIPT: `return JSON.stringify(JSON.parse(value),['mode','pxname','svname'])` - DISCARDUNCHANGEDHEARTBEAT: `1h`
Zabbix raw items	HAProxy: Get stats page	HAProxy Statistics Report HTML	ZABBIX_PASSIVE	web.page.get["{$HAPROXY.STATS.SCHEME}://{HOST.CONN}:{$HAPROXY.STATS.PORT}/{$HAPROXY.STATS.PATH}"]

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
HAProxy: Version has changed	HAProxy version has changed. Ack to close.	`last(/HAProxy by Zabbix agent/haproxy.version,#1)<>last(/HAProxy by Zabbix agent/haproxy.version,#2) and length(last(/HAProxy by Zabbix agent/haproxy.version))>0`	INFO	Manual close: YES
HAProxy: has been restarted	Uptime is less than 10 minutes	`last(/HAProxy by Zabbix agent/haproxy.uptime)<10m`	INFO	Manual close: YES
HAProxy: Service is down	-	`last(/HAProxy by Zabbix agent/net.tcp.service["{$HAPROXY.STATS.SCHEME}","{HOST.CONN}","{$HAPROXY.STATS.PORT}"])=0`	AVERAGE	Manual close: YES
HAProxy: Service response time is too high	-	`min(/HAProxy by Zabbix agent/net.tcp.service.perf["{$HAPROXY.STATS.SCHEME}","{HOST.CONN}","{$HAPROXY.STATS.PORT}"],5m)>{$HAPROXY.RESPONSE_TIME.MAX.WARN}`	WARNING	Manual close: YES Depends on: - HAProxy: Service is down
HAProxy backend {#PXNAME}: Server is DOWN	Backend is not available.	`count(/HAProxy by Zabbix agent/haproxy.backend.status[{#PXNAME},{#SVNAME}],#5,"eq","DOWN")=5`	AVERAGE
HAProxy backend {#PXNAME}: Average response time is high	Average backend response time (in ms) for the last 1,024 requests is more than {$HAPROXY.BACK_RTIME.MAX.WARN}.	`min(/HAProxy by Zabbix agent/haproxy.backend.rtime[{#PXNAME},{#SVNAME}],5m)>{$HAPROXY.BACK_RTIME.MAX.WARN}`	WARNING
HAProxy backend {#PXNAME}: Number of responses with error is high	Number of requests on backend, whose responses yielded an error, is more than {$HAPROXY.BACK_ERESP.MAX.WARN}.	`min(/HAProxy by Zabbix agent/haproxy.backend.eresp.rate[{#PXNAME},{#SVNAME}],5m)>{$HAPROXY.BACK_ERESP.MAX.WARN}`	WARNING
HAProxy backend {#PXNAME}: Current number of requests unassigned in queue is high	Current number of requests on backend unassigned in queue is more than {$HAPROXY.BACK_QCUR.MAX.WARN}.	`min(/HAProxy by Zabbix agent/haproxy.backend.qcur[{#PXNAME},{#SVNAME}],5m)>{$HAPROXY.BACK_QCUR.MAX.WARN}`	WARNING
HAProxy backend {#PXNAME}: Average time spent in queue is high	Average time spent in queue (in ms) for the last 1,024 requests is more than {$HAPROXY.BACK_QTIME.MAX.WARN}.	`min(/HAProxy by Zabbix agent/haproxy.backend.qtime[{#PXNAME},{#SVNAME}],5m)>{$HAPROXY.BACK_QTIME.MAX.WARN}`	WARNING
HAProxy frontend {#PXNAME}: Session utilization is high	Alerting on this metric is essential to ensure your server has sufficient capacity to handle all concurrent sessions. Unlike requests, upon reaching the session limit HAProxy will deny additional clients until resource consumption drops. Furthermore, if you find your session usage percentage to be hovering above 80%, it could be time to either modify HAProxy's configuration to allow more sessions, or migrate your HAProxy server to a bigger box.	`min(/HAProxy by Zabbix agent/haproxy.frontend.sutil[{#PXNAME},{#SVNAME}],5m)>{$HAPROXY.FRONT_SUTIL.MAX.WARN}`	WARNING
HAProxy frontend {#PXNAME}: Number of request errors is high	Number of request errors is more than {$HAPROXY.FRONT_EREQ.MAX.WARN}.	`min(/HAProxy by Zabbix agent/haproxy.frontend.ereq.rate[{#PXNAME},{#SVNAME}],5m)>{$HAPROXY.FRONT_EREQ.MAX.WARN}`	WARNING
HAProxy frontend {#PXNAME}: Number of requests denied is high	Number of requests denied due to security concerns (ACL-restricted) is more than {$HAPROXY.FRONT_DREQ.MAX.WARN}.	`min(/HAProxy by Zabbix agent/haproxy.frontend.dreq.rate[{#PXNAME},{#SVNAME}],5m)>{$HAPROXY.FRONT_DREQ.MAX.WARN}`	WARNING
HAProxy {#PXNAME} {#SVNAME}: Server is DOWN	Server is not available.	`count(/HAProxy by Zabbix agent/haproxy.server.status[{#PXNAME},{#SVNAME}],#5,"eq","DOWN")=5`	WARNING
HAProxy {#PXNAME} {#SVNAME}: Average response time is high	Average server response time (in ms) for the last 1,024 requests is more than {$HAPROXY.SERVER_RTIME.MAX.WARN}.	`min(/HAProxy by Zabbix agent/haproxy.server.rtime[{#PXNAME},{#SVNAME}],5m)>{$HAPROXY.SERVER_RTIME.MAX.WARN}`	WARNING
HAProxy {#PXNAME} {#SVNAME}: Number of responses with error is high	Number of requests on server, whose responses yielded an error, is more than {$HAPROXY.SERVER_ERESP.MAX.WARN}.	`min(/HAProxy by Zabbix agent/haproxy.server.eresp.rate[{#PXNAME},{#SVNAME}],5m)>{$HAPROXY.SERVER_ERESP.MAX.WARN}`	WARNING
HAProxy {#PXNAME} {#SVNAME}: Current number of requests unassigned in queue is high	Current number of requests unassigned in queue is more than {$HAPROXY.SERVER_QCUR.MAX.WARN}.	`min(/HAProxy by Zabbix agent/haproxy.server.qcur[{#PXNAME},{#SVNAME}],5m)>{$HAPROXY.SERVER_QCUR.MAX.WARN}`	WARNING
HAProxy {#PXNAME} {#SVNAME}: Average time spent in queue is high	Average time spent in queue (in ms) for the last 1,024 requests is more than {$HAPROXY.SERVER_QTIME.MAX.WARN}.	`min(/HAProxy by Zabbix agent/haproxy.server.qtime[{#PXNAME},{#SVNAME}],5m)>{$HAPROXY.SERVER_QTIME.MAX.WARN}`	WARNING
HAProxy {#PXNAME} {#SVNAME}: Health check error	Please check the server for faults.	`find(/HAProxy by Zabbix agent/haproxy.server.check_status[{#PXNAME},{#SVNAME}],#3,"regexp","(?:L[4-7]OK	^$)")=0`	WARNING	Depends on: - HAProxy {#PXNAME} {#SVNAME}: Server is DOWN

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

app

app_hadoop_http

View README Download JSON

Hadoop by HTTP

Overview

For Zabbix version: 6.2 and higher
The template for monitoring Hadoop over HTTP that works without any external scripts. It collects metrics by polling the Hadoop API remotely using an HTTP agent and JSONPath preprocessing. Zabbix server (or proxy) execute direct requests to ResourceManager, NodeManagers, NameNode, DataNodes APIs. All metrics are collected at once, thanks to the Zabbix bulk data collection.

This template was tested on:

Hadoop, version 3.1 and later

Setup

You should define the IP address (or FQDN) and Web-UI port for the ResourceManager in {$HADOOP.RESOURCEMANAGER.HOST} and {$HADOOP.RESOURCEMANAGER.PORT} macros and for the NameNode in {$HADOOP.NAMENODE.HOST} and {$HADOOP.NAMENODE.PORT} macros respectively. Macros can be set in the template or overridden at the host level.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$HADOOP.CAPACITY_REMAINING.MIN.WARN}	The Hadoop cluster capacity remaining percent for trigger expression.	`20`
{$HADOOP.NAMENODE.HOST}	The Hadoop NameNode host IP address or FQDN.	`NameNode`
{$HADOOP.NAMENODE.PORT}	The Hadoop NameNode Web-UI port.	`9870`
{$HADOOP.NAMENODE.RESPONSE_TIME.MAX.WARN}	The Hadoop NameNode API page maximum response time in seconds for trigger expression.	`10s`
{$HADOOP.RESOURCEMANAGER.HOST}	The Hadoop ResourceManager host IP address or FQDN.	`ResourceManager`
{$HADOOP.RESOURCEMANAGER.PORT}	The Hadoop ResourceManager Web-UI port.	`8088`
{$HADOOP.RESOURCEMANAGER.RESPONSE_TIME.MAX.WARN}	The Hadoop ResourceManager API page maximum response time in seconds for trigger expression.	`10s`

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Data node discovery

HTTP_AGENT

hadoop.datanode.discovery

Preprocessing:

- JAVASCRIPT: The text is too long. Please see the template.

Node manager discovery

HTTP_AGENT

hadoop.nodemanager.discovery

Preprocessing:

- JAVASCRIPT: The text is too long. Please see the template.

Items collected

Group	Name	Description	Type	Key and additional info
Hadoop	ResourceManager: Service status	Hadoop ResourceManager API port availability.	SIMPLE	net.tcp.service["tcp","{$HADOOP.RESOURCEMANAGER.HOST}","{$HADOOP.RESOURCEMANAGER.PORT}"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
Hadoop	ResourceManager: Service response time	Hadoop ResourceManager API performance.	SIMPLE	net.tcp.service.perf["tcp","{$HADOOP.RESOURCEMANAGER.HOST}","{$HADOOP.RESOURCEMANAGER.PORT}"]
Hadoop	ResourceManager: Uptime	-	DEPENDENT	hadoop.resourcemanager.uptime Preprocessing: - JSONPATH: `$.beans[?(@.name=='java.lang:type=Runtime')].Uptime.first()` - MULTIPLIER: `0.001`
Hadoop	ResourceManager: RPC queue & processing time	Average time spent on processing RPC requests.	DEPENDENT	hadoop.resourcemanager.rpcprocessingtime_avg Preprocessing: - JSONPATH: `$.beans[?(@.name=='Hadoop:service=ResourceManager,name=RpcActivityForPort8031')].RpcProcessingTimeAvgTime.first()`
Hadoop	ResourceManager: Active NMs	Number of Active NodeManagers.	DEPENDENT	hadoop.resourcemanager.numactivenm Preprocessing: - JSONPATH: `$.beans[?(@.name=='Hadoop:service=ResourceManager,name=ClusterMetrics')].NumActiveNMs.first()` - DISCARDUNCHANGEDHEARTBEAT: `6h`
Hadoop	ResourceManager: Decommissioning NMs	Number of Decommissioning NodeManagers.	DEPENDENT	hadoop.resourcemanager.numdecommissioningnm Preprocessing: - JSONPATH: `$.beans[?(@.name=='Hadoop:service=ResourceManager,name=ClusterMetrics')].NumDecommissioningNMs.first()` - DISCARDUNCHANGEDHEARTBEAT: `6h`
Hadoop	ResourceManager: Decommissioned NMs	Number of Decommissioned NodeManagers.	DEPENDENT	hadoop.resourcemanager.numdecommissionednm Preprocessing: - JSONPATH: `$.beans[?(@.name=='Hadoop:service=ResourceManager,name=ClusterMetrics')].NumDecommissionedNMs.first()`
Hadoop	ResourceManager: Lost NMs	Number of Lost NodeManagers.	DEPENDENT	hadoop.resourcemanager.numlostnm Preprocessing: - JSONPATH: `$.beans[?(@.name=='Hadoop:service=ResourceManager,name=ClusterMetrics')].NumLostNMs.first()` - DISCARDUNCHANGEDHEARTBEAT: `6h`
Hadoop	ResourceManager: Unhealthy NMs	Number of Unhealthy NodeManagers.	DEPENDENT	hadoop.resourcemanager.numunhealthynm Preprocessing: - JSONPATH: `$.beans[?(@.name=='Hadoop:service=ResourceManager,name=ClusterMetrics')].NumUnhealthyNMs.first()`
Hadoop	ResourceManager: Rebooted NMs	Number of Rebooted NodeManagers.	DEPENDENT	hadoop.resourcemanager.numrebootednm Preprocessing: - JSONPATH: `$.beans[?(@.name=='Hadoop:service=ResourceManager,name=ClusterMetrics')].NumRebootedNMs.first()`
Hadoop	ResourceManager: Shutdown NMs	Number of Shutdown NodeManagers.	DEPENDENT	hadoop.resourcemanager.numshutdownnm Preprocessing: - JSONPATH: `$.beans[?(@.name=='Hadoop:service=ResourceManager,name=ClusterMetrics')].NumShutdownNMs.first()`
Hadoop	NameNode: Service status	Hadoop NameNode API port availability.	SIMPLE	net.tcp.service["tcp","{$HADOOP.NAMENODE.HOST}","{$HADOOP.NAMENODE.PORT}"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
Hadoop	NameNode: Service response time	Hadoop NameNode API performance.	SIMPLE	net.tcp.service.perf["tcp","{$HADOOP.NAMENODE.HOST}","{$HADOOP.NAMENODE.PORT}"]
Hadoop	NameNode: Uptime	-	DEPENDENT	hadoop.namenode.uptime Preprocessing: - JSONPATH: `$.beans[?(@.name=='java.lang:type=Runtime')].Uptime.first()` - MULTIPLIER: `0.001`
Hadoop	NameNode: RPC queue & processing time	Average time spent on processing RPC requests.	DEPENDENT	hadoop.namenode.rpcprocessingtime_avg Preprocessing: - JSONPATH: `$.beans[?(@.name=='Hadoop:service=NameNode,name=RpcActivityForPort9000')].RpcProcessingTimeAvgTime.first()`
Hadoop	NameNode: Block Pool Renaming	-	DEPENDENT	hadoop.namenode.percentblockpool_used Preprocessing: - JSONPATH: `$.beans[?(@.name=='Hadoop:service=NameNode,name=NameNodeInfo')].PercentBlockPoolUsed.first()`
Hadoop	NameNode: Transactions since last checkpoint	Total number of transactions since last checkpoint.	DEPENDENT	hadoop.namenode.transactionssincelast_checkpoint Preprocessing: - JSONPATH: `$.beans[?(@.name=='Hadoop:service=NameNode,name=FSNamesystem')].TransactionsSinceLastCheckpoint.first()`
Hadoop	NameNode: Percent capacity remaining	Available capacity in percent.	DEPENDENT	hadoop.namenode.percentremaining Preprocessing: - JSONPATH: `$.beans[?(@.name=='Hadoop:service=NameNode,name=NameNodeInfo')].PercentRemaining.first()` - DISCARDUNCHANGED_HEARTBEAT: `6h`
Hadoop	NameNode: Capacity remaining	Available capacity.	DEPENDENT	hadoop.namenode.capacity_remaining Preprocessing: - JSONPATH: `$.beans[?(@.name=='Hadoop:service=NameNode,name=FSNamesystem')].CapacityRemaining.first()`
Hadoop	NameNode: Corrupt blocks	Number of corrupt blocks.	DEPENDENT	hadoop.namenode.corrupt_blocks Preprocessing: - JSONPATH: `$.beans[?(@.name=='Hadoop:service=NameNode,name=FSNamesystem')].CorruptBlocks.first()`
Hadoop	NameNode: Missing blocks	Number of missing blocks.	DEPENDENT	hadoop.namenode.missing_blocks Preprocessing: - JSONPATH: `$.beans[?(@.name=='Hadoop:service=NameNode,name=FSNamesystem')].MissingBlocks.first()`
Hadoop	NameNode: Failed volumes	Number of failed volumes.	DEPENDENT	hadoop.namenode.volumefailurestotal Preprocessing: - JSONPATH: `$.beans[?(@.name=='Hadoop:service=NameNode,name=FSNamesystem')].VolumeFailuresTotal.first()`
Hadoop	NameNode: Alive DataNodes	Count of alive DataNodes.	DEPENDENT	hadoop.namenode.numlivedatanodes Preprocessing: - JSONPATH: `$.beans[?(@.name=='Hadoop:service=NameNode,name=FSNamesystem')].NumLiveDataNodes.first()` - DISCARDUNCHANGED_HEARTBEAT: `6h`
Hadoop	NameNode: Dead DataNodes	Count of dead DataNodes.	DEPENDENT	hadoop.namenode.numdeaddatanodes Preprocessing: - JSONPATH: `$.beans[?(@.name=='Hadoop:service=NameNode,name=FSNamesystem')].NumDeadDataNodes.first()` - DISCARDUNCHANGED_HEARTBEAT: `6h`
Hadoop	NameNode: Stale DataNodes	DataNodes that do not send a heartbeat within 30 seconds are marked as "stale".	DEPENDENT	hadoop.namenode.numstaledatanodes Preprocessing: - JSONPATH: `$.beans[?(@.name=='Hadoop:service=NameNode,name=FSNamesystem')].StaleDataNodes.first()` - DISCARDUNCHANGED_HEARTBEAT: `6h`
Hadoop	NameNode: Total files	Total count of files tracked by the NameNode.	DEPENDENT	hadoop.namenode.files_total Preprocessing: - JSONPATH: `$.beans[?(@.name=='Hadoop:service=NameNode,name=FSNamesystem')].FilesTotal.first()`
Hadoop	NameNode: Total load	The current number of concurrent file accesses (read/write) across all DataNodes.	DEPENDENT	hadoop.namenode.total_load Preprocessing: - JSONPATH: `$.beans[?(@.name=='Hadoop:service=NameNode,name=FSNamesystem')].TotalLoad.first()`
Hadoop	NameNode: Blocks allocable	Maximum number of blocks allocable.	DEPENDENT	hadoop.namenode.block_capacity Preprocessing: - JSONPATH: `$.beans[?(@.name=='Hadoop:service=NameNode,name=FSNamesystem')].BlockCapacity.first()`
Hadoop	NameNode: Total blocks	Count of blocks tracked by NameNode.	DEPENDENT	hadoop.namenode.blocks_total Preprocessing: - JSONPATH: `$.beans[?(@.name=='Hadoop:service=NameNode,name=FSNamesystem')].BlocksTotal.first()`
Hadoop	NameNode: Under-replicated blocks	The number of blocks with insufficient replication.	DEPENDENT	hadoop.namenode.underreplicatedblocks Preprocessing: - JSONPATH: `$.beans[?(@.name=='Hadoop:service=NameNode,name=FSNamesystem')].UnderReplicatedBlocks.first()`
Hadoop	{#HOSTNAME}: RPC queue & processing time	Average time spent on processing RPC requests.	DEPENDENT	hadoop.nodemanager.rpcprocessingtime_avg[{#HOSTNAME}] Preprocessing: - JSONPATH: `$.beans[?(@.name=='Hadoop:service=NodeManager,name=RpcActivityForPort8040')].RpcProcessingTimeAvgTime.first()`
Hadoop	{#HOSTNAME}: Container launch avg duration	-	DEPENDENT	hadoop.nodemanager.containerlaunchduration_avg[{#HOSTNAME}] Preprocessing: - JSONPATH: `$.beans[?(@.name=='Hadoop:service=NodeManager,name=NodeManagerMetrics')].ContainerLaunchDurationAvgTime.first()`
Hadoop	{#HOSTNAME}: JVM Threads	The number of JVM threads.	DEPENDENT	hadoop.nodemanager.jvm.threads[{#HOSTNAME}] Preprocessing: - JSONPATH: `$.beans[?(@.name=='java.lang:type=Threading')].ThreadCount.first()`
Hadoop	{#HOSTNAME}: JVM Garbage collection time	The JVM garbage collection time in milliseconds.	DEPENDENT	hadoop.nodemanager.jvm.gc_time[{#HOSTNAME}] Preprocessing: - JSONPATH: `$.beans[?(@.name=='Hadoop:service=NodeManager,name=JvmMetrics')].GcTimeMillis.first()`
Hadoop	{#HOSTNAME}: JVM Heap usage	The JVM heap usage in MBytes.	DEPENDENT	hadoop.nodemanager.jvm.memheapused[{#HOSTNAME}] Preprocessing: - JSONPATH: `$.beans[?(@.name=='Hadoop:service=NodeManager,name=JvmMetrics')].MemHeapUsedM.first()`
Hadoop	{#HOSTNAME}: Uptime	-	DEPENDENT	hadoop.nodemanager.uptime[{#HOSTNAME}] Preprocessing: - JSONPATH: `$.beans[?(@.name=='java.lang:type=Runtime')].Uptime.first()` - MULTIPLIER: `0.001`
Hadoop	{#HOSTNAME}: State	State of the node - valid values are: NEW, RUNNING, UNHEALTHY, DECOMMISSIONING, DECOMMISSIONED, LOST, REBOOTED, SHUTDOWN.	DEPENDENT	hadoop.nodemanager.state[{#HOSTNAME}] Preprocessing: - JSONPATH: `$[?(@.HostName=='{#HOSTNAME}')].State.first()` - DISCARDUNCHANGEDHEARTBEAT: `6h`
Hadoop	{#HOSTNAME}: Version	-	DEPENDENT	hadoop.nodemanager.version[{#HOSTNAME}] Preprocessing: - JSONPATH: `$[?(@.HostName=='{#HOSTNAME}')].NodeManagerVersion.first()` - DISCARDUNCHANGEDHEARTBEAT: `6h`
Hadoop	{#HOSTNAME}: Number of containers	-	DEPENDENT	hadoop.nodemanager.numcontainers[{#HOSTNAME}] Preprocessing: - JSONPATH: `$[?(@.HostName=='{#HOSTNAME}')].NumContainers.first()`
Hadoop	{#HOSTNAME}: Used memory	-	DEPENDENT	hadoop.nodemanager.usedmemory[{#HOSTNAME}] Preprocessing: - JSONPATH: `$[?(@.HostName=='{#HOSTNAME}')].UsedMemoryMB.first()`
Hadoop	{#HOSTNAME}: Available memory	-	DEPENDENT	hadoop.nodemanager.availablememory[{#HOSTNAME}] Preprocessing: - JSONPATH: `$[?(@.HostName=='{#HOSTNAME}')].AvailableMemoryMB.first()`
Hadoop	{#HOSTNAME}: Remaining	Remaining disk space.	DEPENDENT	hadoop.datanode.remaining[{#HOSTNAME}] Preprocessing: - JSONPATH: `$.beans[?(@.name=='Hadoop:service=DataNode,name=FSDatasetState')].Remaining.first()`
Hadoop	{#HOSTNAME}: Used	Used disk space.	DEPENDENT	hadoop.datanode.dfs_used[{#HOSTNAME}] Preprocessing: - JSONPATH: `$.beans[?(@.name=='Hadoop:service=DataNode,name=FSDatasetState')].DfsUsed.first()`
Hadoop	{#HOSTNAME}: Number of failed volumes	Number of failed storage volumes.	DEPENDENT	hadoop.datanode.numfailedvolumes[{#HOSTNAME}] Preprocessing: - JSONPATH: `$.beans[?(@.name=='Hadoop:service=DataNode,name=FSDatasetState')].NumFailedVolumes.first()`
Hadoop	{#HOSTNAME}: JVM Threads	The number of JVM threads.	DEPENDENT	hadoop.datanode.jvm.threads[{#HOSTNAME}] Preprocessing: - JSONPATH: `$.beans[?(@.name=='java.lang:type=Threading')].ThreadCount.first()`
Hadoop	{#HOSTNAME}: JVM Garbage collection time	The JVM garbage collection time in milliseconds.	DEPENDENT	hadoop.datanode.jvm.gc_time[{#HOSTNAME}] Preprocessing: - JSONPATH: `$.beans[?(@.name=='Hadoop:service=DataNode,name=JvmMetrics')].GcTimeMillis.first()`
Hadoop	{#HOSTNAME}: JVM Heap usage	The JVM heap usage in MBytes.	DEPENDENT	hadoop.datanode.jvm.memheapused[{#HOSTNAME}] Preprocessing: - JSONPATH: `$.beans[?(@.name=='Hadoop:service=DataNode,name=JvmMetrics')].MemHeapUsedM.first()`
Hadoop	{#HOSTNAME}: Uptime	-	DEPENDENT	hadoop.datanode.uptime[{#HOSTNAME}] Preprocessing: - JSONPATH: `$.beans[?(@.name=='java.lang:type=Runtime')].Uptime.first()` - MULTIPLIER: `0.001`
Hadoop	{#HOSTNAME}: Version	DataNode software version.	DEPENDENT	hadoop.datanode.version[{#HOSTNAME}] Preprocessing: - JSONPATH: `$.[?(@.HostName=='{#HOSTNAME}')].version.first()` - DISCARDUNCHANGEDHEARTBEAT: `6h`
Hadoop	{#HOSTNAME}: Admin state	Administrative state.	DEPENDENT	hadoop.datanode.adminstate[{#HOSTNAME}] Preprocessing: - JSONPATH: `$.[?(@.HostName=='{#HOSTNAME}')].adminState.first()` - DISCARDUNCHANGED_HEARTBEAT: `6h`
Hadoop	{#HOSTNAME}: Oper state	Operational state.	DEPENDENT	hadoop.datanode.operstate[{#HOSTNAME}] Preprocessing: - JSONPATH: `$.[?(@.HostName=='{#HOSTNAME}')].operState.first()` - DISCARDUNCHANGED_HEARTBEAT: `6h`
Zabbix raw items	Get ResourceManager stats	-	HTTP_AGENT	hadoop.resourcemanager.get
Zabbix raw items	Get NameNode stats	-	HTTP_AGENT	hadoop.namenode.get
Zabbix raw items	Get NodeManagers states	-	HTTP_AGENT	hadoop.nodemanagers.get Preprocessing: - JAVASCRIPT: `return JSON.stringify(JSON.parse(JSON.parse(value).beans[0].LiveNodeManagers))`
Zabbix raw items	Get DataNodes states	-	HTTP_AGENT	hadoop.datanodes.get Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.`
Zabbix raw items	Hadoop NodeManager {#HOSTNAME}: Get stats	-	HTTP_AGENT	hadoop.nodemanager.get[{#HOSTNAME}]
Zabbix raw items	Hadoop DataNode {#HOSTNAME}: Get stats	-	HTTP_AGENT	hadoop.datanode.get[{#HOSTNAME}]

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
ResourceManager: Service is unavailable	-	`last(/Hadoop by HTTP/net.tcp.service["tcp","{$HADOOP.RESOURCEMANAGER.HOST}","{$HADOOP.RESOURCEMANAGER.PORT}"])=0`	AVERAGE	Manual close: YES
ResourceManager: Service response time is too high	-	`min(/Hadoop by HTTP/net.tcp.service.perf["tcp","{$HADOOP.RESOURCEMANAGER.HOST}","{$HADOOP.RESOURCEMANAGER.PORT}"],5m)>{$HADOOP.RESOURCEMANAGER.RESPONSE_TIME.MAX.WARN}`	WARNING	Manual close: YES Depends on: - ResourceManager: Service is unavailable
ResourceManager: Service has been restarted	Uptime is less than 10 minutes.	`last(/Hadoop by HTTP/hadoop.resourcemanager.uptime)<10m`	INFO	Manual close: YES
ResourceManager: Failed to fetch ResourceManager API page	Zabbix has not received data for items for the last 30 minutes.	`nodata(/Hadoop by HTTP/hadoop.resourcemanager.uptime,30m)=1`	WARNING	Manual close: YES Depends on: - ResourceManager: Service is unavailable
ResourceManager: Cluster has no active NodeManagers	Cluster is unable to execute any jobs without at least one NodeManager.	`max(/Hadoop by HTTP/hadoop.resourcemanager.num_active_nm,5m)=0`	HIGH
ResourceManager: Cluster has unhealthy NodeManagers	YARN considers any node with disk utilization exceeding the value specified under the property yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage (in yarn-site.xml) to be unhealthy. Ample disk space is critical to ensure uninterrupted operation of a Hadoop cluster, and large numbers of unhealthyNodes (the number to alert on depends on the size of your cluster) should be quickly investigated and resolved.	`min(/Hadoop by HTTP/hadoop.resourcemanager.num_unhealthy_nm,15m)>0`	AVERAGE
NameNode: Service is unavailable	-	`last(/Hadoop by HTTP/net.tcp.service["tcp","{$HADOOP.NAMENODE.HOST}","{$HADOOP.NAMENODE.PORT}"])=0`	AVERAGE	Manual close: YES
NameNode: Service response time is too high	-	`min(/Hadoop by HTTP/net.tcp.service.perf["tcp","{$HADOOP.NAMENODE.HOST}","{$HADOOP.NAMENODE.PORT}"],5m)>{$HADOOP.NAMENODE.RESPONSE_TIME.MAX.WARN}`	WARNING	Manual close: YES Depends on: - NameNode: Service is unavailable
NameNode: Service has been restarted	Uptime is less than 10 minutes.	`last(/Hadoop by HTTP/hadoop.namenode.uptime)<10m`	INFO	Manual close: YES
NameNode: Failed to fetch NameNode API page	Zabbix has not received data for items for the last 30 minutes.	`nodata(/Hadoop by HTTP/hadoop.namenode.uptime,30m)=1`	WARNING	Manual close: YES Depends on: - NameNode: Service is unavailable
NameNode: Cluster capacity remaining is low	A good practice is to ensure that disk use never exceeds 80 percent capacity.	`max(/Hadoop by HTTP/hadoop.namenode.percent_remaining,15m)<{$HADOOP.CAPACITY_REMAINING.MIN.WARN}`	WARNING
NameNode: Cluster has missing blocks	A missing block is far worse than a corrupt block, because a missing block cannot be recovered by copying a replica.	`min(/Hadoop by HTTP/hadoop.namenode.missing_blocks,15m)>0`	AVERAGE
NameNode: Cluster has volume failures	HDFS now allows for disks to fail in place, without affecting DataNode operations, until a threshold value is reached. This is set on each DataNode via the dfs.datanode.failed.volumes.tolerated property; it defaults to 0, meaning that any volume failure will shut down the DataNode; on a production cluster where DataNodes typically have 6, 8, or 12 disks, setting this parameter to 1 or 2 is typically the best practice.	`min(/Hadoop by HTTP/hadoop.namenode.volume_failures_total,15m)>0`	AVERAGE
NameNode: Cluster has DataNodes in Dead state	The death of a DataNode causes a flurry of network activity, as the NameNode initiates replication of blocks lost on the dead nodes.	`min(/Hadoop by HTTP/hadoop.namenode.num_dead_data_nodes,5m)>0`	AVERAGE
{#HOSTNAME}: Service has been restarted	Uptime is less than 10 minutes.	`last(/Hadoop by HTTP/hadoop.nodemanager.uptime[{#HOSTNAME}])<10m`	INFO	Manual close: YES
{#HOSTNAME}: Failed to fetch NodeManager API page	Zabbix has not received data for items for the last 30 minutes.	`nodata(/Hadoop by HTTP/hadoop.nodemanager.uptime[{#HOSTNAME}],30m)=1`	WARNING	Manual close: YES Depends on: - {#HOSTNAME}: NodeManager has state {ITEM.VALUE}.
{#HOSTNAME}: NodeManager has state {ITEM.VALUE}.	The state is different from normal.	`last(/Hadoop by HTTP/hadoop.nodemanager.state[{#HOSTNAME}])<>"RUNNING"`	AVERAGE
{#HOSTNAME}: Service has been restarted	Uptime is less than 10 minutes.	`last(/Hadoop by HTTP/hadoop.datanode.uptime[{#HOSTNAME}])<10m`	INFO	Manual close: YES
{#HOSTNAME}: Failed to fetch DataNode API page	Zabbix has not received data for items for the last 30 minutes.	`nodata(/Hadoop by HTTP/hadoop.datanode.uptime[{#HOSTNAME}],30m)=1`	WARNING	Manual close: YES Depends on: - {#HOSTNAME}: DataNode has state {ITEM.VALUE}.
{#HOSTNAME}: DataNode has state {ITEM.VALUE}.	The state is different from normal.	`last(/Hadoop by HTTP/hadoop.datanode.oper_state[{#HOSTNAME}])<>"Live"`	AVERAGE

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

References

https://hadoop.apache.org/docs/current/

app

app_gitlab_http

View README Download JSON

GitLab by HTTP

Overview

For Zabbix version: 6.2 and higher.
This template is designed to monitor GitLab by Zabbix that works without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

The template GitLab by HTTP — collects metrics by an HTTP agent from the GitLab /-/metrics endpoint. See https://docs.gitlab.com/ee/administration/monitoring/prometheus/gitlab_metrics.html.

This template was tested on:

GitLab, version 13.5.3 EE

Setup

This template works with self-hosted GitLab instances. Internal service metrics are collected from the GitLab /-/metrics endpoint. To access metrics following two methods are available:

Explicitly allow monitoring instance IP address in gitlab whitelist configuration.
Get token from Gitlab Admin -> Monitoring -> Health check page: http://your.gitlab.address/admin/health_check; Use this token in macro {$GITLAB.HEALTH.TOKEN} as variable path, like: ?token=your_token. Remember to change the macros {$GITLAB.URL}. Also, see the Macros section for a list of macros used to set trigger values.

NOTE. Some metrics may not be collected depending on your Gitlab instance version and configuration. See Gitlab's documentation for further information about its metric collection.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$GITLAB.HEALTH.TOKEN}	The token path for Gitlab health check. Example `?token=your_token`	``
{$GITLAB.HTTP.FAIL.MAX.WARN}	The maximum number of HTTP request failures for a trigger expression.	`2`
{$GITLAB.OPEN.FDS.MAX.WARN}	The maximum percentage of used file descriptors for a trigger expression.	`90`
{$GITLAB.PUMA.QUEUE.MAX.WARN}	The maximum number of Puma queued requests for a trigger expression.	`1`
{$GITLAB.PUMA.UTILIZATION.MAX.WARN}	The maximum percentage of Puma thread utilization for a trigger expression.	`90`
{$GITLAB.REDIS.FAIL.MAX.WARN}	The maximum number of Redis client exceptions for a trigger expression.	`2`
{$GITLAB.UNICORN.QUEUE.MAX.WARN}	The maximum number of Unicorn queued requests for a trigger expression.	`1`
{$GITLAB.UNICORN.UTILIZATION.MAX.WARN}	The maximum percentage of Unicorn workers utilization for a trigger expression.	`90`
{$GITLAB.URL}	URL of a GitLab instance.	`http://localhost`

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Puma metrics discovery

Discovery of Puma specific metrics when Puma is used.

HTTP_AGENT

gitlab.puma.discovery

Preprocessing:

- PROMETHEUSTOJSON: puma_workers

- JAVASCRIPT: return JSON.stringify(value != "[]" ? [{'{#SINGLETON}': ''}] : []);

Unicorn metrics discovery

DiscoveryUnicorn specific metrics, when Unicorn is used.

HTTP_AGENT

gitlab.unicorn.discovery

Preprocessing:

- PROMETHEUSTOJSON: unicorn_workers

- JAVASCRIPT: return JSON.stringify(value != "[]" ? [{'{#SINGLETON}': ''}] : []);

Items collected

Group	Name	Description	Type	Key and additional info
GitLab	GitLab: Instance readiness check	The readiness probe checks whether the GitLab instance is ready to accept traffic via Rails Controllers.	HTTP_AGENT	gitlab.readiness Preprocessing: - CHECKNOTSUPPORTED ⛔️ONFAIL: `CUSTOM_VALUE -> {"master_check":[{"status":"failed"}]}` - JSONPATH: `$.master_check[0].status` - BOOLTODECIMAL ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - DISCARDUNCHANGEDHEARTBEAT: `30m`
GitLab	GitLab: Application server status	Checks whether the application server is running. This probe is used to know if Rails Controllers are not deadlocked due to a multi-threading.	HTTP_AGENT	gitlab.liveness Preprocessing: - CHECKNOTSUPPORTED ⛔️ONFAIL: `CUSTOM_VALUE -> {"status": "failed"}` - JSONPATH: `$.status` - BOOLTODECIMAL ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - DISCARDUNCHANGEDHEARTBEAT: `30m`
GitLab	GitLab: Version	Version of the GitLab instance.	DEPENDENT	gitlab.deployments.version Preprocessing: - JSONPATH: `$[?(@.name=="deployments")].labels.version.first()` - DISCARDUNCHANGEDHEARTBEAT: `3h`
GitLab	GitLab: Ruby: First process start time	Minimum UNIX timestamp of ruby processes start time.	DEPENDENT	gitlab.ruby.processstarttimeseconds.first Preprocessing: - JSONPATH: `$[?(@.name=="ruby_process_start_time_seconds")].value.min()` - DISCARDUNCHANGED_HEARTBEAT: `3h`
GitLab	GitLab: Ruby: Last process start time	Maximum UNIX timestamp ruby processes start time.	DEPENDENT	gitlab.ruby.processstarttimeseconds.last Preprocessing: - JSONPATH: `$[?(@.name=="ruby_process_start_time_seconds")].value.max()` - DISCARDUNCHANGED_HEARTBEAT: `3h`
GitLab	GitLab: User logins, total	Counter of how many users have logged in since GitLab was started or restarted.	DEPENDENT	gitlab.usersessionloginstotal Preprocessing: - JSONPATH: `$[?(@.name=="user_session_logins_total")].value.first()` ⛔️ONFAIL: `DISCARD_VALUE ->`
GitLab	GitLab: User CAPTCHA logins failed, total	Counter of failed CAPTCHA attempts during login.	DEPENDENT	gitlab.failedlogincaptchatotal Preprocessing: - JSONPATH: `$[?(@.name=="failed_login_captcha_total")].value.first()` ⛔️ONFAIL: `DISCARD_VALUE ->`
GitLab	GitLab: User CAPTCHA logins, total	Counter of successful CAPTCHA attempts during login.	DEPENDENT	gitlab.successfullogincaptchatotal Preprocessing: - JSONPATH: `$[?(@.name=="successful_login_captcha_total")].value.first()` ⛔️ONFAIL: `DISCARD_VALUE ->`
GitLab	GitLab: Upload file does not exist	Number of times an upload record could not find its file.	DEPENDENT	gitlab.uploadfiledoesnotexist Preprocessing: - JSONPATH: `$[?(@.name=="upload_file_does_not_exist")].value.first()` ⛔️ON_FAIL: `DISCARD_VALUE ->`
GitLab	GitLab: Pipelines: Processing events, total	Total amount of pipeline processing events.	DEPENDENT	gitlab.pipeline.processingeventstotal Preprocessing: - JSONPATH: `$[?(@.name=="gitlab_ci_pipeline_processing_events_total")].value.first()` ⛔️ON_FAIL: `DISCARD_VALUE ->`
GitLab	GitLab: Pipelines: Created, total	Counter of pipelines created.	DEPENDENT	gitlab.pipeline.createdtotal Preprocessing: - JSONPATH: `$[?(@.name=="pipelines_created_total")].value.sum()` ⛔️ONFAIL: `DISCARD_VALUE ->`
GitLab	GitLab: Pipelines: Auto DevOps pipelines, total	Counter of completed Auto DevOps pipelines.	DEPENDENT	gitlab.pipeline.autodevopscompleted.total Preprocessing: - JSONPATH: `$[?(@.name=="auto_devops_pipelines_completed_total")].value.sum()` ⛔️ON_FAIL: `DISCARD_VALUE ->`
GitLab	GitLab: Pipelines: Auto DevOps pipelines, failed	Counter of completed Auto DevOps pipelines with status "failed".	DEPENDENT	gitlab.pipeline.autodevopscompletedtotal.failed Preprocessing: - JSONPATH: `$[?(@.name=="auto_devops_pipelines_completed_total" && @.labels.status == "failed")].value.sum()` ⛔️ONFAIL: `DISCARD_VALUE ->`
GitLab	GitLab: Pipelines: CI/CD creation duration	The sum of the time in seconds it takes to create a CI/CD pipeline.	DEPENDENT	gitlab.pipeline.pipelinecreation Preprocessing: - JSONPATH: `$[?(@.name=="gitlab_ci_pipeline_creation_duration_seconds_sum")].value.first()` ⛔️ONFAIL: `DISCARD_VALUE ->`
GitLab	GitLab: Pipelines: Pipelines: CI/CD creation count	The count of the time it takes to create a CI/CD pipeline.	DEPENDENT	gitlab.pipeline.pipelinecreation.count Preprocessing: - JSONPATH: `$[?(@.name=="gitlab_ci_pipeline_creation_duration_seconds_count")].value.first()` ⛔️ONFAIL: `DISCARD_VALUE ->`
GitLab	GitLab: Database: Connection pool, busy	Connections to the main database in use where the owner is still alive.	DEPENDENT	gitlab.database.connectionpoolbusy Preprocessing: - JSONPATH: `$[?(@.name=="gitlab_database_connection_pool_busy" && @.labels.class == "ActiveRecord::Base")].value.sum()`
GitLab	GitLab: Database: Connection pool, current	Current connections to the main database in the pool.	DEPENDENT	gitlab.database.connectionpoolconnections Preprocessing: - JSONPATH: `$[?(@.name=="gitlab_database_connection_pool_connections" && @.labels.class == "ActiveRecord::Base")].value.sum()`
GitLab	GitLab: Database: Connection pool, dead	Connections to the main database in use where the owner is not alive.	DEPENDENT	gitlab.database.connectionpooldead Preprocessing: - JSONPATH: `$[?(@.name=="gitlab_database_connection_pool_dead" && @.labels.class == "ActiveRecord::Base")].value.sum()`
GitLab	GitLab: Database: Connection pool, idle	Connections to the main database not in use.	DEPENDENT	gitlab.database.connectionpoolidle Preprocessing: - JSONPATH: `$[?(@.name=="gitlab_database_connection_pool_idle" && @.labels.class == "ActiveRecord::Base")].value.sum()`
GitLab	GitLab: Database: Connection pool, size	Total connection to the main database pool capacity.	DEPENDENT	gitlab.database.connectionpoolsize Preprocessing: - JSONPATH: `$[?(@.name=="gitlab_database_connection_pool_size" && @.labels.class == "ActiveRecord::Base")].value.sum()`
GitLab	GitLab: Database: Connection pool, waiting	Threads currently waiting on this queue.	DEPENDENT	gitlab.database.connectionpoolwaiting Preprocessing: - JSONPATH: `$[?(@.name=="gitlab_database_connection_pool_waiting" && @.labels.class == "ActiveRecord::Base")].value.sum()`
GitLab	GitLab: Redis: Client requests rate, queues	Number of Redis client requests per second. (Instance: queues)	DEPENDENT	gitlab.redis.clientrequests.queues.rate Preprocessing: - JSONPATH: `$[?(@.name=="gitlab_redis_client_requests_total" && @.labels.storage == "queues")].value.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
GitLab	GitLab: Redis: Client requests rate, cache	Number of Redis client requests per second. (Instance: cache)	DEPENDENT	gitlab.redis.clientrequests.cache.rate Preprocessing: - JSONPATH: `$[?(@.name=="gitlab_redis_client_requests_total" && @.labels.storage == "cache")].value.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
GitLab	GitLab: Redis: Client requests rate, shared_state	Number of Redis client requests per second. (Instance: shared_state)	DEPENDENT	gitlab.redis.clientrequests.sharedstate.rate Preprocessing: - JSONPATH: `$[?(@.name=="gitlab_redis_client_requests_total" && @.labels.storage == "shared_state")].value.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
GitLab	GitLab: Redis: Client exceptions rate, queues	Number of Redis client exceptions per second. (Instance: queues)	DEPENDENT	gitlab.redis.clientexceptions.queues.rate Preprocessing: - JSONPATH: `$[?(@.name=="gitlab_redis_client_exceptions_total" && @.labels.storage == "queues")].value.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
GitLab	GitLab: Redis: Client exceptions rate, cache	Number of Redis client exceptions per second. (Instance: cache)	DEPENDENT	gitlab.redis.clientexceptions.cache.rate Preprocessing: - JSONPATH: `$[?(@.name=="gitlab_redis_client_exceptions_total" && @.labels.storage == "cache")].value.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
GitLab	GitLab: Redis: client exceptions rate, shared_state	Number of Redis client exceptions per second. (Instance: shared_state)	DEPENDENT	gitlab.redis.clientexceptions.sharedstate.rate Preprocessing: - JSONPATH: `$[?(@.name=="gitlab_redis_client_exceptions_total" && @.labels.storage == "shared_state")].value.first()` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
GitLab	GitLab: Cache: Misses rate, total	The cache read miss count.	DEPENDENT	gitlab.cache.missestotal.rate Preprocessing: - JSONPATH: `$[?(@.name=="gitlab_cache_misses_total")].value.sum()` - CHANGEPER_SECOND
GitLab	GitLab: Cache: Operations rate, total	The count of cache operations.	DEPENDENT	gitlab.cache.operationstotal.rate Preprocessing: - JSONPATH: `$[?(@.name=="gitlab_cache_operations_total")].value.sum()` - CHANGEPER_SECOND
GitLab	GitLab: Ruby: CPU usage per second	Average CPU time util in seconds.	DEPENDENT	gitlab.ruby.processcpuseconds.rate Preprocessing: - JSONPATH: `$[?(@.name=="ruby_process_cpu_seconds_total")].value.avg()` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
GitLab	GitLab: Ruby: Running_threads	Number of running Ruby threads.	DEPENDENT	gitlab.ruby.threads_running Preprocessing: - JSONPATH: `$[?(@.name=="gitlab_ruby_threads_running_threads")].value.sum()`
GitLab	GitLab: Ruby: File descriptors opened, avg	Average number of opened file descriptors.	DEPENDENT	gitlab.ruby.file_descriptors.avg Preprocessing: - JSONPATH: `$[?(@.name=="ruby_file_descriptors")].value.avg()`
GitLab	GitLab: Ruby: File descriptors opened, max	Maximum number of opened file descriptors.	DEPENDENT	gitlab.ruby.file_descriptors.max Preprocessing: - JSONPATH: `$[?(@.name=="ruby_file_descriptors")].value.max()`
GitLab	GitLab: Ruby: File descriptors opened, min	Minimum number of opened file descriptors.	DEPENDENT	gitlab.ruby.file_descriptors.min Preprocessing: - JSONPATH: `$[?(@.name=="ruby_file_descriptors")].value.min()`
GitLab	GitLab: Ruby: File descriptors, max	Maximum number of open file descriptors per process.	DEPENDENT	gitlab.ruby.processmaxfds Preprocessing: - JSONPATH: `$[?(@.name=="ruby_process_max_fds")].value.avg()`
GitLab	GitLab: Ruby: RSS memory, avg	Average RSS Memory usage in bytes.	DEPENDENT	gitlab.ruby.processresidentmemory_bytes.avg Preprocessing: - JSONPATH: `$[?(@.name=="ruby_process_resident_memory_bytes")].value.avg()`
GitLab	GitLab: Ruby: RSS memory, min	Minimum RSS Memory usage in bytes.	DEPENDENT	gitlab.ruby.processresidentmemory_bytes.min Preprocessing: - JSONPATH: `$[?(@.name=="ruby_process_resident_memory_bytes")].value.min()`
GitLab	GitLab: Ruby: RSS memory, max	Maximum RSS Memory usage in bytes.	DEPENDENT	gitlab.ruby.processresidentmemory_bytes.max Preprocessing: - JSONPATH: `$[?(@.name=="ruby_process_resident_memory_bytes")].value.max()`
GitLab	GitLab: HTTP requests rate, total	Number of requests received into the system.	DEPENDENT	gitlab.http.requests.rate Preprocessing: - JSONPATH: `$[?(@.name=="http_requests_total")].value.sum()` - CHANGEPERSECOND
GitLab	GitLab: HTTP requests rate, 5xx	Number of handle failures of requests with HTTP-code 5xx.	DEPENDENT	gitlab.http.requests.5xx.rate Preprocessing: - JSONPATH: `$[?(@.name=="http_requests_total" && @.labels.status =~ '5..' )].value.sum()` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
GitLab	GitLab: HTTP requests rate, 4xx	Number of handle failures of requests with code 4XX.	DEPENDENT	gitlab.http.requests.4xx.rate Preprocessing: - JSONPATH: `$[?(@.name=="http_requests_total" && @.labels.status =~ '4..' )].value.sum()` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
GitLab	GitLab: Transactions per second	Transactions per second (gitlabtransaction* metrics).	DEPENDENT	gitlab.transactions.rate Preprocessing: - JSONPATH: `$[?(@.name=~"gitlab_transaction_._count_total")].value.sum()` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGE*PER_SECOND
GitLab: Puma stats	GitLab: Active connections	Number of puma threads processing a request.	DEPENDENT	gitlab.puma.active_connections[{#SINGLETON}] Preprocessing: - JSONPATH: `$[?(@.name=='puma_active_connections')].value.sum()`
GitLab: Puma stats	GitLab: Workers	Total number of puma workers.	DEPENDENT	gitlab.puma.workers[{#SINGLETON}] Preprocessing: - JSONPATH: `$[?(@.name=='puma_workers')].value.sum()`
GitLab: Puma stats	GitLab: Running workers	The number of booted puma workers.	DEPENDENT	gitlab.puma.running_workers[{#SINGLETON}] Preprocessing: - JSONPATH: `$[?(@.name=='puma_running_workers')].value.sum()`
GitLab: Puma stats	GitLab: Stale workers	The number of old puma workers.	DEPENDENT	gitlab.puma.stale_workers[{#SINGLETON}] Preprocessing: - JSONPATH: `$[?(@.name=='puma_stale_workers')].value.sum()`
GitLab: Puma stats	GitLab: Running threads	The number of running puma threads.	DEPENDENT	gitlab.puma.running[{#SINGLETON}] Preprocessing: - JSONPATH: `$[?(@.name=='puma_running')].value.sum()`
GitLab: Puma stats	GitLab: Queued connections	The number of connections in that puma worker's "todo" set waiting for a worker thread.	DEPENDENT	gitlab.puma.queued_connections[{#SINGLETON}] Preprocessing: - JSONPATH: `$[?(@.name=='puma_queued_connections')].value.sum()`
GitLab: Puma stats	GitLab: Pool capacity	The number of requests the puma worker is capable of taking right now.	DEPENDENT	gitlab.puma.pool_capacity[{#SINGLETON}] Preprocessing: - JSONPATH: `$[?(@.name=='puma_pool_capacity')].value.sum()`
GitLab: Puma stats	GitLab: Max threads	The maximum number of puma worker threads.	DEPENDENT	gitlab.puma.max_threads[{#SINGLETON}] Preprocessing: - JSONPATH: `$[?(@.name=='puma_max_threads')].value.sum()`
GitLab: Puma stats	GitLab: Idle threads	The number of spawned puma threads which are not processing a request.	DEPENDENT	gitlab.puma.idle_threads[{#SINGLETON}] Preprocessing: - JSONPATH: `$[?(@.name=='puma_idle_threads')].value.sum()`
GitLab: Puma stats	GitLab: Killer terminations, total	The number of workers terminated by PumaWorkerKiller.	DEPENDENT	gitlab.puma.killerterminationstotal[{#SINGLETON}] Preprocessing: - JSONPATH: `$[?(@.name=='puma_killer_terminations_total')].value.sum()` ⛔️ON_FAIL: `DISCARD_VALUE ->`
GitLab: Unicorn stats	GitLab: Unicorn: Workers	The number of Unicorn workers	DEPENDENT	gitlab.unicorn.unicorn_workers[{#SINGLETON}] Preprocessing: - JSONPATH: `$[?(@.name=='unicorn_workers')].value.sum()`
GitLab: Unicorn stats	GitLab: Unicorn: Active connections	The number of active Unicorn connections.	DEPENDENT	gitlab.unicorn.active_connections[{#SINGLETON}] Preprocessing: - JSONPATH: `$[?(@.name=='unicorn_active_connections')].value.sum()`
GitLab: Unicorn stats	GitLab: Unicorn: Queued connections	The number of queued Unicorn connections.	DEPENDENT	gitlab.unicorn.queued_connections[{#SINGLETON}] Preprocessing: - JSONPATH: `$[?(@.name=='unicorn_queued_connections')].value.sum()`
Zabbix raw items	GitLab: Get instance metrics	-	HTTP_AGENT	gitlab.getmetrics Preprocessing: - CHECKNOTSUPPORTED ⛔️ONFAIL: `DISCARD_VALUE ->` - PROMETHEUSTOJSON

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
GitLab: Gitlab instance is not able to accept traffic	-	`last(/GitLab by HTTP/gitlab.readiness)=0`	HIGH	Depends on: - GitLab: Liveness check was failed
GitLab: Liveness check was failed	The application server is not running or Rails Controllers are deadlocked.	`last(/GitLab by HTTP/gitlab.liveness)=0`	HIGH
GitLab: Version has changed	The GitLab version has changed. Perform Ack to close.	`last(/GitLab by HTTP/gitlab.deployments.version,#1)<>last(/GitLab by HTTP/gitlab.deployments.version,#2) and length(last(/GitLab by HTTP/gitlab.deployments.version))>0`	INFO	Manual close: YES
GitLab: Too many Redis queues client exceptions	"Too many Redis client exceptions during the requests to Redis instance queues."	`min(/GitLab by HTTP/gitlab.redis.client_exceptions.queues.rate,5m)>{$GITLAB.REDIS.FAIL.MAX.WARN}`	WARNING
GitLab: Too many Redis cache client exceptions	"Too many Redis client exceptions during the requests to Redis instance cache."	`min(/GitLab by HTTP/gitlab.redis.client_exceptions.cache.rate,5m)>{$GITLAB.REDIS.FAIL.MAX.WARN}`	WARNING
GitLab: Too many Redis shared_state client exceptions	"Too many Redis client exceptions during the requests to Redis instance shared_state."	`min(/GitLab by HTTP/gitlab.redis.client_exceptions.shared_state.rate,5m)>{$GITLAB.REDIS.FAIL.MAX.WARN}`	WARNING
GitLab: Failed to fetch info data	Zabbix has not received a metrics data for the last 30 minutes	`nodata(/GitLab by HTTP/gitlab.ruby.threads_running,30m)=1`	WARNING	Manual close: YES Depends on: - GitLab: Liveness check was failed
GitLab: Current number of open files is too high	-	`min(/GitLab by HTTP/gitlab.ruby.file_descriptors.max,5m)/last(/GitLab by HTTP/gitlab.ruby.process_max_fds)*100>{$GITLAB.OPEN.FDS.MAX.WARN}`	WARNING
GitLab: Too many HTTP requests failures	"Too many requests failed on GitLab instance with 5xx HTTP code"	`min(/GitLab by HTTP/gitlab.http.requests.5xx.rate,5m)>{$GITLAB.HTTP.FAIL.MAX.WARN}`	WARNING
GitLab: Puma instance thread utilization is too high	-	`min(/GitLab by HTTP/gitlab.puma.active_connections[{#SINGLETON}],5m)/last(/GitLab by HTTP/gitlab.puma.max_threads[{#SINGLETON}])*100>{$GITLAB.PUMA.UTILIZATION.MAX.WARN}`	WARNING
GitLab: Puma is queueing requests	-	`min(/GitLab by HTTP/gitlab.puma.queued_connections[{#SINGLETON}],15m)>{$GITLAB.PUMA.QUEUE.MAX.WARN}`	WARNING
GitLab: Unicorn worker utilization is too high	-	`min(/GitLab by HTTP/gitlab.unicorn.active_connections[{#SINGLETON}],5m)/last(/GitLab by HTTP/gitlab.unicorn.unicorn_workers[{#SINGLETON}])*100>{$GITLAB.UNICORN.UTILIZATION.MAX.WARN}`	WARNING
GitLab: Unicorn is queueing requests	-	`min(/GitLab by HTTP/gitlab.unicorn.queued_connections[{#SINGLETON}],5m)>{$GITLAB.UNICORN.QUEUE.MAX.WARN}`	WARNING

Feedback

Please report any issues with the template at https://support.zabbix.com.

You can also provide feedback, discuss the template, or ask for help at ZABBIX forums.

app

app_generic_java_jmx

View README Download JSON

Generic Java JMX

Overview

For Zabbix version: 6.2 and higher
Official JMX Template from Zabbix distribution. Could be useful for many Java Applications (JMX).

Setup

Refer to the vendor documentation.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$JMX.CPU.LOAD.MAX}	A threshold in percent for CPU utilization trigger.	`85`
{$JMX.CPU.LOAD.TIME}	The time during which the CPU utilization may exceed the threshold.	`5m`
{$JMX.FILE.DESCRIPTORS.MAX}	A threshold in percent for file descriptors count trigger.	`85`
{$JMX.FILE.DESCRIPTORS.TIME}	The time during which the file descriptors count may exceed the threshold.	`3m`
{$JMX.HEAP.MEM.USAGE.MAX}	A threshold in percent for Heap memory utilization trigger.	`85`
{$JMX.HEAP.MEM.USAGE.TIME}	The time during which the Heap memory utilization may exceed the threshold.	`10m`
{$JMX.MEM.POOL.NAME.MATCHES}	This macro used in memory pool discovery as a filter.	`Old Gen	G1	Perm Gen	Code Cache	Tenured Gen`
{$JMX.MP.USAGE.MAX}	A threshold in percent for memory pools utilization trigger. Use a context to change the threshold for a specific pool.	`85`
{$JMX.MP.USAGE.TIME}	The time during which the memory pools utilization may exceed the threshold.	`10m`
{$JMX.NONHEAP.MEM.USAGE.MAX}	A threshold in percent for Non-heap memory utilization trigger.	`85`
{$JMX.NONHEAP.MEM.USAGE.TIME}	The time during which the Non-heap memory utilization may exceed the threshold.	`10m`
{$JMX.PASSWORD}	JMX password.	``
{$JMX.USER}	JMX username.	``

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Garbage collector discovery

Garbage collectors metrics discovery.

JMX

jmx.discovery["beans","java.lang:name=*,type=GarbageCollector"]

Memory pool discovery

Memory pools metrics discovery.

JMX

jmx.discovery["beans","java.lang:name=*,type=MemoryPool"]

Filter:

- {#JMXNAME} MATCHES_REGEX {$JMX.MEM.POOL.NAME.MATCHES}

Items collected

Group	Name	Description	Type	Key and additional info
JMX	ClassLoading: Loaded class count	Displays number of classes that are currently loaded in the Java virtual machine.	JMX	jmx["java.lang:type=ClassLoading","LoadedClassCount"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
JMX	ClassLoading: Total loaded class count	Displays the total number of classes that have been loaded since the Java virtual machine has started execution.	JMX	jmx["java.lang:type=ClassLoading","TotalLoadedClassCount"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
JMX	ClassLoading: Unloaded class count	Displays the total number of classes that have been loaded since the Java virtual machine has started execution.	JMX	jmx["java.lang:type=ClassLoading","UnloadedClassCount"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
JMX	Compilation: Name of the current JIT compiler	Displays the total number of classes unloaded since the Java virtual machine has started execution.	JMX	jmx["java.lang:type=Compilation","Name"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `30m`
JMX	Compilation: Accumulated time spent	Displays the approximate accumulated elapsed time spent in compilation, in seconds.	JMX	jmx["java.lang:type=Compilation","TotalCompilationTime"] Preprocessing: - MULTIPLIER: `0.001` - DISCARDUNCHANGEDHEARTBEAT: `10m`
JMX	Memory: Heap memory committed	Current heap memory allocated. This amount of memory is guaranteed for the Java virtual machine to use.	JMX	jmx["java.lang:type=Memory","HeapMemoryUsage.committed"]
JMX	Memory: Heap memory maximum size	Maximum amount of heap that can be used for memory management. This amount of memory is not guaranteed to be available if it is greater than the amount of committed memory. The Java virtual machine may fail to allocate memory even if the amount of used memory does not exceed this maximum size.	JMX	jmx["java.lang:type=Memory","HeapMemoryUsage.max"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
JMX	Memory: Heap memory used	Current memory usage outside the heap.	JMX	jmx["java.lang:type=Memory","HeapMemoryUsage.used"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
JMX	Memory: Non-Heap memory committed	Current memory allocated outside the heap. This amount of memory is guaranteed for the Java virtual machine to use.	JMX	jmx["java.lang:type=Memory","NonHeapMemoryUsage.committed"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
JMX	Memory: Non-Heap memory maximum size	Maximum amount of non-heap memory that can be used for memory management. This amount of memory is not guaranteed to be available if it is greater than the amount of committed memory. The Java virtual machine may fail to allocate memory even if the amount of used memory does not exceed this maximum size.	JMX	jmx["java.lang:type=Memory","NonHeapMemoryUsage.max"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
JMX	Memory: Non-Heap memory used	Current memory usage outside the heap	JMX	jmx["java.lang:type=Memory","NonHeapMemoryUsage.used"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
JMX	Memory: Object pending finalization count	The approximate number of objects for which finalization is pending.	JMX	jmx["java.lang:type=Memory","ObjectPendingFinalizationCount"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
JMX	OperatingSystem: File descriptors maximum count	This is the number of file descriptors we can have opened in the same process, as determined by the operating system. You can never have more file descriptors than this number.	JMX	jmx["java.lang:type=OperatingSystem","MaxFileDescriptorCount"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
JMX	OperatingSystem: File descriptors opened	This is the number of opened file descriptors at the moment, if this reaches the MaxFileDescriptorCount, the application will throw an IOException: Too many open files. This could mean you are opening file descriptors and never closing them.	JMX	jmx["java.lang:type=OperatingSystem","OpenFileDescriptorCount"]
JMX	OperatingSystem: Process CPU Load	ProcessCpuLoad represents the CPU load in this process.	JMX	jmx["java.lang:type=OperatingSystem","ProcessCpuLoad"] Preprocessing: - MULTIPLIER: `100`
JMX	Runtime: JVM uptime	-	JMX	jmx["java.lang:type=Runtime","Uptime"] Preprocessing: - MULTIPLIER: `0.001`
JMX	Runtime: JVM name	-	JMX	jmx["java.lang:type=Runtime","VmName"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `30m`
JMX	Runtime: JVM version	-	JMX	jmx["java.lang:type=Runtime","VmVersion"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `30m`
JMX	Threading: Daemon thread count	Number of daemon threads running.	JMX	jmx["java.lang:type=Threading","DaemonThreadCount"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
JMX	Threading: Peak thread count	Maximum number of threads being executed at the same time since the JVM was started or the peak was reset.	JMX	jmx["java.lang:type=Threading","PeakThreadCount"]
JMX	Threading: Thread count	The number of threads running at the current moment.	JMX	jmx["java.lang:type=Threading","ThreadCount"]
JMX	Threading: Total started thread count	The number of threads started since the JVM was launched.	JMX	jmx["java.lang:type=Threading","TotalStartedThreadCount"]
JMX	GarbageCollector: {#JMXNAME} number of collections per second	Displays the total number of collections that have occurred per second.	JMX	jmx["java.lang:name={#JMXNAME},type=GarbageCollector","CollectionCount"] Preprocessing: - CHANGEPERSECOND
JMX	GarbageCollector: {#JMXNAME} accumulated time spent in collection	Displays the approximate accumulated collection elapsed time, in seconds.	JMX	jmx["java.lang:name={#JMXNAME},type=GarbageCollector","CollectionTime"] Preprocessing: - MULTIPLIER: `0.001` - DISCARDUNCHANGEDHEARTBEAT: `10m`
JMX	Memory pool: {#JMXNAME} committed	Current memory allocated.	JMX	jmx["java.lang:name={#JMXNAME},type=MemoryPool","Usage.committed"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
JMX	Memory pool: {#JMXNAME} maximum size	Maximum amount of memory that can be used for memory management. This amount of memory is not guaranteed to be available if it is greater than the amount of committed memory. The Java virtual machine may fail to allocate memory even if the amount of used memory does not exceed this maximum size.	JMX	jmx["java.lang:name={#JMXNAME},type=MemoryPool","Usage.max"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
JMX	Memory pool: {#JMXNAME} used	Current memory usage.	JMX	jmx["java.lang:name={#JMXNAME},type=MemoryPool","Usage.used"]

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Compilation: {HOST.NAME} uses suboptimal JIT compiler	-	`find(/Generic Java JMX/jmx["java.lang:type=Compilation","Name"],,"like","Client")=1`	INFO	Manual close: YES
Memory: Heap memory usage is high	-	`min(/Generic Java JMX/jmx["java.lang:type=Memory","HeapMemoryUsage.used"],{$JMX.HEAP.MEM.USAGE.TIME})>(last(/Generic Java JMX/jmx["java.lang:type=Memory","HeapMemoryUsage.max"])*{$JMX.HEAP.MEM.USAGE.MAX}/100) and last(/Generic Java JMX/jmx["java.lang:type=Memory","HeapMemoryUsage.max"])>0`	WARNING
Memory: Non-Heap memory usage is high	-	`min(/Generic Java JMX/jmx["java.lang:type=Memory","NonHeapMemoryUsage.used"],{$JMX.NONHEAP.MEM.USAGE.TIME})>(last(/Generic Java JMX/jmx["java.lang:type=Memory","NonHeapMemoryUsage.max"])*{$JMX.NONHEAP.MEM.USAGE.MAX}/100) and last(/Generic Java JMX/jmx["java.lang:type=Memory","NonHeapMemoryUsage.max"])>0`	WARNING
OperatingSystem: Opened file descriptor count is high	-	`min(/Generic Java JMX/jmx["java.lang:type=OperatingSystem","OpenFileDescriptorCount"],{$JMX.FILE.DESCRIPTORS.TIME})>(last(/Generic Java JMX/jmx["java.lang:type=OperatingSystem","MaxFileDescriptorCount"])*{$JMX.FILE.DESCRIPTORS.MAX}/100)`	WARNING
OperatingSystem: Process CPU Load is high	-	`min(/Generic Java JMX/jmx["java.lang:type=OperatingSystem","ProcessCpuLoad"],{$JMX.CPU.LOAD.TIME})>{$JMX.CPU.LOAD.MAX}`	AVERAGE
Runtime: JVM is not reachable	-	`nodata(/Generic Java JMX/jmx["java.lang:type=Runtime","Uptime"],5m)=1`	AVERAGE	Manual close: YES
Runtime: {HOST.NAME} runs suboptimal VM type	-	`find(/Generic Java JMX/jmx["java.lang:type=Runtime","VmName"],,"like","Server")<>1`	INFO	Manual close: YES
Memory pool: {#JMXNAME} memory usage is high	-	`min(/Generic Java JMX/jmx["java.lang:name={#JMXNAME},type=MemoryPool","Usage.used"],{$JMX.MP.USAGE.TIME:"{#JMXNAME}"})>(last(/Generic Java JMX/jmx["java.lang:name={#JMXNAME},type=MemoryPool","Usage.max"])*{$JMX.MP.USAGE.MAX:"{#JMXNAME}"}/100) and last(/Generic Java JMX/jmx["java.lang:name={#JMXNAME},type=MemoryPool","Usage.max"])>0`	WARNING

Feedback

Please report any issues with the template at https://support.zabbix.com

app

app_ftp_service

View README Download JSON

FTP Service

Overview

For Zabbix version: 6.2 and higher

Setup

Refer to the vendor documentation.

Zabbix configuration

No specific Zabbix configuration is required.

Template links

There are no template links in this template.

Discovery rules

Items collected

Group	Name	Description	Type	Key and additional info
Services	FTP service is running	-	SIMPLE	net.tcp.service[ftp]

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
FTP service is down on {HOST.NAME}	-	`max(/FTP Service/net.tcp.service[ftp],#3)=0`	AVERAGE

Feedback

Please report any issues with the template at https://support.zabbix.com

app

app_exchange_active

View README Download JSON

Microsoft Exchange Server 2016 by Zabbix agent active

Overview

For Zabbix version: 6.2 and higher
Official Template for Microsoft Exchange Server 2016.

This template was tested on:

Microsoft Exchange Server, version 2016 CU18

Setup

Metrics are collected by Zabbix agent active.

1. Import the template into Zabbix.

2. Link the imported template to a host with MS Exchange.

Note that template doesn't provide information about Windows services state. Recommended to use it with "OS Windows by Zabbix agent active" template.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$AGENT.TIMEOUT}	Timeout after which agent is considered unavailable.	`5m`
{$MS.EXCHANGE.DB.ACTIVE.READ.TIME}	The time during which the active database read operations latency may exceed the threshold.	`5m`
{$MS.EXCHANGE.DB.ACTIVE.READ.WARN}	Threshold for active database read operations latency trigger.	`0.02`
{$MS.EXCHANGE.DB.ACTIVE.WRITE.TIME}	The time during which the active database write operations latency may exceed the threshold.	`10m`
{$MS.EXCHANGE.DB.ACTIVE.WRITE.WARN}	Threshold for active database write operations latency trigger.	`0.05`
{$MS.EXCHANGE.DB.FAULTS.TIME}	The time during which the database page faults may exceed the threshold.	`5m`
{$MS.EXCHANGE.DB.FAULTS.WARN}	Threshold for database page faults trigger.	`0`
{$MS.EXCHANGE.DB.PASSIVE.READ.TIME}	The time during which the passive database read operations latency may exceed the threshold.	`5m`
{$MS.EXCHANGE.DB.PASSIVE.READ.WARN}	Threshold for passive database read operations latency trigger.	`0.2`
{$MS.EXCHANGE.DB.PASSIVE.WRITE.TIME}	The time during which the passive database write operations latency may exceed the threshold.	`10m`
{$MS.EXCHANGE.LDAP.TIME}	The time during which the LDAP metrics may exceed the threshold.	`5m`
{$MS.EXCHANGE.LDAP.WARN}	Threshold for LDAP triggers.	`0.05`
{$MS.EXCHANGE.LOG.STALLS.TIME}	The time during which the log records stalled may exceed the threshold.	`10m`
{$MS.EXCHANGE.LOG.STALLS.WARN}	Threshold for log records stalled trigger.	`100`
{$MS.EXCHANGE.PERF.INTERVAL}	Update interval for perfcounteren items.	`60`
{$MS.EXCHANGE.RPC.COUNT.TIME}	The time during which the RPC total requests may exceed the threshold.	`5m`
{$MS.EXCHANGE.RPC.COUNT.WARN}	Threshold for LDAP triggers.	`70`
{$MS.EXCHANGE.RPC.TIME}	The time during which the RPC requests latency may exceed the threshold.	`10m`
{$MS.EXCHANGE.RPC.WARN}	Threshold for RPC requests latency trigger.	`0.05`

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Databases discovery

Discovery of Exchange databases.

ZABBIX_ACTIVE

perf_instance.discovery["MSExchange Active Manager"]

Preprocessing:

- JAVASCRIPT: The text is too long. Please see the template.

LDAP discovery

Discovery of domain controller.

ZABBIX_ACTIVE

perfinstanceen.discovery["MSExchange ADAccess Domain Controllers"]

Web services discovery

Discovery of Exchange web services.

ZABBIX_ACTIVE

perfinstanceen.discovery["Web Service"]

Items collected

Group	Name	Description	Type	Key and additional info
MS Exchange	MS Exchange: Databases total mounted	Shows the number of active database copies on the server.	ZABBIX_ACTIVE	perfcounteren["\MSExchange Active Manager(total)\Database Mounted"] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `3h`
MS Exchange	MS Exchange [Client Access Server]: ActiveSync: ping command pending	Shows the number of ping commands currently pending in the queue.	ZABBIX_ACTIVE	perfcounteren["\MSExchange ActiveSync\Ping Commands Pending", {$MS.EXCHANGE.PERF.INTERVAL}]
MS Exchange	MS Exchange [Client Access Server]: ActiveSync: requests per second	Shows the number of HTTP requests received from the client via ASP.NET per second. Determines the current Exchange ActiveSync request rate. Used only to determine current user load.	ZABBIX_ACTIVE	perfcounteren["\MSExchange ActiveSync\Requests/sec", {$MS.EXCHANGE.PERF.INTERVAL}]
MS Exchange	MS Exchange [Client Access Server]: ActiveSync: sync commands per second	Shows the number of sync commands processed per second. Clients use this command to synchronize items within a folder.	ZABBIX_ACTIVE	perfcounteren["\MSExchange ActiveSync\Sync Commands/sec", {$MS.EXCHANGE.PERF.INTERVAL}]
MS Exchange	MS Exchange [Client Access Server]: Autodiscover: requests per second	Shows the number of Autodiscover service requests processed each second. Determines current user load.	ZABBIX_ACTIVE	perfcounteren["\MSExchangeAutodiscover\Requests/sec", {$MS.EXCHANGE.PERF.INTERVAL}]
MS Exchange	MS Exchange [Client Access Server]: Availability Service: availability requests per second	Shows the number of requests serviced per second. The request can be only for free/ busy information or include suggestions. One request may contain multiple mailboxes. Determines the rate at which Availability service requests are occurring.	ZABBIX_ACTIVE	perfcounteren["\MSExchange Availability Service\Availability Requests (sec)", {$MS.EXCHANGE.PERF.INTERVAL}]
MS Exchange	MS Exchange [Client Access Server]: Outlook Web App: current unique users	Shows the number of unique users currently logged on to Outlook Web App. This value monitors the number of unique active user sessions, so that users are only removed from this counter after they log off or their session times out. Determines current user load.	ZABBIX_ACTIVE	perfcounteren["\MSExchange OWA\Current Unique Users", {$MS.EXCHANGE.PERF.INTERVAL}]
MS Exchange	MS Exchange [Client Access Server]: Outlook Web App: requests per second	Shows the number of requests handled by Outlook Web App per second. Determines current user load.	ZABBIX_ACTIVE	perfcounteren["\MSExchange OWA\Requests/sec", {$MS.EXCHANGE.PERF.INTERVAL}]
MS Exchange	MS Exchange [Client Access Server]: MSExchangeWS: requests per second	Shows the number of requests processed each second. Determines current user load.	ZABBIX_ACTIVE	perfcounteren["\MSExchangeWS\Requests/sec", {$MS.EXCHANGE.PERF.INTERVAL}]
MS Exchange	MS Exchange: Active agent availability	Availability of active checks on the host. The value of this item corresponds to availability icons in the host list. Possible value: 0 - unknown 1 - available 2 - not available	INTERNAL	zabbix[host,active_agent,available]
MS Exchange	Active Manager [{#INSTANCE}]: Database copy role	Database copy active or passive role.	ZABBIX_ACTIVE	perfcounteren["\MSExchange Active Manager({#INSTANCE})\Database Copy Role Active"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `3h`
MS Exchange	Information Store [{#INSTANCE}]: Database state	Database state. Possible values: 0: Database without any copy and dismounted. 1: Database is a primary database and mounted. 2: Database is a passive copy and the state is healthy.	ZABBIX_ACTIVE	perfcounteren["\MSExchangeIS Store({#INSTANCE})\Database State"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `3m`
MS Exchange	Information Store [{#INSTANCE}]: Active mailboxes count	Number of active mailboxes in this database.	ZABBIX_ACTIVE	perfcounteren["\MSExchangeIS Store({#INSTANCE})\Active mailboxes"]
MS Exchange	Information Store [{#INSTANCE}]: Page faults per second	Indicates the rate of page faults that can't be serviced because there are no pages available for allocation from the database cache. If this counter is above 0, it's an indication that the MSExchange Database\I/O Database Writes (Attached) Average Latency is too high.	ZABBIX_ACTIVE	perfcounteren["\MSExchange Database({#INF.STORE})\Database Page Fault Stalls/sec", {$MS.EXCHANGE.PERF.INTERVAL}]
MS Exchange	Information Store [{#INSTANCE}]: Log records stalled	Indicates the number of log records that can't be added to the log buffers per second because the log buffers are full. The average value should be below 10 per second. Spikes (maximum values) shouldn't be higher than 100 per second.	ZABBIX_ACTIVE	perfcounteren["\MSExchange Database({#INF.STORE})\Log Record Stalls/sec", {$MS.EXCHANGE.PERF.INTERVAL}]
MS Exchange	Information Store [{#INSTANCE}]: Log threads waiting	Indicates the number of threads waiting to complete an update of the database by writing their data to the log.	ZABBIX_ACTIVE	perfcounteren["\MSExchange Database({#INF.STORE})\Log Threads Waiting", {$MS.EXCHANGE.PERF.INTERVAL}]
MS Exchange	Information Store [{#INSTANCE}]: RPC requests per second	Shows the number of RPC operations per second for each database instance.	ZABBIX_ACTIVE	perfcounteren["\MSExchangeIS Store({#INSTANCE})\RPC Operations/sec", {$MS.EXCHANGE.PERF.INTERVAL}]
MS Exchange	Information Store [{#INSTANCE}]: RPC requests latency	RPC Latency average is the average latency of RPC requests per database. Average is calculated over all RPCs since exrpc32 was loaded. Should be less than 50ms at all times, with spikes less than 100ms.	ZABBIX_ACTIVE	perfcounteren["\MSExchangeIS Store({#INSTANCE})\RPC Average Latency", {$MS.EXCHANGE.PERF.INTERVAL}] Preprocessing: - MULTIPLIER: `0.001`
MS Exchange	Information Store [{#INSTANCE}]: RPC requests total	Indicates the overall RPC requests currently executing within the information store process. Should be below 70 at all times.	ZABBIX_ACTIVE	perfcounteren["\MSExchangeIS Store({#INSTANCE})\RPC requests", {$MS.EXCHANGE.PERF.INTERVAL}]
MS Exchange	Database Counters [{#INSTANCE}]: Active database read operations per second	Shows the number of database read operations.	ZABBIX_ACTIVE	perfcounteren["\MSExchange Database ==> Instances({#INF.STORE}/_Total)\I/O Database Reads (Attached)/sec", {$MS.EXCHANGE.PERF.INTERVAL}]
MS Exchange	Database Counters [{#INSTANCE}]: Active database read operations latency	Shows the average length of time per database read operation. Should be less than 20 ms on average.	ZABBIX_ACTIVE	perfcounteren["\MSExchange Database ==> Instances({#INF.STORE}/_Total)\I/O Database Reads (Attached) Average Latency", {$MS.EXCHANGE.PERF.INTERVAL}] Preprocessing: - MULTIPLIER: `0.001`
MS Exchange	Database Counters [{#INSTANCE}]: Passive database read operations latency	Shows the average length of time per passive database read operation. Should be less than 200ms on average.	ZABBIX_ACTIVE	perfcounteren["\MSExchange Database ==> Instances({#INF.STORE}/_Total)\I/O Database Reads (Recovery) Average Latency", {$MS.EXCHANGE.PERF.INTERVAL}] Preprocessing: - MULTIPLIER: `0.001`
MS Exchange	Database Counters [{#INSTANCE}]: Active database write operations per second	Shows the number of database write operations per second for each attached database instance.	ZABBIX_ACTIVE	perfcounteren["\MSExchange Database ==> Instances({#INF.STORE}/_Total)\I/O Database Writes (Attached)/sec", {$MS.EXCHANGE.PERF.INTERVAL}]
MS Exchange	Database Counters [{#INSTANCE}]: Active database write operations latency	Shows the average length of time per database write operation. Should be less than 50ms on average.	ZABBIX_ACTIVE	perfcounteren["\MSExchange Database ==> Instances({#INF.STORE}/_Total)\I/O Database Writes (Attached) Average Latency", {$MS.EXCHANGE.PERF.INTERVAL}] Preprocessing: - MULTIPLIER: `0.001`
MS Exchange	Database Counters [{#INSTANCE}]: Passive database write operations latency	Shows the average length of time, in ms, per passive database write operation. Should be less than the read latency for the same instance, as measured by the MSExchange Database ==> Instances({#INF.STORE}/_Total)\I/O Database Reads (Recovery) Average Latency counter.	ZABBIX_ACTIVE	perfcounteren["\MSExchange Database ==> Instances({#INF.STORE}/_Total)\I/O Database Writes (Recovery) Average Latency", {$MS.EXCHANGE.PERF.INTERVAL}] Preprocessing: - MULTIPLIER: `0.001`
MS Exchange	Web Service [{#INSTANCE}]: Current connections	Shows the current number of connections established to the each Web Service.	ZABBIX_ACTIVE	perfcounteren["\Web Service({#INSTANCE})\Current Connections", {$MS.EXCHANGE.PERF.INTERVAL}]
MS Exchange	Domain Controller [{#INSTANCE}]: Read time	Time that it takes to send an LDAP read request to the domain controller in question and get a response. Should ideally be below 50 ms; spikes below 100 ms are acceptable.	ZABBIX_ACTIVE	perfcounteren["\MSExchange ADAccess Domain Controllers({#INSTANCE})\LDAP Read Time", {$MS.EXCHANGE.PERF.INTERVAL}] Preprocessing: - MULTIPLIER: `0.001`
MS Exchange	Domain Controller [{#INSTANCE}]: Search time	Time that it takes to send an LDAP search request and get a response. Should ideally be below 50 ms; spikes below 100 ms are acceptable.	ZABBIX_ACTIVE	perfcounteren["\MSExchange ADAccess Domain Controllers({#INSTANCE})\LDAP Search Time", {$MS.EXCHANGE.PERF.INTERVAL}] Preprocessing: - MULTIPLIER: `0.001`

Triggers

Name	Description	Expression	Severity
MS Exchange: Zabbix agent: active checks are not available	Active checks are considered unavailable. Agent is not sending heartbeat for prolonged time.	`min(/Microsoft Exchange Server 2016 by Zabbix agent active/zabbix[host,active_agent,available],{$AGENT.TIMEOUT})=2`	HIGH
Information Store [{#INSTANCE}]: Page faults is too high	Too much page faults stalls for database "{#INSTANCE}". This counter should be 0 on production servers.	`min(/Microsoft Exchange Server 2016 by Zabbix agent active/perf_counter_en["\MSExchange Database({#INF.STORE})\Database Page Fault Stalls/sec", {$MS.EXCHANGE.PERF.INTERVAL}],{$MS.EXCHANGE.DB.FAULTS.TIME})>{$MS.EXCHANGE.DB.FAULTS.WARN}`	AVERAGE
Information Store [{#INSTANCE}]: Log records stalls is too high	Stalled log records too high. The average value should be less than 10 threads waiting.	`avg(/Microsoft Exchange Server 2016 by Zabbix agent active/perf_counter_en["\MSExchange Database({#INF.STORE})\Log Record Stalls/sec", {$MS.EXCHANGE.PERF.INTERVAL}],{$MS.EXCHANGE.LOG.STALLS.TIME})>{$MS.EXCHANGE.LOG.STALLS.WARN}`	AVERAGE
Information Store [{#INSTANCE}]: RPC Requests latency is too high	Should be less than 50ms at all times, with spikes less than 100ms.	`min(/Microsoft Exchange Server 2016 by Zabbix agent active/perf_counter_en["\MSExchangeIS Store({#INSTANCE})\RPC Average Latency", {$MS.EXCHANGE.PERF.INTERVAL}],{$MS.EXCHANGE.RPC.TIME})>{$MS.EXCHANGE.RPC.WARN}`	WARNING
Information Store [{#INSTANCE}]: RPC Requests total count is too high	Should be below 70 at all times.	`min(/Microsoft Exchange Server 2016 by Zabbix agent active/perf_counter_en["\MSExchangeIS Store({#INSTANCE})\RPC requests", {$MS.EXCHANGE.PERF.INTERVAL}],{$MS.EXCHANGE.RPC.COUNT.TIME})>{$MS.EXCHANGE.RPC.COUNT.WARN}`	WARNING
Database Counters [{#INSTANCE}]: Average read time latency is too high	Should be less than 20ms on average.	`min(/Microsoft Exchange Server 2016 by Zabbix agent active/perf_counter_en["\MSExchange Database ==> Instances({#INF.STORE}/_Total)\I/O Database Reads (Attached) Average Latency", {$MS.EXCHANGE.PERF.INTERVAL}],{$MS.EXCHANGE.DB.ACTIVE.READ.TIME})>{$MS.EXCHANGE.DB.ACTIVE.READ.WARN}`	WARNING
Database Counters [{#INSTANCE}]: Average read time latency is too high	Should be less than 200ms on average.	`min(/Microsoft Exchange Server 2016 by Zabbix agent active/perf_counter_en["\MSExchange Database ==> Instances({#INF.STORE}/_Total)\I/O Database Reads (Recovery) Average Latency", {$MS.EXCHANGE.PERF.INTERVAL}],{$MS.EXCHANGE.DB.PASSIVE.READ.TIME})>{$MS.EXCHANGE.DB.PASSIVE.READ.WARN}`	WARNING
Database Counters [{#INSTANCE}]: Average write time latency is too high for {$MS.EXCHANGE.DB.ACTIVE.WRITE.TIME}	Should be less than 50ms on average.	`min(/Microsoft Exchange Server 2016 by Zabbix agent active/perf_counter_en["\MSExchange Database ==> Instances({#INF.STORE}/_Total)\I/O Database Writes (Attached) Average Latency", {$MS.EXCHANGE.PERF.INTERVAL}],{$MS.EXCHANGE.DB.ACTIVE.WRITE.TIME})>{$MS.EXCHANGE.DB.ACTIVE.WRITE.WARN}`	WARNING
Database Counters [{#INSTANCE}]: Average write time latency is higher than read time latency for {$MS.EXCHANGE.DB.PASSIVE.WRITE.TIME}	Should be less than the read latency for the same instance, as measured by the MSExchange Database ==> Instances({#INF.STORE}/_Total)\I/O Database Reads (Recovery) Average Latency counter.	`avg(/Microsoft Exchange Server 2016 by Zabbix agent active/perf_counter_en["\MSExchange Database ==> Instances({#INF.STORE}/_Total)\I/O Database Writes (Recovery) Average Latency", {$MS.EXCHANGE.PERF.INTERVAL}],{$MS.EXCHANGE.DB.PASSIVE.WRITE.TIME})>avg(/Microsoft Exchange Server 2016 by Zabbix agent active/perf_counter_en["\MSExchange Database ==> Instances({#INF.STORE}/_Total)\I/O Database Reads (Recovery) Average Latency", {$MS.EXCHANGE.PERF.INTERVAL}],{$MS.EXCHANGE.DB.PASSIVE.WRITE.TIME})`	WARNING
Domain Controller [{#INSTANCE}]: LDAP read time is too high	Should be less than 50ms at all times, with spikes less than 100ms.	`min(/Microsoft Exchange Server 2016 by Zabbix agent active/perf_counter_en["\MSExchange ADAccess Domain Controllers({#INSTANCE})\LDAP Read Time", {$MS.EXCHANGE.PERF.INTERVAL}],{$MS.EXCHANGE.LDAP.TIME})>{$MS.EXCHANGE.LDAP.WARN}`	AVERAGE
Domain Controller [{#INSTANCE}]: LDAP search time is too high	Should be less than 50ms at all times, with spikes less than 100ms.	`min(/Microsoft Exchange Server 2016 by Zabbix agent active/perf_counter_en["\MSExchange ADAccess Domain Controllers({#INSTANCE})\LDAP Search Time", {$MS.EXCHANGE.PERF.INTERVAL}],{$MS.EXCHANGE.LDAP.TIME})>{$MS.EXCHANGE.LDAP.WARN}`	AVERAGE

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

app

app_exchange

View README Download JSON

Microsoft Exchange Server 2016 by Zabbix agent

Overview

For Zabbix version: 6.2 and higher
Official Template for Microsoft Exchange Server 2016.

This template was tested on:

Microsoft Exchange Server, version 2016 CU18

Setup

Metrics are collected by Zabbix agent.

1. Import the template into Zabbix.

2. Link the imported template to a host with MS Exchange.

Note that template doesn't provide information about Windows services state. Recommended to use it with "OS Windows by Zabbix agent" template.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$MS.EXCHANGE.DB.ACTIVE.READ.TIME}	The time during which the active database read operations latency may exceed the threshold.	`5m`
{$MS.EXCHANGE.DB.ACTIVE.READ.WARN}	Threshold for active database read operations latency trigger.	`0.02`
{$MS.EXCHANGE.DB.ACTIVE.WRITE.TIME}	The time during which the active database write operations latency may exceed the threshold.	`10m`
{$MS.EXCHANGE.DB.ACTIVE.WRITE.WARN}	Threshold for active database write operations latency trigger.	`0.05`
{$MS.EXCHANGE.DB.FAULTS.TIME}	The time during which the database page faults may exceed the threshold.	`5m`
{$MS.EXCHANGE.DB.FAULTS.WARN}	Threshold for database page faults trigger.	`0`
{$MS.EXCHANGE.DB.PASSIVE.READ.TIME}	The time during which the passive database read operations latency may exceed the threshold.	`5m`
{$MS.EXCHANGE.DB.PASSIVE.READ.WARN}	Threshold for passive database read operations latency trigger.	`0.2`
{$MS.EXCHANGE.DB.PASSIVE.WRITE.TIME}	The time during which the passive database write operations latency may exceed the threshold.	`10m`
{$MS.EXCHANGE.LDAP.TIME}	The time during which the LDAP metrics may exceed the threshold.	`5m`
{$MS.EXCHANGE.LDAP.WARN}	Threshold for LDAP triggers.	`0.05`
{$MS.EXCHANGE.LOG.STALLS.TIME}	The time during which the log records stalled may exceed the threshold.	`10m`
{$MS.EXCHANGE.LOG.STALLS.WARN}	Threshold for log records stalled trigger.	`100`
{$MS.EXCHANGE.PERF.INTERVAL}	Update interval for perfcounteren items.	`60`
{$MS.EXCHANGE.RPC.COUNT.TIME}	The time during which the RPC total requests may exceed the threshold.	`5m`
{$MS.EXCHANGE.RPC.COUNT.WARN}	Threshold for LDAP triggers.	`70`
{$MS.EXCHANGE.RPC.TIME}	The time during which the RPC requests latency may exceed the threshold.	`10m`
{$MS.EXCHANGE.RPC.WARN}	Threshold for RPC requests latency trigger.	`0.05`

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Databases discovery

Discovery of Exchange databases.

ZABBIX_PASSIVE

perf_instance.discovery["MSExchange Active Manager"]

Preprocessing:

- JAVASCRIPT: The text is too long. Please see the template.

LDAP discovery

Discovery of domain controller.

ZABBIX_PASSIVE

perfinstanceen.discovery["MSExchange ADAccess Domain Controllers"]

Web services discovery

Discovery of Exchange web services.

ZABBIX_PASSIVE

perfinstanceen.discovery["Web Service"]

Items collected

Group	Name	Description	Type	Key and additional info
MS Exchange	MS Exchange: Databases total mounted	Shows the number of active database copies on the server.	ZABBIX_PASSIVE	perfcounteren["\MSExchange Active Manager(total)\Database Mounted"] Preprocessing: - DISCARDUNCHANGED_HEARTBEAT: `3h`
MS Exchange	MS Exchange [Client Access Server]: ActiveSync: ping command pending	Shows the number of ping commands currently pending in the queue.	ZABBIX_PASSIVE	perfcounteren["\MSExchange ActiveSync\Ping Commands Pending", {$MS.EXCHANGE.PERF.INTERVAL}]
MS Exchange	MS Exchange [Client Access Server]: ActiveSync: requests per second	Shows the number of HTTP requests received from the client via ASP.NET per second. Determines the current Exchange ActiveSync request rate. Used only to determine current user load.	ZABBIX_PASSIVE	perfcounteren["\MSExchange ActiveSync\Requests/sec", {$MS.EXCHANGE.PERF.INTERVAL}]
MS Exchange	MS Exchange [Client Access Server]: ActiveSync: sync commands per second	Shows the number of sync commands processed per second. Clients use this command to synchronize items within a folder.	ZABBIX_PASSIVE	perfcounteren["\MSExchange ActiveSync\Sync Commands/sec", {$MS.EXCHANGE.PERF.INTERVAL}]
MS Exchange	MS Exchange [Client Access Server]: Autodiscover: requests per second	Shows the number of Autodiscover service requests processed each second. Determines current user load.	ZABBIX_PASSIVE	perfcounteren["\MSExchangeAutodiscover\Requests/sec", {$MS.EXCHANGE.PERF.INTERVAL}]
MS Exchange	MS Exchange [Client Access Server]: Availability Service: availability requests per second	Shows the number of requests serviced per second. The request can be only for free/ busy information or include suggestions. One request may contain multiple mailboxes. Determines the rate at which Availability service requests are occurring.	ZABBIX_PASSIVE	perfcounteren["\MSExchange Availability Service\Availability Requests (sec)", {$MS.EXCHANGE.PERF.INTERVAL}]
MS Exchange	MS Exchange [Client Access Server]: Outlook Web App: current unique users	Shows the number of unique users currently logged on to Outlook Web App. This value monitors the number of unique active user sessions, so that users are only removed from this counter after they log off or their session times out. Determines current user load.	ZABBIX_PASSIVE	perfcounteren["\MSExchange OWA\Current Unique Users", {$MS.EXCHANGE.PERF.INTERVAL}]
MS Exchange	MS Exchange [Client Access Server]: Outlook Web App: requests per second	Shows the number of requests handled by Outlook Web App per second. Determines current user load.	ZABBIX_PASSIVE	perfcounteren["\MSExchange OWA\Requests/sec", {$MS.EXCHANGE.PERF.INTERVAL}]
MS Exchange	MS Exchange [Client Access Server]: MSExchangeWS: requests per second	Shows the number of requests processed each second. Determines current user load.	ZABBIX_PASSIVE	perfcounteren["\MSExchangeWS\Requests/sec", {$MS.EXCHANGE.PERF.INTERVAL}]
MS Exchange	Active Manager [{#INSTANCE}]: Database copy role	Database copy active or passive role.	ZABBIX_PASSIVE	perfcounteren["\MSExchange Active Manager({#INSTANCE})\Database Copy Role Active"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `3h`
MS Exchange	Information Store [{#INSTANCE}]: Database state	Database state. Possible values: 0: Database without any copy and dismounted. 1: Database is a primary database and mounted. 2: Database is a passive copy and the state is healthy.	ZABBIX_PASSIVE	perfcounteren["\MSExchangeIS Store({#INSTANCE})\Database State"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `3m`
MS Exchange	Information Store [{#INSTANCE}]: Active mailboxes count	Number of active mailboxes in this database.	ZABBIX_PASSIVE	perfcounteren["\MSExchangeIS Store({#INSTANCE})\Active mailboxes"]
MS Exchange	Information Store [{#INSTANCE}]: Page faults per second	Indicates the rate of page faults that can't be serviced because there are no pages available for allocation from the database cache. If this counter is above 0, it's an indication that the MSExchange Database\I/O Database Writes (Attached) Average Latency is too high.	ZABBIX_PASSIVE	perfcounteren["\MSExchange Database({#INF.STORE})\Database Page Fault Stalls/sec", {$MS.EXCHANGE.PERF.INTERVAL}]
MS Exchange	Information Store [{#INSTANCE}]: Log records stalled	Indicates the number of log records that can't be added to the log buffers per second because the log buffers are full. The average value should be below 10 per second. Spikes (maximum values) shouldn't be higher than 100 per second.	ZABBIX_PASSIVE	perfcounteren["\MSExchange Database({#INF.STORE})\Log Record Stalls/sec", {$MS.EXCHANGE.PERF.INTERVAL}]
MS Exchange	Information Store [{#INSTANCE}]: Log threads waiting	Indicates the number of threads waiting to complete an update of the database by writing their data to the log.	ZABBIX_PASSIVE	perfcounteren["\MSExchange Database({#INF.STORE})\Log Threads Waiting", {$MS.EXCHANGE.PERF.INTERVAL}]
MS Exchange	Information Store [{#INSTANCE}]: RPC requests per second	Shows the number of RPC operations per second for each database instance.	ZABBIX_PASSIVE	perfcounteren["\MSExchangeIS Store({#INSTANCE})\RPC Operations/sec", {$MS.EXCHANGE.PERF.INTERVAL}]
MS Exchange	Information Store [{#INSTANCE}]: RPC requests latency	RPC Latency average is the average latency of RPC requests per database. Average is calculated over all RPCs since exrpc32 was loaded. Should be less than 50ms at all times, with spikes less than 100ms.	ZABBIX_PASSIVE	perfcounteren["\MSExchangeIS Store({#INSTANCE})\RPC Average Latency", {$MS.EXCHANGE.PERF.INTERVAL}] Preprocessing: - MULTIPLIER: `0.001`
MS Exchange	Information Store [{#INSTANCE}]: RPC requests total	Indicates the overall RPC requests currently executing within the information store process. Should be below 70 at all times.	ZABBIX_PASSIVE	perfcounteren["\MSExchangeIS Store({#INSTANCE})\RPC requests", {$MS.EXCHANGE.PERF.INTERVAL}]
MS Exchange	Database Counters [{#INSTANCE}]: Active database read operations per second	Shows the number of database read operations.	ZABBIX_PASSIVE	perfcounteren["\MSExchange Database ==> Instances({#INF.STORE}/_Total)\I/O Database Reads (Attached)/sec", {$MS.EXCHANGE.PERF.INTERVAL}]
MS Exchange	Database Counters [{#INSTANCE}]: Active database read operations latency	Shows the average length of time per database read operation. Should be less than 20 ms on average.	ZABBIX_PASSIVE	perfcounteren["\MSExchange Database ==> Instances({#INF.STORE}/_Total)\I/O Database Reads (Attached) Average Latency", {$MS.EXCHANGE.PERF.INTERVAL}] Preprocessing: - MULTIPLIER: `0.001`
MS Exchange	Database Counters [{#INSTANCE}]: Passive database read operations latency	Shows the average length of time per passive database read operation. Should be less than 200ms on average.	ZABBIX_PASSIVE	perfcounteren["\MSExchange Database ==> Instances({#INF.STORE}/_Total)\I/O Database Reads (Recovery) Average Latency", {$MS.EXCHANGE.PERF.INTERVAL}] Preprocessing: - MULTIPLIER: `0.001`
MS Exchange	Database Counters [{#INSTANCE}]: Active database write operations per second	Shows the number of database write operations per second for each attached database instance.	ZABBIX_PASSIVE	perfcounteren["\MSExchange Database ==> Instances({#INF.STORE}/_Total)\I/O Database Writes (Attached)/sec", {$MS.EXCHANGE.PERF.INTERVAL}]
MS Exchange	Database Counters [{#INSTANCE}]: Active database write operations latency	Shows the average length of time per database write operation. Should be less than 50ms on average.	ZABBIX_PASSIVE	perfcounteren["\MSExchange Database ==> Instances({#INF.STORE}/_Total)\I/O Database Writes (Attached) Average Latency", {$MS.EXCHANGE.PERF.INTERVAL}] Preprocessing: - MULTIPLIER: `0.001`
MS Exchange	Database Counters [{#INSTANCE}]: Passive database write operations latency	Shows the average length of time, in ms, per passive database write operation. Should be less than the read latency for the same instance, as measured by the MSExchange Database ==> Instances({#INF.STORE}/_Total)\I/O Database Reads (Recovery) Average Latency counter.	ZABBIX_PASSIVE	perfcounteren["\MSExchange Database ==> Instances({#INF.STORE}/_Total)\I/O Database Writes (Recovery) Average Latency", {$MS.EXCHANGE.PERF.INTERVAL}] Preprocessing: - MULTIPLIER: `0.001`
MS Exchange	Web Service [{#INSTANCE}]: Current connections	Shows the current number of connections established to the each Web Service.	ZABBIX_PASSIVE	perfcounteren["\Web Service({#INSTANCE})\Current Connections", {$MS.EXCHANGE.PERF.INTERVAL}]
MS Exchange	Domain Controller [{#INSTANCE}]: Read time	Time that it takes to send an LDAP read request to the domain controller in question and get a response. Should ideally be below 50 ms; spikes below 100 ms are acceptable.	ZABBIX_PASSIVE	perfcounteren["\MSExchange ADAccess Domain Controllers({#INSTANCE})\LDAP Read Time", {$MS.EXCHANGE.PERF.INTERVAL}] Preprocessing: - MULTIPLIER: `0.001`
MS Exchange	Domain Controller [{#INSTANCE}]: Search time	Time that it takes to send an LDAP search request and get a response. Should ideally be below 50 ms; spikes below 100 ms are acceptable.	ZABBIX_PASSIVE	perfcounteren["\MSExchange ADAccess Domain Controllers({#INSTANCE})\LDAP Search Time", {$MS.EXCHANGE.PERF.INTERVAL}] Preprocessing: - MULTIPLIER: `0.001`

Triggers

Name	Description	Expression	Severity
Information Store [{#INSTANCE}]: Page faults is too high	Too much page faults stalls for database "{#INSTANCE}". This counter should be 0 on production servers.	`min(/Microsoft Exchange Server 2016 by Zabbix agent/perf_counter_en["\MSExchange Database({#INF.STORE})\Database Page Fault Stalls/sec", {$MS.EXCHANGE.PERF.INTERVAL}],{$MS.EXCHANGE.DB.FAULTS.TIME})>{$MS.EXCHANGE.DB.FAULTS.WARN}`	AVERAGE
Information Store [{#INSTANCE}]: Log records stalls is too high	Stalled log records too high. The average value should be less than 10 threads waiting.	`avg(/Microsoft Exchange Server 2016 by Zabbix agent/perf_counter_en["\MSExchange Database({#INF.STORE})\Log Record Stalls/sec", {$MS.EXCHANGE.PERF.INTERVAL}],{$MS.EXCHANGE.LOG.STALLS.TIME})>{$MS.EXCHANGE.LOG.STALLS.WARN}`	AVERAGE
Information Store [{#INSTANCE}]: RPC Requests latency is too high	Should be less than 50ms at all times, with spikes less than 100ms.	`min(/Microsoft Exchange Server 2016 by Zabbix agent/perf_counter_en["\MSExchangeIS Store({#INSTANCE})\RPC Average Latency", {$MS.EXCHANGE.PERF.INTERVAL}],{$MS.EXCHANGE.RPC.TIME})>{$MS.EXCHANGE.RPC.WARN}`	WARNING
Information Store [{#INSTANCE}]: RPC Requests total count is too high	Should be below 70 at all times.	`min(/Microsoft Exchange Server 2016 by Zabbix agent/perf_counter_en["\MSExchangeIS Store({#INSTANCE})\RPC requests", {$MS.EXCHANGE.PERF.INTERVAL}],{$MS.EXCHANGE.RPC.COUNT.TIME})>{$MS.EXCHANGE.RPC.COUNT.WARN}`	WARNING
Database Counters [{#INSTANCE}]: Average read time latency is too high	Should be less than 20ms on average.	`min(/Microsoft Exchange Server 2016 by Zabbix agent/perf_counter_en["\MSExchange Database ==> Instances({#INF.STORE}/_Total)\I/O Database Reads (Attached) Average Latency", {$MS.EXCHANGE.PERF.INTERVAL}],{$MS.EXCHANGE.DB.ACTIVE.READ.TIME})>{$MS.EXCHANGE.DB.ACTIVE.READ.WARN}`	WARNING
Database Counters [{#INSTANCE}]: Average read time latency is too high	Should be less than 200ms on average.	`min(/Microsoft Exchange Server 2016 by Zabbix agent/perf_counter_en["\MSExchange Database ==> Instances({#INF.STORE}/_Total)\I/O Database Reads (Recovery) Average Latency", {$MS.EXCHANGE.PERF.INTERVAL}],{$MS.EXCHANGE.DB.PASSIVE.READ.TIME})>{$MS.EXCHANGE.DB.PASSIVE.READ.WARN}`	WARNING
Database Counters [{#INSTANCE}]: Average write time latency is too high for {$MS.EXCHANGE.DB.ACTIVE.WRITE.TIME}	Should be less than 50ms on average.	`min(/Microsoft Exchange Server 2016 by Zabbix agent/perf_counter_en["\MSExchange Database ==> Instances({#INF.STORE}/_Total)\I/O Database Writes (Attached) Average Latency", {$MS.EXCHANGE.PERF.INTERVAL}],{$MS.EXCHANGE.DB.ACTIVE.WRITE.TIME})>{$MS.EXCHANGE.DB.ACTIVE.WRITE.WARN}`	WARNING
Database Counters [{#INSTANCE}]: Average write time latency is higher than read time latency for {$MS.EXCHANGE.DB.PASSIVE.WRITE.TIME}	Should be less than the read latency for the same instance, as measured by the MSExchange Database ==> Instances({#INF.STORE}/_Total)\I/O Database Reads (Recovery) Average Latency counter.	`avg(/Microsoft Exchange Server 2016 by Zabbix agent/perf_counter_en["\MSExchange Database ==> Instances({#INF.STORE}/_Total)\I/O Database Writes (Recovery) Average Latency", {$MS.EXCHANGE.PERF.INTERVAL}],{$MS.EXCHANGE.DB.PASSIVE.WRITE.TIME})>avg(/Microsoft Exchange Server 2016 by Zabbix agent/perf_counter_en["\MSExchange Database ==> Instances({#INF.STORE}/_Total)\I/O Database Reads (Recovery) Average Latency", {$MS.EXCHANGE.PERF.INTERVAL}],{$MS.EXCHANGE.DB.PASSIVE.WRITE.TIME})`	WARNING
Domain Controller [{#INSTANCE}]: LDAP read time is too high	Should be less than 50ms at all times, with spikes less than 100ms.	`min(/Microsoft Exchange Server 2016 by Zabbix agent/perf_counter_en["\MSExchange ADAccess Domain Controllers({#INSTANCE})\LDAP Read Time", {$MS.EXCHANGE.PERF.INTERVAL}],{$MS.EXCHANGE.LDAP.TIME})>{$MS.EXCHANGE.LDAP.WARN}`	AVERAGE
Domain Controller [{#INSTANCE}]: LDAP search time is too high	Should be less than 50ms at all times, with spikes less than 100ms.	`min(/Microsoft Exchange Server 2016 by Zabbix agent/perf_counter_en["\MSExchange ADAccess Domain Controllers({#INSTANCE})\LDAP Search Time", {$MS.EXCHANGE.PERF.INTERVAL}],{$MS.EXCHANGE.LDAP.TIME})>{$MS.EXCHANGE.LDAP.WARN}`	AVERAGE

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

app

app_etcd_http

View README Download JSON

Etcd by HTTP

Overview

For Zabbix version: 6.2 and higher. This template is designed to monitor etcd by Zabbix that works without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

The template Etcd by HTTP — collects metrics by help of the HTTP agent from /metrics endpoint.

For the users of etcd version <= 3.4 !

This template has been tested on:

Etcd, version 3.5.6

Setup

Follow these instructions:

Import the template into Zabbix.
After importing the template, make sure that etcd allows the collection of metrics. You can test it by running: curl -L http://localhost:2379/metrics.
Check if etcd is accessible from Zabbix proxy or Zabbix server depending on where you are planning to do the monitoring. To verify it, run curl -L http://<etcd_node_address>:2379/metrics.
Add the template to each etcd node. By default, the template uses a client's port. You can configure metrics endpoint location by adding --listen-metrics-urls flag. (For more details, see etcd documentation).

Additional points to consider:

If you have specified a non-standard port for etcd, don't forget to change macros: {$ETCD.SCHEME} and {$ETCD.PORT}.
You can set {$ETCD.USERNAME} and {$ETCD.PASSWORD} macros in the template to use on a host level if necessary.
To test availability, run : zabbix_get -s etcd-host -k etcd.health.
See the macros section, as it will set the trigger values.

Configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$ETCD.GRPC.ERRORS.MAX.WARN}	The maximum number of gRPC request failures.	`1`
{$ETCD.GRPC_CODE.MATCHES}	The filter of discoverable gRPC codes. See more details on https://github.com/grpc/grpc/blob/master/doc/statuscodes.md.	`.*`
{$ETCD.GRPCCODE.NOTMATCHES}	The filter to exclude discovered gRPC codes. See more details on https://github.com/grpc/grpc/blob/master/doc/statuscodes.md.	`CHANGE_IF_NEEDED`
{$ETCD.GRPC_CODE.TRIGGER.MATCHES}	The filter of discoverable gRPC codes, which will create triggers.	`Aborted	Unavailable`
{$ETCD.HTTP.FAIL.MAX.WARN}	The maximum number of HTTP request failures.	`2`
{$ETCD.LEADER.CHANGES.MAX.WARN}	The maximum number of leader changes.	`5`
{$ETCD.OPEN.FDS.MAX.WARN}	The maximum percentage of used file descriptors.	`90`
{$ETCD.PASSWORD}	-	``
{$ETCD.PORT}	The port of `etcd` API endpoint.	`2379`
{$ETCD.PROPOSAL.FAIL.MAX.WARN}	The maximum number of proposal failures.	`2`
{$ETCD.PROPOSAL.PENDING.MAX.WARN}	The maximum number of proposals in queue.	`5`
{$ETCD.SCHEME}	The request scheme which may be `http` or `https`.	`http`
{$ETCD.USER}	-	``

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

gRPC codes discovery

DEPENDENT

etcd.grpccode.discovery
Preprocessing:

- PROMETHEUSTOJSON: grpc_server_handled_total

- JAVASCRIPT: The text is too long. Please see the template.

- DISCARDUNCHANGEDHEARTBEAT: 1h

Filter:

AND

- {#GRPC.CODE} NOTMATCHESREGEX {$ETCD.GRPC_CODE.NOT_MATCHES}

- {#GRPC.CODE} MATCHESREGEX {$ETCD.GRPC_CODE.MATCHES}

Overrides:

trigger
- {#GRPC.CODE} MATCHESREGEX {$ETCD.GRPC_CODE.TRIGGER.MATCHES}
- TRIGGERPROTOTYPE LIKE Too many failed gRPC requests
- DISCOVER

Peers discovery

DEPENDENT

etcd.peer.discovery

Preprocessing:

- PROMETHEUSTOJSON: etcd_network_peer_sent_bytes_total

Items collected

Group	Name	Description	Type	Key and additional info
Etcd	Etcd: Service's TCP port state	-	SIMPLE	net.tcp.service["{$ETCD.SCHEME}","{HOST.CONN}","{$ETCD.PORT}"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
Etcd	Etcd: Node health	-	HTTP_AGENT	etcd.health Preprocessing: - JSONPATH: `$.health` - BOOLTODECIMAL ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - DISCARDUNCHANGED_HEARTBEAT: `10m`
Etcd	Etcd: Server is a leader	It defines - whether or not this member is a leader: 1 - it is; 0 - otherwise.	DEPENDENT	etcd.is.leader Preprocessing: - PROMETHEUSPATTERN: `etcd_server_is_leader` ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - DISCARDUNCHANGEDHEARTBEAT: `10m`
Etcd	Etcd: Server has a leader	It defines - whether or not a leader exists: 1 - it exists; 0 - it does not.	DEPENDENT	etcd.has.leader Preprocessing: - PROMETHEUSPATTERN: `etcd_server_has_leader` - DISCARDUNCHANGED_HEARTBEAT: `10m`
Etcd	Etcd: Leader changes	The number of leader changes the member has seen since its start.	DEPENDENT	etcd.leader.changes Preprocessing: - PROMETHEUS_PATTERN: `etcd_server_leader_changes_seen_total`
Etcd	Etcd: Proposals committed per second	The number of consensus proposals committed.	DEPENDENT	etcd.proposals.committed.rate Preprocessing: - PROMETHEUSPATTERN: `etcd_server_proposals_committed_total` - CHANGEPER_SECOND
Etcd	Etcd: Proposals applied per second	The number of consensus proposals applied.	DEPENDENT	etcd.proposals.applied.rate Preprocessing: - PROMETHEUSPATTERN: `etcd_server_proposals_applied_total` - CHANGEPER_SECOND
Etcd	Etcd: Proposals failed per second	The number of failed proposals seen.	DEPENDENT	etcd.proposals.failed.rate Preprocessing: - PROMETHEUSPATTERN: `etcd_server_proposals_failed_total` - CHANGEPER_SECOND
Etcd	Etcd: Proposals pending	The current number of pending proposals to commit.	DEPENDENT	etcd.proposals.pending Preprocessing: - PROMETHEUS_PATTERN: `etcd_server_proposals_pending`
Etcd	Etcd: Reads per second	The number of read actions by `get/getRecursive`, local to this member.	DEPENDENT	etcd.reads.rate Preprocessing: - PROMETHEUSTOJSON: `etcd_debugging_store_reads_total` - JAVASCRIPT: `The text is too long. Please see the template.` - CHANGEPERSECOND
Etcd	Etcd: Writes per second	The number of writes (e.g., `set/compareAndDelete`) seen by this member.	DEPENDENT	etcd.writes.rate Preprocessing: - PROMETHEUSTOJSON: `etcd_debugging_store_writes_total` - JAVASCRIPT: `The text is too long. Please see the template.` - CHANGEPERSECOND
Etcd	Etcd: Client gRPC received bytes per second	The number of bytes received from gRPC clients per second.	DEPENDENT	etcd.network.grpc.received.rate Preprocessing: - PROMETHEUSPATTERN: `etcd_network_client_grpc_received_bytes_total` - CHANGEPER_SECOND
Etcd	Etcd: Client gRPC sent bytes per second	The number of bytes sent from gRPC clients per second.	DEPENDENT	etcd.network.grpc.sent.rate Preprocessing: - PROMETHEUSPATTERN: `etcd_network_client_grpc_sent_bytes_total` - CHANGEPER_SECOND
Etcd	Etcd: HTTP requests received	The number of requests received into the system (successfully parsed and `authd`).	DEPENDENT	etcd.http.requests.rate Preprocessing: - PROMETHEUSTOJSON: `etcd_http_received_total` - JAVASCRIPT: `The text is too long. Please see the template.` - CHANGEPERSECOND
Etcd	Etcd: HTTP 5XX	The number of handled failures of requests (non-watches), by the method (`GET/PUT` etc.), and the code `5XX`.	DEPENDENT	etcd.http.requests.5xx.rate Preprocessing: - PROMETHEUSTOJSON: `etcd_http_failed_total{code=~"5.+"}` - JAVASCRIPT: `The text is too long. Please see the template.` - CHANGEPERSECOND
Etcd	Etcd: HTTP 4XX	The number of handled failures of requests (non-watches), by the method (`GET/PUT` etc.), and the code `4XX`.	DEPENDENT	etcd.http.requests.4xx.rate Preprocessing: - PROMETHEUSTOJSON: `etcd_http_failed_total{code=~"4.+"}` - JAVASCRIPT: `The text is too long. Please see the template.` - CHANGEPERSECOND
Etcd	Etcd: RPCs received per second	The number of RPC stream messages received on the server.	DEPENDENT	etcd.grpc.received.rate Preprocessing: - PROMETHEUSTOJSON: `grpc_server_msg_received_total` - JAVASCRIPT: `The text is too long. Please see the template.` - CHANGEPERSECOND
Etcd	Etcd: RPCs sent per second	The number of gRPC stream messages sent by the server.	DEPENDENT	etcd.grpc.sent.rate Preprocessing: - PROMETHEUSTOJSON: `grpc_server_msg_sent_total` - JAVASCRIPT: `The text is too long. Please see the template.` - CHANGEPERSECOND
Etcd	Etcd: RPCs started per second	The number of RPCs started on the server.	DEPENDENT	etcd.grpc.started.rate Preprocessing: - PROMETHEUSTOJSON: `grpc_server_started_total` - JAVASCRIPT: `The text is too long. Please see the template.` - CHANGEPERSECOND
Etcd	Etcd: Server version	The version of the `etcd server`.	DEPENDENT	etcd.server.version Preprocessing: - JSONPATH: `$.etcdserver` - DISCARDUNCHANGEDHEARTBEAT: `1d`
Etcd	Etcd: Cluster version	The version of the `etcd cluster`.	DEPENDENT	etcd.cluster.version Preprocessing: - JSONPATH: `$.etcdcluster` - DISCARDUNCHANGEDHEARTBEAT: `1d`
Etcd	Etcd: DB size	The total size of the underlying database.	DEPENDENT	etcd.db.size Preprocessing: - PROMETHEUS_PATTERN: `etcd_mvcc_db_total_size_in_bytes`
Etcd	Etcd: Keys compacted per second	The number of DB keys compacted per second.	DEPENDENT	etcd.keys.compacted.rate Preprocessing: - PROMETHEUSPATTERN: `etcd_debugging_mvcc_db_compaction_keys_total` ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - CHANGEPERSECOND
Etcd	Etcd: Keys expired per second	The number of expired keys per second.	DEPENDENT	etcd.keys.expired.rate Preprocessing: - PROMETHEUSPATTERN: `etcd_debugging_store_expires_total` - CHANGEPER_SECOND
Etcd	Etcd: Keys total	The total number of keys.	DEPENDENT	etcd.keys.total Preprocessing: - PROMETHEUS_PATTERN: `etcd_debugging_mvcc_keys_total`
Etcd	Etcd: Uptime	`Etcd` server uptime.	DEPENDENT	etcd.uptime Preprocessing: - PROMETHEUS_PATTERN: `process_start_time_seconds` - JAVASCRIPT: `//use boottime to calculate uptime return (Math.floor(Date.now()/1000)-Number(value));`
Etcd	Etcd: Virtual memory	The size of virtual memory expressed in bytes.	DEPENDENT	etcd.virtual.bytes Preprocessing: - PROMETHEUS_PATTERN: `process_virtual_memory_bytes`
Etcd	Etcd: Resident memory	The size of resident memory expressed in bytes.	DEPENDENT	etcd.res.bytes Preprocessing: - PROMETHEUS_PATTERN: `process_resident_memory_bytes`
Etcd	Etcd: CPU	The total user and system CPU time spent in seconds.	DEPENDENT	etcd.cpu.util Preprocessing: - PROMETHEUSPATTERN: `process_cpu_seconds_total` - CHANGEPER_SECOND
Etcd	Etcd: Open file descriptors	The number of open file descriptors.	DEPENDENT	etcd.open.fds Preprocessing: - PROMETHEUS_PATTERN: `process_open_fds`
Etcd	Etcd: Maximum open file descriptors	The Maximum number of open file descriptors.	DEPENDENT	etcd.max.fds Preprocessing: - PROMETHEUS_PATTERN: `process_max_fds`
Etcd	Etcd: Deletes per second	The number of deletes seen by this member per second.	DEPENDENT	etcd.delete.rate Preprocessing: - PROMETHEUSPATTERN: `etcd_mvcc_delete_total` - CHANGEPER_SECOND
Etcd	Etcd: PUT per second	The number of puts seen by this member per second.	DEPENDENT	etcd.put.rate Preprocessing: - PROMETHEUSPATTERN: `etcd_mvcc_put_total` - CHANGEPER_SECOND
Etcd	Etcd: Range per second	The number of ranges seen by this member per second.	DEPENDENT	etcd.range.rate Preprocessing: - PROMETHEUSPATTERN: `etcd_debugging_mvcc_range_total` - CHANGEPER_SECOND
Etcd	Etcd: Transaction per second	The number of transactions seen by this member per second.	DEPENDENT	etcd.txn.rate Preprocessing: - PROMETHEUSPATTERN: `etcd_debugging_mvcc_range_total` - CHANGEPER_SECOND
Etcd	Etcd: Pending events	The total number of pending events to be sent.	DEPENDENT	etcd.events.sent.rate Preprocessing: - PROMETHEUS_PATTERN: `etcd_debugging_mvcc_pending_events_total`
Etcd	Etcd: RPCs completed with code {#GRPC.CODE}	The number of RPCs completed on the server with grpc_code {#GRPC.CODE}.	DEPENDENT	etcd.grpc.handled.rate[{#GRPC.CODE}] Preprocessing: - PROMETHEUSTOJSON: `grpc_server_handled_total{grpc_method="{#GRPC.CODE}"}` - JAVASCRIPT: `The text is too long. Please see the template.` - CHANGEPERSECOND
Etcd	Etcd: Etcd peer {#ETCD.PEER}: Bytes sent	The number of bytes sent to a peer with the ID `{#ETCD.PEER}`.	DEPENDENT	etcd.bytes.sent.rate[{#ETCD.PEER}] Preprocessing: - PROMETHEUSPATTERN: `etcd_network_peer_sent_bytes_total{To="{#ETCD.PEER}"}` ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - CHANGEPERSECOND
Etcd	Etcd: Etcd peer {#ETCD.PEER}: Bytes received	The number of bytes received from a peer with the ID `{#ETCD.PEER}`.	DEPENDENT	etcd.bytes.received.rate[{#ETCD.PEER}] Preprocessing: - PROMETHEUSPATTERN: `etcd_network_peer_received_bytes_total{From="{#ETCD.PEER}"}` ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - CHANGEPERSECOND
Etcd	Etcd: Etcd peer {#ETCD.PEER}: Send failures	The number of sent failures from a peer with the ID `{#ETCD.PEER}`.	DEPENDENT	etcd.sent.fail.rate[{#ETCD.PEER}] Preprocessing: - PROMETHEUSPATTERN: `etcd_network_peer_sent_failures_total{To="{#ETCD.PEER}"}` ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - CHANGEPERSECOND
Etcd	Etcd: Etcd peer {#ETCD.PEER}: Receive failures	The number of received failures from a peer with the ID `{#ETCD.PEER}`.	DEPENDENT	etcd.received.fail.rate[{#ETCD.PEER}] Preprocessing: - PROMETHEUSPATTERN: `etcd_network_peer_received_failures_total{To="{#ETCD.PEER}"}` ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - CHANGEPERSECOND
Zabbix raw items	Etcd: Get node metrics	-	HTTP_AGENT	etcd.get_metrics
Zabbix raw items	Etcd: Get version	-	HTTP_AGENT	etcd.get_version

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Etcd: Service is unavailable	-	`last(/Etcd by HTTP/net.tcp.service["{$ETCD.SCHEME}","{HOST.CONN}","{$ETCD.PORT}"])=0`	AVERAGE	Manual close: YES
Etcd: Node healthcheck failed	See more details on https://etcd.io/docs/v3.5/op-guide/monitoring/#health-check.	`last(/Etcd by HTTP/etcd.health)=0`	AVERAGE	Depends on: - Etcd: Service is unavailable
Etcd: Failed to fetch info data	Zabbix has not received data for items for the last 30 minutes.	`nodata(/Etcd by HTTP/etcd.is.leader,30m)=1`	WARNING	Manual close: YES Depends on: - Etcd: Service is unavailable
Etcd: Member has no leader	If a member does not have a leader, it is totally unavailable.	`last(/Etcd by HTTP/etcd.has.leader)=0`	AVERAGE
Etcd: Instance has seen too many leader changes	Rapid leadership changes impact the performance of `etcd` significantly. It also signals that the leader is unstable, perhaps due to network connectivity issues or excessive load hitting the `etcd cluster`.	`(max(/Etcd by HTTP/etcd.leader.changes,15m)-min(/Etcd by HTTP/etcd.leader.changes,15m))>{$ETCD.LEADER.CHANGES.MAX.WARN}`	WARNING
Etcd: Too many proposal failures	Normally related to two issues: temporary failures related to a leader election or longer downtime caused by a loss of quorum in the cluster.	`min(/Etcd by HTTP/etcd.proposals.failed.rate,5m)>{$ETCD.PROPOSAL.FAIL.MAX.WARN}`	WARNING
Etcd: Too many proposals are queued to commit	Rising pending proposals suggests there is a high client load, or the member cannot commit proposals.	`min(/Etcd by HTTP/etcd.proposals.pending,5m)>{$ETCD.PROPOSAL.PENDING.MAX.WARN}`	WARNING
Etcd: Too many HTTP requests failures	Too many requests failed on `etcd` instance with the `5xx HTTP code`.	`min(/Etcd by HTTP/etcd.http.requests.5xx.rate,5m)>{$ETCD.HTTP.FAIL.MAX.WARN}`	WARNING
Etcd: Server version has changed	The Etcd version has changed. Acknowledge to close manually.	`last(/Etcd by HTTP/etcd.server.version,#1)<>last(/Etcd by HTTP/etcd.server.version,#2) and length(last(/Etcd by HTTP/etcd.server.version))>0`	INFO	Manual close: YES
Etcd: Cluster version has changed	The Etcd version has changed. Acknowledge to close manually.	`last(/Etcd by HTTP/etcd.cluster.version,#1)<>last(/Etcd by HTTP/etcd.cluster.version,#2) and length(last(/Etcd by HTTP/etcd.cluster.version))>0`	INFO	Manual close: YES
Etcd: Host has been restarted	The host uptime is less than 10 minutes.	`last(/Etcd by HTTP/etcd.uptime)<10m`	INFO	Manual close: YES
Etcd: Current number of open files is too high	Heavy usage of a file descriptor (i.e., near the limit of the process's file descriptor) indicates a potential file descriptor exhaustion issue. If the file descriptors are exhausted, `etcd` may panic because it cannot create new WAL files.	`min(/Etcd by HTTP/etcd.open.fds,5m)/last(/Etcd by HTTP/etcd.max.fds)*100>{$ETCD.OPEN.FDS.MAX.WARN}`	WARNING
Etcd: Too many failed gRPC requests with code: {#GRPC.CODE}	-	`min(/Etcd by HTTP/etcd.grpc.handled.rate[{#GRPC.CODE}],5m)>{$ETCD.GRPC.ERRORS.MAX.WARN}`	WARNING

Feedback

Please report any issues with the template at https://support.zabbix.com.

app

app_envoy_proxy_http

View README Download JSON

Envoy Proxy by HTTP

Overview

For Zabbix version: 6.2 and higher
The template to monitor Envoy Proxy by Zabbix that works without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

Template Envoy Proxy by HTTP — collects metrics by HTTP agent from metrics endpoint {$ENVOY.METRICS.PATH} endpoint (default: /stats/prometheus).

This template was tested on:

Envoy Proxy, version 1.20.2

Setup

Internal service metrics are collected from {$ENVOY.METRICS.PATH} endpoint (default: /stats/prometheus). https://www.envoyproxy.io/docs/envoy/v1.20.0/operations/stats_overview

Don't forget to change macros {$ENVOY.URL}, {$ENVOY.METRICS.PATH}. Also, see the Macros section for a list of macros used to set trigger values.
NOTE. Some metrics may not be collected depending on your Envoy Proxy instance version and configuration.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$ENVOY.CERT.MIN}	Minimum number of days before certificate expiration used for trigger expression.	`7`
{$ENVOY.METRICS.PATH}	The path Zabbix will scrape metrics in prometheus format from.	`/stats/prometheus`
{$ENVOY.URL}	Instance URL.	`http://localhost:9901`

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Cluster metrics discovery

DEPENDENT

envoy.lld.cluster

Preprocessing:

- PROMETHEUSTOJSON

- JAVASCRIPT: The text is too long. Please see the template.

- DISCARDUNCHANGEDHEARTBEAT: 3h

HTTP metrics discovery

DEPENDENT

envoy.lld.http

Preprocessing:

- PROMETHEUSTOJSON

- JAVASCRIPT: The text is too long. Please see the template.

- DISCARDUNCHANGEDHEARTBEAT: 3h

Listeners metrics discovery

DEPENDENT

envoy.lld.listeners

Preprocessing:

- PROMETHEUSTOJSON

- JAVASCRIPT: The text is too long. Please see the template.

- DISCARDUNCHANGEDHEARTBEAT: 3h

Items collected

Group	Name	Description	Type	Key and additional info
Envoy Proxy	Envoy Proxy: Server state	State of the server. Live - (default) ⁣Server is live and serving traffic. Draining - ⁣Server is draining listeners in response to external health checks failing. Pre initializing - ⁣Server has not yet completed cluster manager initialization. Initializing - Server is running the cluster manager initialization callbacks (e.g., RDS).	DEPENDENT	envoy.server.state Preprocessing: - PROMETHEUSPATTERN: `envoy_server_state` - DISCARDUNCHANGED_HEARTBEAT: `3h`
Envoy Proxy	Envoy Proxy: Server live	1 if the server is not currently draining, 0 otherwise.	DEPENDENT	envoy.server.live Preprocessing: - PROMETHEUSPATTERN: `envoy_server_live` - DISCARDUNCHANGED_HEARTBEAT: `3h`
Envoy Proxy	Envoy Proxy: Uptime	Current server uptime in seconds.	DEPENDENT	envoy.server.uptime Preprocessing: - PROMETHEUSPATTERN: `envoy_server_uptime` ⛔️ONFAIL: `DISCARD_VALUE ->`
Envoy Proxy	Envoy Proxy: Certificate expiration, day before	Number of days until the next certificate being managed will expire.	DEPENDENT	envoy.server.daysuntilfirstcertexpiring Preprocessing: - PROMETHEUS_PATTERN: `envoy_server_days_until_first_cert_expiring`
Envoy Proxy	Envoy Proxy: Server concurrency	Number of worker threads.	DEPENDENT	envoy.server.concurrency Preprocessing: - PROMETHEUS_PATTERN: `envoy_server_concurrency`
Envoy Proxy	Envoy Proxy: Memory allocated	Current amount of allocated memory in bytes. Total of both new and old Envoy processes on hot restart.	DEPENDENT	envoy.server.memoryallocated Preprocessing: - PROMETHEUSPATTERN: `envoy_server_memory_allocated`
Envoy Proxy	Envoy Proxy: Memory heap size	Current reserved heap size in bytes. New Envoy process heap size on hot restart.	DEPENDENT	envoy.server.memoryheapsize Preprocessing: - PROMETHEUS_PATTERN: `envoy_server_memory_heap_size`
Envoy Proxy	Envoy Proxy: Memory physical size	Current estimate of total bytes of the physical memory. New Envoy process physical memory size on hot restart.	DEPENDENT	envoy.server.memoryphysicalsize Preprocessing: - PROMETHEUS_PATTERN: `envoy_server_memory_physical_size`
Envoy Proxy	Envoy Proxy: Filesystem, flushed by timer rate	Total number of times internal flush buffers are written to a file due to flush timeout per second.	DEPENDENT	envoy.filesystem.flushedbytimer.rate Preprocessing: - PROMETHEUSPATTERN: `envoy_filesystem_flushed_by_timer` - CHANGEPER_SECOND
Envoy Proxy	Envoy Proxy: Filesystem, write completed rate	Total number of times a file was written per second.	DEPENDENT	envoy.filesystem.writecompleted.rate Preprocessing: - PROMETHEUSPATTERN: `envoy_filesystem_write_completed` - CHANGEPERSECOND
Envoy Proxy	Envoy Proxy: Filesystem, write failed rate	Total number of times an error occurred during a file write operation per second.	DEPENDENT	envoy.filesystem.writefailed.rate Preprocessing: - PROMETHEUSPATTERN: `envoy_filesystem_write_failed` - CHANGEPERSECOND
Envoy Proxy	Envoy Proxy: Filesystem, reopen failed rate	Total number of times a file was failed to be opened per second.	DEPENDENT	envoy.filesystem.reopenfailed.rate Preprocessing: - PROMETHEUSPATTERN: `envoy_filesystem_reopen_failed` - CHANGEPERSECOND
Envoy Proxy	Envoy Proxy: Connections, total	Total connections of both new and old Envoy processes.	DEPENDENT	envoy.server.totalconnections Preprocessing: - PROMETHEUSPATTERN: `envoy_server_total_connections`
Envoy Proxy	Envoy Proxy: Connections, parent	Total connections of the old Envoy process on hot restart.	DEPENDENT	envoy.server.parentconnections Preprocessing: - PROMETHEUSPATTERN: `envoy_server_parent_connections`
Envoy Proxy	Envoy Proxy: Clusters, warming	Number of currently warming (not active) clusters.	DEPENDENT	envoy.clustermanager.warmingclusters Preprocessing: - PROMETHEUS_PATTERN: `envoy_cluster_manager_warming_clusters`
Envoy Proxy	Envoy Proxy: Clusters, active	Number of currently active (warmed) clusters.	DEPENDENT	envoy.clustermanager.activeclusters Preprocessing: - PROMETHEUS_PATTERN: `envoy_cluster_manager_active_clusters`
Envoy Proxy	Envoy Proxy: Clusters, added rate	Total clusters added (either via static config or CDS) per second.	DEPENDENT	envoy.clustermanager.clusteradded.rate Preprocessing: - PROMETHEUSPATTERN: `envoy_cluster_manager_cluster_added` - CHANGEPER_SECOND
Envoy Proxy	Envoy Proxy: Clusters, modified rate	Total clusters modified (via CDS) per second.	DEPENDENT	envoy.clustermanager.clustermodified.rate Preprocessing: - PROMETHEUSPATTERN: `envoy_cluster_manager_cluster_modified` - CHANGEPER_SECOND
Envoy Proxy	Envoy Proxy: Clusters, removed rate	Total clusters removed (via CDS) per second.	DEPENDENT	envoy.clustermanager.clusterremoved.rate Preprocessing: - PROMETHEUSPATTERN: `envoy_cluster_manager_cluster_removed` - CHANGEPER_SECOND
Envoy Proxy	Envoy Proxy: Clusters, updates rate	Total cluster updates per second.	DEPENDENT	envoy.clustermanager.clusterupdated.rate Preprocessing: - PROMETHEUSPATTERN: `envoy_cluster_manager_cluster_updated` - CHANGEPER_SECOND
Envoy Proxy	Envoy Proxy: Listeners, active	Number of currently active listeners.	DEPENDENT	envoy.listenermanager.totallistenersactive Preprocessing: - PROMETHEUSPATTERN: `envoy_listener_manager_total_listeners_active`: `function`: `sum`
Envoy Proxy	Envoy Proxy: Listeners, draining	Number of currently draining listeners.	DEPENDENT	envoy.listenermanager.totallistenersdraining Preprocessing: - PROMETHEUSPATTERN: `envoy_listener_manager_total_listeners_draining`: `function`: `sum`
Envoy Proxy	Envoy Proxy: Listener, warming	Number of currently warming listeners.	DEPENDENT	envoy.listenermanager.totallistenerswarming Preprocessing: - PROMETHEUSPATTERN: `envoy_listener_manager_total_listeners_warming`: `function`: `sum`
Envoy Proxy	Envoy Proxy: Listener manager, initialized	A boolean (1 if started and 0 otherwise) that indicates whether listeners have been initialized on workers.	DEPENDENT	envoy.listenermanager.workersstarted Preprocessing: - PROMETHEUSPATTERN: `envoy_listener_manager_workers_started` - DISCARDUNCHANGED_HEARTBEAT: `3h`
Envoy Proxy	Envoy Proxy: Listeners, create failure	Total failed listener object additions to workers per second.	DEPENDENT	envoy.listenermanager.listenercreatefailure.rate Preprocessing: - PROMETHEUSPATTERN: `envoy_listener_manager_listener_create_failure` - CHANGEPERSECOND
Envoy Proxy	Envoy Proxy: Listeners, create success	Total listener objects successfully added to workers per second.	DEPENDENT	envoy.listenermanager.listenercreatesuccess.rate Preprocessing: - PROMETHEUSPATTERN: `envoy_listener_manager_listener_create_success` - CHANGEPERSECOND
Envoy Proxy	Envoy Proxy: Listeners, added	Total listeners added (either via static config or LDS) per second.	DEPENDENT	envoy.listenermanager.listeneradded.rate Preprocessing: - PROMETHEUSPATTERN: `envoy_listener_manager_listener_added` - CHANGEPER_SECOND
Envoy Proxy	Envoy Proxy: Listeners, stopped	Total listeners stopped per second.	DEPENDENT	envoy.listenermanager.listenerstopped.rate Preprocessing: - PROMETHEUSPATTERN: `envoy_listener_manager_listener_stopped` - CHANGEPER_SECOND
Envoy Proxy	Envoy Proxy: Cluster ["{#CLUSTER_NAME}"]: Membership, total	Current cluster membership total.	DEPENDENT	envoy.cluster.membershiptotal["{#CLUSTERNAME}"] Preprocessing: - PROMETHEUS_PATTERN: `envoy_cluster_membership_total{envoy_cluster_name = "{#CLUSTER_NAME}"}`: `function`: `sum`
Envoy Proxy	Envoy Proxy: Cluster ["{#CLUSTER_NAME}"]: Membership, healthy	Current cluster healthy total (inclusive of both health checking and outlier detection).	DEPENDENT	envoy.cluster.membershiphealthy["{#CLUSTERNAME}"] Preprocessing: - PROMETHEUS_PATTERN: `envoy_cluster_membership_healthy{envoy_cluster_name = "{#CLUSTER_NAME}"}`: `function`: `sum`
Envoy Proxy	Envoy Proxy: Cluster ["{#CLUSTER_NAME}"]: Membership, unhealthy	Current cluster unhealthy.	CALCULATED	envoy.cluster.membershipunhealthy["{#CLUSTERNAME}"] Expression: `last(//envoy.cluster.membership_total["{#CLUSTER_NAME}"]) - last(//envoy.cluster.membership_healthy["{#CLUSTER_NAME}"])`
Envoy Proxy	Envoy Proxy: Cluster ["{#CLUSTER_NAME}"]: Membership, degraded	Current cluster degraded total.	DEPENDENT	envoy.cluster.membershipdegraded["{#CLUSTERNAME}"] Preprocessing: - PROMETHEUS_PATTERN: `envoy_cluster_membership_degraded{envoy_cluster_name = "{#CLUSTER_NAME}"}`: `function`: `sum`
Envoy Proxy	Envoy Proxy: Cluster ["{#CLUSTER_NAME}"]: Connections, total	Current cluster total connections.	DEPENDENT	envoy.cluster.upstreamcxtotal["{#CLUSTERNAME}"] Preprocessing: - PROMETHEUSPATTERN: `envoy_cluster_upstream_cx_total{envoy_cluster_name = "{#CLUSTER_NAME}"}`: `function`: `sum`
Envoy Proxy	Envoy Proxy: Cluster ["{#CLUSTER_NAME}"]: Connections, active	Current cluster total active connections.	DEPENDENT	envoy.cluster.upstreamcxactive["{#CLUSTERNAME}"] Preprocessing: - PROMETHEUSPATTERN: `envoy_cluster_upstream_cx_active{envoy_cluster_name = "{#CLUSTER_NAME}"}`: `function`: `sum`
Envoy Proxy	Envoy Proxy: Cluster ["{#CLUSTER_NAME}"]: Requests total, rate	Current cluster request total per second.	DEPENDENT	envoy.cluster.upstreamrqtotal.rate["{#CLUSTERNAME}"] Preprocessing: - PROMETHEUSPATTERN: `envoy_cluster_upstream_rq_total{envoy_cluster_name = "{#CLUSTER_NAME}"}`: `function`: `sum` - CHANGEPERSECOND
Envoy Proxy	Envoy Proxy: Cluster ["{#CLUSTER_NAME}"]: Requests timeout, rate	Current cluster requests that timed out waiting for a response per second.	DEPENDENT	envoy.cluster.upstreamrqtimeout.rate["{#CLUSTERNAME}"] Preprocessing: - PROMETHEUSPATTERN: `envoy_cluster_upstream_rq_timeout{envoy_cluster_name = "{#CLUSTER_NAME}"}`: `function`: `sum` - CHANGEPERSECOND
Envoy Proxy	Envoy Proxy: Cluster ["{#CLUSTER_NAME}"]: Requests completed, rate	Total upstream requests completed per second.	DEPENDENT	envoy.cluster.upstreamrqcompleted.rate["{#CLUSTERNAME}"] Preprocessing: - PROMETHEUSPATTERN: `envoy_cluster_upstream_rq_completed{envoy_cluster_name = "{#CLUSTER_NAME}"}`: `function`: `sum` - CHANGEPERSECOND
Envoy Proxy	Envoy Proxy: Cluster ["{#CLUSTER_NAME}"]: Requests 2xx, rate	Aggregate HTTP response codes per second.	DEPENDENT	envoy.cluster.upstreamrq2x.rate["{#CLUSTERNAME}"] Preprocessing: - PROMETHEUSPATTERN: `envoy_cluster_upstream_rq_xx{envoy_cluster_name = "{#CLUSTER_NAME}", envoy_response_code_class="2"}`: `function`: `sum` - CHANGEPERSECOND
Envoy Proxy	Envoy Proxy: Cluster ["{#CLUSTER_NAME}"]: Requests 3xx, rate	Aggregate HTTP response codes per second.	DEPENDENT	envoy.cluster.upstreamrq3x.rate["{#CLUSTERNAME}"] Preprocessing: - PROMETHEUSPATTERN: `envoy_cluster_upstream_rq_xx{envoy_cluster_name = "{#CLUSTER_NAME}", envoy_response_code_class="3"}`: `function`: `sum` - CHANGEPERSECOND
Envoy Proxy	Envoy Proxy: Cluster ["{#CLUSTER_NAME}"]: Requests 4xx, rate	Aggregate HTTP response codes per second.	DEPENDENT	envoy.cluster.upstreamrq4x.rate["{#CLUSTERNAME}"] Preprocessing: - PROMETHEUSPATTERN: `envoy_cluster_upstream_rq_xx{envoy_cluster_name = "{#CLUSTER_NAME}", envoy_response_code_class="4"}`: `function`: `sum` - CHANGEPERSECOND
Envoy Proxy	Envoy Proxy: Cluster ["{#CLUSTER_NAME}"]: Requests 5xx, rate	Aggregate HTTP response codes per second.	DEPENDENT	envoy.cluster.upstreamrq5x.rate["{#CLUSTERNAME}"] Preprocessing: - PROMETHEUSPATTERN: `envoy_cluster_upstream_rq_xx{envoy_cluster_name = "{#CLUSTER_NAME}", envoy_response_code_class="5"}`: `function`: `sum` - CHANGEPERSECOND
Envoy Proxy	Envoy Proxy: Cluster ["{#CLUSTER_NAME}"]: Requests pending	Total active requests pending a connection pool connection.	DEPENDENT	envoy.cluster.upstreamrqpendingactive["{#CLUSTERNAME}"] Preprocessing: - PROMETHEUS_PATTERN: `envoy_cluster_upstream_rq_pending_active{envoy_cluster_name = "{#CLUSTER_NAME}"}`: `function`: `sum`
Envoy Proxy	Envoy Proxy: Cluster ["{#CLUSTER_NAME}"]: Requests active	Total active requests.	DEPENDENT	envoy.cluster.upstreamrqactive["{#CLUSTERNAME}"] Preprocessing: - PROMETHEUSPATTERN: `envoy_cluster_upstream_rq_active{envoy_cluster_name = "{#CLUSTER_NAME}"}`: `function`: `sum`
Envoy Proxy	Envoy Proxy: Cluster ["{#CLUSTER_NAME}"]: Upstream bytes out, rate	Total sent connection bytes per second.	DEPENDENT	envoy.cluster.upstreamcxtxbytestotal.rate["{#CLUSTERNAME}"] Preprocessing: - PROMETHEUSPATTERN: `envoy_cluster_upstream_cx_tx_bytes_total{envoy_cluster_name = "{#CLUSTER_NAME}"}`: `function`: `sum` - CHANGEPERSECOND
Envoy Proxy	Envoy Proxy: Cluster ["{#CLUSTER_NAME}"]: Upstream bytes in, rate	Total received connection bytes per second.	DEPENDENT	envoy.cluster.upstreamcxrxbytestotal.rate["{#CLUSTERNAME}"] Preprocessing: - PROMETHEUSPATTERN: `envoy_cluster_upstream_cx_rx_bytes_total{envoy_cluster_name = "{#CLUSTER_NAME}"}`: `function`: `sum` - CHANGEPERSECOND
Envoy Proxy	Envoy Proxy: Listener ["{#LISTENER_ADDRESS}"]: Connections, active	Total active connections.	DEPENDENT	envoy.listener.downstreamcxactive["{#LISTENERADDRESS}"] Preprocessing: - PROMETHEUSPATTERN: `envoy_listener_downstream_cx_active{envoy_listener_address = "{#LISTENER_ADDRESS}"}`: `function`: `sum`
Envoy Proxy	Envoy Proxy: Listener ["{#LISTENER_ADDRESS}"]: Connections, rate	Total connections per second.	DEPENDENT	envoy.listener.downstreamcxtotal.rate["{#LISTENERADDRESS}"] Preprocessing: - PROMETHEUSPATTERN: `envoy_listener_downstream_cx_total{envoy_listener_address = "{#LISTENER_ADDRESS}"}`: `function`: `sum` - CHANGEPERSECOND
Envoy Proxy	Envoy Proxy: Listener ["{#LISTENER_ADDRESS}"]: Sockets, undergoing	Sockets currently undergoing listener filter processing.	DEPENDENT	envoy.listener.downstreamprecxactive["{#LISTENERADDRESS}"] Preprocessing: - PROMETHEUS_PATTERN: `envoy_listener_downstream_pre_cx_active{envoy_listener_address = "{#LISTENER_ADDRESS}"}`: `function`: `sum`
Envoy Proxy	Envoy Proxy: HTTP ["{#CONN_MANAGER}"]: Requests, rate	Total active connections per second.	DEPENDENT	envoy.http.downstreamrqtotal.rate["{#CONNMANAGER}"] Preprocessing: - PROMETHEUSPATTERN: `envoy_http_downstream_rq_total{envoy_http_conn_manager_prefix = "{#CONN_MANAGER}"}`: `function`: `sum` - CHANGEPERSECOND
Envoy Proxy	Envoy Proxy: HTTP ["{#CONN_MANAGER}"]: Requests, active	Total active requests.	DEPENDENT	envoy.http.downstreamrqactive["{#CONNMANAGER}"] Preprocessing: - PROMETHEUSPATTERN: `envoy_http_downstream_rq_active{envoy_http_conn_manager_prefix = "{#CONN_MANAGER}"}`: `function`: `sum`
Envoy Proxy	Envoy Proxy: HTTP ["{#CONN_MANAGER}"]: Requests timeout, rate	Total requests closed due to a timeout on the request path per second.	DEPENDENT	envoy.http.downstreamrqtimeout["{#CONNMANAGER}"] Preprocessing: - PROMETHEUSPATTERN: `envoy_http_downstream_rq_timeout{envoy_http_conn_manager_prefix = "{#CONN_MANAGER}"}`: `function`: `sum` - CHANGEPERSECOND
Envoy Proxy	Envoy Proxy: HTTP ["{#CONN_MANAGER}"]: Connections, rate	Total connections per second.	DEPENDENT	envoy.http.downstreamcxtotal["{#CONNMANAGER}"] Preprocessing: - PROMETHEUSPATTERN: `envoy_http_downstream_cx_total{envoy_http_conn_manager_prefix = "{#CONN_MANAGER}"}`: `function`: `sum` - CHANGEPERSECOND
Envoy Proxy	Envoy Proxy: HTTP ["{#CONN_MANAGER}"]: Connections, active	Total active connections.	DEPENDENT	envoy.http.downstreamcxactive["{#CONNMANAGER}"] Preprocessing: - PROMETHEUSPATTERN: `envoy_http_downstream_cx_active{envoy_http_conn_manager_prefix = "{#CONN_MANAGER}"}`: `function`: `sum`
Envoy Proxy	Envoy Proxy: HTTP ["{#CONN_MANAGER}"]: Bytes in, rate	Total bytes received per second.	DEPENDENT	envoy.http.downstreamcxrxbytestotal.rate["{#CONNMANAGER}"] Preprocessing: - PROMETHEUSPATTERN: `envoy_http_downstream_cx_rx_bytes_total{envoy_http_conn_manager_prefix = "{#CONN_MANAGER}"}`: `function`: `sum` - CHANGEPERSECOND
Envoy Proxy	Envoy Proxy: HTTP ["{#CONN_MANAGER}"]: Bytes out, rate	Total bytes sent per second.	DEPENDENT	envoy.http.downstreamcxtxbytestota.rate["{#CONNMANAGER}"] Preprocessing: - PROMETHEUSPATTERN: `envoy_http_downstream_cx_tx_bytes_total{envoy_http_conn_manager_prefix = "{#CONN_MANAGER}"}`: `function`: `sum` - CHANGEPERSECOND
Zabbix raw items	Envoy Proxy: Get node metrics	Get server metrics.	HTTP_AGENT	envoy.getmetrics Preprocessing: - CHECKNOTSUPPORTED ⛔️ONFAIL: `DISCARD_VALUE ->`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Envoy Proxy: Server state is not live	-	`last(/Envoy Proxy by HTTP/envoy.server.state) > 0`	AVERAGE
Envoy Proxy: Service has been restarted	Uptime is less than 10 minutes.	`last(/Envoy Proxy by HTTP/envoy.server.uptime)<10m`	INFO	Manual close: YES
Envoy Proxy: Failed to fetch metrics data	Zabbix has not received data for items for the last 10 minutes.	`nodata(/Envoy Proxy by HTTP/envoy.server.uptime,10m)=1`	WARNING	Manual close: YES
Envoy Proxy: SSL certificate expires soon	Please check certificate. Less than {$ENVOY.CERT.MIN} days left until the next certificate being managed will expire.	`last(/Envoy Proxy by HTTP/envoy.server.days_until_first_cert_expiring)<{$ENVOY.CERT.MIN}`	WARNING
Envoy Proxy: There are unhealthy clusters	-	`last(/Envoy Proxy by HTTP/envoy.cluster.membership_unhealthy["{#CLUSTER_NAME}"]) > 0`	AVERAGE

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

app

app_elasticsearch_http

View README Download JSON

Elasticsearch Cluster by HTTP

Overview

For Zabbix version: 6.2 and higher
The template to monitor Elasticsearch by Zabbix that work without any external scripts. It works with both standalone and cluster instances. The metrics are collected in one pass remotely using an HTTP agent. They are getting values from REST API _cluster/health, _cluster/stats, _nodes/stats requests.

This template was tested on:

Elasticsearch, version 6.5..7.6

Setup

You can set {$ELASTICSEARCH.USERNAME} and {$ELASTICSEARCH.PASSWORD} macros in the template for using on the host level. If you use an atypical location ES API, don't forget to change the macros {$ELASTICSEARCH.SCHEME},{$ELASTICSEARCH.PORT}.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$ELASTICSEARCH.FETCH_LATENCY.MAX.WARN}	Maximum of fetch latency in milliseconds for trigger expression.	`100`
{$ELASTICSEARCH.FLUSH_LATENCY.MAX.WARN}	Maximum of flush latency in milliseconds for trigger expression.	`100`
{$ELASTICSEARCH.HEAP_USED.MAX.CRIT}	The maximum percent in the use of JVM heap for critically trigger expression.	`95`
{$ELASTICSEARCH.HEAP_USED.MAX.WARN}	The maximum percent in the use of JVM heap for warning trigger expression.	`85`
{$ELASTICSEARCH.INDEXING_LATENCY.MAX.WARN}	Maximum of indexing latency in milliseconds for trigger expression.	`100`
{$ELASTICSEARCH.PASSWORD}	The password of the Elasticsearch.	``
{$ELASTICSEARCH.PORT}	The port of the Elasticsearch host.	`9200`
{$ELASTICSEARCH.QUERY_LATENCY.MAX.WARN}	Maximum of query latency in milliseconds for trigger expression.	`100`
{$ELASTICSEARCH.RESPONSE_TIME.MAX.WARN}	The ES cluster maximum response time in seconds for trigger expression.	`10s`
{$ELASTICSEARCH.SCHEME}	The scheme of the Elasticsearch (http/https).	`http`
{$ELASTICSEARCH.USERNAME}	The username of the Elasticsearch.	``

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Cluster nodes discovery

Discovery ES cluster nodes.

HTTP_AGENT

es.nodes.discovery

Preprocessing:

- JSONPATH: $.nodes.[*]

- DISCARDUNCHANGEDHEARTBEAT: 1d

Items collected

Group	Name	Description	Type	Key and additional info
ES cluster	ES: Service status	Checks if the service is running and accepting TCP connections.	SIMPLE	net.tcp.service["{$ELASTICSEARCH.SCHEME}","{HOST.CONN}","{$ELASTICSEARCH.PORT}"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
ES cluster	ES: Service response time	Checks performance of the TCP service.	SIMPLE	net.tcp.service.perf["{$ELASTICSEARCH.SCHEME}","{HOST.CONN}","{$ELASTICSEARCH.PORT}"]
ES cluster	ES: Cluster health status	Health status of the cluster, based on the state of its primary and replica shards. Statuses are: green All shards are assigned. yellow All primary shards are assigned, but one or more replica shards are unassigned. If a node in the cluster fails, some data could be unavailable until that node is repaired. red One or more primary shards are unassigned, so some data is unavailable. This can occur briefly during cluster startup as primary shards are assigned.	DEPENDENT	es.cluster.status Preprocessing: - JSONPATH: `$.status` - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `1h`
ES cluster	ES: Number of nodes	The number of nodes within the cluster.	DEPENDENT	es.cluster.numberofnodes Preprocessing: - JSONPATH: `$.number_of_nodes` - DISCARDUNCHANGEDHEARTBEAT: `1h`
ES cluster	ES: Number of data nodes	The number of nodes that are dedicated to data nodes.	DEPENDENT	es.cluster.numberofdatanodes Preprocessing: - JSONPATH: `$.number_of_data_nodes` - DISCARDUNCHANGED_HEARTBEAT: `1h`
ES cluster	ES: Number of relocating shards	The number of shards that are under relocation.	DEPENDENT	es.cluster.relocating_shards Preprocessing: - JSONPATH: `$.relocating_shards`
ES cluster	ES: Number of initializing shards	The number of shards that are under initialization.	DEPENDENT	es.cluster.initializing_shards Preprocessing: - JSONPATH: `$.initializing_shards`
ES cluster	ES: Number of unassigned shards	The number of shards that are not allocated.	DEPENDENT	es.cluster.unassigned_shards Preprocessing: - JSONPATH: `$.unassigned_shards`
ES cluster	ES: Delayed unassigned shards	The number of shards whose allocation has been delayed by the timeout settings.	DEPENDENT	es.cluster.delayedunassignedshards Preprocessing: - JSONPATH: `$.delayed_unassigned_shards`
ES cluster	ES: Number of pending tasks	The number of cluster-level changes that have not yet been executed.	DEPENDENT	es.cluster.numberofpending_tasks Preprocessing: - JSONPATH: `$.number_of_pending_tasks`
ES cluster	ES: Task max waiting in queue	The time expressed in seconds since the earliest initiated task is waiting for being performed.	DEPENDENT	es.cluster.taskmaxwaitinginqueue Preprocessing: - JSONPATH: `$.task_max_waiting_in_queue_millis` - MULTIPLIER: `0.001`
ES cluster	ES: Inactive shards percentage	The ratio of inactive shards in the cluster expressed as a percentage.	DEPENDENT	es.cluster.inactiveshardspercentasnumber Preprocessing: - JSONPATH: `$.active_shards_percent_as_number` - JAVASCRIPT: `return (100 - value)`
ES cluster	ES: Cluster uptime	Uptime duration in seconds since JVM has last started.	DEPENDENT	es.nodes.jvm.max_uptime Preprocessing: - JSONPATH: `$.nodes.jvm.max_uptime_in_millis` - MULTIPLIER: `0.001`
ES cluster	ES: Number of non-deleted documents	The total number of non-deleted documents across all primary shards assigned to the selected nodes. This number is based on the documents in Lucene segments and may include the documents from nested fields.	DEPENDENT	es.indices.docs.count Preprocessing: - JSONPATH: `$.indices.docs.count` - DISCARDUNCHANGEDHEARTBEAT: `1h`
ES cluster	ES: Indices with shards assigned to nodes	The total number of indices with shards assigned to the selected nodes.	DEPENDENT	es.indices.count Preprocessing: - JSONPATH: `$.indices.count` - DISCARDUNCHANGEDHEARTBEAT: `1h`
ES cluster	ES: Total size of all file stores	The total size in bytes of all file stores across all selected nodes.	DEPENDENT	es.nodes.fs.totalinbytes Preprocessing: - JSONPATH: `$.nodes.fs.total_in_bytes` - DISCARDUNCHANGEDHEARTBEAT: `1h`
ES cluster	ES: Total available size to JVM in all file stores	The total number of bytes available to JVM in the file stores across all selected nodes. Depending on OS or process-level restrictions, this number may be less than nodes.fs.freeinbyes. This is the actual amount of free disk space the selected Elasticsearch nodes can use.	DEPENDENT	es.nodes.fs.availableinbytes Preprocessing: - JSONPATH: `$.nodes.fs.available_in_bytes` - DISCARDUNCHANGEDHEARTBEAT: `1h`
ES cluster	ES: Nodes with the data role	The number of selected nodes with the data role.	DEPENDENT	es.nodes.count.data Preprocessing: - JSONPATH: `$.nodes.count.data` - DISCARDUNCHANGEDHEARTBEAT: `1h`
ES cluster	ES: Nodes with the ingest role	The number of selected nodes with the ingest role.	DEPENDENT	es.nodes.count.ingest Preprocessing: - JSONPATH: `$.nodes.count.ingest` - DISCARDUNCHANGEDHEARTBEAT: `1h`
ES cluster	ES: Nodes with the master role	The number of selected nodes with the master role.	DEPENDENT	es.nodes.count.master Preprocessing: - JSONPATH: `$.nodes.count.master` - DISCARDUNCHANGEDHEARTBEAT: `1h`
ES cluster	ES {#ES.NODE}: Total size	Total size (in bytes) of all file stores.	DEPENDENT	es.node.fs.total.totalinbytes[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].fs.total.total_in_bytes.first()` - DISCARDUNCHANGEDHEARTBEAT: `1d`
ES cluster	ES {#ES.NODE}: Total available size	The total number of bytes available to this Java virtual machine on all file stores. Depending on OS or process level restrictions, this might appear less than fs.total.freeinbytes. This is the actual amount of free disk space the Elasticsearch node can utilize.	DEPENDENT	es.node.fs.total.availableinbytes[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].fs.total.available_in_bytes.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
ES cluster	ES {#ES.NODE}: Node uptime	JVM uptime in seconds.	DEPENDENT	es.node.jvm.uptime[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].jvm.uptime_in_millis.first()` - MULTIPLIER: `0.001`
ES cluster	ES {#ES.NODE}: Maximum JVM memory available for use	The maximum amount of memory, in bytes, available for use by the heap.	DEPENDENT	es.node.jvm.mem.heapmaxinbytes[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].jvm.mem.heap_max_in_bytes.first()` - DISCARDUNCHANGED_HEARTBEAT: `1d`
ES cluster	ES {#ES.NODE}: Amount of JVM heap currently in use	The memory, in bytes, currently in use by the heap.	DEPENDENT	es.node.jvm.mem.heapusedinbytes[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].jvm.mem.heap_used_in_bytes.first()` - DISCARDUNCHANGED_HEARTBEAT: `1h`
ES cluster	ES {#ES.NODE}: Percent of JVM heap currently in use	The percentage of memory currently in use by the heap.	DEPENDENT	es.node.jvm.mem.heapusedpercent[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].jvm.mem.heap_used_percent.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
ES cluster	ES {#ES.NODE}: Amount of JVM heap committed	The amount of memory, in bytes, available for use by the heap.	DEPENDENT	es.node.jvm.mem.heapcommittedinbytes[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].jvm.mem.heap_committed_in_bytes.first()` - DISCARDUNCHANGED_HEARTBEAT: `1h`
ES cluster	ES {#ES.NODE}: Number of open HTTP connections	The number of currently open HTTP connections for the node.	DEPENDENT	es.node.http.currentopen[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].http.current_open.first()` - DISCARDUNCHANGED_HEARTBEAT: `1h`
ES cluster	ES {#ES.NODE}: Rate of HTTP connections opened	The number of HTTP connections opened for the node per second.	DEPENDENT	es.node.http.opened.rate[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].http.total_opened.first()` - CHANGEPERSECOND
ES cluster	ES {#ES.NODE}: Time spent throttling operations	Time in seconds spent throttling operations for the last measuring span.	DEPENDENT	es.node.indices.indexing.throttletime[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].indices.indexing.throttle_time_in_millis.first()` - MULTIPLIER: `0.001` - SIMPLECHANGE
ES cluster	ES {#ES.NODE}: Time spent throttling recovery operations	Time in seconds spent throttling recovery operations for the last measuring span.	DEPENDENT	es.node.indices.recovery.throttletime[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].indices.recovery.throttle_time_in_millis.first()` - MULTIPLIER: `0.001` - SIMPLECHANGE
ES cluster	ES {#ES.NODE}: Time spent throttling merge operations	Time in seconds spent throttling merge operations for the last measuring span.	DEPENDENT	es.node.indices.merges.totalthrottledtime[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].indices.merges.total_throttled_time_in_millis.first()` - MULTIPLIER: `0.001` - SIMPLE_CHANGE
ES cluster	ES {#ES.NODE}: Rate of queries	The number of query operations per second.	DEPENDENT	es.node.indices.search.query.rate[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].indices.search.query_total.first()` - CHANGEPERSECOND
ES cluster	ES {#ES.NODE}: Time spent performing query	Time in seconds spent performing query operations for the last measuring span.	DEPENDENT	es.node.indices.search.querytime[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].indices.search.query_time_in_millis.first()` - MULTIPLIER: `0.001` - SIMPLECHANGE
ES cluster	ES {#ES.NODE}: Query latency	The average query latency calculated by sampling the total number of queries and the total elapsed time at regular intervals.	CALCULATED	es.node.indices.search.query_latency[{#ES.NODE}] Expression: `change(//es.node.indices.search.query_time_in_millis[{#ES.NODE}]) / ( change(//es.node.indices.search.query_total[{#ES.NODE}]) + (change(//es.node.indices.search.query_total[{#ES.NODE}]) = 0) )`
ES cluster	ES {#ES.NODE}: Current query operations	The number of query operations currently running.	DEPENDENT	es.node.indices.search.query_current[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].indices.search.query_current.first()`
ES cluster	ES {#ES.NODE}: Rate of fetch	The number of fetch operations per second.	DEPENDENT	es.node.indices.search.fetch.rate[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].indices.search.fetch_total.first()` - CHANGEPERSECOND
ES cluster	ES {#ES.NODE}: Time spent performing fetch	Time in seconds spent performing fetch operations for the last measuring span.	DEPENDENT	es.node.indices.search.fetchtime[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].indices.search.fetch_time_in_millis.first()` - MULTIPLIER: `0.001` - SIMPLECHANGE
ES cluster	ES {#ES.NODE}: Fetch latency	The average fetch latency calculated by sampling the total number of fetches and the total elapsed time at regular intervals.	CALCULATED	es.node.indices.search.fetch_latency[{#ES.NODE}] Expression: `change(//es.node.indices.search.fetch_time_in_millis[{#ES.NODE}]) / ( change(//es.node.indices.search.fetch_total[{#ES.NODE}]) + (change(//es.node.indices.search.fetch_total[{#ES.NODE}]) = 0) )`
ES cluster	ES {#ES.NODE}: Current fetch operations	The number of fetch operations currently running.	DEPENDENT	es.node.indices.search.fetch_current[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].indices.search.fetch_current.first()`
ES cluster	ES {#ES.NODE}: Write thread pool executor tasks completed	The number of tasks completed by the write thread pool executor.	DEPENDENT	es.node.threadpool.write.completed.rate[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].thread_pool.write.completed.first()` - CHANGEPER_SECOND
ES cluster	ES {#ES.NODE}: Write thread pool active threads	The number of active threads in the write thread pool.	DEPENDENT	es.node.thread_pool.write.active[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].thread_pool.write.active.first()`
ES cluster	ES {#ES.NODE}: Write thread pool tasks in queue	The number of tasks in queue for the write thread pool.	DEPENDENT	es.node.thread_pool.write.queue[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].thread_pool.write.queue.first()`
ES cluster	ES {#ES.NODE}: Write thread pool executor tasks rejected	The number of tasks rejected by the write thread pool executor.	DEPENDENT	es.node.threadpool.write.rejected.rate[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].thread_pool.write.rejected.first()` - CHANGEPER_SECOND
ES cluster	ES {#ES.NODE}: Search thread pool executor tasks completed	The number of tasks completed by the search thread pool executor.	DEPENDENT	es.node.threadpool.search.completed.rate[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].thread_pool.search.completed.first()` - CHANGEPER_SECOND
ES cluster	ES {#ES.NODE}: Search thread pool active threads	The number of active threads in the search thread pool.	DEPENDENT	es.node.thread_pool.search.active[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].thread_pool.search.active.first()`
ES cluster	ES {#ES.NODE}: Search thread pool tasks in queue	The number of tasks in queue for the search thread pool.	DEPENDENT	es.node.thread_pool.search.queue[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].thread_pool.search.queue.first()`
ES cluster	ES {#ES.NODE}: Search thread pool executor tasks rejected	The number of tasks rejected by the search thread pool executor.	DEPENDENT	es.node.threadpool.search.rejected.rate[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].thread_pool.search.rejected.first()` - CHANGEPER_SECOND
ES cluster	ES {#ES.NODE}: Refresh thread pool executor tasks completed	The number of tasks completed by the refresh thread pool executor.	DEPENDENT	es.node.threadpool.refresh.completed.rate[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].thread_pool.refresh.completed.first()` - CHANGEPER_SECOND
ES cluster	ES {#ES.NODE}: Refresh thread pool active threads	The number of active threads in the refresh thread pool.	DEPENDENT	es.node.thread_pool.refresh.active[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].thread_pool.refresh.active.first()`
ES cluster	ES {#ES.NODE}: Refresh thread pool tasks in queue	The number of tasks in queue for the refresh thread pool.	DEPENDENT	es.node.thread_pool.refresh.queue[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].thread_pool.refresh.queue.first()`
ES cluster	ES {#ES.NODE}: Refresh thread pool executor tasks rejected	The number of tasks rejected by the refresh thread pool executor.	DEPENDENT	es.node.threadpool.refresh.rejected.rate[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].thread_pool.refresh.rejected.first()` - CHANGEPER_SECOND
ES cluster	ES {#ES.NODE}: Indexing latency	The average indexing latency calculated from the available indextotal and indextimeinmillis metrics.	CALCULATED	es.node.indices.indexing.index_latency[{#ES.NODE}] Expression: `change(//es.node.indices.indexing.index_time_in_millis[{#ES.NODE}]) / ( change(//es.node.indices.indexing.index_total[{#ES.NODE}]) + (change(//es.node.indices.indexing.index_total[{#ES.NODE}]) = 0) )`
ES cluster	ES {#ES.NODE}: Current indexing operations	The number of indexing operations currently running.	DEPENDENT	es.node.indices.indexing.indexcurrent[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].indices.indexing.index_current.first()` - DISCARDUNCHANGED_HEARTBEAT: `1h`
ES cluster	ES {#ES.NODE}: Flush latency	The average flush latency calculated from the available flush.total and flush.totaltimein_millis metrics.	CALCULATED	es.node.indices.flush.latency[{#ES.NODE}] Expression: `change(//es.node.indices.flush.total_time_in_millis[{#ES.NODE}]) / ( change(//es.node.indices.flush.total[{#ES.NODE}]) + (change(//es.node.indices.flush.total[{#ES.NODE}]) = 0) )`
ES cluster	ES {#ES.NODE}: Rate of index refreshes	The number of refresh operations per second.	DEPENDENT	es.node.indices.refresh.rate[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].indices.refresh.total.first()` - CHANGEPERSECOND
ES cluster	ES {#ES.NODE}: Time spent performing refresh	Time in seconds spent performing refresh operations for the last measuring span.	DEPENDENT	es.node.indices.refresh.time[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].indices.refresh.total_time_in_millis.first()` - MULTIPLIER: `0.001` - SIMPLE_CHANGE
Zabbix raw items	ES: Get cluster health	Returns the health status of a cluster.	HTTP_AGENT	es.cluster.get_health
Zabbix raw items	ES: Get cluster stats	Returns cluster statistics.	HTTP_AGENT	es.cluster.get_stats
Zabbix raw items	ES: Get nodes stats	Returns cluster nodes statistics.	HTTP_AGENT	es.nodes.get_stats
Zabbix raw items	ES {#ES.NODE}: Total number of query	The total number of query operations.	DEPENDENT	es.node.indices.search.querytotal[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].indices.search.query_total.first()` - DISCARDUNCHANGED_HEARTBEAT: `1h`
Zabbix raw items	ES {#ES.NODE}: Total time spent performing query	Time in milliseconds spent performing query operations.	DEPENDENT	es.node.indices.search.querytimeinmillis[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].indices.search.query_time_in_millis.first()` - DISCARDUNCHANGED_HEARTBEAT: `1h`
Zabbix raw items	ES {#ES.NODE}: Total number of fetch	The total number of fetch operations.	DEPENDENT	es.node.indices.search.fetchtotal[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].indices.search.fetch_total.first()` - DISCARDUNCHANGED_HEARTBEAT: `1h`
Zabbix raw items	ES {#ES.NODE}: Total time spent performing fetch	Time in milliseconds spent performing fetch operations.	DEPENDENT	es.node.indices.search.fetchtimeinmillis[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].indices.search.fetch_time_in_millis.first()` - DISCARDUNCHANGED_HEARTBEAT: `1h`
Zabbix raw items	ES {#ES.NODE}: Total number of indexing	The total number of indexing operations.	DEPENDENT	es.node.indices.indexing.indextotal[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].indices.indexing.index_total.first()` - DISCARDUNCHANGED_HEARTBEAT: `1h`
Zabbix raw items	ES {#ES.NODE}: Total time spent performing indexing	Total time in milliseconds spent performing indexing operations.	DEPENDENT	es.node.indices.indexing.indextimeinmillis[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].indices.indexing.index_time_in_millis.first()` - DISCARDUNCHANGED_HEARTBEAT: `1h`
Zabbix raw items	ES {#ES.NODE}: Total number of index flushes to disk	The total number of flush operations.	DEPENDENT	es.node.indices.flush.total[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].indices.flush.total.first()` - DISCARDUNCHANGEDHEARTBEAT: `1h`
Zabbix raw items	ES {#ES.NODE}: Total time spent on flushing indices to disk	Total time in milliseconds spent performing flush operations.	DEPENDENT	es.node.indices.flush.totaltimeinmillis[{#ES.NODE}] Preprocessing: - JSONPATH: `$..[?(@.name=='{#ES.NODE}')].indices.flush.total_time_in_millis.first()` - DISCARDUNCHANGED_HEARTBEAT: `1h`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
ES: Service is down	The service is unavailable or does not accept TCP connections.	`last(/Elasticsearch Cluster by HTTP/net.tcp.service["{$ELASTICSEARCH.SCHEME}","{HOST.CONN}","{$ELASTICSEARCH.PORT}"])=0`	AVERAGE	Manual close: YES
ES: Service response time is too high	The performance of the TCP service is very low.	`min(/Elasticsearch Cluster by HTTP/net.tcp.service.perf["{$ELASTICSEARCH.SCHEME}","{HOST.CONN}","{$ELASTICSEARCH.PORT}"],5m)>{$ELASTICSEARCH.RESPONSE_TIME.MAX.WARN}`	WARNING	Manual close: YES Depends on: - ES: Service is down
ES: Health is YELLOW	All primary shards are assigned, but one or more replica shards are unassigned. If a node in the cluster fails, some data could be unavailable until that node is repaired.	`last(/Elasticsearch Cluster by HTTP/es.cluster.status)=1`	AVERAGE
ES: Health is RED	One or more primary shards are unassigned, so some data is unavailable. This can occur briefly during cluster startup as primary shards are assigned.	`last(/Elasticsearch Cluster by HTTP/es.cluster.status)=2`	HIGH
ES: Health is UNKNOWN	The health status of the cluster is unknown or cannot be obtained.	`last(/Elasticsearch Cluster by HTTP/es.cluster.status)=255`	HIGH
ES: The number of nodes within the cluster has decreased	-	`change(/Elasticsearch Cluster by HTTP/es.cluster.number_of_nodes)<0`	INFO	Manual close: YES
ES: The number of nodes within the cluster has increased	-	`change(/Elasticsearch Cluster by HTTP/es.cluster.number_of_nodes)>0`	INFO	Manual close: YES
ES: Cluster has the initializing shards	The cluster has the initializing shards longer than 10 minutes.	`min(/Elasticsearch Cluster by HTTP/es.cluster.initializing_shards,10m)>0`	AVERAGE
ES: Cluster has the unassigned shards	The cluster has the unassigned shards longer than 10 minutes.	`min(/Elasticsearch Cluster by HTTP/es.cluster.unassigned_shards,10m)>0`	AVERAGE
ES: Cluster has been restarted	Uptime is less than 10 minutes.	`last(/Elasticsearch Cluster by HTTP/es.nodes.jvm.max_uptime)<10m`	INFO	Manual close: YES
ES: Cluster does not have enough space for resharding	There is not enough disk space for index resharding.	`(last(/Elasticsearch Cluster by HTTP/es.nodes.fs.total_in_bytes)-last(/Elasticsearch Cluster by HTTP/es.nodes.fs.available_in_bytes))/(last(/Elasticsearch Cluster by HTTP/es.cluster.number_of_data_nodes)-1)>last(/Elasticsearch Cluster by HTTP/es.nodes.fs.available_in_bytes)`	HIGH
ES: Cluster has only two master nodes	The cluster has only two nodes with a master role and will be unavailable if one of them breaks.	`last(/Elasticsearch Cluster by HTTP/es.nodes.count.master)=2`	DISASTER
ES {#ES.NODE}: has been restarted	Uptime is less than 10 minutes.	`last(/Elasticsearch Cluster by HTTP/es.node.jvm.uptime[{#ES.NODE}])<10m`	INFO	Manual close: YES
ES {#ES.NODE}: Percent of JVM heap in use is high	This indicates that the rate of garbage collection isn't keeping up with the rate of garbage creation. To address this problem, you can either increase your heap size (as long as it remains below the recommended guidelines stated above), or scale out the cluster by adding more nodes.	`min(/Elasticsearch Cluster by HTTP/es.node.jvm.mem.heap_used_percent[{#ES.NODE}],1h)>{$ELASTICSEARCH.HEAP_USED.MAX.WARN}`	WARNING	Depends on: - ES {#ES.NODE}: Percent of JVM heap in use is critical
ES {#ES.NODE}: Percent of JVM heap in use is critical	This indicates that the rate of garbage collection isn't keeping up with the rate of garbage creation. To address this problem, you can either increase your heap size (as long as it remains below the recommended guidelines stated above), or scale out the cluster by adding more nodes.	`min(/Elasticsearch Cluster by HTTP/es.node.jvm.mem.heap_used_percent[{#ES.NODE}],1h)>{$ELASTICSEARCH.HEAP_USED.MAX.CRIT}`	HIGH
ES {#ES.NODE}: Query latency is too high	If latency exceeds a threshold, look for potential resource bottlenecks, or investigate whether you need to optimize your queries.	`min(/Elasticsearch Cluster by HTTP/es.node.indices.search.query_latency[{#ES.NODE}],5m)>{$ELASTICSEARCH.QUERY_LATENCY.MAX.WARN}`	WARNING
ES {#ES.NODE}: Fetch latency is too high	The fetch phase should typically take much less time than the query phase. If you notice this metric consistently increasing, this could indicate a problem with slow disks, enriching of documents (highlighting the relevant text in search results, etc.), or requesting too many results.	`min(/Elasticsearch Cluster by HTTP/es.node.indices.search.fetch_latency[{#ES.NODE}],5m)>{$ELASTICSEARCH.FETCH_LATENCY.MAX.WARN}`	WARNING
ES {#ES.NODE}: Write thread pool executor has the rejected tasks	The number of tasks rejected by the write thread pool executor is over 0 for 5m.	`min(/Elasticsearch Cluster by HTTP/es.node.thread_pool.write.rejected.rate[{#ES.NODE}],5m)>0`	WARNING
ES {#ES.NODE}: Search thread pool executor has the rejected tasks	The number of tasks rejected by the search thread pool executor is over 0 for 5m.	`min(/Elasticsearch Cluster by HTTP/es.node.thread_pool.search.rejected.rate[{#ES.NODE}],5m)>0`	WARNING
ES {#ES.NODE}: Refresh thread pool executor has the rejected tasks	The number of tasks rejected by the refresh thread pool executor is over 0 for 5m.	`min(/Elasticsearch Cluster by HTTP/es.node.thread_pool.refresh.rejected.rate[{#ES.NODE}],5m)>0`	WARNING
ES {#ES.NODE}: Indexing latency is too high	If the latency is increasing, it may indicate that you are indexing too many documents at the same time (Elasticsearch's documentation recommends starting with a bulk indexing size of 5 to 15 megabytes and increasing slowly from there).	`min(/Elasticsearch Cluster by HTTP/es.node.indices.indexing.index_latency[{#ES.NODE}],5m)>{$ELASTICSEARCH.INDEXING_LATENCY.MAX.WARN}`	WARNING
ES {#ES.NODE}: Flush latency is too high	If you see this metric increasing steadily, it may indicate a problem with slow disks; this problem may escalate and eventually prevent you from being able to add new information to your index.	`min(/Elasticsearch Cluster by HTTP/es.node.indices.flush.latency[{#ES.NODE}],5m)>{$ELASTICSEARCH.FLUSH_LATENCY.MAX.WARN}`	WARNING

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

References

https://www.elastic.co/guide/en/elasticsearch/reference/index.html

app

app_docker

View README Download JSON

Docker by Zabbix agent 2

Overview

For Zabbix version: 6.2 and higher
The template to monitor Docker engine by Zabbix that work without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

Template Docker by Zabbix agent 2 — collects metrics by polling zabbix-agent2.

This template was tested on:

Docker, version 19.03.5

Setup

Setup and configure zabbix-agent2 compiled with the Docker monitoring plugin.

Test availability: zabbix_get -s docker-host -k docker.info

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$DOCKER.LLD.FILTER.CONTAINER.MATCHES}	Filter of discoverable containers	`.*`
{$DOCKER.LLD.FILTER.CONTAINER.NOT_MATCHES}	Filter to exclude discovered containers	`CHANGE_IF_NEEDED`
{$DOCKER.LLD.FILTER.IMAGE.MATCHES}	Filter of discoverable images	`.*`
{$DOCKER.LLD.FILTER.IMAGE.NOT_MATCHES}	Filter to exclude discovered images	`CHANGE_IF_NEEDED`

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Containers discovery

Discovery for containers metrics

Parameter:

true - Returns all containers

false - Returns only running containers

ZABBIX_PASSIVE

docker.containers.discovery[false]

Filter:

AND

- {#NAME} MATCHESREGEX {$DOCKER.LLD.FILTER.CONTAINER.MATCHES}

- {#NAME} NOTMATCHES_REGEX {$DOCKER.LLD.FILTER.CONTAINER.NOT_MATCHES}

Images discovery

Discovery for images metrics

ZABBIX_PASSIVE

docker.images.discovery

Filter:

AND

- {#NAME} MATCHESREGEX {$DOCKER.LLD.FILTER.IMAGE.MATCHES}

- {#NAME} NOTMATCHES_REGEX {$DOCKER.LLD.FILTER.IMAGE.NOT_MATCHES}

Items collected

Group	Name	Description	Type	Key and additional info
Docker	Docker: Ping		ZABBIX_PASSIVE	docker.ping Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
Docker	Docker: Containers total	Total number of containers on this host	DEPENDENT	docker.containers.total Preprocessing: - JSONPATH: `$.Containers`
Docker	Docker: Containers running	Total number of containers running on this host	DEPENDENT	docker.containers.running Preprocessing: - JSONPATH: `$.ContainersRunning`
Docker	Docker: Containers stopped	Total number of containers stopped on this host	DEPENDENT	docker.containers.stopped Preprocessing: - JSONPATH: `$.ContainersStopped`
Docker	Docker: Containers paused	Total number of containers paused on this host	DEPENDENT	docker.containers.paused Preprocessing: - JSONPATH: `$.ContainersPaused`
Docker	Docker: Images total	Number of images with intermediate image layers	DEPENDENT	docker.images.total Preprocessing: - JSONPATH: `$.Images`
Docker	Docker: Storage driver	Docker storage driver https://docs.docker.com/storage/storagedriver/	DEPENDENT	docker.driver Preprocessing: - JSONPATH: `$.Driver` - DISCARDUNCHANGEDHEARTBEAT: `1d`
Docker	Docker: Memory limit enabled	-	DEPENDENT	docker.memlimit.enabled Preprocessing: - JSONPATH: `$.MemoryLimit` - BOOLTODECIMAL - DISCARDUNCHANGED_HEARTBEAT: `1d`
Docker	Docker: Swap limit enabled	-	DEPENDENT	docker.swaplimit.enabled Preprocessing: - JSONPATH: `$.SwapLimit` - BOOLTODECIMAL - DISCARDUNCHANGED_HEARTBEAT: `1d`
Docker	Docker: Kernel memory enabled	-	DEPENDENT	docker.kernelmem.enabled Preprocessing: - JSONPATH: `$.KernelMemory` - BOOLTODECIMAL - DISCARDUNCHANGED_HEARTBEAT: `1d`
Docker	Docker: Kernel memory TCP enabled	-	DEPENDENT	docker.kernelmemtcp.enabled Preprocessing: - JSONPATH: `$.KernelMemoryTCP` - BOOLTODECIMAL - DISCARDUNCHANGEDHEARTBEAT: `1d`
Docker	Docker: CPU CFS Period enabled	https://docs.docker.com/config/containers/resource_constraints/#configure-the-default-cfs-scheduler	DEPENDENT	docker.cpucfsperiod.enabled Preprocessing: - JSONPATH: `$.CpuCfsPeriod` - BOOLTODECIMAL - DISCARDUNCHANGEDHEARTBEAT: `1d`
Docker	Docker: CPU CFS Quota enabled	https://docs.docker.com/config/containers/resource_constraints/#configure-the-default-cfs-scheduler	DEPENDENT	docker.cpucfsquota.enabled Preprocessing: - JSONPATH: `$.CpuCfsQuota` - BOOLTODECIMAL - DISCARDUNCHANGEDHEARTBEAT: `1d`
Docker	Docker: CPU Shares enabled	https://docs.docker.com/config/containers/resource_constraints/#configure-the-default-cfs-scheduler	DEPENDENT	docker.cpushares.enabled Preprocessing: - JSONPATH: `$.CPUShares` - BOOLTODECIMAL - DISCARDUNCHANGED_HEARTBEAT: `1d`
Docker	Docker: CPU Set enabled	https://docs.docker.com/config/containers/resource_constraints/#configure-the-default-cfs-scheduler	DEPENDENT	docker.cpuset.enabled Preprocessing: - JSONPATH: `$.CPUSet` - BOOLTODECIMAL - DISCARDUNCHANGED_HEARTBEAT: `1d`
Docker	Docker: Pids limit enabled	-	DEPENDENT	docker.pidslimit.enabled Preprocessing: - JSONPATH: `$.PidsLimit` - BOOLTODECIMAL - DISCARDUNCHANGED_HEARTBEAT: `1d`
Docker	Docker: IPv4 Forwarding enabled	-	DEPENDENT	docker.ipv4forwarding.enabled Preprocessing: - JSONPATH: `$.IPv4Forwarding` - BOOLTODECIMAL - DISCARDUNCHANGED_HEARTBEAT: `1d`
Docker	Docker: Debug enabled	-	DEPENDENT	docker.debug.enabled Preprocessing: - JSONPATH: `$.Debug` - BOOLTODECIMAL - DISCARDUNCHANGEDHEARTBEAT: `1d`
Docker	Docker: Nfd	Number of used File Descriptors	DEPENDENT	docker.nfd Preprocessing: - JSONPATH: `$.NFd`
Docker	Docker: OomKill disabled	-	DEPENDENT	docker.oomkill.disabled Preprocessing: - JSONPATH: `$.OomKillDisable` - BOOLTODECIMAL - DISCARDUNCHANGEDHEARTBEAT: `1d`
Docker	Docker: Goroutines	Number of goroutines	DEPENDENT	docker.goroutines Preprocessing: - JSONPATH: `$.NGoroutines`
Docker	Docker: Logging driver	-	DEPENDENT	docker.loggingdriver Preprocessing: - JSONPATH: `$.LoggingDriver` - DISCARDUNCHANGED_HEARTBEAT: `1d`
Docker	Docker: Cgroup driver	-	DEPENDENT	docker.cgroupdriver Preprocessing: - JSONPATH: `$.CgroupDriver` - DISCARDUNCHANGED_HEARTBEAT: `1d`
Docker	Docker: NEvents listener	-	DEPENDENT	docker.nevents_listener Preprocessing: - JSONPATH: `$.NEventsListener`
Docker	Docker: Kernel version	-	DEPENDENT	docker.kernelversion Preprocessing: - JSONPATH: `$.KernelVersion` - DISCARDUNCHANGED_HEARTBEAT: `1d`
Docker	Docker: Operating system	-	DEPENDENT	docker.operatingsystem Preprocessing: - JSONPATH: `$.OperatingSystem` - DISCARDUNCHANGED_HEARTBEAT: `1d`
Docker	Docker: OS type	-	DEPENDENT	docker.ostype Preprocessing: - JSONPATH: `$.OSType` - DISCARDUNCHANGED_HEARTBEAT: `1d`
Docker	Docker: Architecture	-	DEPENDENT	docker.architecture Preprocessing: - JSONPATH: `$.Architecture` - DISCARDUNCHANGEDHEARTBEAT: `1d`
Docker	Docker: NCPU	-	DEPENDENT	docker.ncpu Preprocessing: - JSONPATH: `$.NCPU`
Docker	Docker: Memory total	-	DEPENDENT	docker.mem.total Preprocessing: - JSONPATH: `$.MemTotal`
Docker	Docker: Docker root dir	-	DEPENDENT	docker.rootdir Preprocessing: - JSONPATH: `$.DockerRootDir` - DISCARDUNCHANGED_HEARTBEAT: `1d`
Docker	Docker: Name	-	DEPENDENT	docker.name Preprocessing: - JSONPATH: `$.Name`
Docker	Docker: Server version	-	DEPENDENT	docker.serverversion Preprocessing: - JSONPATH: `$.ServerVersion` - DISCARDUNCHANGED_HEARTBEAT: `1d`
Docker	Docker: Default runtime	-	DEPENDENT	docker.defaultruntime Preprocessing: - JSONPATH: `$.DefaultRuntime` - DISCARDUNCHANGED_HEARTBEAT: `1d`
Docker	Docker: Live restore enabled	-	DEPENDENT	docker.liverestore.enabled Preprocessing: - JSONPATH: `$.LiveRestoreEnabled` - BOOLTODECIMAL - DISCARDUNCHANGED_HEARTBEAT: `1d`
Docker	Docker: Layers size	-	DEPENDENT	docker.layers_size Preprocessing: - JSONPATH: `$.LayersSize`
Docker	Docker: Images size	-	DEPENDENT	docker.images_size Preprocessing: - JSONPATH: `$.Images[*].Size.sum()`
Docker	Docker: Containers size	-	DEPENDENT	docker.containers_size Preprocessing: - JSONPATH: `$.Containers[*].SizeRw.sum()`
Docker	Docker: Volumes size	-	DEPENDENT	docker.volumes_size Preprocessing: - JSONPATH: `$.Volumes[*].UsageData.Size.sum()`
Docker	Docker: Images available	Number of top-level images	DEPENDENT	docker.images.top_level Preprocessing: - JSONPATH: `$.length()`
Docker	Image {#NAME}: Created	-	DEPENDENT	docker.image.created["{#ID}"] Preprocessing: - JSONPATH: `$[?(@.Id == "{#ID}")].Created.first()` - DISCARDUNCHANGEDHEARTBEAT: `1d`
Docker	Image {#NAME}: Size	-	DEPENDENT	docker.image.size["{#ID}"] Preprocessing: - JSONPATH: `$[?(@.Id == "{#ID}")].Size.first()`
Docker	Container {#NAME}: Get stats	Get container stats based on resource usage	ZABBIX_PASSIVE	docker.container_stats["{#NAME}"]
Docker	Container {#NAME}: CPU total usage per second	-	DEPENDENT	docker.containerstats.cpuusage.total.rate["{#NAME}"] Preprocessing: - JSONPATH: `$.cpu_stats.cpu_usage.total_usage` - CHANGEPERSECOND - MULTIPLIER: `1.0E-9`
Docker	Container {#NAME}: CPU percent usage	-	DEPENDENT	docker.containerstats.cpupct_usage["{#NAME}"] Preprocessing: - JSONPATH: `$.cpu_stats.cpu_usage.percent_usage`
Docker	Container {#NAME}: CPU kernelmode usage per second	-	DEPENDENT	docker.containerstats.cpuusage.kernel.rate["{#NAME}"] Preprocessing: - JSONPATH: `$.cpu_stats.cpu_usage.usage_in_kernelmode` - CHANGEPERSECOND - MULTIPLIER: `1.0E-9`
Docker	Container {#NAME}: CPU usermode usage per second	-	DEPENDENT	docker.containerstats.cpuusage.user.rate["{#NAME}"] Preprocessing: - JSONPATH: `$.cpu_stats.cpu_usage.usage_in_usermode` - CHANGEPERSECOND - MULTIPLIER: `1.0E-9`
Docker	Container {#NAME}: Online CPUs	-	DEPENDENT	docker.containerstats.onlinecpus["{#NAME}"] Preprocessing: - JSONPATH: `$.cpu_stats.online_cpus`
Docker	Container {#NAME}: Throttling periods	Number of periods with throttling active	DEPENDENT	docker.containerstats.cpuusage.throttling_periods["{#NAME}"] Preprocessing: - JSONPATH: `$.cpu_stats.throttling_data.periods`
Docker	Container {#NAME}: Throttled periods	Number of periods when the container hits its throttling limit	DEPENDENT	docker.containerstats.cpuusage.throttled_periods["{#NAME}"] Preprocessing: - JSONPATH: `$.cpu_stats.throttling_data.throttled_periods`
Docker	Container {#NAME}: Throttled time	Aggregate time the container was throttled for in nanoseconds	DEPENDENT	docker.containerstats.cpuusage.throttled_time["{#NAME}"] Preprocessing: - JSONPATH: `$.cpu_stats.throttling_data.throttled_time` - MULTIPLIER: `1.0E-9`
Docker	Container {#NAME}: Memory usage	-	DEPENDENT	docker.container_stats.memory.usage["{#NAME}"] Preprocessing: - JSONPATH: `$.memory_stats.usage`
Docker	Container {#NAME}: Memory maximum usage	-	DEPENDENT	docker.containerstats.memory.maxusage["{#NAME}"] Preprocessing: - JSONPATH: `$.memory_stats.max_usage`
Docker	Container {#NAME}: Memory commit bytes	-	DEPENDENT	docker.containerstats.memory.commitbytes["{#NAME}"] Preprocessing: - JSONPATH: `$.memory_stats.commitbytes`
Docker	Container {#NAME}: Memory commit peak bytes	-	DEPENDENT	docker.containerstats.memory.commitpeak_bytes["{#NAME}"] Preprocessing: - JSONPATH: `$.memory_stats.commitpeakbytes`
Docker	Container {#NAME}: Memory private working set	-	DEPENDENT	docker.containerstats.memory.privateworking_set["{#NAME}"] Preprocessing: - JSONPATH: `$.memory_stats.privateworkingset`
Docker	Container {#NAME}: Networks bytes received per second	-	DEPENDENT	docker.networks.rxbytes["{#NAME}"] Preprocessing: - JSONPATH: `$.networks[].rx_bytes.sum()` ⛔️ON*FAIL: `CUSTOM_VALUE -> 0` - CHANGEPERSECOND
Docker	Container {#NAME}: Networks packets received per second	-	DEPENDENT	docker.networks.rxpackets["{#NAME}"] Preprocessing: - JSONPATH: `$.networks[].rx_packets.sum()` ⛔️ON*FAIL: `CUSTOM_VALUE -> 0` - CHANGEPERSECOND
Docker	Container {#NAME}: Networks errors received per second	-	DEPENDENT	docker.networks.rxerrors["{#NAME}"] Preprocessing: - JSONPATH: `$.networks[].rx_errors.sum()` ⛔️ON*FAIL: `CUSTOM_VALUE -> 0` - CHANGEPERSECOND
Docker	Container {#NAME}: Networks incoming packets dropped per second	-	DEPENDENT	docker.networks.rxdropped["{#NAME}"] Preprocessing: - JSONPATH: `$.networks[].rx_dropped.sum()` ⛔️ON*FAIL: `CUSTOM_VALUE -> 0` - CHANGEPERSECOND
Docker	Container {#NAME}: Networks bytes sent per second	-	DEPENDENT	docker.networks.txbytes["{#NAME}"] Preprocessing: - JSONPATH: `$.networks[].tx_bytes.sum()` ⛔️ON*FAIL: `CUSTOM_VALUE -> 0` - CHANGEPERSECOND
Docker	Container {#NAME}: Networks packets sent per second	-	DEPENDENT	docker.networks.txpackets["{#NAME}"] Preprocessing: - JSONPATH: `$.networks[].tx_packets.sum()` ⛔️ON*FAIL: `CUSTOM_VALUE -> 0` - CHANGEPERSECOND
Docker	Container {#NAME}: Networks errors sent per second	-	DEPENDENT	docker.networks.txerrors["{#NAME}"] Preprocessing: - JSONPATH: `$.networks[].tx_errors.sum()` ⛔️ON*FAIL: `CUSTOM_VALUE -> 0` - CHANGEPERSECOND
Docker	Container {#NAME}: Networks outgoing packets dropped per second	-	DEPENDENT	docker.networks.txdropped["{#NAME}"] Preprocessing: - JSONPATH: `$.networks[].tx_dropped.sum()` ⛔️ON*FAIL: `CUSTOM_VALUE -> 0` - CHANGEPERSECOND
Docker	Container {#NAME}: Get info	Return low-level information about a container	ZABBIX_PASSIVE	docker.container_info["{#NAME}"]
Docker	Container {#NAME}: Created	-	DEPENDENT	docker.containerinfo.created["{#NAME}"] Preprocessing: - JSONPATH: `$.Created` - DISCARDUNCHANGED_HEARTBEAT: `1d`
Docker	Container {#NAME}: Image	-	DEPENDENT	docker.containerinfo.image["{#NAME}"] Preprocessing: - JSONPATH: `$[?(@.Names[0] == "{#NAME}")].Image.first()` - DISCARDUNCHANGED_HEARTBEAT: `1d`
Docker	Container {#NAME}: Restart count	-	DEPENDENT	docker.containerinfo.restartcount["{#NAME}"] Preprocessing: - JSONPATH: `$.RestartCount`
Docker	Container {#NAME}: Status	-	DEPENDENT	docker.containerinfo.state.status["{#NAME}"] Preprocessing: - JSONPATH: `$.State.Status` - DISCARDUNCHANGED_HEARTBEAT: `1h`
Docker	Container {#NAME}: Running	-	DEPENDENT	docker.containerinfo.state.running["{#NAME}"] Preprocessing: - JSONPATH: `$.State.Running` - BOOLTO_DECIMAL
Docker	Container {#NAME}: Paused	-	DEPENDENT	docker.containerinfo.state.paused["{#NAME}"] Preprocessing: - JSONPATH: `$.State.Paused` - BOOLTO_DECIMAL
Docker	Container {#NAME}: Restarting	-	DEPENDENT	docker.containerinfo.state.restarting["{#NAME}"] Preprocessing: - JSONPATH: `$.State.Restarting` - BOOLTO_DECIMAL
Docker	Container {#NAME}: OOMKilled	-	DEPENDENT	docker.containerinfo.state.oomkilled["{#NAME}"] Preprocessing: - JSONPATH: `$.State.OOMKilled` - BOOLTO_DECIMAL
Docker	Container {#NAME}: Dead	-	DEPENDENT	docker.containerinfo.state.dead["{#NAME}"] Preprocessing: - JSONPATH: `$.State.Dead` - BOOLTO_DECIMAL
Docker	Container {#NAME}: Pid	-	DEPENDENT	docker.containerinfo.state.pid["{#NAME}"] Preprocessing: - JSONPATH: `$.State.Pid` - DISCARDUNCHANGED_HEARTBEAT: `1d`
Docker	Container {#NAME}: Exit code	-	DEPENDENT	docker.containerinfo.state.exitcode["{#NAME}"] Preprocessing: - JSONPATH: `$.State.ExitCode` - DISCARDUNCHANGED_HEARTBEAT: `1d`
Docker	Container {#NAME}: Error	-	DEPENDENT	docker.containerinfo.state.error["{#NAME}"] Preprocessing: - JSONPATH: `$.State.Error` - DISCARDUNCHANGED_HEARTBEAT: `1d`
Docker	Container {#NAME}: Started at	-	DEPENDENT	docker.containerinfo.started["{#NAME}"] Preprocessing: - JSONPATH: `$.State.StartedAt` - DISCARDUNCHANGED_HEARTBEAT: `1d`
Docker	Container {#NAME}: Finished at	-	DEPENDENT	docker.containerinfo.finished["{#NAME}"] Preprocessing: - JSONPATH: `$.State.FinishedAt` - DISCARDUNCHANGED_HEARTBEAT: `1d`
Zabbix raw items	Docker: Get info		ZABBIX_PASSIVE	docker.info
Zabbix raw items	Docker: Get containers		ZABBIX_PASSIVE	docker.containers
Zabbix raw items	Docker: Get images		ZABBIX_PASSIVE	docker.images
Zabbix raw items	Docker: Get data_usage		ZABBIX_PASSIVE	docker.data_usage

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Docker: Service is down	-	`last(/Docker by Zabbix agent 2/docker.ping)=0`	AVERAGE	Manual close: YES
Docker: Failed to fetch info data	Zabbix has not received data for items for the last 30 minutes	`nodata(/Docker by Zabbix agent 2/docker.name,30m)=1`	WARNING	Manual close: YES Depends on: - Docker: Service is down
Docker: Version has changed	Docker version has changed. Ack to close.	`last(/Docker by Zabbix agent 2/docker.server_version,#1)<>last(/Docker by Zabbix agent 2/docker.server_version,#2) and length(last(/Docker by Zabbix agent 2/docker.server_version))>0`	INFO	Manual close: YES
Container {#NAME}: Container has been stopped with error code	-	`last(/Docker by Zabbix agent 2/docker.container_info.state.exitcode["{#NAME}"])>0 and last(/Docker by Zabbix agent 2/docker.container_info.state.running["{#NAME}"])=0`	AVERAGE	Manual close: YES
Container {#NAME}: An error has occurred in the container	Container {#NAME} has an error. Ack to close.	`last(/Docker by Zabbix agent 2/docker.container_info.state.error["{#NAME}"],#1)<>last(/Docker by Zabbix agent 2/docker.container_info.state.error["{#NAME}"],#2) and length(last(/Docker by Zabbix agent 2/docker.container_info.state.error["{#NAME}"]))>0`	WARNING	Manual close: YES

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

app

app_controlm_http

View README Download JSON

Control-M server by HTTP

Overview

For Zabbix version: 6.2 and higher.

This template is designed to get metrics from the Control-M server using the Control-M Automation API with HTTP agent.

This template monitors server statistics, discovers jobs and agents using Low Level Discovery.

To use this template, macros {$API.TOKEN}, {$API.URI.ENDPOINT}, and {$SERVER.NAME} need to be set.

Tested versions

This template has been tested on:

Control-M 9.21.0

Setup

This template is primarily intended for using in conjunction with the Control-M enterprise manager by HTTP template in order to create host prototypes.

It monitors:

server statistics;
discovers jobs using Low Level Discovery;
discovers agents using Low Level Discovery.

However, if you wish to monitor the Control-M server separately with this template, you must set the following macros: {$API.TOKEN}, {$API.URI.ENDPOINT}, and {$SERVER.NAME}.

To access the {$API.TOKEN} macro, use one of the following interfaces:

{$API.URI.ENDPOINT} - is the Control-M Automation API endpoint for the API requests, including your server IP, or DNS address, the Automation API port and path.

For example, https://monitored.controlm.instance:8443/automation-api.

{$SERVER.NAME} - is the name of the Control-M server to be monitored.

Macros used

Name	Description	Default
{$SERVER.NAME}	The name of the Control-M server.
{$API.URI.ENDPOINT}	The API endpoint is a URI - for example, `https://monitored.controlm.instance:8443/automation-api`.
{$API.TOKEN}	A token to use for API connections.

Items

Name	Description	Type	Key and additional info
Control-M: Get Control-M server stats	Gets the statistics of the server.	Http Agent	controlm.server.stats Preprocessing Jsonpath ⛔️On fail: Custom_Error -> Could Not Get Server Stats.
Control-M: Get jobs	Gets the status of jobs.	Http Agent	controlm.jobs
Control-M: Get agents	Gets agents for the server.	Http Agent	controlm.agents
Control-M: Jobs statistics	Gets the statistics of jobs.	Dependent	controlm.jobs.statistics Preprocessing Jsonpath: `$.['returned', 'total']`
Control-M: Jobs returned	Gets the count of returned jobs.	Dependent	controlm.jobs.statistics.returned Preprocessing Jsonpath: `$.[0]` DiscardUnchangedHeartbeat: `1h`
Control-M: Jobs total	Gets the count of total jobs.	Dependent	controlm.jobs.statistics.total Preprocessing Jsonpath: `$.[1]` DiscardUnchangedHeartbeat: `1h`
Control-M: Server state	Gets the metric of the server state.	Dependent	server.state Preprocessing Jsonpath ⛔️On fail: CustomError -> Could Not Get Server State. Javascript: `The text is too long. Please see the template.` DiscardUnchanged_Heartbeat: `1h`
Control-M: Server message	Gets the metric of the server message.	Dependent	server.message Preprocessing Jsonpath ⛔️On fail: CustomError -> Could Not Get Server Message. DiscardUnchanged_Heartbeat: `1h`
Control-M: Server version	Gets the metric of the server version.	Dependent	server.version Preprocessing Jsonpath ⛔️On fail: CustomError -> Could Not Get Server Version. DiscardUnchanged_Heartbeat: `1h`

Triggers

Name	Description	Expression	Severity
Control-M: Server is down	The server is down.	`last(/Control-M server by HTTP/Control-M: Server state)=0 or last(/Control-M server by HTTP/Control-M: Server state)=10`\|High	-
Control-M: Server disconnected	The server is disconnected.	`last(/Control-M server by HTTP/Control-M: Server message,#1)="Disconnected"`\|High	-
Control-M: Server error	The server has encountered an error.	`last(/Control-M server by HTTP/Control-M: Server message,#1)<>"Connected" and last(/Control-M server by HTTP/Control-M: Server message,#1)<>"Disconnected" and last(/Control-M server by HTTP/Control-M: Server message,#1)<>""`\|High	-
Control-M: Server version has changed	The server version has changed. Acknowledge (Ack) to close.	`last(/Control-M server by HTTP/Control-M: Server version,#1)<>last(/Control-M server by HTTP/Control-M: Server version,#2) and length(last(/Control-M server by HTTP/Control-M: Server version))>0`\|Info	-

LLD rule for jobs discovery

Name Description Type Key and additional info

Jobs discovery

Discovers jobs on the server.

Dependent

controlm.jobs.discovery

Preprocessing

Jsonpath
⛔️On fail: CustomValue -> []
DiscardUnchanged_Heartbeat: 1h

Items for jobs discovery

Name	Description	Type	Key and additional info
Job [{#JOB.ID}]: stats	Gets the statistics of a job.	Dependent	job.stats['{#JOB.ID}'] Preprocessing Jsonpath ⛔️On fail: Discard_Value
Job [{#JOB.ID}]: status	Gets the status of a job.	Dependent	job.status['{#JOB.ID}'] Preprocessing Jsonpath: `$.status` Javascript: `The text is too long. Please see the template.` DiscardUnchangedHeartbeat: `1h`
Job [{#JOB.ID}]: number of runs	Gets the number of runs for a job.	Dependent	job.numberOfRuns['{#JOB.ID}'] Preprocessing Jsonpath: `$.numberOfRuns` DiscardUnchangedHeartbeat: `1h`
Job [{#JOB.ID}]: type	Gets the job type.	Dependent	job.type['{#JOB.ID}'] Preprocessing Jsonpath: `$.type` Javascript: `The text is too long. Please see the template.` DiscardUnchangedHeartbeat: `1h`
Job [{#JOB.ID}]: held status	Gets the held status of a job.	Dependent	job.held['{#JOB.ID}'] Preprocessing Jsonpath: `$.held` Javascript: `The text is too long. Please see the template.`

Triggers for jobs discovery

Name	Description	Type	Key and additional info
Job [{#JOB.ID}]: status [{ITEM.VALUE}]	The job has encountered an issue.	`last(/Control-M server by HTTP/Job [{#JOB.ID}]: status,#1)=1 or last(/Control-M server by HTTP/Job [{#JOB.ID}]: status,#1)=10`\|Warning	-

LLD rule for agent discovery

Name Description Type Key and additional info

Agent discovery

Discovers agents on the server.

Dependent

controlm.agent.discovery

Preprocessing

Jsonpath
⛔️On fail: CustomValue -> []
DiscardUnchanged_Heartbeat: 1h

Items for agent discovery

Name Description Type Key and additional info

Agent [{#AGENT.NAME}]: stats

Gets the statistics of an agent.

Dependent

agent.stats['{#AGENT.NAME}']

Preprocessing

Jsonpath
⛔️On fail: Discard_Value

Agent [{#AGENT.NAME}]: status

Gets the status of an agent.

Dependent

agent.status['{#AGENT.NAME}']

Preprocessing

Jsonpath: $.status
Javascript: The text is too long. Please see the template.
DiscardUnchangedHeartbeat: 1h

Agent [{#AGENT.NAME}]: version

Gets the version number of an agent.

Dependent

agent.version['{#AGENT.NAME}']

Preprocessing

Jsonpath
⛔️On fail: CustomValue -> Unknown
DiscardUnchanged_Heartbeat: 1h

Triggers for agent discovery

Name	Description	Type	Key and additional info
Agent [{#AGENT.NAME}]: status [{ITEM.VALUE}]	The agent has encountered an issue.	`last(/Control-M server by HTTP/Agent [{#AGENT.NAME}]: status,#1)=1 or last(/Control-M server by HTTP/Agent [{#AGENT.NAME}]: status,#1)=10`\|Average	-
Agent [{#AGENT.NAME}}: status disabled	The agent is disabled.	`last(/Control-M server by HTTP/Agent [{#AGENT.NAME}]: status,#1)=2 or last(/Control-M server by HTTP/Agent [{#AGENT.NAME}]: status,#1)=3`\|Info	-
Agent [{#AGENT.NAME}]: version has changed	The agent version has changed. Acknowledge (Ack) to close.	`last(/Control-M server by HTTP/Agent [{#AGENT.NAME}]: version,#1)<>last(/Control-M server by HTTP/Agent [{#AGENT.NAME}]: version,#2)`\|Info	-
Agent [{#AGENT.NAME}]: unknown version	The agent version is unknown.	`last(/Control-M server by HTTP/Agent [{#AGENT.NAME}]: version,#1)="Unknown"`\|Warning	-

Feedback

Please report any issues with the template at https://support.zabbix.com.

You can also provide feedback, discuss the template, or ask for help at ZABBIX forums.

Control-M enterprise manager by HTTP

Overview

For Zabbix version: 6.2 and higher.

This template is designed to get metrics from the Control-M Enterprise Manager using the Control-M Automation API with HTTP agent.

This template monitors active Service Level Agreement (SLA) services, discovers Control-M servers using Low Level Discovery and also creates host prototypes for them in conjunction with the Control-M server by HTTP template.

To use this template, macros {$API.TOKEN} and {$API.URI.ENDPOINT} need to be set.

Tested versions

This template has been tested on:

Control-M 9.21.0

Setup

This template is intended to be used on Control-M Enterprise Manager instances.

It monitors:

active SLA services;
discovers Control-M servers using Low Level Discovery;
creates host prototypes for discovered servers with the Control-M server by HTTP template.

To use this template, you must set macros: {$API.TOKEN} and {$API.URI.ENDPOINT}.

To access the API token, use one of the following Control-M interfaces:

{$API.URI.ENDPOINT} - is the Control-M Automation API endpoint for the API requests, including your server IP, or DNS address, Automation API port and path.

For example, https://monitored.controlm.instance:8443/automation-api.

Macros used

Name	Description	Default
{$API.URI.ENDPOINT}	The API endpoint is a URI - for example, `https://monitored.controlm.instance:8443/automation-api`.
{$API.TOKEN}	A token to use for API connections.

Items

Name	Description	Type	Key and additional info
Control-M: Get Control-M servers	Gets a list of servers.	Http Agent	controlm.servers
Control-M: Get SLA services	Gets all the SLA active services.	Http Agent	controlm.services

Triggers

Name	Description	Expression	Severity	Dependencies and additional info

LLD rule for server discovery

Name Description Type Key and additional info

Server discovery

Discovers the Control-M servers.

Dependent

controlm.server.discovery

Preprocessing

DiscardUnchangedHeartbeat: 2h

LLD rule for sla services discovery

Name Description Type Key and additional info

SLA services discovery

Discovers the SLA services in the Control-M environment.

Dependent

controlm.services.discovery

Preprocessing

Jsonpath
⛔️On fail: CustomValue -> []
DiscardUnchanged_Heartbeat: 1h

Items for sla services discovery

Name	Description	Type	Key and additional info
Service [{#SERVICE.NAME}, {#SERVICE.JOB}]: stats	Gets the service statistics.	Dependent	service.stats['{#SERVICE.NAME}','{#SERVICE.JOB}'] Preprocessing Jsonpath ⛔️On fail: DiscardValue Jsonpath ⛔️On fail: DiscardValue
Service [{#SERVICE.NAME}, {#SERVICE.JOB}]: status	Gets the service status.	Dependent	service.status['{#SERVICE.NAME}','{#SERVICE.JOB}'] Preprocessing Jsonpath: `$.status` Javascript: `The text is too long. Please see the template.` DiscardUnchangedHeartbeat: `1h`
Service [{#SERVICE.NAME}, {#SERVICE.JOB}]: jobs 'executed'	Gets the number of jobs in the state - `executed`.	Dependent	service.jobs.status['{#SERVICE.NAME}','{#SERVICE.JOB}',executed] Preprocessing Jsonpath: `$.statusByJobs.executed` DiscardUnchangedHeartbeat: `1h`
Service [{#SERVICE.NAME}, {#SERVICE.JOB}]: jobs 'waitCondition'	Gets the number of jobs in the state - `waitCondition`.	Dependent	service.jobs.status['{#SERVICE.NAME}','{#SERVICE.JOB}',waitCondition] Preprocessing Jsonpath: `$.statusByJobs.waitCondition` DiscardUnchangedHeartbeat: `1h`
Service [{#SERVICE.NAME}, {#SERVICE.JOB}]: jobs 'waitResource'	Gets the number of jobs in the state - `waitResource`.	Dependent	service.jobs.status['{#SERVICE.NAME}','{#SERVICE.JOB}',waitResource] Preprocessing Jsonpath: `$.statusByJobs.waitResource` DiscardUnchangedHeartbeat: `1h`
Service [{#SERVICE.NAME}, {#SERVICE.JOB}]: jobs 'waitHost'	Gets the number of jobs in the state - `waitHost`.	Dependent	service.jobs.status['{#SERVICE.NAME}','{#SERVICE.JOB}',waitHost] Preprocessing Jsonpath: `$.statusByJobs.waitHost` DiscardUnchangedHeartbeat: `1h`
Service [{#SERVICE.NAME}, {#SERVICE.JOB}]: jobs 'waitWorkload'	Gets the number of jobs in the state - `waitWorkload`.	Dependent	service.jobs.status['{#SERVICE.NAME}','{#SERVICE.JOB}',waitWorkload] Preprocessing Jsonpath: `$.statusByJobs.waitWorkload` DiscardUnchangedHeartbeat: `1h`
Service [{#SERVICE.NAME}, {#SERVICE.JOB}]: jobs 'completed'	Gets the number of jobs in the state - `completed`.	Dependent	service.jobs.status['{#SERVICE.NAME}','{#SERVICE.JOB}',completed] Preprocessing Jsonpath: `$.statusByJobs.completed` DiscardUnchangedHeartbeat: `1h`
Service [{#SERVICE.NAME}, {#SERVICE.JOB}]: jobs 'error'	Gets the number of jobs in the state - `error`.	Dependent	service.jobs.status['{#SERVICE.NAME}','{#SERVICE.JOB}',error] Preprocessing Jsonpath: `$.statusByJobs.error` DiscardUnchangedHeartbeat: `1h`

Triggers for sla services discovery

Name	Description	Type	Key and additional info
Service [{#SERVICE.NAME}, {#SERVICE.JOB}]: status [{ITEM.VALUE}]	The service has encountered an issue.	`last(/Control-M enterprise manager by HTTP/Service [{#SERVICE.NAME}, {#SERVICE.JOB}]: status,#1)=0 or last(/Control-M enterprise manager by HTTP/Service [{#SERVICE.NAME}, {#SERVICE.JOB}]: status,#1)=10`\|Average	-
Service [{#SERVICE.NAME}, {#SERVICE.JOB}]: status [{ITEM.VALUE}]	The service has finished its job late.	`last(/Control-M enterprise manager by HTTP/Service [{#SERVICE.NAME}, {#SERVICE.JOB}]: status,#1)=3`\|Warning	-
Service [{#SERVICE.NAME}, {#SERVICE.JOB}]: jobs in 'error' state	There are services present which are in the state - `error`.	`last(/Control-M enterprise manager by HTTP/Service [{#SERVICE.NAME}, {#SERVICE.JOB}]: jobs 'error',#1)>0`\|Average	-

Feedback

Please report any issues with the template at https://support.zabbix.com.

You can also provide feedback, discuss the template, or ask for help at ZABBIX forums.

app

consul_cluster_http

View README Download JSON

HashiCorp Consul Cluster by HTTP

Overview

Template HashiCorp Consul Cluster by HTTP — collects metrics by HTTP agent from API endpoints.
More information about metrics you can find in official documentation.

This template was tested on:

HashiCorp Consul, version 1.10.0

Setup

Template need to use Authorization via API token.

Don't forget to change macros {$CONSUL.CLUSTER.URL}, {$CONSUL.TOKEN}.
Also, see the Macros section for a list of macros used to set trigger values.

This template support Consul namespaces. You can set macro {$CONSUL.NAMESPACE}, if you are interested in only one service namespace. Do not specify this macro to get all of services.
In case of Open Source version leave this macro empty.

NOTE. Some metrics may not be collected depending on your HashiCorp Consul instance version and configuration.
NOTE. You maybe are interested in Envoy Proxy by HTTP template.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$CONSUL.API.PORT}	Consul API port. Using in node LLD.	`8500`
{$CONSUL.API.SCHEME}	Consul API scheme. Using in node LLD.	`http`
{$CONSUL.CLUSTER.URL}	Consul cluster URL.	`http://localhost:8500`
{$CONSUL.LLD.FILTER.NODE_NAME.MATCHES}	Filter of discoverable discovered nodes.	`.*`
{$CONSUL.LLD.FILTER.NODENAME.NOTMATCHES}	Filter to exclude discovered nodes.	`CHANGE IF NEEDED`
{$CONSUL.LLD.FILTER.SERVICE_NAME.MATCHES}	Filter of discoverable discovered services.	`.*`
{$CONSUL.LLD.FILTER.SERVICENAME.NOTMATCHES}	Filter to exclude discovered services.	`CHANGE IF NEEDED`
{$CONSUL.NAMESPACE}	Consul service namespace. Enterprise only, in case of Open Source version leave this macro empty. Do not specify this macro to get all of services.	``
{$CONSUL.SERVICE_NODES.CRITICAL.MAX.AVG}	Maximum number of service nodes in status 'critical' for trigger expression. Can be used with context.	`0`
{$CONSUL.TOKEN}	Consul auth token.	`<PUT YOUR AUTH TOKEN>`

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Consul cluster nodes discovery

DEPENDENT

consul.lldnodes
Preprocessing:
- JAVASCRIPT: The text is too long. Please see the template.

- DISCARDUNCHANGEDHEARTBEAT: 3h

Filter:

- {#NODENAME} MATCHESREGEX {$CONSUL.LLD.FILTER.NODE_NAME.MATCHES}

- {#NODENAME} NOTMATCHESREGEX {$CONSUL.LLD.FILTER.NODE_NAME.NOT_MATCHES}

Consul cluster services discovery

DEPENDENT

consul.lldservices
Preprocessing:
- JAVASCRIPT: The text is too long. Please see the template.

- DISCARDUNCHANGEDHEARTBEAT: 3h

Filter:

- {#SERVICENAME} MATCHESREGEX {$CONSUL.LLD.FILTER.SERVICE_NAME.MATCHES}

- {#SERVICENAME} NOTMATCHESREGEX {$CONSUL.LLD.FILTER.SERVICE_NAME.NOT_MATCHES}

Items collected

Group	Name	Description	Type	Key and additional info
Consul	Consul: Nodes: total	Number of nodes on current dc.	DEPENDENT	consul.nodestotal Preprocessing: - JSONPATH: `$.length()` - DISCARDUNCHANGED_HEARTBEAT: `3h`
Consul	Consul: Nodes: passing	Number of agents on current dc with serf health status 'passing'.	DEPENDENT	consul.nodespassing Preprocessing: - JSONPATH: `$[?(@.Status == "passing")].length()` - DISCARDUNCHANGED_HEARTBEAT: `3h`
Consul	Consul: Nodes: critical	Number of agents on current dc with serf health status 'critical'.	DEPENDENT	consul.nodescritical Preprocessing: - JSONPATH: `$[?(@.Status == "critical")].length()` - DISCARDUNCHANGED_HEARTBEAT: `3h`
Consul	Consul: Nodes: warning	Number of agents on current dc with serf health status 'warning'.	DEPENDENT	consul.nodeswarning Preprocessing: - JSONPATH: `$[?(@.Status == "warning")].length()` - DISCARDUNCHANGED_HEARTBEAT: `3h`
Consul	Consul: Services: total	Number of services on current dc.	DEPENDENT	consul.servicestotal Preprocessing: - JAVASCRIPT: `return Object.keys(JSON.parse(value)).length;` - DISCARDUNCHANGED_HEARTBEAT: `3h`
Consul	Consul: Node ["{#NODE_NAME}"]: Serf Health	Node Serf Health Status.	DEPENDENT	consul.serf.health["{#NODENAME}"] Preprocessing: - JSONPATH: `$[?(@.Node == "{#NODE_NAME}" && @.CheckID == "serfHealth")].Status.first()` - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGED_HEARTBEAT: `3h`
Consul	Consul: Service ["{#SERVICE_NAME}"]: Nodes passing	-	DEPENDENT	consul.service.nodespassing["{#SERVICENAME}"] Preprocessing: - JSONPATH: `$[?(@.Service.Service == "{#SERVICE_NAME}")].Checks[?(@.CheckID == "serfHealth" && @.Status == 'passing')].length()` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Consul	Consul: Service ["{#SERVICE_NAME}"]: Nodes warning	-	DEPENDENT	consul.service.nodeswarning["{#SERVICENAME}"] Preprocessing: - JSONPATH: `$[?(@.Service.Service == "{#SERVICE_NAME}")].Checks[?(@.CheckID == "serfHealth" && @.Status == 'warning')].length()` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Consul	Consul: Service ["{#SERVICE_NAME}"]: Nodes critical	-	DEPENDENT	consul.service.nodescritical["{#SERVICENAME}"] Preprocessing: - JSONPATH: `$[?(@.Service.Service == "{#SERVICE_NAME}")].Checks[?(@.CheckID == "serfHealth" && @.Status == 'critical')].length()` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Consul cluster	Consul cluster: Cluster leader	Current leader address.	HTTP_AGENT	consul.getleader Preprocessing: - CHECKNOTSUPPORTED ⛔️ONFAIL: `DISCARD_VALUE ->` - TRIM: `"` - DISCARDUNCHANGEDHEARTBEAT: `1h`
Zabbix raw items	Consul cluster: Nodes: peers	The number of Raft peers for the datacenter in which the agent is running.	HTTP_AGENT	consul.getpeers Preprocessing: - CHECKNOTSUPPORTED ⛔️ONFAIL: `DISCARD_VALUE ->` - JSONPATH: `$.length()` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Zabbix raw items	Consul cluster: Get nodes	Catalog of nodes registered in a given datacenter.	HTTP_AGENT	consul.getnodes Preprocessing: - CHECKNOTSUPPORTED ⛔️ONFAIL: `DISCARD_VALUE ->`
Zabbix raw items	Consul cluster: Get nodes Serf health status	Get Serf Health Status for all agents in cluster.	HTTP_AGENT	consul.getclusterserf Preprocessing: - CHECKNOTSUPPORTED ⛔️ON_FAIL: `DISCARD_VALUE ->`
Zabbix raw items	Consul cluster: Get services	Catalog of services registered in a given datacenter.	HTTP_AGENT	consul.getcatalogservices Preprocessing: - CHECKNOTSUPPORTED ⛔️ON_FAIL: `DISCARD_VALUE ->`
Zabbix raw items	Consul cluster: ["{#SERVICE_NAME}"]: Get raw service state	Retrieve service instances providing the service indicated on the path.	HTTP_AGENT	consul.getservicestats["{#SERVICENAME}"] Preprocessing: - CHECKNOTSUPPORTED ⛔️ONFAIL: `DISCARD_VALUE ->`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Consul: One or more nodes in cluster in 'critical' state	One or more agents on current dc with serf health status 'critical'.	`last(/HashiCorp Consul Cluster by HTTP/consul.nodes_critical)>0`	AVERAGE
Consul: One or more nodes in cluster in 'warning' state	One or more agents on current dc with serf health status 'warning'.	`last(/HashiCorp Consul Cluster by HTTP/consul.nodes_warning)>0`	WARNING
Consul: Service ["{#SERVICE_NAME}"]: Too many nodes with service status 'critical'
One or more nodes with service status 'critical'.	`last(/HashiCorp Consul Cluster by HTTP/consul.service.nodes_critical["{#SERVICE_NAME}"])>{$CONSUL.CLUSTER.SERVICE_NODES.CRITICAL.MAX.AVG:"{#SERVICE_NAME}"}`	AVERAGE
Consul cluster: Leader has been changed	Consul cluster version has changed. Ack to close.	`last(/HashiCorp Consul Cluster by HTTP/consul.get_leader,#1)<>last(/HashiCorp Consul Cluster by HTTP/consul.get_leader,#2) and length(last(/HashiCorp Consul Cluster by HTTP/consul.get_leader))>0`	INFO	Manual close: YES

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

app

consul_node_http

View README Download JSON

HashiCorp Consul Node by HTTP

Overview

For Zabbix version: 6.2 and higher
The template to monitor HashiCorp Consul by Zabbix that works without any external scripts.
Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.
Do not forget to enable Prometheus format for export metrics. See documentation.
More information about metrics you can find in official documentation.

Template HashiCorp Consul Node by HTTP — collects metrics by HTTP agent from /v1/agent/metrics endpoint.

This template was tested on:

HashiCorp Consul, version 1.10.0

Setup

Internal service metrics are collected from /v1/agent/metrics endpoint. Do not forget to enable Prometheus format for export metrics. See documentation. Template need to use Authorization via API token.

Don't forget to change macros {$CONSUL.NODE.API.URL}, {$CONSUL.TOKEN}.
Also, see the Macros section for a list of macros used to set trigger values.

This template support Consul namespaces. You can set macros {$CONSUL.LLD.FILTER.SERVICENAMESPACE.MATCHES}, {$CONSUL.LLD.FILTER.SERVICENAMESPACE.NOT_MATCHES} if you want to filter discovered services by namespace.
In case of Open Source version service namespace will be set to 'None'.

NOTE. Some metrics may not be collected depending on your HashiCorp Consul instance version and configuration.
NOTE. You maybe are interested in Envoy Proxy by HTTP template.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$CONSUL.LLD.FILTER.LOCALSERVICENAME.MATCHES}	Filter of discoverable discovered services on local node.	`.*`
{$CONSUL.LLD.FILTER.LOCALSERVICENAME.NOT_MATCHES}	Filter to exclude discovered services on local node.	`CHANGE IF NEEDED`
{$CONSUL.LLD.FILTER.SERVICE_NAMESPACE.MATCHES}	Filter of discoverable discovered service by namespace on local node. Enterprise only, in case of Open Source version Namespace will be set to 'None'.	`.*`
{$CONSUL.LLD.FILTER.SERVICENAMESPACE.NOTMATCHES}	Filter to exclude discovered service by namespace on local node. Enterprise only, in case of Open Source version Namespace will be set to 'None'.	`CHANGE IF NEEDED`
{$CONSUL.NODE.API.URL}	Consul instance URL.	`http://localhost:8500`
{$CONSUL.NODE.HEALTH_SCORE.MAX.HIGH}	Maximum acceptable value of node's health score for AVERAGE trigger expression.	`4`
{$CONSUL.NODE.HEALTH_SCORE.MAX.WARN}	Maximum acceptable value of node's health score for WARNING trigger expression.	`2`
{$CONSUL.OPEN.FDS.MAX.WARN}	Maximum percentage of used file descriptors.	`90`
{$CONSUL.TOKEN}	Consul auth token.	`<PUT YOUR AUTH TOKEN>`

Template links

There are no template links in this template.

Discovery rules

Name	Description	Type	Key and additional info
HTTP API methods discovery	Discovery HTTP API methods specific metrics.	DEPENDENT	consul.httpapidiscovery Preprocessing: - PROMETHEUSTOJSON: `consul_api_http{method =~ ".*"}` - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Local node services discovery	Discover metrics for services that are registered with the local agent.	DEPENDENT	consul.nodeserviceslld Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `3h` Filter: - {#SERVICENAME} MATCHESREGEX `{$CONSUL.LLD.FILTER.LOCAL_SERVICE_NAME.MATCHES}` - {#SERVICENAME} NOTMATCHESREGEX `{$CONSUL.LLD.FILTER.LOCAL_SERVICE_NAME.NOT_MATCHES}` - {#SERVICENAMESPACE} MATCHESREGEX `{$CONSUL.LLD.FILTER.SERVICE_NAMESPACE.MATCHES}` - {#SERVICENAMESPACE} NOTMATCHESREGEX `{$CONSUL.LLD.FILTER.SERVICE_NAMESPACE.NOT_MATCHES}` Overrides: aggregated status - {#TYPE} MATCHESREGEX `aggregated_status` - ITEMPROTOTYPE LIKE `Aggregated status` - DISCOVER - ITEMPROTOTYPE LIKE `State` - DISCOVER checks - {#TYPE} MATCHESREGEX `service_check` - ITEM_PROTOTYPE LIKE `Check` - DISCOVER
Raft leader metrics discovery	Discover raft metrics for leader nodes.	DEPENDENT	consul.raft.leader.discovery Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Raft server metrics discovery	Discover raft metrics for server nodes.	DEPENDENT	consul.raft.server.discovery Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `3h`

Items collected

Group	Name	Description	Type	Key and additional info
Consul	Consul: Role	Role of current Consul agent.	DEPENDENT	consul.role Preprocessing: - JSONPATH: `$.Config.Server` - BOOLTODECIMAL - DISCARDUNCHANGEDHEARTBEAT: `3h`
Consul	Consul: Version	Version of Consul agent.	DEPENDENT	consul.version Preprocessing: - JSONPATH: `$.Config.Version` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Consul	Consul: Number of services	Number of services on current node.	DEPENDENT	consul.servicesnumber Preprocessing: - JSONPATH: `$.Stats.agent.services` - DISCARDUNCHANGED_HEARTBEAT: `3h`
Consul	Consul: Number of checks	Number of checks on current node.	DEPENDENT	consul.checksnumber Preprocessing: - JSONPATH: `$.Stats.agent.checks` - DISCARDUNCHANGED_HEARTBEAT: `3h`
Consul	Consul: Number of check monitors	Number of check monitors on current node.	DEPENDENT	consul.checkmonitorsnumber Preprocessing: - JSONPATH: `$.Stats.agent.check_monitors` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Consul	Consul: Process CPU seconds, total	Total user and system CPU time spent in seconds.	DEPENDENT	consul.cpusecondstotal.rate Preprocessing: - PROMETHEUSPATTERN: `process_cpu_seconds_total` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Consul	Consul: Virtual memory size	Virtual memory size in bytes.	DEPENDENT	consul.virtualmemorybytes Preprocessing: - PROMETHEUS_PATTERN: `process_virtual_memory_bytes`
Consul	Consul: RSS memory usage	Resident memory size in bytes.	DEPENDENT	consul.residentmemorybytes Preprocessing: - PROMETHEUS_PATTERN: `process_resident_memory_bytes`
Consul	Consul: Goroutine count	The number of Goroutines on Consul instance.	DEPENDENT	consul.goroutines Preprocessing: - PROMETHEUS_PATTERN: `go_goroutines`
Consul	Consul: Open file descriptors	Number of open file descriptors.	DEPENDENT	consul.processopenfds Preprocessing: - PROMETHEUS_PATTERN: `process_open_fds`
Consul	Consul: Open file descriptors, max	Maximum number of open file descriptors.	DEPENDENT	consul.processmaxfds Preprocessing: - PROMETHEUS_PATTERN: `process_max_fds`
Consul	Consul: Client RPC, per second	Number of times per second whenever a Consul agent in client mode makes an RPC request to a Consul server. This gives a measure of how much a given agent is loading the Consul servers. This is only generated by agents in client mode, not Consul servers.	DEPENDENT	consul.clientrpc Preprocessing: - PROMETHEUSPATTERN: `consul_client_rpc` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
Consul	Consul: Client RPC failed ,per second	Number of times per second whenever a Consul agent in client mode makes an RPC request to a Consul server and fails.	DEPENDENT	consul.clientrpcfailed Preprocessing: - PROMETHEUSPATTERN: `consul_client_rpc_failed` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Consul	Consul: TCP connections, accepted per second	This metric counts the number of times a Consul agent has accepted an incoming TCP stream connection per second.	DEPENDENT	consul.memberlist.tcpaccept Preprocessing: - PROMETHEUSPATTERN: `consul_memberlist_tcp_accept` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
Consul	Consul: TCP connections, per second	This metric counts the number of times a Consul agent has initiated a push/pull sync with an other agent per second.	DEPENDENT	consul.memberlist.tcpconnect Preprocessing: - PROMETHEUSPATTERN: `consul_memberlist_tcp_connect` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
Consul	Consul: TCP send bytes, per second	This metric measures the total number of bytes sent by a Consul agent through the TCP protocol per second.	DEPENDENT	consul.memberlist.tcpsent Preprocessing: - PROMETHEUSPATTERN: `consul_memberlist_tcp_sent` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
Consul	Consul: UDP received bytes, per second	This metric measures the total number of bytes received by a Consul agent through the UDP protocol per second.	DEPENDENT	consul.memberlist.udpreceived Preprocessing: - PROMETHEUSPATTERN: `consul_memberlist_udp_received` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
Consul	Consul: UDP sent bytes, per second	This metric measures the total number of bytes sent by a Consul agent through the UDP protocol per second.	DEPENDENT	consul.memberlist.udpsent Preprocessing: - PROMETHEUSPATTERN: `consul_memberlist_udp_sent` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
Consul	Consul: GC pause, p90	The 90 percentile for the number of nanoseconds consumed by stop-the-world garbage collection (GC) pauses since Consul started, in milliseconds.	DEPENDENT	consul.gcpause.p90 Preprocessing: - PROMETHEUSPATTERN: `consul_runtime_gc_pause_ns{quantile="0.9"}` ⛔️ON_FAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return (isNaN(value)) ? 0 : value;` - MULTIPLIER: `1.0E-9`
Consul	Consul: GC pause, p50	The 50 percentile (median) for the number of nanoseconds consumed by stop-the-world garbage collection (GC) pauses since Consul started, in milliseconds.	DEPENDENT	consul.gcpause.p50 Preprocessing: - PROMETHEUSPATTERN: `consul_runtime_gc_pause_ns{quantile="0.5"}` ⛔️ON_FAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return (isNaN(value)) ? 0 : value;` - MULTIPLIER: `1.0E-9`
Consul	Consul: Memberlist: degraded	This metric counts the number of times the Consul agent has performed failure detection on another agent at a slower probe rate. The agent uses its own health metric as an indicator to perform this action. If its health score is low, it means that the node is healthy, and vice versa.	DEPENDENT	consul.memberlist.degraded Preprocessing: - PROMETHEUSPATTERN: `consul_memberlist_degraded` ⛔️ONFAIL: `DISCARD_VALUE ->`
Consul	Consul: Memberlist: health score	This metric describes a node's perception of its own health based on how well it is meeting the soft real-time requirements of the protocol. This metric ranges from 0 to 8, where 0 indicates "totally healthy".	DEPENDENT	consul.memberlist.healthscore Preprocessing: - PROMETHEUSPATTERN: `consul_memberlist_health_score` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Consul	Consul: Memberlist: gossip, p90	The 90 percentile for the number of gossips (messages) broadcasted to a set of randomly selected nodes.	DEPENDENT	consul.memberlist.dispatchlog.p90 Preprocessing: - PROMETHEUSPATTERN: `consul_memberlist_gossip{quantile="0.9"}` ⛔️ON_FAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return (isNaN(value)) ? 0 : value;`
Consul	Consul: Memberlist: gossip, p50	The 50 for the number of gossips (messages) broadcasted to a set of randomly selected nodes.	DEPENDENT	consul.memberlist.gossip.p50 Preprocessing: - PROMETHEUSPATTERN: `consul_memberlist_gossip{quantile="0.5"}` ⛔️ONFAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return (isNaN(value)) ? 0 : value;`
Consul	Consul: Memberlist: msg alive	This metric counts the number of alive Consul agents, that the agent has mapped out so far, based on the message information given by the network layer.	DEPENDENT	consul.memberlist.msg.alive Preprocessing: - PROMETHEUSPATTERN: `consul_memberlist_msg_alive` ⛔️ONFAIL: `DISCARD_VALUE ->`
Consul	Consul: Memberlist: msg dead	This metric counts the number of times a Consul agent has marked another agent to be a dead node.	DEPENDENT	consul.memberlist.msg.dead Preprocessing: - PROMETHEUSPATTERN: `consul_memberlist_msg_dead` ⛔️ONFAIL: `DISCARD_VALUE ->`
Consul	Consul: Memberlist: msg suspect	The number of times a Consul agent suspects another as failed while probing during gossip protocol.	DEPENDENT	consul.memberlist.msg.suspect Preprocessing: - PROMETHEUSPATTERN: `consul_memberlist_msg_suspect` ⛔️ONFAIL: `DISCARD_VALUE ->`
Consul	Consul: Memberlist: probe node, p90	The 90 percentile for the time taken to perform a single round of failure detection on a select Consul agent.	DEPENDENT	consul.memberlist.probenode.p90 Preprocessing: - PROMETHEUSPATTERN: `consul_memberlist_probeNode{quantile="0.9"}` ⛔️ON_FAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return (isNaN(value)) ? 0 : value;`
Consul	Consul: Memberlist: probe node, p50	The 50 percentile (median) for the time taken to perform a single round of failure detection on a select Consul agent.	DEPENDENT	consul.memberlist.probenode.p50 Preprocessing: - PROMETHEUSPATTERN: `consul_memberlist_probeNode{quantile="0.5"}` ⛔️ON_FAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return (isNaN(value)) ? 0 : value;`
Consul	Consul: Memberlist: push pull node, p90	The 90 percentile for the number of Consul agents that have exchanged state with this agent.	DEPENDENT	consul.memberlist.pushpullnode.p90 Preprocessing: - PROMETHEUSPATTERN: `consul_memberlist_pushPullNode{quantile="0.9"}` ⛔️ONFAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return (isNaN(value)) ? 0 : value;`
Consul	Consul: Memberlist: push pull node, p50	The 50 percentile (median) for the number of Consul agents that have exchanged state with this agent.	DEPENDENT	consul.memberlist.pushpullnode.p50 Preprocessing: - PROMETHEUSPATTERN: `consul_memberlist_pushPullNode{quantile="0.5"}` ⛔️ONFAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return (isNaN(value)) ? 0 : value;`
Consul	Consul: KV store: apply, p90	The 90 percentile for the time it takes to complete an update to the KV store.	DEPENDENT	consul.kvs.apply.p90 Preprocessing: - PROMETHEUSPATTERN: `consul_kvs_apply{quantile="0.9"}` ⛔️ONFAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return (isNaN(value)) ? 0 : value;`
Consul	Consul: KV store: apply, p50	The 50 percentile (median) for the time it takes to complete an update to the KV store.	DEPENDENT	consul.kvs.apply.p50 Preprocessing: - PROMETHEUSPATTERN: `consul_kvs_apply{quantile="0.5"}` ⛔️ONFAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return (isNaN(value)) ? 0 : value;`
Consul	Consul: KV store: apply, rate	The number of updates to the KV store per second.	DEPENDENT	consul.kvs.apply.rate Preprocessing: - PROMETHEUSPATTERN: `consul_kvs_apply_count` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Consul	Consul: Serf member: flap, rate	Increments when an agent is marked dead and then recovers within a short time period. This can be an indicator of overloaded agents, network problems, or configuration errors where agents cannot connect to each other on the required ports. Shown as events per second.	DEPENDENT	consul.serf.member.flap.rate Preprocessing: - PROMETHEUSPATTERN: `consul_serf_member_flap` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Consul	Consul: Serf member: failed, rate	Increments when an agent is marked dead. This can be an indicator of overloaded agents, network problems, or configuration errors where agents cannot connect to each other on the required ports. Shown as events per second.	DEPENDENT	consul.serf.member.failed.rate Preprocessing: - PROMETHEUSPATTERN: `consul_serf_member_failed` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Consul	Consul: Serf member: join, rate	Increments when an agent joins the cluster. If an agent flapped or failed this counter also increments when it re-joins. Shown as events per second.	DEPENDENT	consul.serf.member.join.rate Preprocessing: - PROMETHEUSPATTERN: `consul_serf_member_join` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Consul	Consul: Serf member: left, rate	Increments when an agent leaves the cluster. Shown as events per second.	DEPENDENT	consul.serf.member.left.rate Preprocessing: - PROMETHEUSPATTERN: `consul_serf_member_left` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Consul	Consul: Serf member: update, rate	Increments when a Consul agent updates. Shown as events per second.	DEPENDENT	consul.serf.member.update.rate Preprocessing: - PROMETHEUSPATTERN: `consul_serf_member_update` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Consul	Consul: ACL: resolves, rate	The number of ACL resolves per second.	DEPENDENT	consul.acl.resolves.rate Preprocessing: - PROMETHEUSPATTERN: `consul_acl_ResolveToken_count` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Consul	Consul: Catalog: register, rate	The number of catalog register operation per second.	DEPENDENT	consul.catalog.register.rate Preprocessing: - PROMETHEUSPATTERN: `consul_catalog_register_count` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Consul	Consul: Catalog: deregister, rate	The number of catalog deregister operation per second.	DEPENDENT	consul.catalog.deregister.rate Preprocessing: - PROMETHEUSPATTERN: `consul_catalog_deregister_count` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Consul	Consul: Snapshot: append line, p90	The 90 percentile for the time taken by the Consul agent to append an entry into the existing log.	DEPENDENT	consul.snapshot.appendline.p90 Preprocessing: - PROMETHEUSPATTERN: `consul_serf_snapshot_appendLine{quantile="0.9"}` ⛔️ON_FAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return (isNaN(value)) ? 0 : value;`
Consul	Consul: Snapshot: append line, p50	The 50 percentile (median) for the time taken by the Consul agent to append an entry into the existing log.	DEPENDENT	consul.snapshot.appendline.p50 Preprocessing: - PROMETHEUSPATTERN: `consul_serf_snapshot_appendLine{quantile="0.5"}` ⛔️ON_FAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return (isNaN(value)) ? 0 : value;`
Consul	Consul: Snapshot: append line, rate	The number of snapshot appendLine operations per second.	DEPENDENT	consul.snapshot.appendline.rate Preprocessing: - PROMETHEUSPATTERN: `consul_serf_snapshot_appendLine_count` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
Consul	Consul: Snapshot: compact, p90	The 90 percentile for the time taken by the Consul agent to compact a log. This operation occurs only when the snapshot becomes large enough to justify the compaction.	DEPENDENT	consul.snapshot.compact.p90 Preprocessing: - PROMETHEUSPATTERN: `consul_serf_snapshot_compact{quantile="0.9"}` ⛔️ONFAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return (isNaN(value)) ? 0 : value;`
Consul	Consul: Snapshot: compact, p50	The 50 percentile (median) for the time taken by the Consul agent to compact a log. This operation occurs only when the snapshot becomes large enough to justify the compaction.	DEPENDENT	consul.snapshot.compact.p50 Preprocessing: - PROMETHEUSPATTERN: `consul_serf_snapshot_compact{quantile="0.5"}` ⛔️ONFAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return (isNaN(value)) ? 0 : value;`
Consul	Consul: Snapshot: compact, rate	The number of snapshot compact operations per second.	DEPENDENT	consul.snapshot.compact.rate Preprocessing: - PROMETHEUSPATTERN: `consul_serf_snapshot_compact_count` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Consul	Consul: Get local services check	Data collection check.	DEPENDENT	consul.getlocalservices.check Preprocessing: - JSONPATH: `$.error` ⛔️ONFAIL: `CUSTOM_VALUE ->` - DISCARDUNCHANGED_HEARTBEAT: `3h`
Consul	Consul: ["{#SERVICE_NAME}"]: Aggregated status	Aggregated values of all health checks for the service instance.	DEPENDENT	consul.service.aggregatedstate["{#SERVICEID}"] Preprocessing: - JSONPATH: `$[?(@.id == "{#SERVICE_ID}")].status.first()` - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Consul	Consul: ["{#SERVICENAME}"]: Check ["{#SERVICECHECK_NAME}"]: Status	Current state of health check for the service.	DEPENDENT	consul.service.check.state["{#SERVICEID}/{#SERVICECHECKID}"] Preprocessing: - JSONPATH: `$[?(@.id == "{#SERVICE_ID}")].checks[?(@.CheckID == "{#SERVICE_CHECK_ID}")].Status.first()` - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGED_HEARTBEAT: `3h`
Consul	Consul: ["{#SERVICENAME}"]: Check ["{#SERVICECHECK_NAME}"]: Output	Current output of health check for the service.	DEPENDENT	consul.service.check.output["{#SERVICEID}/{#SERVICECHECKID}"] Preprocessing: - JSONPATH: `$[?(@.id == "{#SERVICE_ID}")].checks[?(@.CheckID == "{#SERVICE_CHECK_ID}")].Output.first()` - DISCARDUNCHANGED_HEARTBEAT: `3h`
Consul	Consul: HTTP request: ["{#HTTP_METHOD}"], p90	The 90 percentile of how long it takes to service the given HTTP request for the given verb.	DEPENDENT	consul.http.api.p90["{#HTTPMETHOD}"] Preprocessing: - PROMETHEUSPATTERN: `consul_api_http{method = "{#HTTP_METHOD}", quantile = "0.9"}`: `function`: `sum` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Consul	Consul: HTTP request: ["{#HTTP_METHOD}"], p50	The 50 percentile (median) of how long it takes to service the given HTTP request for the given verb.	DEPENDENT	consul.http.api.p50["{#HTTPMETHOD}"] Preprocessing: - PROMETHEUSPATTERN: `consul_api_http{method = "{#HTTP_METHOD}", quantile = "0.5"}`: `function`: `sum` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Consul	Consul: HTTP request: ["{#HTTP_METHOD}"], rate	Thr number of HTTP request for the given verb per second.	DEPENDENT	consul.http.api.rate["{#HTTPMETHOD}"] Preprocessing: - PROMETHEUSPATTERN: `consul_api_http_count{method = "{#HTTP_METHOD}"}`: `function`: `sum` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
Consul	Consul: Raft state	Current state of Consul agent.	DEPENDENT	consul.raft.state[{#SINGLETON}] Preprocessing: - JSONPATH: `$.Stats.raft.state` - DISCARDUNCHANGEDHEARTBEAT: `3h`
Consul	Consul: Raft state: leader	Increments when a server becomes a leader.	DEPENDENT	consul.raft.stateleader[{#SINGLETON}] Preprocessing: - PROMETHEUSPATTERN: `consul_raft_state_leader` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Consul	Consul: Raft state: candidate	The number of initiated leader elections.	DEPENDENT	consul.raft.statecandidate[{#SINGLETON}] Preprocessing: - PROMETHEUSPATTERN: `consul_raft_state_candidate` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Consul	Consul: Raft: apply, rate	Incremented whenever a leader first passes a message into the Raft commit process (called an Apply operation). This metric describes the arrival rate of new logs into Raft per second.	DEPENDENT	consul.raft.apply.rate[{#SINGLETON}] Preprocessing: - PROMETHEUSPATTERN: `consul_raft_apply` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPERSECOND
Consul	Consul: Raft state: leader last contact, p90	The 90 percentile of how long it takes a leader node to communicate with followers during a leader lease check, in milliseconds.	DEPENDENT	consul.raft.leaderlastcontact.p90[{#SINGLETON}] Preprocessing: - PROMETHEUSPATTERN: `consul_raft_leader_lastContact{quantile="0.9"}` ⛔️ONFAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return (isNaN(value)) ? 0 : value;`
Consul	Consul: Raft state: leader last contact, p50	The 50 percentile (median) of how long it takes a leader node to communicate with followers during a leader lease check, in milliseconds.	DEPENDENT	consul.raft.leaderlastcontact.p50[{#SINGLETON}] Preprocessing: - PROMETHEUSPATTERN: `consul_raft_leader_lastContact{quantile="0.5"}` ⛔️ONFAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return (isNaN(value)) ? 0 : value;`
Consul	Consul: Raft state: commit time, p90	The 90 percentile time it takes to commit a new entry to the raft log on the leader, in milliseconds.	DEPENDENT	consul.raft.committime.p90[{#SINGLETON}] Preprocessing: - PROMETHEUSPATTERN: `consul_raft_commitTime{quantile="0.9"}` ⛔️ON_FAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return (isNaN(value)) ? 0 : value;`
Consul	Consul: Raft state: commit time, p50	The 50 percentile (median) time it takes to commit a new entry to the raft log on the leader, in milliseconds.	DEPENDENT	consul.raft.committime.p50[{#SINGLETON}] Preprocessing: - PROMETHEUSPATTERN: `consul_raft_commitTime{quantile="0.5"}` ⛔️ON_FAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return (isNaN(value)) ? 0 : value;`
Consul	Consul: Raft state: dispatch log, p90	The 90 percentile time it takes for the leader to write log entries to disk, in milliseconds.	DEPENDENT	consul.raft.dispatchlog.p90[{#SINGLETON}] Preprocessing: - PROMETHEUSPATTERN: `consul_raft_leader_dispatchLog{quantile="0.9"}` ⛔️ON_FAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return (isNaN(value)) ? 0 : value;`
Consul	Consul: Raft state: dispatch log, p50	The 50 percentile (median) time it takes for the leader to write log entries to disk, in milliseconds.	DEPENDENT	consul.raft.dispatchlog.p50[{#SINGLETON}] Preprocessing: - PROMETHEUSPATTERN: `consul_raft_leader_dispatchLog{quantile="0.5"}` ⛔️ON_FAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return (isNaN(value)) ? 0 : value;`
Consul	Consul: Raft state: dispatch log, rate	The number of times a Raft leader writes a log to disk per second.	DEPENDENT	consul.raft.dispatchlog.rate[{#SINGLETON}] Preprocessing: - PROMETHEUSPATTERN: `consul_raft_leader_dispatchLog_count` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
Consul	Consul: Raft state: commit, rate	The number of commits a new entry to the Raft log on the leader per second.	DEPENDENT	consul.raft.committime.rate[{#SINGLETON}] Preprocessing: - PROMETHEUSPATTERN: `consul_raft_commitTime_count` ⛔️ONFAIL: `DISCARD_VALUE ->` - CHANGEPER_SECOND
Consul	Consul: Autopilot healthy	Tracks the overall health of the local server cluster. 1 if all servers are healthy, 0 if one or more are unhealthy.	DEPENDENT	consul.autopilot.healthy[{#SINGLETON}] Preprocessing: - PROMETHEUSPATTERN: `consul_autopilot_healthy` ⛔️ONFAIL: `DISCARD_VALUE ->`
Zabbix raw items	Consul: Get instance metrics	Get raw metrics from Consul instance /metrics endpoint.	HTTP_AGENT	consul.getmetrics Preprocessing: - CHECKNOTSUPPORTED ⛔️ONFAIL: `DISCARD_VALUE ->`
Zabbix raw items	Consul: Get node info	Get configuration and member information of the local agent.	HTTP_AGENT	consul.getnodeinfo Preprocessing: - CHECKNOTSUPPORTED ⛔️ON_FAIL: `DISCARD_VALUE ->`
Zabbix raw items	Consul: Get local services	Get all the services that are registered with the local agent and their status.	SCRIPT	consul.getlocalservices Expression: `The text is too long. Please see the template.`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Consul: Version has been changed	Consul version has changed. Ack to close.	`last(/HashiCorp Consul Node by HTTP/consul.version,#1)<>last(/HashiCorp Consul Node by HTTP/consul.version,#2) and length(last(/HashiCorp Consul Node by HTTP/consul.version))>0`	INFO	Manual close: YES
Consul: Current number of open files is too high	"Heavy file descriptor usage (i.e., near the process’s file descriptor limit) indicates a potential file descriptor exhaustion issue."	`min(/HashiCorp Consul Node by HTTP/consul.process_open_fds,5m)/last(/HashiCorp Consul Node by HTTP/consul.process_max_fds)*100>{$CONSUL.OPEN.FDS.MAX.WARN}`	WARNING
Consul: Node's health score is warning	This metric ranges from 0 to 8, where 0 indicates "totally healthy". This health score is used to scale the time between outgoing probes, and higher scores translate into longer probing intervals. For more details see section IV of the Lifeguard paper: https://arxiv.org/pdf/1707.00788.pdf	`max(/HashiCorp Consul Node by HTTP/consul.memberlist.health_score,#3)>{$CONSUL.NODE.HEALTH_SCORE.MAX.WARN}`	WARNING	Depends on: - Consul: Node's health score is critical
Consul: Node's health score is critical	This metric ranges from 0 to 8, where 0 indicates "totally healthy". This health score is used to scale the time between outgoing probes, and higher scores translate into longer probing intervals. For more details see section IV of the Lifeguard paper: https://arxiv.org/pdf/1707.00788.pdf	`max(/HashiCorp Consul Node by HTTP/consul.memberlist.health_score,#3)>{$CONSUL.NODE.HEALTH_SCORE.MAX.HIGH}`	AVERAGE
Consul: Failed to get local services	Failed to get local services. Check debug log for more information.	`length(last(/HashiCorp Consul Node by HTTP/consul.get_local_services.check))>0`	WARNING
Consul: Aggregated status is 'warning'	Aggregated state of service on the local agent is 'warning'.	`last(/HashiCorp Consul Node by HTTP/consul.service.aggregated_state["{#SERVICE_ID}"]) = 1`	WARNING
Consul: Aggregated status is 'critical'	Aggregated state of service on the local agent is 'critical'.	`last(/HashiCorp Consul Node by HTTP/consul.service.aggregated_state["{#SERVICE_ID}"]) = 2`	AVERAGE

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

app

app_cloudflare_http

View README Download JSON

Cloudflare by HTTP

Overview

For Zabbix version: 6.2 and higher
The template to monitor Cloudflare to watch your web traffic and DNS metrics. It works without any external scripts and uses the Script item.

Setup

1. Create a host, for example mywebsite.com, for a site in your Cloudflare account.

2. Link the template to the host.

3. Customize the values of {$CLOUDFLARE.API.TOKEN}, {$CLOUDFLARE.ZONE_ID} macros.
Cloudflare API Tokens are available in your Cloudflare account under My Profile > API Tokens.
Zone ID is available in your Cloudflare account under Account Home > Site.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$CLOUDFLARE.API.TOKEN}	Your Cloudflare API Token.	`<change>`
{$CLOUDFLARE.API.URL}	The URL of Cloudflare API endpoint.	`https://api.cloudflare.com/client/v4`
{$CLOUDFLARE.CACHED_BANDWIDTH.MIN.WARN}	Minimum of cached bandwidth in %.	`50`
{$CLOUDFLARE.ERRORS.MAX.WARN}	Maximum responses with errors in %.	`30`
{$CLOUDFLARE.GET_DATA.TIMEOUT}	Response timeout for Cloudflare API.	`3s`
{$CLOUDFLARE.ZONE_ID}	Your Cloudflare Site Zone ID.	`<change>`

Template links

There are no template links in this template.

Discovery rules

Items collected

Group	Name	Description	Type	Key and additional info
General	Cloudflare: Total bandwidth	The volume of all data.	DEPENDENT	cloudflare.bandwidth.all Preprocessing: - JSONPATH: `$.bandwidth.all`
General	Cloudflare: Cached bandwidth	The volume of cached data.	DEPENDENT	cloudflare.bandwidth.cached Preprocessing: - JSONPATH: `$.bandwidth.cached`
General	Cloudflare: Uncached bandwidth	The volume of uncached data.	DEPENDENT	cloudflare.bandwidth.uncached Preprocessing: - JSONPATH: `$.bandwidth.uncached`
General	Cloudflare: Cache hit ratio of bandwidth	The ratio of the amount cached bandwidth to the bandwidth in percentage.	DEPENDENT	cloudflare.bandwidth.cachehitratio Preprocessing: - JSONPATH: `$.bandwidth.cache_hit_ratio`
General	Cloudflare: SSL encrypted bandwidth	The volume of encrypted data.	DEPENDENT	cloudflare.bandwidth.ssl.encrypted Preprocessing: - JSONPATH: `$.bandwidth.encrypted`
General	Cloudflare: Unencrypted bandwidth	The volume of unencrypted data.	DEPENDENT	cloudflare.bandwidth.ssl.unencrypted Preprocessing: - JSONPATH: `$.bandwidth.unencrypted`
General	Cloudflare: DNS queries	The amount of all DNS queries.	DEPENDENT	cloudflare.dns.query.all Preprocessing: - JSONPATH: `$.dns.query.all`
General	Cloudflare: Stale DNS queries	The number of stale DNS queries.	DEPENDENT	cloudflare.dns.query.stale Preprocessing: - JSONPATH: `$.dns.query.stale`
General	Cloudflare: Uncached DNS queries	The number of uncached DNS queries.	DEPENDENT	cloudflare.dns.query.uncached Preprocessing: - JSONPATH: `$.dns.query.uncached`
General	Cloudflare: Total page views	The amount of all pageviews.	DEPENDENT	cloudflare.pageviews.all Preprocessing: - JSONPATH: `$.pageviews.all`
General	Cloudflare: Total requests	The amount of all requests.	DEPENDENT	cloudflare.requests.all Preprocessing: - JSONPATH: `$.requests.all`
General	Cloudflare: Cached requests	-	DEPENDENT	cloudflare.requests.cached Preprocessing: - JSONPATH: `$.requests.cached`
General	Cloudflare: Uncached requests	The number of uncached requests.	DEPENDENT	cloudflare.requests.uncached Preprocessing: - JSONPATH: `$.requests.uncached`
General	Cloudflare: Cache hit ratio % over time	The ratio of the amount cached requests to all requests in percentage.	DEPENDENT	cloudflare.requests.cachehitratio Preprocessing: - JSONPATH: `$.requests.cache_hit_ratio`
General	Cloudflare: Response codes 1xx	The number requests with 1xx response codes.	DEPENDENT	cloudflare.requests.response_100 Preprocessing: - JSONPATH: `$.requests.response_100`
General	Cloudflare: Response codes 2xx	The number requests with 2xx response codes.	DEPENDENT	cloudflare.requests.response_200 Preprocessing: - JSONPATH: `$.requests.response_200`
General	Cloudflare: Response codes 3xx	The number requests with 3xx response codes.	DEPENDENT	cloudflare.requests.response_300 Preprocessing: - JSONPATH: `$.requests.response_300`
General	Cloudflare: Response codes 4xx	The number requests with 4xx response codes.	DEPENDENT	cloudflare.requests.response_400 Preprocessing: - JSONPATH: `$.requests.response_400`
General	Cloudflare: Response codes 5xx	The number requests with 5xx response codes.	DEPENDENT	cloudflare.requests.response_500 Preprocessing: - JSONPATH: `$.requests.response_500`
General	Cloudflare: Non-2xx responses ratio	The ratio of the amount requests with non-2xx response codes to all requests in percentage.	DEPENDENT	cloudflare.requests.others_ratio Preprocessing: - JSONPATH: `$.requests.others_ratio`
General	Cloudflare: 2xx responses ratio	The ratio of the amount requests with 2xx response codes to all requests in percentage.	DEPENDENT	cloudflare.requests.success_ratio Preprocessing: - JSONPATH: `$.requests.success_ratio`
General	Cloudflare: SSL encrypted requests	The number of encrypted requests.	DEPENDENT	cloudflare.requests.ssl.encrypted Preprocessing: - JSONPATH: `$.requests.encrypted`
General	Cloudflare: Unencrypted requests	The number of unencrypted requests.	DEPENDENT	cloudflare.requests.ssl.unencrypted Preprocessing: - JSONPATH: `$.requests.unencrypted`
General	Cloudflare: Total threats	The number of all threats.	DEPENDENT	cloudflare.threats.all Preprocessing: - JSONPATH: `$.threats.all`
General	Cloudflare: Unique visitors	The number of all visitors IPs.	DEPENDENT	cloudflare.uniques.all Preprocessing: - JSONPATH: `$.uniques.all`
Zabbix raw items	Cloudflare: Get data	The JSON with result of Cloudflare API request.	SCRIPT	cloudflare.get Expression: `The text is too long. Please see the template.`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Cloudflare: Cached bandwidth is too low		`max(/Cloudflare by HTTP/cloudflare.bandwidth.cache_hit_ratio,#3) < {$CLOUDFLARE.CACHED_BANDWIDTH.MIN.WARN}`	WARNING
Cloudflare: Ratio of non-2xx responses is too high	A large number of errors can indicate a malfunction of the site.	`min(/Cloudflare by HTTP/cloudflare.requests.others_ratio,#3) > {$CLOUDFLARE.ERRORS.MAX.WARN}`	AVERAGE

Feedback

Please report any issues with the template at https://support.zabbix.com

app

app_certificate_agent2

View README Download JSON

Website certificate by Zabbix agent 2

Overview

For Zabbix version: 6.2 and higher
The template to monitor TLS/SSL certificate on the website by Zabbix agent 2 that works without any external scripts. Zabbix agent 2 with the WebCertificate plugin requests certificate using the web.certificate.get key and returns JSON with certificate attributes.

Setup

1. Setup and configure zabbix-agent2 with the WebCertificate plugin.

2. Test availability: zabbix_get -s <zabbix_agent_addr> -k web.certificate.get[<website_DNS_name>]

3. Create a host for the TLS/SSL certificate with Zabbix agent interface.

4. Link the template to the host.

5. Customize the value of {$CERT.WEBSITE.HOSTNAME} macro.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$CERT.EXPIRY.WARN}	Number of days until the certificate expires.	`7`
{$CERT.WEBSITE.HOSTNAME}	The website DNS name for the connection.	`<Put DNS name>`
{$CERT.WEBSITE.IP}	The website IP address for the connection.	``
{$CERT.WEBSITE.PORT}	The TLS/SSL port number of the website.	`443`

Template links

There are no template links in this template.

Discovery rules

Items collected

Group	Name	Description	Type	Key and additional info
General	Cert: Validation result	The certificate validation result. Possible values: valid/invalid/valid-but-self-signed	DEPENDENT	cert.validation Preprocessing: - JSONPATH: `$.result.value`
General	Cert: Last validation status	Last check result message.	DEPENDENT	cert.message Preprocessing: - JSONPATH: `$.result.message`
General	Cert: Version	The version of the encoded certificate.	DEPENDENT	cert.version Preprocessing: - JSONPATH: `$.x509.version`
General	Cert: Serial number	The serial number is a positive integer assigned by the CA to each certificate. It is unique for each certificate issued by a given CA. Non-conforming CAs may issue certificates with serial numbers that are negative or zero.	DEPENDENT	cert.serial_number Preprocessing: - JSONPATH: `$.x509.serial_number`
General	Cert: Signature algorithm	The algorithm identifier for the algorithm used by the CA to sign the certificate.	DEPENDENT	cert.signature_algorithm Preprocessing: - JSONPATH: `$.x509.signature_algorithm`
General	Cert: Issuer	The field identifies the entity that has signed and issued the certificate.	DEPENDENT	cert.issuer Preprocessing: - JSONPATH: `$.x509.issuer`
General	Cert: Valid from	The date on which the certificate validity period begins.	DEPENDENT	cert.not_before Preprocessing: - JSONPATH: `$.x509.not_before.timestamp`
General	Cert: Expires on	The date on which the certificate validity period ends.	DEPENDENT	cert.not_after Preprocessing: - JSONPATH: `$.x509.not_after.timestamp`
General	Cert: Subject	The field identifies the entity associated with the public key stored in the subject public key field.	DEPENDENT	cert.subject Preprocessing: - JSONPATH: `$.x509.subject`
General	Cert: Subject alternative name	The subject alternative name extension allows identities to be bound to the subject of the certificate. These identities may be included in addition to or in place of the identity in the subject field of the certificate. Defined options include an Internet electronic mail address, a DNS name, an IP address, and a Uniform Resource Identifier (URI).	DEPENDENT	cert.alternative_names Preprocessing: - JSONPATH: `$.x509.alternative_names`
General	Cert: Public key algorithm	The digital signature algorithm is used to verify the signature of a certificate.	DEPENDENT	cert.publickeyalgorithm Preprocessing: - JSONPATH: `$.x509.public_key_algorithm`
General	Cert: Fingerprint	The Certificate Signature (SHA1 Fingerprint or Thumbprint) is the hash of the entire certificate in DER form.	DEPENDENT	cert.sha1_fingerprint Preprocessing: - JSONPATH: `$.sha1_fingerprint`
Zabbix raw items	Cert: Get	Returns the JSON with attributes of a certificate of the requested site.	ZABBIX_PASSIVE	web.certificate.get[{$CERT.WEBSITE.HOSTNAME},{$CERT.WEBSITE.PORT},{$CERT.WEBSITE.IP}] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `6h`

Triggers

Name Description Expression Severity Dependencies and additional info

Cert: SSL certificate is invalid

SSL certificate has expired or it is issued for another domain.

find(/Website certificate by Zabbix agent 2/cert.validation,,"like","invalid")=1

HIGH

Cert: SSL certificate expires soon

The SSL certificate should be updated or it will become untrusted.

(last(/Website certificate by Zabbix agent 2/cert.not_after) - now()) / 86400 < {$CERT.EXPIRY.WARN}

WARNING

Depends on:

- Cert: SSL certificate is invalid

Cert: Fingerprint has changed

The SSL certificate fingerprint has changed. If you did not update the certificate, it may mean your certificate has been hacked. Ack to close.

There could be multiple valid certificates on some installations. In this case, the trigger will have a false positive. You can ignore it or disable the trigger.

last(/Website certificate by Zabbix agent 2/cert.sha1_fingerprint) <> last(/Website certificate by Zabbix agent 2/cert.sha1_fingerprint,#2)

INFO

Manual close: YES

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

app

app_ceph_agent2

View README Download JSON

Ceph by Zabbix agent 2

Overview

For Zabbix version: 6.2 and higher. The template is designed to monitor Ceph cluster by Zabbix, which works without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

The template Ceph by Zabbix agent 2 — collects metrics by polling zabbix-agent2.

This template was tested on:

Ceph, version 14.2

Setup

Setup and configure zabbix-agent2 compiled with the Ceph monitoring plugin.
Set the {$CEPH.CONNSTRING}, such as
Set the user name and password in the host macros ({$CEPH.USER}, {$CEPH.API.KEY}) if you want to override the parameters from the Zabbix agent configuration file.

Test availability: zabbix_get -s ceph-host -k ceph.ping["{$CEPH.CONNSTRING}","{$CEPH.USER}","{$CEPH.API.KEY}"]

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$CEPH.API.KEY}	-	`zabbix_pass`
{$CEPH.CONNSTRING}	-	`https://localhost:8003`
{$CEPH.USER}	-	`zabbix`

Template links

There are no template links in this template.

Discovery rules

Name	Description	Type	Key and additional info
OSD	-	ZABBIX_PASSIVE	ceph.osd.discovery["{$CEPH.CONNSTRING}","{$CEPH.USER}","{$CEPH.API.KEY}"]
Pool	-	ZABBIX_PASSIVE	ceph.pool.discovery["{$CEPH.CONNSTRING}","{$CEPH.USER}","{$CEPH.API.KEY}"]

Items collected

Group	Name	Description	Type	Key and additional info
Ceph	Ceph: Ping		ZABBIX_PASSIVE	ceph.ping["{$CEPH.CONNSTRING}","{$CEPH.USER}","{$CEPH.API.KEY}"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `30m`
Ceph	Ceph: Number of Monitors	The number of Monitors configured in a Ceph cluster.	DEPENDENT	ceph.nummon Preprocessing: - JSONPATH: `$.num_mon` - DISCARDUNCHANGED_HEARTBEAT: `30m`
Ceph	Ceph: Overall cluster status	The overall Ceph cluster status, eg 0 - HEALTHOK, 1 - HEALTHWARN or 2 - HEALTH_ERR.	DEPENDENT	ceph.overallstatus Preprocessing: - JSONPATH: `$.overall_status` - DISCARDUNCHANGED_HEARTBEAT: `10m`
Ceph	Ceph: Minimum Mon release version	minmonrelease_name	DEPENDENT	ceph.minmonreleasename Preprocessing: - JSONPATH: `$.min_mon_release_name` - DISCARDUNCHANGED_HEARTBEAT: `1h`
Ceph	Ceph: Ceph Read bandwidth	The global read bytes per second.	DEPENDENT	ceph.rdbytes.rate Preprocessing: - JSONPATH: `$.rd_bytes` - CHANGEPER_SECOND
Ceph	Ceph: Ceph Write bandwidth	The global write bytes per second	DEPENDENT	ceph.wrbytes.rate Preprocessing: - JSONPATH: `$.wr_bytes` - CHANGEPER_SECOND
Ceph	Ceph: Ceph Read operations per sec	The global read operations per second.	DEPENDENT	ceph.rdops.rate Preprocessing: - JSONPATH: `$.rd_ops` - CHANGEPER_SECOND
Ceph	Ceph: Ceph Write operations per sec	The global write operations per second.	DEPENDENT	ceph.wrops.rate Preprocessing: - JSONPATH: `$.wr_ops` - CHANGEPER_SECOND
Ceph	Ceph: Total bytes available	The total bytes available in a Ceph cluster.	DEPENDENT	ceph.totalavailbytes Preprocessing: - JSONPATH: `$.total_avail_bytes`
Ceph	Ceph: Total bytes	The total (RAW) capacity of a Ceph cluster in bytes.	DEPENDENT	ceph.total_bytes Preprocessing: - JSONPATH: `$.total_bytes`
Ceph	Ceph: Total bytes used	The total bytes used in a Ceph cluster.	DEPENDENT	ceph.totalusedbytes Preprocessing: - JSONPATH: `$.total_used_bytes`
Ceph	Ceph: Total number of objects	The total number of objects in a Ceph cluster.	DEPENDENT	ceph.total_objects Preprocessing: - JSONPATH: `$.total_objects`
Ceph	Ceph: Number of Placement Groups	The total number of Placement Groups in a Ceph cluster.	DEPENDENT	ceph.numpg Preprocessing: - JSONPATH: `$.num_pg` - DISCARDUNCHANGED_HEARTBEAT: `10m`
Ceph	Ceph: Number of Placement Groups in Temporary state	The total number of Placement Groups in a pg_temp state	DEPENDENT	ceph.numpgtemp Preprocessing: - JSONPATH: `$.num_pg_temp`
Ceph	Ceph: Number of Placement Groups in Active state	The total number of Placement Groups in an active state.	DEPENDENT	ceph.pg_states.active Preprocessing: - JSONPATH: `$.pg_states.active`
Ceph	Ceph: Number of Placement Groups in Clean state	The total number of Placement Groups in a clean state.	DEPENDENT	ceph.pg_states.clean Preprocessing: - JSONPATH: `$.pg_states.clean`
Ceph	Ceph: Number of Placement Groups in Peering state	The total number of Placement Groups in a peering state.	DEPENDENT	ceph.pg_states.peering Preprocessing: - JSONPATH: `$.pg_states.peering`
Ceph	Ceph: Number of Placement Groups in Scrubbing state	The total number of Placement Groups in a scrubbing state.	DEPENDENT	ceph.pg_states.scrubbing Preprocessing: - JSONPATH: `$.pg_states.scrubbing`
Ceph	Ceph: Number of Placement Groups in Undersized state	The total number of Placement Groups in an undersized state.	DEPENDENT	ceph.pg_states.undersized Preprocessing: - JSONPATH: `$.pg_states.undersized`
Ceph	Ceph: Number of Placement Groups in Backfilling state	The total number of Placement Groups in a backfill state.	DEPENDENT	ceph.pg_states.backfilling Preprocessing: - JSONPATH: `$.pg_states.backfilling`
Ceph	Ceph: Number of Placement Groups in degraded state	The total number of Placement Groups in a degraded state.	DEPENDENT	ceph.pg_states.degraded Preprocessing: - JSONPATH: `$.pg_states.degraded`
Ceph	Ceph: Number of Placement Groups in inconsistent state	The total number of Placement Groups in an inconsistent state.	DEPENDENT	ceph.pg_states.inconsistent Preprocessing: - JSONPATH: `$.pg_states.inconsistent`
Ceph	Ceph: Number of Placement Groups in Unknown state	The total number of Placement Groups in an unknown state.	DEPENDENT	ceph.pg_states.unknown Preprocessing: - JSONPATH: `$.pg_states.unknown`
Ceph	Ceph: Number of Placement Groups in remapped state	The total number of Placement Groups in a remapped state.	DEPENDENT	ceph.pg_states.remapped Preprocessing: - JSONPATH: `$.pg_states.remapped`
Ceph	Ceph: Number of Placement Groups in recovering state	The total number of Placement Groups in a recovering state.	DEPENDENT	ceph.pg_states.recovering Preprocessing: - JSONPATH: `$.pg_states.recovering`
Ceph	Ceph: Number of Placement Groups in backfill_toofull state	The total number of Placement Groups in a backfill_toofull state.	DEPENDENT	ceph.pgstates.backfilltoofull Preprocessing: - JSONPATH: `$.pg_states.backfill_toofull`
Ceph	Ceph: Number of Placement Groups in backfill_wait state	The total number of Placement Groups in a backfill_wait state.	DEPENDENT	ceph.pgstates.backfillwait Preprocessing: - JSONPATH: `$.pg_states.backfill_wait`
Ceph	Ceph: Number of Placement Groups in recovery_wait state	The total number of Placement Groups in a recovery_wait state.	DEPENDENT	ceph.pgstates.recoverywait Preprocessing: - JSONPATH: `$.pg_states.recovery_wait`
Ceph	Ceph: Number of Pools	The total number of pools in a Ceph cluster.	DEPENDENT	ceph.num_pools Preprocessing: - JSONPATH: `$.num_pools`
Ceph	Ceph: Number of OSDs	The number of the known storage daemons in a Ceph cluster.	DEPENDENT	ceph.numosd Preprocessing: - JSONPATH: `$.num_osd` - DISCARDUNCHANGED_HEARTBEAT: `10m`
Ceph	Ceph: Number of OSDs in state: UP	The total number of the online storage daemons in a Ceph cluster.	DEPENDENT	ceph.numosdup Preprocessing: - JSONPATH: `$.num_osd_up` - DISCARDUNCHANGEDHEARTBEAT: `10m`
Ceph	Ceph: Number of OSDs in state: IN	The total number of the participating storage daemons in a Ceph cluster.	DEPENDENT	ceph.numosdin Preprocessing: - JSONPATH: `$.num_osd_in` - DISCARDUNCHANGEDHEARTBEAT: `10m`
Ceph	Ceph: Ceph OSD avg fill	The average fill of OSDs.	DEPENDENT	ceph.osd_fill.avg Preprocessing: - JSONPATH: `$.osd_fill.avg`
Ceph	Ceph: Ceph OSD max fill	The percentage of the most filled OSD.	DEPENDENT	ceph.osd_fill.max Preprocessing: - JSONPATH: `$.osd_fill.max`
Ceph	Ceph: Ceph OSD min fill	The percentage fill of the minimum filled OSD.	DEPENDENT	ceph.osd_fill.min Preprocessing: - JSONPATH: `$.osd_fill.min`
Ceph	Ceph: Ceph OSD max PGs	The maximum amount of Placement Groups on OSDs.	DEPENDENT	ceph.osd_pgs.max Preprocessing: - JSONPATH: `$.osd_pgs.max`
Ceph	Ceph: Ceph OSD min PGs	The minimum amount of Placement Groups on OSDs.	DEPENDENT	ceph.osd_pgs.min Preprocessing: - JSONPATH: `$.osd_pgs.min`
Ceph	Ceph: Ceph OSD avg PGs	The average amount of Placement Groups on OSDs.	DEPENDENT	ceph.osd_pgs.avg Preprocessing: - JSONPATH: `$.osd_pgs.avg`
Ceph	Ceph: Ceph OSD Apply latency Avg	The average apply latency of OSDs.	DEPENDENT	ceph.osdlatencyapply.avg Preprocessing: - JSONPATH: `$.osd_latency_apply.avg`
Ceph	Ceph: Ceph OSD Apply latency Max	The maximum apply latency of OSDs.	DEPENDENT	ceph.osdlatencyapply.max Preprocessing: - JSONPATH: `$.osd_latency_apply.max`
Ceph	Ceph: Ceph OSD Apply latency Min	The minimum apply latency of OSDs.	DEPENDENT	ceph.osdlatencyapply.min Preprocessing: - JSONPATH: `$.osd_latency_apply.min`
Ceph	Ceph: Ceph OSD Commit latency Avg	The average commit latency of OSDs.	DEPENDENT	ceph.osdlatencycommit.avg Preprocessing: - JSONPATH: `$.osd_latency_commit.avg`
Ceph	Ceph: Ceph OSD Commit latency Max	The maximum commit latency of OSDs.	DEPENDENT	ceph.osdlatencycommit.max Preprocessing: - JSONPATH: `$.osd_latency_commit.max`
Ceph	Ceph: Ceph OSD Commit latency Min	The minimum commit latency of OSDs.	DEPENDENT	ceph.osdlatencycommit.min Preprocessing: - JSONPATH: `$.osd_latency_commit.min`
Ceph	Ceph: Ceph backfill full ratio	The backfill full ratio setting of the Ceph cluster as configured on OSDMap.	DEPENDENT	ceph.osdbackfillfullratio Preprocessing: - JSONPATH: `$.osd_backfillfull_ratio` - DISCARDUNCHANGEDHEARTBEAT: `10m`
Ceph	Ceph: Ceph full ratio	The full ratio setting of the Ceph cluster as configured on OSDMap.	DEPENDENT	ceph.osdfullratio Preprocessing: - JSONPATH: `$.osd_full_ratio` - DISCARDUNCHANGEDHEARTBEAT: `10m`
Ceph	Ceph: Ceph nearfull ratio	The near full ratio setting of the Ceph cluster as configured on OSDMap.	DEPENDENT	ceph.osdnearfullratio Preprocessing: - JSONPATH: `$.osd_nearfull_ratio` - DISCARDUNCHANGEDHEARTBEAT: `10m`
Ceph	Ceph: [osd.{#OSDNAME}] OSD in		DEPENDENT	ceph.osd[{#OSDNAME},in] Preprocessing: - JSONPATH: `$.osds.{#OSDNAME}.in` - DISCARDUNCHANGEDHEARTBEAT: `10m`
Ceph	Ceph: [osd.{#OSDNAME}] OSD up		DEPENDENT	ceph.osd[{#OSDNAME},up] Preprocessing: - JSONPATH: `$.osds.{#OSDNAME}.up` - DISCARDUNCHANGEDHEARTBEAT: `10m`
Ceph	Ceph: [osd.{#OSDNAME}] OSD PGs		DEPENDENT	ceph.osd[{#OSDNAME},numpgs] Preprocessing: - JSONPATH: `$.osds.{#OSDNAME}.num_pgs` ⛔️ONFAIL: `DISCARD_VALUE ->`
Ceph	Ceph: [osd.{#OSDNAME}] OSD fill		DEPENDENT	ceph.osd[{#OSDNAME},fill] Preprocessing: - JSONPATH: `$.osds.{#OSDNAME}.osd_fill` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Ceph	Ceph: [osd.{#OSDNAME}] OSD latency apply	The time taken to flush an update to disks.	DEPENDENT	ceph.osd[{#OSDNAME},latencyapply] Preprocessing: - JSONPATH: `$.osds.{#OSDNAME}.osd_latency_apply` ⛔️ONFAIL: `DISCARD_VALUE ->`
Ceph	Ceph: [osd.{#OSDNAME}] OSD latency commit	The time taken to commit an operation to the journal.	DEPENDENT	ceph.osd[{#OSDNAME},latencycommit] Preprocessing: - JSONPATH: `$.osds.{#OSDNAME}.osd_latency_commit` ⛔️ONFAIL: `DISCARD_VALUE ->`
Ceph	Ceph: [{#POOLNAME}] Pool Used	The total bytes used in a pool.	DEPENDENT	ceph.pool["{#POOLNAME}",bytes_used] Preprocessing: - JSONPATH: `$.pools["{#POOLNAME}"].bytes_used`
Ceph	Ceph: [{#POOLNAME}] Max available	The maximum available space in the given pool.	DEPENDENT	ceph.pool["{#POOLNAME}",max_avail] Preprocessing: - JSONPATH: `$.pools["{#POOLNAME}"].max_avail`
Ceph	Ceph: [{#POOLNAME}] Pool RAW Used	Bytes used in pool including the copies made.	DEPENDENT	ceph.pool["{#POOLNAME}",stored_raw] Preprocessing: - JSONPATH: `$.pools["{#POOLNAME}"].stored_raw`
Ceph	Ceph: [{#POOLNAME}] Pool Percent Used	The percentage of the storage used per pool.	DEPENDENT	ceph.pool["{#POOLNAME}",percent_used] Preprocessing: - JSONPATH: `$.pools["{#POOLNAME}"].percent_used`
Ceph	Ceph: [{#POOLNAME}] Pool objects	The number of objects in the pool.	DEPENDENT	ceph.pool["{#POOLNAME}",objects] Preprocessing: - JSONPATH: `$.pools["{#POOLNAME}"].objects`
Ceph	Ceph: [{#POOLNAME}] Pool Read bandwidth	The read rate per pool (bytes per second).	DEPENDENT	ceph.pool["{#POOLNAME}",rdbytes.rate] Preprocessing: - JSONPATH: `$.pools["{#POOLNAME}"].rd_bytes` - CHANGEPER_SECOND
Ceph	Ceph: [{#POOLNAME}] Pool Write bandwidth	The write rate per pool (bytes per second).	DEPENDENT	ceph.pool["{#POOLNAME}",wrbytes.rate] Preprocessing: - JSONPATH: `$.pools["{#POOLNAME}"].wr_bytes` - CHANGEPER_SECOND
Ceph	Ceph: [{#POOLNAME}] Pool Read operations	The read rate per pool (operations per second).	DEPENDENT	ceph.pool["{#POOLNAME}",rdops.rate] Preprocessing: - JSONPATH: `$.pools["{#POOLNAME}"].rd_ops` - CHANGEPER_SECOND
Ceph	Ceph: [{#POOLNAME}] Pool Write operations	The write rate per pool (operations per second).	DEPENDENT	ceph.pool["{#POOLNAME}",wrops.rate] Preprocessing: - JSONPATH: `$.pools["{#POOLNAME}"].wr_ops` - CHANGEPER_SECOND
Zabbix raw items	Ceph: Get overall cluster status		ZABBIX_PASSIVE	ceph.status["{$CEPH.CONNSTRING}","{$CEPH.USER}","{$CEPH.API.KEY}"]
Zabbix raw items	Ceph: Get OSD stats		ZABBIX_PASSIVE	ceph.osd.stats["{$CEPH.CONNSTRING}","{$CEPH.USER}","{$CEPH.API.KEY}"]
Zabbix raw items	Ceph: Get OSD dump		ZABBIX_PASSIVE	ceph.osd.dump["{$CEPH.CONNSTRING}","{$CEPH.USER}","{$CEPH.API.KEY}"]
Zabbix raw items	Ceph: Get df		ZABBIX_PASSIVE	ceph.df.details["{$CEPH.CONNSTRING}","{$CEPH.USER}","{$CEPH.API.KEY}"]

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Ceph: Can not connect to cluster	The connection to the Ceph RESTful module is broken (if there is any error presented including AUTH and the configuration issues).	`last(/Ceph by Zabbix agent 2/ceph.ping["{$CEPH.CONNSTRING}","{$CEPH.USER}","{$CEPH.API.KEY}"])=0`	AVERAGE
Ceph: Cluster in ERROR state	-	`last(/Ceph by Zabbix agent 2/ceph.overall_status)=2`	AVERAGE	Manual close: YES
Ceph: Cluster in WARNING state	-	`last(/Ceph by Zabbix agent 2/ceph.overall_status)=1` Recovery expression: `last(/Ceph by Zabbix agent 2/ceph.overall_status)=0`	WARNING	Manual close: YES Depends on: - Ceph: Cluster in ERROR state
Ceph: Minimum monitor release version has changed	A Ceph version has changed. Perform Ack to close manually.	`last(/Ceph by Zabbix agent 2/ceph.min_mon_release_name,#1)<>last(/Ceph by Zabbix agent 2/ceph.min_mon_release_name,#2) and length(last(/Ceph by Zabbix agent 2/ceph.min_mon_release_name))>0`	INFO	Manual close: YES
Ceph: OSD osd.{#OSDNAME} is down	OSD osd.{#OSDNAME} is marked "down" in the osdmap. The OSD daemon may have been stopped, or peer OSDs may be unable to reach the OSD over the network.	`last(/Ceph by Zabbix agent 2/ceph.osd[{#OSDNAME},up]) = 0`	AVERAGE
Ceph: OSD osd.{#OSDNAME} is full	-	`min(/Ceph by Zabbix agent 2/ceph.osd[{#OSDNAME},fill],15m) > last(/Ceph by Zabbix agent 2/ceph.osd_full_ratio)*100`	AVERAGE
Ceph: Ceph OSD osd.{#OSDNAME} is near full	-	`min(/Ceph by Zabbix agent 2/ceph.osd[{#OSDNAME},fill],15m) > last(/Ceph by Zabbix agent 2/ceph.osd_nearfull_ratio)*100`	WARNING	Depends on: - Ceph: OSD osd.{#OSDNAME} is full

Feedback

Please report any issues with the template at https://support.zabbix.com.

You can also provide feedback, discuss the template or ask for help at ZABBIX forums.

app

app_aranet_http

View README Download JSON

Aranet Cloud

Overview

For Zabbix version: 6.2 and higher

Setup

Refer to the vendor documentation.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$ARANET.API.ENDPOINT}	Aranet Cloud API endpoint.	`https://aranet.cloud/api`
{$ARANET.API.PASSWORD}	Aranet Cloud password.	`<PUT YOUR PASSWORD>`
{$ARANET.API.SPACE_NAME}	Aranet Cloud organization name.	`<PUT YOUR SPACE NAME>`
{$ARANET.API.USERNAME}	Aranet Cloud username.	`<PUT YOUR USERNAME>`
{$ARANET.BATT.VOLTAGE.MIN.CRIT}	Battery voltage critical threshold.	`2`
{$ARANET.BATT.VOLTAGE.MIN.WARN}	Battery voltage warning threshold.	`1`
{$ARANET.CO2.MAX.CRIT}	CO2 critical threshold.	`1000`
{$ARANET.CO2.MAX.WARN}	CO2 warning threshold.	`600`
{$ARANET.HUMIDITY.MAX.WARN}	Maximum humidity threshold.	`70`
{$ARANET.HUMIDITY.MIN.WARN}	Minimum humidity threshold.	`20`
{$ARANET.LAST_UPDATE.MAX.WARN}	Data update delay threshold.	`1h`
{$ARANET.LLD.FILTER.GATEWAY_ID.MATCHES}	Filter of discoverable sensors by gateway id.	`.+`
{$ARANET.LLD.FILTER.GATEWAY_NAME.MATCHES}	Filter of discoverable sensors by gateway name.	`.+`
{$ARANET.LLD.FILTER.GATEWAYNAME.NOTMATCHES}	Filter to exclude discoverable sensors by gateway name.	`CHANGE_IF_NEEDED`
{$ARANET.LLD.FILTER.SENSOR_ID.MATCHES}	Filter of discoverable sensors by id.	`.+`
{$ARANET.LLD.FILTER.SENSOR_NAME.MATCHES}	Filter of discoverable sensors by name.	`.+`
{$ARANET.LLD.FILTER.SENSORNAME.NOTMATCHES}	Filter to exclude discoverable sensors by name.	`CHANGE_IF_NEEDED`

Template links

There are no template links in this template.

Discovery rules

Name	Description	Type	Key and additional info
Atmospheric pressure discovery	Discovery for Aranet Cloud atmospheric pressure sensors	DEPENDENT	aranet.pressure.discovery Filter: AND - {#SENSORNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.MATCHES}` - {#SENSORNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.NOT_MATCHES}` - {#SENSORID} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_ID.MATCHES}` - {#GATEWAYNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.MATCHES}` - {#GATEWAYNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.NOT_MATCHES}` - {#GATEWAYID} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_ID.MATCHES}` - {#METRIC} MATCHES_REGEX `Atmospheric Pressure`
Battery voltage discovery	Discovery for Aranet Cloud Battery voltage sensors	DEPENDENT	aranet.battery.voltage.discovery Filter: AND - {#SENSORNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.MATCHES}` - {#SENSORNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.NOT_MATCHES}` - {#SENSORID} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_ID.MATCHES}` - {#GATEWAYNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.MATCHES}` - {#GATEWAYNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.NOT_MATCHES}` - {#GATEWAYID} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_ID.MATCHES}` - {#METRIC} MATCHES_REGEX `Battery voltage`
CO2 discovery	Discovery for Aranet Cloud CO2 sensors	DEPENDENT	aranet.co2.discovery Filter: AND - {#SENSORNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.MATCHES}` - {#SENSORNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.NOT_MATCHES}` - {#SENSORID} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_ID.MATCHES}` - {#GATEWAYNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.MATCHES}` - {#GATEWAYNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.NOT_MATCHES}` - {#GATEWAYID} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_ID.MATCHES}` - {#METRIC} MATCHES_REGEX `CO₂`
Current discovery	Discovery for Aranet Cloud Current sensors	DEPENDENT	aranet.current.discovery Filter: AND - {#SENSORNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.MATCHES}` - {#SENSORNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.NOT_MATCHES}` - {#SENSORID} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_ID.MATCHES}` - {#GATEWAYNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.MATCHES}` - {#GATEWAYNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.NOT_MATCHES}` - {#GATEWAYID} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_ID.MATCHES}` - {#METRIC} MATCHES_REGEX `Current`
Differential Pressure discovery	Discovery for Aranet Cloud Differential Pressure sensors	DEPENDENT	aranet.diffpressure.discovery Filter: AND - {#SENSORNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.MATCHES}` - {#SENSORNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.NOT_MATCHES}` - {#SENSORID} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_ID.MATCHES}` - {#GATEWAYNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.MATCHES}` - {#GATEWAYNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.NOT_MATCHES}` - {#GATEWAYID} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_ID.MATCHES}` - {#METRIC} MATCHESREGEX `Differential Pressure`
Distance discovery	Discovery for Aranet Cloud Distance sensors	DEPENDENT	aranet.distance.discovery Filter: AND - {#SENSORNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.MATCHES}` - {#SENSORNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.NOT_MATCHES}` - {#SENSORID} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_ID.MATCHES}` - {#GATEWAYNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.MATCHES}` - {#GATEWAYNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.NOT_MATCHES}` - {#GATEWAYID} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_ID.MATCHES}` - {#METRIC} MATCHES_REGEX `Distance`
Humidity discovery	Discovery for Aranet Cloud humidity sensors	DEPENDENT	aranet.humidity.discovery Filter: AND - {#SENSORNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.MATCHES}` - {#SENSORNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.NOT_MATCHES}` - {#SENSORID} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_ID.MATCHES}` - {#GATEWAYNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.MATCHES}` - {#GATEWAYNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.NOT_MATCHES}` - {#GATEWAYID} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_ID.MATCHES}` - {#METRIC} MATCHES_REGEX `Humidity`
Illuminance discovery	Discovery for Aranet Cloud Illuminance sensors	DEPENDENT	aranet.illuminance.discovery Filter: AND - {#SENSORNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.MATCHES}` - {#SENSORNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.NOT_MATCHES}` - {#SENSORID} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_ID.MATCHES}` - {#GATEWAYNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.MATCHES}` - {#GATEWAYNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.NOT_MATCHES}` - {#GATEWAYID} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_ID.MATCHES}` - {#METRIC} MATCHES_REGEX `Illuminance`
Last update discovery	Discovery for Aranet Cloud Last update metric	DEPENDENT	aranet.lastupdate.discovery Filter: AND - {#SENSORNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.MATCHES}` - {#SENSORNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.NOT_MATCHES}` - {#SENSORID} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_ID.MATCHES}` - {#GATEWAYNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.MATCHES}` - {#GATEWAYNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.NOT_MATCHES}` - {#GATEWAYID} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_ID.MATCHES}` - {#METRIC} MATCHESREGEX `Last update`
pH discovery	Discovery for Aranet Cloud pH sensors	DEPENDENT	aranet.ph.discovery Filter: AND - {#SENSORNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.MATCHES}` - {#SENSORNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.NOT_MATCHES}` - {#SENSORID} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_ID.MATCHES}` - {#GATEWAYNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.MATCHES}` - {#GATEWAYNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.NOT_MATCHES}` - {#GATEWAYID} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_ID.MATCHES}` - {#METRIC} MATCHES_REGEX `pH`
Pore Electrical Conductivity discovery	Discovery for Aranet Cloud Pore Electrical Conductivity sensors	DEPENDENT	aranet.poreelectriccond.discovery Filter: AND - {#SENSORNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.MATCHES}` - {#SENSORNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.NOT_MATCHES}` - {#SENSORID} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_ID.MATCHES}` - {#GATEWAYNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.MATCHES}` - {#GATEWAYNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.NOT_MATCHES}` - {#GATEWAYID} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_ID.MATCHES}` - {#METRIC} MATCHES_REGEX `Pore Electrical Conductivity`
PPFD discovery	Discovery for Aranet Cloud PPFD sensors	DEPENDENT	aranet.ppfd.discovery Filter: AND - {#SENSORNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.MATCHES}` - {#SENSORNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.NOT_MATCHES}` - {#SENSORID} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_ID.MATCHES}` - {#GATEWAYNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.MATCHES}` - {#GATEWAYNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.NOT_MATCHES}` - {#GATEWAYID} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_ID.MATCHES}` - {#METRIC} MATCHES_REGEX `PPFD`
Pulses Cumulative discovery	Discovery for Aranet Cloud Pulses Cumulative sensors	DEPENDENT	aranet.pulsescumulative.discovery Filter: AND - {#SENSORNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.MATCHES}` - {#SENSORNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.NOT_MATCHES}` - {#SENSORID} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_ID.MATCHES}` - {#GATEWAYNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.MATCHES}` - {#GATEWAYNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.NOT_MATCHES}` - {#GATEWAYID} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_ID.MATCHES}` - {#METRIC} MATCHESREGEX `Pulses Cumulative`
Pulses discovery	Discovery for Aranet Cloud Pulses sensors	DEPENDENT	aranet.pulses.discovery Filter: AND - {#SENSORNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.MATCHES}` - {#SENSORNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.NOT_MATCHES}` - {#SENSORID} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_ID.MATCHES}` - {#GATEWAYNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.MATCHES}` - {#GATEWAYNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.NOT_MATCHES}` - {#GATEWAYID} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_ID.MATCHES}` - {#METRIC} MATCHES_REGEX `Pulses`
RSSI discovery	Discovery for Aranet Cloud RSSI sensors	DEPENDENT	aranet.rssi.discovery Filter: AND - {#SENSORNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.MATCHES}` - {#SENSORNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.NOT_MATCHES}` - {#SENSORID} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_ID.MATCHES}` - {#GATEWAYNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.MATCHES}` - {#GATEWAYNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.NOT_MATCHES}` - {#GATEWAYID} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_ID.MATCHES}` - {#METRIC} MATCHES_REGEX `RSSI`
Soil Dielectric Permittivity discovery	Discovery for Aranet Cloud Soil Dielectric Permittivity sensors	DEPENDENT	aranet.soildielectricperm.discovery Filter: AND - {#SENSORNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.MATCHES}` - {#SENSORNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.NOT_MATCHES}` - {#SENSORID} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_ID.MATCHES}` - {#GATEWAYNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.MATCHES}` - {#GATEWAYNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.NOT_MATCHES}` - {#GATEWAYID} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_ID.MATCHES}` - {#METRIC} MATCHES_REGEX `Soil Dielectric Permittivity`
Soil Electrical Conductivity discovery	Discovery for Aranet Cloud Soil Electrical Conductivity sensors	DEPENDENT	aranet.soilelectriccond.discovery Filter: AND - {#SENSORNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.MATCHES}` - {#SENSORNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.NOT_MATCHES}` - {#SENSORID} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_ID.MATCHES}` - {#GATEWAYNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.MATCHES}` - {#GATEWAYNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.NOT_MATCHES}` - {#GATEWAYID} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_ID.MATCHES}` - {#METRIC} MATCHES_REGEX `Soil Electrical Conductivity`
Temperature discovery	Discovery for Aranet Cloud temperature sensors	DEPENDENT	aranet.temp.discovery Filter: AND - {#SENSORNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.MATCHES}` - {#SENSORNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.NOT_MATCHES}` - {#SENSORID} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_ID.MATCHES}` - {#GATEWAYNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.MATCHES}` - {#GATEWAYNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.NOT_MATCHES}` - {#GATEWAYID} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_ID.MATCHES}` - {#METRIC} MATCHES_REGEX `Temperature`
Voltage discovery	Discovery for Aranet Cloud Voltage sensors	DEPENDENT	aranet.voltage.discovery Filter: AND - {#SENSORNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.MATCHES}` - {#SENSORNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.NOT_MATCHES}` - {#SENSORID} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_ID.MATCHES}` - {#GATEWAYNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.MATCHES}` - {#GATEWAYNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.NOT_MATCHES}` - {#GATEWAYID} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_ID.MATCHES}` - {#METRIC} MATCHES_REGEX `Voltage`
Volumetric Water Content discovery	Discovery for Aranet Cloud Volumetric Water Content sensors	DEPENDENT	aranet.volumwatercontent.discovery Filter: AND - {#SENSORNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.MATCHES}` - {#SENSORNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.NOT_MATCHES}` - {#SENSORID} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_ID.MATCHES}` - {#GATEWAYNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.MATCHES}` - {#GATEWAYNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.NOT_MATCHES}` - {#GATEWAYID} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_ID.MATCHES}` - {#METRIC} MATCHES_REGEX `Volumetric Water Content`
Weight discovery	Discovery for Aranet Cloud Weight sensors	DEPENDENT	aranet.weight.discovery Filter: AND - {#SENSORNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.MATCHES}` - {#SENSORNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_NAME.NOT_MATCHES}` - {#SENSORID} MATCHESREGEX `{$ARANET.LLD.FILTER.SENSOR_ID.MATCHES}` - {#GATEWAYNAME} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.MATCHES}` - {#GATEWAYNAME} NOTMATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_NAME.NOT_MATCHES}` - {#GATEWAYID} MATCHESREGEX `{$ARANET.LLD.FILTER.GATEWAY_ID.MATCHES}` - {#METRIC} MATCHES_REGEX `Weight`

Items collected

Group	Name	Description	Type	Key and additional info
Aranet	{#METRIC}: [{#GATEWAYNAME}] {#SENSORNAME}	-	DEPENDENT	aranet.temp["{#GATEWAYID}", "{#SENSORID}"] Preprocessing: - JSONPATH: `$[?(@.id == "{#SENSOR_ID}" && @.name == "{#SENSOR_NAME}")].metrics[?(@.name == "{#METRIC}")].value.first()`
Aranet	{#METRIC}: [{#GATEWAYNAME}] {#SENSORNAME}	-	DEPENDENT	aranet.humidity["{#GATEWAYID}", "{#SENSORID}"] Preprocessing: - JSONPATH: `$[?(@.id == "{#SENSOR_ID}" && @.name == "{#SENSOR_NAME}")].metrics[?(@.name == "{#METRIC}")].value.first()`
Aranet	{#METRIC}: [{#GATEWAYNAME}] {#SENSORNAME}	-	DEPENDENT	aranet.rssi["{#GATEWAYID}", "{#SENSORID}"] Preprocessing: - JSONPATH: `$[?(@.id == "{#SENSOR_ID}" && @.name == "{#SENSOR_NAME}")].metrics[?(@.name == "{#METRIC}")].value.first()`
Aranet	{#METRIC}: [{#GATEWAYNAME}] {#SENSORNAME}	-	DEPENDENT	aranet.battery.voltage["{#GATEWAYID}", "{#SENSORID}"] Preprocessing: - JSONPATH: `$[?(@.id == "{#SENSOR_ID}" && @.name == "{#SENSOR_NAME}")].metrics[?(@.name == "{#METRIC}")].value.first()`
Aranet	{#METRIC}: [{#GATEWAYNAME}] {#SENSORNAME}	-	DEPENDENT	aranet.co2["{#GATEWAYID}", "{#SENSORID}"] Preprocessing: - JSONPATH: `$[?(@.id == "{#SENSOR_ID}" && @.name == "{#SENSOR_NAME}")].metrics[?(@.name == "{#METRIC}")].value.first()`
Aranet	{#METRIC}: [{#GATEWAYNAME}] {#SENSORNAME}	-	DEPENDENT	aranet.pressure["{#GATEWAYID}", "{#SENSORID}"] Preprocessing: - JSONPATH: `$[?(@.id == "{#SENSOR_ID}" && @.name == "{#SENSOR_NAME}")].metrics[?(@.name == "{#METRIC}")].value.first()`
Aranet	{#METRIC}: [{#GATEWAYNAME}] {#SENSORNAME}	-	DEPENDENT	aranet.voltage["{#GATEWAYID}", "{#SENSORID}"] Preprocessing: - JSONPATH: `$[?(@.id == "{#SENSOR_ID}" && @.name == "{#SENSOR_NAME}")].metrics[?(@.name == "{#METRIC}")].value.first()`
Aranet	{#METRIC}: [{#GATEWAYNAME}] {#SENSORNAME}	-	DEPENDENT	aranet.weight["{#GATEWAYID}", "{#SENSORID}"] Preprocessing: - JSONPATH: `$[?(@.id == "{#SENSOR_ID}" && @.name == "{#SENSOR_NAME}")].metrics[?(@.name == "{#METRIC}")].value.first()`
Aranet	{#METRIC}: [{#GATEWAYNAME}] {#SENSORNAME}	-	DEPENDENT	aranet.volumetric.water.content["{#GATEWAYID}", "{#SENSORID}"] Preprocessing: - JSONPATH: `$[?(@.id == "{#SENSOR_ID}" && @.name == "{#SENSOR_NAME}")].metrics[?(@.name == "{#METRIC}")].value.first()`
Aranet	{#METRIC}: [{#GATEWAYNAME}] {#SENSORNAME}	-	DEPENDENT	aranet.ppfd["{#GATEWAYID}", "{#SENSORID}"] Preprocessing: - JSONPATH: `$[?(@.id == "{#SENSOR_ID}" && @.name == "{#SENSOR_NAME}")].metrics[?(@.name == "{#METRIC}")].value.first()`
Aranet	{#METRIC}: [{#GATEWAYNAME}] {#SENSORNAME}	-	DEPENDENT	aranet.distance["{#GATEWAYID}", "{#SENSORID}"] Preprocessing: - JSONPATH: `$[?(@.id == "{#SENSOR_ID}" && @.name == "{#SENSOR_NAME}")].metrics[?(@.name == "{#METRIC}")].value.first()`
Aranet	{#METRIC}: [{#GATEWAYNAME}] {#SENSORNAME}	-	DEPENDENT	aranet.illuminance["{#GATEWAYID}", "{#SENSORID}"] Preprocessing: - JSONPATH: `$[?(@.id == "{#SENSOR_ID}" && @.name == "{#SENSOR_NAME}")].metrics[?(@.name == "{#METRIC}")].value.first()`
Aranet	{#METRIC}: [{#GATEWAYNAME}] {#SENSORNAME}	-	DEPENDENT	aranet.ph["{#GATEWAYID}", "{#SENSORID}"] Preprocessing: - JSONPATH: `$[?(@.id == "{#SENSOR_ID}" && @.name == "{#SENSOR_NAME}")].metrics[?(@.name == "{#METRIC}")].value.first()`
Aranet	{#METRIC}: [{#GATEWAYNAME}] {#SENSORNAME}	-	DEPENDENT	aranet.current["{#GATEWAYID}", "{#SENSORID}"] Preprocessing: - JSONPATH: `$[?(@.id == "{#SENSOR_ID}" && @.name == "{#SENSOR_NAME}")].metrics[?(@.name == "{#METRIC}")].value.first()`
Aranet	{#METRIC}: [{#GATEWAYNAME}] {#SENSORNAME}	-	DEPENDENT	aranet.soildielectricperm["{#GATEWAYID}", "{#SENSORID}"] Preprocessing: - JSONPATH: `$[?(@.id == "{#SENSOR_ID}" && @.name == "{#SENSOR_NAME}")].metrics[?(@.name == "{#METRIC}")].value.first()`
Aranet	{#METRIC}: [{#GATEWAYNAME}] {#SENSORNAME}	-	DEPENDENT	aranet.soilelectriccond["{#GATEWAYID}", "{#SENSORID}"] Preprocessing: - JSONPATH: `$[?(@.id == "{#SENSOR_ID}" && @.name == "{#SENSOR_NAME}")].metrics[?(@.name == "{#METRIC}")].value.first()`
Aranet	{#METRIC}: [{#GATEWAYNAME}] {#SENSORNAME}	-	DEPENDENT	aranet.poreelectriccond["{#GATEWAYID}", "{#SENSORID}"] Preprocessing: - JSONPATH: `$[?(@.id == "{#SENSOR_ID}" && @.name == "{#SENSOR_NAME}")].metrics[?(@.name == "{#METRIC}")].value.first()`
Aranet	{#METRIC}: [{#GATEWAYNAME}] {#SENSORNAME}	-	DEPENDENT	aranet.pulses["{#GATEWAYID}", "{#SENSORID}"] Preprocessing: - JSONPATH: `$[?(@.id == "{#SENSOR_ID}" && @.name == "{#SENSOR_NAME}")].metrics[?(@.name == "{#METRIC}")].value.first()`
Aranet	{#METRIC}: [{#GATEWAYNAME}] {#SENSORNAME}	-	DEPENDENT	aranet.pulsescumulative["{#GATEWAYID}", "{#SENSOR_ID}"] Preprocessing: - JSONPATH: `$[?(@.id == "{#SENSOR_ID}" && @.name == "{#SENSOR_NAME}")].metrics[?(@.name == "{#METRIC}")].value.first()`
Aranet	{#METRIC}: [{#GATEWAYNAME}] {#SENSORNAME}	-	DEPENDENT	aranet.diffpressure["{#GATEWAYID}", "{#SENSOR_ID}"] Preprocessing: - JSONPATH: `$[?(@.id == "{#SENSOR_ID}" && @.name == "{#SENSOR_NAME}")].metrics[?(@.name == "{#METRIC}")].value.first()`
Aranet	{#METRIC}: [{#GATEWAYNAME}] {#SENSORNAME}	-	DEPENDENT	aranet.lastupdate["{#GATEWAYID}", "{#SENSOR_ID}"] Preprocessing: - JSONPATH: `$[?(@.id == "{#SENSOR_ID}" && @.name == "{#SENSOR_NAME}")].metrics[?(@.name == "{#METRIC}")].value.first()` - JAVASCRIPT: `return Math.floor(Date.now()/1000 - Number(value));`
Zabbix raw items	Aranet: Sensors discovery	Discovery for Aranet Cloud sensors	DEPENDENT	aranet.sensor.discovery Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARDUNCHANGEDHEARTBEAT: `15m`
Zabbix raw items	Aranet: Get data	-	SCRIPT	aranet.get_data Expression: `The text is too long. Please see the template.`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
{#METRIC}: Low humidity on "[{#GATEWAYNAME}] {#SENSORNAME}"		`max(/Aranet Cloud/aranet.humidity["{#GATEWAY_ID}", "{#SENSOR_ID}"],5m) < {$ARANET.HUMIDITY.MIN.WARN:"{#SENSOR_NAME}"}`	WARNING	Depends on: - {#METRIC}: High humidity on "[{#GATEWAYNAME}] {#SENSORNAME}"
{#METRIC}: High humidity on "[{#GATEWAYNAME}] {#SENSORNAME}"		`min(/Aranet Cloud/aranet.humidity["{#GATEWAY_ID}", "{#SENSOR_ID}"],5m) > {$ARANET.HUMIDITY.MAX.WARN:"{#SENSOR_NAME}"}`	HIGH
{#METRIC}: Low battery voltage on "[{#GATEWAYNAME}] {#SENSORNAME}"	-	`max(/Aranet Cloud/aranet.battery.voltage["{#GATEWAY_ID}", "{#SENSOR_ID}"],5m) < {$ARANET.BATT.VOLTAGE.MIN.WARN:"{#SENSOR_NAME}"}`	WARNING	Depends on: - {#METRIC}: Critically low battery voltage on "[{#GATEWAYNAME}] {#SENSORNAME}"
{#METRIC}: Critically low battery voltage on "[{#GATEWAYNAME}] {#SENSORNAME}"	-	`max(/Aranet Cloud/aranet.battery.voltage["{#GATEWAY_ID}", "{#SENSOR_ID}"],5m) < {$ARANET.BATT.VOLTAGE.MIN.CRIT:"{#SENSOR_NAME}"}`	HIGH
{#METRIC}: High CO2 level on "[{#GATEWAYNAME}] {#SENSORNAME}"	-	`min(/Aranet Cloud/aranet.co2["{#GATEWAY_ID}", "{#SENSOR_ID}"],5m) > {$ARANET.CO2.MAX.WARN:"{#SENSOR_NAME}"}`	WARNING	Depends on: - {#METRIC}: Critically high CO2 level on "[{#GATEWAYNAME}] {#SENSORNAME}"
{#METRIC}: Critically high CO2 level on "[{#GATEWAYNAME}] {#SENSORNAME}"	-	`min(/Aranet Cloud/aranet.co2["{#GATEWAY_ID}", "{#SENSOR_ID}"],5m) > {$ARANET.CO2.MAX.CRIT:"{#SENSOR_NAME}"}`	HIGH
{#METRIC}: Sensor data "[{#GATEWAYNAME}] {#SENSORNAME}" is not updated	-	`last(/Aranet Cloud/aranet.last_update["{#GATEWAY_ID}", "{#SENSOR_ID}"]) > {$ARANET.LAST_UPDATE.MAX.WARN:"{#SENSOR_NAME}"}`	WARNING

Feedback

Please report any issues with the template at https://support.zabbix.com

app

app_apache_http

View README Download JSON

Apache by HTTP

Overview

For Zabbix version: 6.2 and higher
The template to monitor Apache HTTPD by Zabbix that work without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.
Template Apache by HTTP - collects metrics by polling mod_status with HTTP agent remotely:

127.0.0.1
ServerVersion: Apache/2.4.41 (Unix)
ServerMPM: event
Server Built: Aug 14 2019 00:35:10
CurrentTime: Friday, 16-Aug-2019 12:38:40 UTC
RestartTime: Wednesday, 14-Aug-2019 07:58:26 UTC
ParentServerConfigGeneration: 1
ParentServerMPMGeneration: 0
ServerUptimeSeconds: 189613
ServerUptime: 2 days 4 hours 40 minutes 13 seconds
Load1: 4.60
Load5: 1.20
Load15: 0.47
Total Accesses: 27860
Total kBytes: 33011
Total Duration: 54118
CPUUser: 18.02
CPUSystem: 31.76
CPUChildrenUser: 0
CPUChildrenSystem: 0
CPULoad: .0262535
Uptime: 189613
ReqPerSec: .146931
BytesPerSec: 178.275
BytesPerReq: 1213.33
DurationPerReq: 1.9425
BusyWorkers: 7
IdleWorkers: 93
Processes: 4
Stopping: 0
BusyWorkers: 7
IdleWorkers: 93
ConnsTotal: 13
ConnsAsyncWriting: 0
ConnsAsyncKeepAlive: 5
ConnsAsyncClosing: 0
Scoreboard: __________________________________________W_____________W___________________LW_____W______W_W_______............................................................................................................................................................................................................................................................................................................

This template was tested on:

Apache, version 2.4.41

Setup

Setup mod_status

Check module availability: httpd -M 2>/dev/null | grep status_module

Example configuration of Apache:

<Location "/server-status">
  SetHandler server-status
  Require host example.com
</Location>

If you use another path, then don't forget to change {$APACHE.STATUS.PATH} macro.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$APACHE.RESPONSE_TIME.MAX.WARN}	Maximum Apache response time in seconds for trigger expression	`10`
{$APACHE.STATUS.PATH}	The URL path	`server-status?auto`
{$APACHE.STATUS.PORT}	The port of Apache status page	`80`
{$APACHE.STATUS.SCHEME}	Request scheme which may be http or https	`http`

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Event MPM discovery

Additional metrics if event MPM is used

https://httpd.apache.org/docs/current/mod/event.html

DEPENDENT

apache.mpm.event.discovery

Preprocessing:

- JAVASCRIPT: return JSON.stringify(JSON.parse(value).ServerMPM === 'event' ? [{'{#SINGLETON}': ''}] : []);

- DISCARDUNCHANGEDHEARTBEAT: 3h

Items collected

Group	Name	Description	Type	Key and additional info
Apache	Apache: Service ping	-	SIMPLE	net.tcp.service[http,"{HOST.CONN}","{$APACHE.STATUS.PORT}"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
Apache	Apache: Service response time	-	SIMPLE	net.tcp.service.perf[http,"{HOST.CONN}","{$APACHE.STATUS.PORT}"]
Apache	Apache: Total bytes	Total bytes served	DEPENDENT	apache.bytes Preprocessing: - JSONPATH: `$["Total kBytes"]` - MULTIPLIER: `1024`
Apache	Apache: Bytes per second	Calculated as change rate for 'Total bytes' stat. BytesPerSec is not used, as it counts average since last Apache server start.	DEPENDENT	apache.bytes.rate Preprocessing: - JSONPATH: `$["Total kBytes"]` - MULTIPLIER: `1024` - CHANGEPERSECOND
Apache	Apache: Requests per second	Calculated as change rate for 'Total requests' stat. ReqPerSec is not used, as it counts average since last Apache server start.	DEPENDENT	apache.requests.rate Preprocessing: - JSONPATH: `$["Total Accesses"]` - CHANGEPERSECOND
Apache	Apache: Total requests	A total number of accesses	DEPENDENT	apache.requests Preprocessing: - JSONPATH: `$["Total Accesses"]`
Apache	Apache: Uptime	Service uptime in seconds	DEPENDENT	apache.uptime Preprocessing: - JSONPATH: `$.ServerUptimeSeconds`
Apache	Apache: Version	Service version	DEPENDENT	apache.version Preprocessing: - JSONPATH: `$.ServerVersion` - DISCARDUNCHANGEDHEARTBEAT: `1d`
Apache	Apache: Total workers busy	Total number of busy worker threads/processes	DEPENDENT	apache.workers_total.busy Preprocessing: - JSONPATH: `$.BusyWorkers`
Apache	Apache: Total workers idle	Total number of idle worker threads/processes	DEPENDENT	apache.workers_total.idle Preprocessing: - JSONPATH: `$.IdleWorkers`
Apache	Apache: Workers closing connection	Number of workers in closing state	DEPENDENT	apache.workers.closing Preprocessing: - JSONPATH: `$.Workers.closing`
Apache	Apache: Workers DNS lookup	Number of workers in dnslookup state	DEPENDENT	apache.workers.dnslookup Preprocessing: - JSONPATH: `$.Workers.dnslookup`
Apache	Apache: Workers finishing	Number of workers in finishing state	DEPENDENT	apache.workers.finishing Preprocessing: - JSONPATH: `$.Workers.finishing`
Apache	Apache: Workers idle cleanup	Number of workers in cleanup state	DEPENDENT	apache.workers.cleanup Preprocessing: - JSONPATH: `$.Workers.cleanup`
Apache	Apache: Workers keepalive (read)	Number of workers in keepalive state	DEPENDENT	apache.workers.keepalive Preprocessing: - JSONPATH: `$.Workers.keepalive`
Apache	Apache: Workers logging	Number of workers in logging state	DEPENDENT	apache.workers.logging Preprocessing: - JSONPATH: `$.Workers.logging`
Apache	Apache: Workers reading request	Number of workers in reading state	DEPENDENT	apache.workers.reading Preprocessing: - JSONPATH: `$.Workers.reading`
Apache	Apache: Workers sending reply	Number of workers in sending state	DEPENDENT	apache.workers.sending Preprocessing: - JSONPATH: `$.Workers.sending`
Apache	Apache: Workers slot with no current process	Number of slots with no current process	DEPENDENT	apache.workers.slot Preprocessing: - JSONPATH: `$.Workers.slot`
Apache	Apache: Workers starting up	Number of workers in starting state	DEPENDENT	apache.workers.starting Preprocessing: - JSONPATH: `$.Workers.starting`
Apache	Apache: Workers waiting for connection	Number of workers in waiting state	DEPENDENT	apache.workers.waiting Preprocessing: - JSONPATH: `$.Workers.waiting`
Apache	Apache: Connections async closing	Number of async connections in closing state (only applicable to event MPM)	DEPENDENT	apache.connections[async_closing{#SINGLETON}] Preprocessing: - JSONPATH: `$.ConnsAsyncClosing`
Apache	Apache: Connections async keep alive	Number of async connections in keep-alive state (only applicable to event MPM)	DEPENDENT	apache.connections[asynckeepalive{#SINGLETON}] Preprocessing: - JSONPATH: `$.ConnsAsyncKeepAlive`
Apache	Apache: Connections async writing	Number of async connections in writing state (only applicable to event MPM)	DEPENDENT	apache.connections[async_writing{#SINGLETON}] Preprocessing: - JSONPATH: `$.ConnsAsyncWriting`
Apache	Apache: Connections total	Number of total connections	DEPENDENT	apache.connections[total{#SINGLETON}] Preprocessing: - JSONPATH: `$.ConnsTotal`
Apache	Apache: Bytes per request	Average number of client requests per second	DEPENDENT	apache.bytes[per_request{#SINGLETON}] Preprocessing: - JSONPATH: `$.BytesPerReq`
Apache	Apache: Number of async processes	Number of async processes	DEPENDENT	apache.process[num{#SINGLETON}] Preprocessing: - JSONPATH: `$.Processes`
Zabbix raw items	Apache: Get status	Getting data from a machine-readable version of the Apache status page. https://httpd.apache.org/docs/current/mod/mod_status.html	HTTP_AGENT	apache.get_status Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Apache: Service is down	-	`last(/Apache by HTTP/net.tcp.service[http,"{HOST.CONN}","{$APACHE.STATUS.PORT}"])=0`	AVERAGE	Manual close: YES
Apache: Service response time is too high	-	`min(/Apache by HTTP/net.tcp.service.perf[http,"{HOST.CONN}","{$APACHE.STATUS.PORT}"],5m)>{$APACHE.RESPONSE_TIME.MAX.WARN}`	WARNING	Manual close: YES Depends on: - Apache: Service is down
Apache: has been restarted	Uptime is less than 10 minutes.	`last(/Apache by HTTP/apache.uptime)<10m`	INFO	Manual close: YES
Apache: Version has changed	Apache version has changed. Ack to close.	`last(/Apache by HTTP/apache.version,#1)<>last(/Apache by HTTP/apache.version,#2) and length(last(/Apache by HTTP/apache.version))>0`	INFO	Manual close: YES
Apache: Failed to fetch status page	Zabbix has not received data for items for the last 30 minutes.	`nodata(/Apache by HTTP/apache.get_status,30m)=1`	WARNING	Manual close: YES Depends on: - Apache: Service is down

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

app

app_apache_agent

View README Download JSON

Apache by Zabbix agent

Overview

For Zabbix version: 6.2 and higher. This template is developed to monitor Apache HTTPD by Zabbix that works without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.
The template Apache by Zabbix agent - collects metrics by polling Apache Satus module locally with Zabbix agent:

127.0.0.1
ServerVersion: Apache/2.4.41 (Unix)
ServerMPM: event
Server Built: Aug 14 2019 00:35:10
CurrentTime: Friday, 16-Aug-2019 12:38:40 UTC
RestartTime: Wednesday, 14-Aug-2019 07:58:26 UTC
ParentServerConfigGeneration: 1
ParentServerMPMGeneration: 0
ServerUptimeSeconds: 189613
ServerUptime: 2 days 4 hours 40 minutes 13 seconds
Load1: 4.60
Load5: 1.20
Load15: 0.47
Total Accesses: 27860
Total kBytes: 33011
Total Duration: 54118
CPUUser: 18.02
CPUSystem: 31.76
CPUChildrenUser: 0
CPUChildrenSystem: 0
CPULoad: .0262535
Uptime: 189613
ReqPerSec: .146931
BytesPerSec: 178.275
BytesPerReq: 1213.33
DurationPerReq: 1.9425
BusyWorkers: 7
IdleWorkers: 93
Processes: 4
Stopping: 0
BusyWorkers: 7
IdleWorkers: 93
ConnsTotal: 13
ConnsAsyncWriting: 0
ConnsAsyncKeepAlive: 5
ConnsAsyncClosing: 0
Scoreboard: __________________________________________W_____________W___________________LW_____W______W_W_______............................................................................................................................................................................................................................................................................................................

It also uses Zabbix agent to collect Apache Linux process statistics, such as CPU usage, memory usage, and whether the process is running or not.

This template was tested on:

Apache, version 2.4.41

Setup

See the setup instructions for Apache Satus module.

Check the availability of the module with this command line: httpd -M 2>/dev/null | grep status_module

This is an example configuration of the Apache web server:

<Location "/server-status">
  SetHandler server-status
  Require host example.com
</Location>

If you use another path, then do not forget to change the {$APACHE.STATUS.PATH} macro. Install and setup Zabbix agent.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$APACHE.PROCESS_NAME}	The process name of the Apache web server (Apache).	`(httpd	apache2)`
{$APACHE.RESPONSE_TIME.MAX.WARN}	The maximum Apache response time expressed in seconds for a trigger expression.	`10`
{$APACHE.STATUS.HOST}	The Hostname or an IP address of the Apache status page.	`127.0.0.1`
{$APACHE.STATUS.PATH}	The URL path.	`server-status?auto`
{$APACHE.STATUS.PORT}	The port of the Apache status page.	`80`
{$APACHE.STATUS.SCHEME}	The request scheme, which may be either HTTP or HTTPS.	`http`

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Apache process discovery

The discovery of the Apache process summary.

DEPENDENT

apache.proc.discovery

Filter:

AND

- {#NAME} MATCHES_REGEX {$APACHE.PROCESS_NAME}

Event MPM discovery

The discovery of additional metrics if the event Multi-Processing Module (MPM) is used.

For more details see Apache MPM event.

DEPENDENT

apache.mpm.event.discovery

Preprocessing:

- JAVASCRIPT: return JSON.stringify(JSON.parse(value).ServerMPM === 'event' ? [{'{#SINGLETON}': ''}] : []);

- DISCARDUNCHANGEDHEARTBEAT: 3h

Items collected

Group	Name	Description	Type	Key and additional info
Apache	Apache: Service ping	-	ZABBIX_PASSIVE	net.tcp.service[http,"{$APACHE.STATUS.HOST}","{$APACHE.STATUS.PORT}"] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `10m`
Apache	Apache: Service response time	-	ZABBIX_PASSIVE	net.tcp.service.perf[http,"{$APACHE.STATUS.HOST}","{$APACHE.STATUS.PORT}"]
Apache	Apache: Total bytes	The total bytes served.	DEPENDENT	apache.bytes Preprocessing: - JSONPATH: `$["Total kBytes"]` - MULTIPLIER: `1024`
Apache	Apache: Bytes per second	It is calculated as a rate of change for total bytes statistics. `ReqPerSec` is not used, as it counts the average since the last Apache server start.	DEPENDENT	apache.bytes.rate Preprocessing: - JSONPATH: `$["Total kBytes"]` - MULTIPLIER: `1024` - CHANGEPERSECOND
Apache	Apache: Requests per second	It is calculated as a rate of change for the "Total requests" statistics. `ReqPerSec` is not used, as it counts the average since the last Apache server start.	DEPENDENT	apache.requests.rate Preprocessing: - JSONPATH: `$["Total Accesses"]` - CHANGEPERSECOND
Apache	Apache: Total requests	The total number of the Apache server accesses.	DEPENDENT	apache.requests Preprocessing: - JSONPATH: `$["Total Accesses"]`
Apache	Apache: Uptime	The service uptime expressed in seconds.	DEPENDENT	apache.uptime Preprocessing: - JSONPATH: `$.ServerUptimeSeconds`
Apache	Apache: Version	The Apache service version.	DEPENDENT	apache.version Preprocessing: - JSONPATH: `$.ServerVersion` - DISCARDUNCHANGEDHEARTBEAT: `1d`
Apache	Apache: Total workers busy	The total number of busy worker threads/processes.	DEPENDENT	apache.workers_total.busy Preprocessing: - JSONPATH: `$.BusyWorkers`
Apache	Apache: Total workers idle	The total number of idle worker threads/processes.	DEPENDENT	apache.workers_total.idle Preprocessing: - JSONPATH: `$.IdleWorkers`
Apache	Apache: Workers closing connection	The number of workers in closing state.	DEPENDENT	apache.workers.closing Preprocessing: - JSONPATH: `$.Workers.closing`
Apache	Apache: Workers DNS lookup	The number of workers in `dnslookup` state.	DEPENDENT	apache.workers.dnslookup Preprocessing: - JSONPATH: `$.Workers.dnslookup`
Apache	Apache: Workers finishing	The number of workers in finishing state.	DEPENDENT	apache.workers.finishing Preprocessing: - JSONPATH: `$.Workers.finishing`
Apache	Apache: Workers idle cleanup	The number of workers in cleanup state.	DEPENDENT	apache.workers.cleanup Preprocessing: - JSONPATH: `$.Workers.cleanup`
Apache	Apache: Workers keepalive (read)	The number of workers in `keepalive` state.	DEPENDENT	apache.workers.keepalive Preprocessing: - JSONPATH: `$.Workers.keepalive`
Apache	Apache: Workers logging	The number of workers in logging state.	DEPENDENT	apache.workers.logging Preprocessing: - JSONPATH: `$.Workers.logging`
Apache	Apache: Workers reading request	The number of workers in reading state.	DEPENDENT	apache.workers.reading Preprocessing: - JSONPATH: `$.Workers.reading`
Apache	Apache: Workers sending reply	The number of workers in sending state.	DEPENDENT	apache.workers.sending Preprocessing: - JSONPATH: `$.Workers.sending`
Apache	Apache: Workers slot with no current process	The number of slots with no current process.	DEPENDENT	apache.workers.slot Preprocessing: - JSONPATH: `$.Workers.slot`
Apache	Apache: Workers starting up	The number of workers in starting state.	DEPENDENT	apache.workers.starting Preprocessing: - JSONPATH: `$.Workers.starting`
Apache	Apache: Workers waiting for connection	The number of workers in waiting state.	DEPENDENT	apache.workers.waiting Preprocessing: - JSONPATH: `$.Workers.waiting`
Apache	Apache: Get processes summary	The aggregated data of summary metrics for all processes.	ZABBIX_PASSIVE	proc.get[,,,summary]
Apache	Apache: CPU utilization	The percentage of the CPU utilization by a process {#NAME}.	ZABBIX_PASSIVE	proc.cpu.util[{#NAME}]
Apache	Apache: Get process data	The summary metrics aggregated by a process {#NAME}.	DEPENDENT	apache.proc.get[{#NAME}] Preprocessing: - JSONPATH: `$.[?(@["name"]=="{#NAME}")].first()` ⛔️ON_FAIL: `CUSTOM_VALUE -> Failed to retrieve process {#NAME} data`
Apache	Apache: Memory usage (rss)	The summary of resident set size memory used by a process {#NAME} expressed in bytes.	DEPENDENT	apache.proc.rss[{#NAME}] Preprocessing: - JSONPATH: `$.rss` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Apache	Apache: Memory usage (vsize)	The summary of virtual memory used by a process {#NAME} expressed in bytes.	DEPENDENT	apache.proc.vmem[{#NAME}] Preprocessing: - JSONPATH: `$.vsize` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Apache	Apache: Memory usage, %	The percentage of real memory used by a process {#NAME}.	DEPENDENT	apache.proc.pmem[{#NAME}] Preprocessing: - JSONPATH: `$.pmem` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Apache	Apache: Number of running processes	The number of running processes {#NAME}.	DEPENDENT	apache.proc.num[{#NAME}] Preprocessing: - JSONPATH: `$.processes` ⛔️ONFAIL: `CUSTOM_VALUE -> 0` - DISCARDUNCHANGED_HEARTBEAT: `1h`
Apache	Apache: Connections async closing	The number of asynchronous connections in closing state (applicable only to the event MPM).	DEPENDENT	apache.connections[async_closing{#SINGLETON}] Preprocessing: - JSONPATH: `$.ConnsAsyncClosing`
Apache	Apache: Connections async keepalive	The number of asynchronous connections in keepalive state (applicable only to the event MPM).	DEPENDENT	apache.connections[asynckeepalive{#SINGLETON}] Preprocessing: - JSONPATH: `$.ConnsAsyncKeepAlive`
Apache	Apache: Connections async writing	The number of asynchronous connections in writing state (applicable only to the event MPM).	DEPENDENT	apache.connections[async_writing{#SINGLETON}] Preprocessing: - JSONPATH: `$.ConnsAsyncWriting`
Apache	Apache: Connections total	The number of total connections.	DEPENDENT	apache.connections[total{#SINGLETON}] Preprocessing: - JSONPATH: `$.ConnsTotal`
Apache	Apache: Bytes per request	The average number of client requests per second.	DEPENDENT	apache.bytes[per_request{#SINGLETON}] Preprocessing: - JSONPATH: `$.BytesPerReq`
Apache	Apache: Number of async processes	The number of asynchronous processes.	DEPENDENT	apache.process[num{#SINGLETON}] Preprocessing: - JSONPATH: `$.Processes`
Zabbix raw items	Apache: Get status	Getting data from a machine-readable version of the Apache status page. For more information see Apache Module mod_status.	ZABBIX_PASSIVE	web.page.get["{$APACHE.STATUS.SCHEME}://{$APACHE.STATUS.HOST}:{$APACHE.STATUS.PORT}/{$APACHE.STATUS.PATH}"] Preprocessing: - JAVASCRIPT: `The text is too long. Please see the template.`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Apache: Host has been restarted	Uptime is less than 10 minutes.	`last(/Apache by Zabbix agent/apache.uptime)<10m`	INFO	Manual close: YES
Apache: Version has changed	The Apache version has changed. Acknowledge (Ack) to close manually.	`last(/Apache by Zabbix agent/apache.version,#1)<>last(/Apache by Zabbix agent/apache.version,#2) and length(last(/Apache by Zabbix agent/apache.version))>0`	INFO	Manual close: YES
Apache: Process is not running	-	`last(/Apache by Zabbix agent/apache.proc.num[{#NAME}])=0`	HIGH
Apache: Failed to fetch status page	Zabbix has not received any data for items for the last 30 minutes.	`nodata(/Apache by Zabbix agent/web.page.get["{$APACHE.STATUS.SCHEME}://{$APACHE.STATUS.HOST}:{$APACHE.STATUS.PORT}/{$APACHE.STATUS.PATH}"],30m)=1 and last(/Apache by Zabbix agent/apache.proc.num[{#NAME}])>0`	WARNING	Manual close: YES Depends on: - Apache: Service is down
Apache: Service is down	-	`last(/Apache by Zabbix agent/net.tcp.service[http,"{$APACHE.STATUS.HOST}","{$APACHE.STATUS.PORT}"])=0 and last(/Apache by Zabbix agent/apache.proc.num[{#NAME}])>0`	AVERAGE	Manual close: YES
Apache: Service response time is too high	-	`min(/Apache by Zabbix agent/net.tcp.service.perf[http,"{$APACHE.STATUS.HOST}","{$APACHE.STATUS.PORT}"],5m)>{$APACHE.RESPONSE_TIME.MAX.WARN} and last(/Apache by Zabbix agent/apache.proc.num[{#NAME}])>0`	WARNING	Manual close: YES Depends on: - Apache: Service is down

Feedback

Please report any issues with the template at https://support.zabbix.com.

You can also provide feedback, discuss the template, or ask for help at ZABBIX forums.

app

app_activemq_jmx

View README Download JSON

Apache ActiveMQ by JMX

Overview

For Zabbix version: 6.2 and higher
Official JMX Template for Apache ActiveMQ.

This template was tested on:

Apache ActiveMQ, version 5.15.5

Setup

Metrics are collected by JMX.

Enable and configure JMX access to Apache ActiveMQ. See documentation for instructions.
Set values in host macros {$ACTIVEMQ.USERNAME}, {$ACTIVEMQ.PASSWORD} and {$ACTIVEMQ.PORT}.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$ACTIVEMQ.BROKER.CONSUMERS.MIN.HIGH}	Minimum amount of consumers for broker. Can be used with broker name as context.	`1`
{$ACTIVEMQ.BROKER.CONSUMERS.MIN.TIME}	Time during which there may be no consumers on destination. Can be used with broker name as context.	`5m`
{$ACTIVEMQ.BROKER.PRODUCERS.MIN.HIGH}	Minimum amount of producers for broker. Can be used with broker name as context.	`1`
{$ACTIVEMQ.BROKER.PRODUCERS.MIN.TIME}	Time during which there may be no producers on broker. Can be used with broker name as context.	`5m`
{$ACTIVEMQ.DESTINATION.CONSUMERS.MIN.HIGH}	Minimum amount of consumers for destination. Can be used with destination name as context.	`1`
{$ACTIVEMQ.DESTINATION.CONSUMERS.MIN.TIME}	Time during which there may be no consumers in destination. Can be used with destination name as context.	`10m`
{$ACTIVEMQ.DESTINATION.PRODUCERS.MIN.HIGH}	Minimum amount of producers for destination. Can be used with destination name as context.	`1`
{$ACTIVEMQ.DESTINATION.PRODUCERS.MIN.TIME}	Time during which there may be no producers on destination. Can be used with destination name as context.	`10m`
{$ACTIVEMQ.EXPIRED.WARN}	Threshold for expired messages count. Can be used with destination name as context.	`0`
{$ACTIVEMQ.LLD.FILTER.BROKER.MATCHES}	Filter of discoverable discovered brokers	`.*`
{$ACTIVEMQ.LLD.FILTER.BROKER.NOT_MATCHES}	Filter to exclude discovered brokers	`CHANGE IF NEEDED`
{$ACTIVEMQ.LLD.FILTER.DESTINATION.MATCHES}	Filter of discoverable discovered destinations	`.*`
{$ACTIVEMQ.LLD.FILTER.DESTINATION.NOT_MATCHES}	Filter to exclude discovered destinations	`CHANGE IF NEEDED`
{$ACTIVEMQ.MEM.MAX.HIGH}	Memory threshold for HIGH trigger. Can be used with destination or broker name as context.	`90`
{$ACTIVEMQ.MEM.MAX.WARN}	Memory threshold for AVERAGE trigger. Can be used with destination or broker name as context.	`75`
{$ACTIVEMQ.MEM.TIME}	Time during which the metric can be above the threshold. Can be used with destination or broker name as context.	`5m`
{$ACTIVEMQ.MSG.RATE.WARN.TIME}	The time for message enqueue/dequeue rate. Can be used with destination or broker name as context.	`15m`
{$ACTIVEMQ.PASSWORD}	Password for JMX	`activemq`
{$ACTIVEMQ.PORT}	Port for JMX	`1099`
{$ACTIVEMQ.QUEUE.ENABLED}	Use this to disable alerting for specific destination. 1 = enabled, 0 = disabled. Can be used with destination name as context.	`1`
{$ACTIVEMQ.QUEUE.TIME}	Time during which the QueueSize can be higher than threshold. Can be used with destination name as context.	`10m`
{$ACTIVEMQ.QUEUE.WARN}	Threshold for QueueSize. Can be used with destination name as context.	`100`
{$ACTIVEMQ.STORE.MAX.HIGH}	Storage threshold for HIGH trigger. Can be used with broker name as context.	`90`
{$ACTIVEMQ.STORE.MAX.WARN}	Storage threshold for AVERAGE trigger. Can be used with broker name as context.	`75`
{$ACTIVEMQ.STORE.TIME}	Time during which the metric can be above the threshold. Can be used with destination or broker name as context.	`5m`
{$ACTIVEMQ.TEMP.MAX.HIGH}	Temp threshold for HIGH trigger. Can be used with broker name as context.	`90`
{$ACTIVEMQ.TEMP.MAX.WARN}	Temp threshold for AVERAGE trigger. Can be used with broker name as context.	`75`
{$ACTIVEMQ.TEMP.TIME}	Time during which the metric can be above the threshold. Can be used with destination or broker name as context.	`5m`
{$ACTIVEMQ.TOTAL.CONSUMERS.COUNT}	Attribute for TotalConsumerCount per destination. Used to suppress destination's triggers when the count of consumers on the broker is lower than threshold.	`TotalConsumerCount`
{$ACTIVEMQ.TOTAL.PRODUCERS.COUNT}	Attribute for TotalProducerCount per destination. Used to suppress destination's triggers when the count of consumers on the broker is lower than threshold.	`TotalProducerCount`
{$ACTIVEMQ.USER}	User for JMX	`admin`

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info

Brokers discovery

Discovery of brokers

JMX

jmx.discovery[beans,"org.apache.activemq:type=Broker,brokerName=*"]

Filter:

FORMULA A and B

- {#JMXBROKERNAME} MATCHESREGEX {$ACTIVEMQ.LLD.FILTER.BROKER.MATCHES}

- {#JMXBROKERNAME} NOTMATCHES_REGEX {$ACTIVEMQ.LLD.FILTER.BROKER.NOT_MATCHES}

Destinations discovery

Discovery of destinations

JMX

jmx.discovery[beans,"org.apache.activemq:type=Broker,brokerName=,destinationType=,destinationName=*"]

Filter:

FORMULA A and B

- {#JMXDESTINATIONNAME} MATCHESREGEX {$ACTIVEMQ.LLD.FILTER.DESTINATION.MATCHES}

- {#JMXDESTINATIONNAME} NOTMATCHES_REGEX {$ACTIVEMQ.LLD.FILTER.DESTINATION.NOT_MATCHES}

Items collected

Group	Name	Description	Type	Key and additional info
ActiveMQ	Broker {#JMXBROKERNAME}: Version	The version of the broker.	JMX	jmx[{#JMXOBJ},BrokerVersion] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `3h`
ActiveMQ	Broker {#JMXBROKERNAME}: Uptime	The uptime of the broker.	JMX	jmx[{#JMXOBJ},UptimeMillis] Preprocessing: - MULTIPLIER: `0.001`
ActiveMQ	Broker {#JMXBROKERNAME}: Memory limit	Memory limit, in bytes, used for holding undelivered messages before paging to temporary storage.	JMX	jmx[{#JMXOBJ},MemoryLimit] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1h`
ActiveMQ	Broker {#JMXBROKERNAME}: Memory usage in percents	Percent of memory limit used.	JMX	jmx[{#JMXOBJ}, MemoryPercentUsage]
ActiveMQ	Broker {#JMXBROKERNAME}: Storage limit	Disk limit, in bytes, used for persistent messages before producers are blocked.	JMX	jmx[{#JMXOBJ},StoreLimit] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1h`
ActiveMQ	Broker {#JMXBROKERNAME}: Storage usage in percents	Percent of store limit used.	JMX	jmx[{#JMXOBJ},StorePercentUsage]
ActiveMQ	Broker {#JMXBROKERNAME}: Temp limit	Disk limit, in bytes, used for non-persistent messages and temporary data before producers are blocked.	JMX	jmx[{#JMXOBJ},TempLimit] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `1h`
ActiveMQ	Broker {#JMXBROKERNAME}: Temp usage in percents	Percent of temp limit used.	JMX	jmx[{#JMXOBJ},TempPercentUsage]
ActiveMQ	Broker {#JMXBROKERNAME}: Messages enqueue rate	Rate of messages that have been sent to the broker.	JMX	jmx[{#JMXOBJ},TotalEnqueueCount] Preprocessing: - CHANGEPERSECOND
ActiveMQ	Broker {#JMXBROKERNAME}: Messages dequeue rate	Rate of messages that have been delivered by the broker and acknowledged by consumers.	JMX	jmx[{#JMXOBJ},TotalDequeueCount] Preprocessing: - CHANGEPERSECOND
ActiveMQ	Broker {#JMXBROKERNAME}: Consumers count total	Number of consumers attached to this broker.	JMX	jmx[{#JMXOBJ},TotalConsumerCount]
ActiveMQ	Broker {#JMXBROKERNAME}: Producers count total	Number of producers attached to this broker.	JMX	jmx[{#JMXOBJ},TotalProducerCount]
ActiveMQ	{#JMXBROKERNAME}: {#JMXDESTINATIONTYPE} {#JMXDESTINATIONNAME}: Consumers count	Number of consumers attached to this destination.	JMX	jmx[{#JMXOBJ},ConsumerCount]
ActiveMQ	{#JMXBROKERNAME}: {#JMXDESTINATIONTYPE} {#JMXDESTINATIONNAME}: Consumers count total on {#JMXBROKERNAME}	Number of consumers attached to the broker of this destination. Used to suppress destination's triggers when the count of consumers on the broker is lower than threshold.	JMX	jmx["org.apache.activemq:type=Broker,brokerName={#JMXBROKERNAME}",{$ACTIVEMQ.TOTAL.CONSUMERS.COUNT: "{#JMXDESTINATIONNAME}"}] Preprocessing: - INRANGE: `0 {$ACTIVEMQ.BROKER.CONSUMERS.MIN.HIGH}` ⛔️ONFAIL: `CUSTOM_VALUE -> {$ACTIVEMQ.BROKER.CONSUMERS.MIN.HIGH}` - DISCARDUNCHANGEDHEARTBEAT: `3h`
ActiveMQ	{#JMXBROKERNAME}: {#JMXDESTINATIONTYPE} {#JMXDESTINATIONNAME}: Producers count	Number of producers attached to this destination.	JMX	jmx[{#JMXOBJ},ProducerCount]
ActiveMQ	{#JMXBROKERNAME}: {#JMXDESTINATIONTYPE} {#JMXDESTINATIONNAME}: Producers count total on {#JMXBROKERNAME}	Number of producers attached to the broker of this destination. Used to suppress destination's triggers when the count of producers on the broker is lower than threshold.	JMX	jmx["org.apache.activemq:type=Broker,brokerName={#JMXBROKERNAME}",{$ACTIVEMQ.TOTAL.PRODUCERS.COUNT: "{#JMXDESTINATIONNAME}"}] Preprocessing: - INRANGE: `0 {$ACTIVEMQ.BROKER.PRODUCERS.MIN.HIGH}` ⛔️ONFAIL: `CUSTOM_VALUE -> {$ACTIVEMQ.BROKER.PRODUCERS.MIN.HIGH}` - DISCARDUNCHANGEDHEARTBEAT: `3h`
ActiveMQ	{#JMXBROKERNAME}: {#JMXDESTINATIONTYPE} {#JMXDESTINATIONNAME}: Memory usage in percents	The percentage of the memory limit used.	JMX	jmx[{#JMXOBJ},MemoryPercentUsage]
ActiveMQ	{#JMXBROKERNAME}: {#JMXDESTINATIONTYPE} {#JMXDESTINATIONNAME}: Messages enqueue rate	Rate of messages that have been sent to the destination.	JMX	jmx[{#JMXOBJ},EnqueueCount] Preprocessing: - CHANGEPERSECOND
ActiveMQ	{#JMXBROKERNAME}: {#JMXDESTINATIONTYPE} {#JMXDESTINATIONNAME}: Messages dequeue rate	Rate of messages that has been acknowledged (and removed) from the destination.	JMX	jmx[{#JMXOBJ},DequeueCount] Preprocessing: - CHANGEPERSECOND
ActiveMQ	{#JMXBROKERNAME}: {#JMXDESTINATIONTYPE} {#JMXDESTINATIONNAME}: Queue size	Number of messages on this destination, including any that have been dispatched but not acknowledged.	JMX	jmx[{#JMXOBJ},QueueSize]
ActiveMQ	{#JMXBROKERNAME}: {#JMXDESTINATIONTYPE} {#JMXDESTINATIONNAME}: Expired messages count	Number of messages that have been expired.	JMX	jmx[{#JMXOBJ},ExpiredCount] Preprocessing: - DISCARDUNCHANGEDHEARTBEAT: `3h`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Broker {#JMXBROKERNAME}: Version has been changed	Broker {#JMXBROKERNAME} version has changed. Ack to close.	`last(/Apache ActiveMQ by JMX/jmx[{#JMXOBJ},BrokerVersion],#1)<>last(/Apache ActiveMQ by JMX/jmx[{#JMXOBJ},BrokerVersion],#2) and length(last(/Apache ActiveMQ by JMX/jmx[{#JMXOBJ},BrokerVersion]))>0`	INFO	Manual close: YES
Broker {#JMXBROKERNAME}: Broker has been restarted	Uptime is less than 10 minutes.	`last(/Apache ActiveMQ by JMX/jmx[{#JMXOBJ},UptimeMillis])<10m`	INFO	Manual close: YES
Broker {#JMXBROKERNAME}: Memory usage is too high	-	`min(/Apache ActiveMQ by JMX/jmx[{#JMXOBJ}, MemoryPercentUsage],{$ACTIVEMQ.MEM.TIME:"{#JMXBROKERNAME}"})>{$ACTIVEMQ.MEM.MAX.WARN:"{#JMXBROKERNAME}"}`	AVERAGE	Depends on: - Broker {#JMXBROKERNAME}: Memory usage is too high
Broker {#JMXBROKERNAME}: Memory usage is too high	-	`min(/Apache ActiveMQ by JMX/jmx[{#JMXOBJ}, MemoryPercentUsage],{$ACTIVEMQ.MEM.TIME:"{#JMXBROKERNAME}"})>{$ACTIVEMQ.MEM.MAX.HIGH:"{#JMXBROKERNAME}"}`	HIGH
Broker {#JMXBROKERNAME}: Storage usage is too high	-	`min(/Apache ActiveMQ by JMX/jmx[{#JMXOBJ},StorePercentUsage],{$ACTIVEMQ.STORE.TIME:"{#JMXBROKERNAME}"})>{$ACTIVEMQ.STORE.MAX.WARN:"{#JMXBROKERNAME}"}`	AVERAGE	Depends on: - Broker {#JMXBROKERNAME}: Storage usage is too high
Broker {#JMXBROKERNAME}: Storage usage is too high	-	`min(/Apache ActiveMQ by JMX/jmx[{#JMXOBJ},StorePercentUsage],{$ACTIVEMQ.STORE.TIME:"{#JMXBROKERNAME}"})>{$ACTIVEMQ.STORE.MAX.HIGH:"{#JMXBROKERNAME}"}`	HIGH
Broker {#JMXBROKERNAME}: Temp usage is too high	-	`min(/Apache ActiveMQ by JMX/jmx[{#JMXOBJ},TempPercentUsage],{$ACTIVEMQ.TEMP.TIME:"{#JMXBROKERNAME}"})>{$ACTIVEMQ.TEMP.MAX.WARN}`	AVERAGE	Depends on: - Broker {#JMXBROKERNAME}: Temp usage is too high
Broker {#JMXBROKERNAME}: Temp usage is too high	-	`min(/Apache ActiveMQ by JMX/jmx[{#JMXOBJ},TempPercentUsage],{$ACTIVEMQ.TEMP.TIME:"{#JMXBROKERNAME}"})>{$ACTIVEMQ.TEMP.MAX.HIGH}`	HIGH
Broker {#JMXBROKERNAME}: Message enqueue rate is higher than dequeue rate	Enqueue rate is higher than dequeue rate. It may indicate performance problems.	`avg(/Apache ActiveMQ by JMX/jmx[{#JMXOBJ},TotalEnqueueCount],{$ACTIVEMQ.MSG.RATE.WARN.TIME:"{#JMXBROKERNAME}"})>avg(/Apache ActiveMQ by JMX/jmx[{#JMXOBJ},TotalEnqueueCount],{$ACTIVEMQ.MSG.RATE.WARN.TIME:"{#JMXBROKERNAME}"})`	AVERAGE
Broker {#JMXBROKERNAME}: Consumers count is too low	-	`max(/Apache ActiveMQ by JMX/jmx[{#JMXOBJ},TotalConsumerCount],{$ACTIVEMQ.BROKER.CONSUMERS.MIN.TIME:"{#JMXBROKERNAME}"})<{$ACTIVEMQ.BROKER.CONSUMERS.MIN.HIGH:"{#JMXBROKERNAME}"}`	HIGH
Broker {#JMXBROKERNAME}: Producers count is too low	-	`max(/Apache ActiveMQ by JMX/jmx[{#JMXOBJ},TotalProducerCount],{$ACTIVEMQ.BROKER.PRODUCERS.MIN.TIME:"{#JMXBROKERNAME}"})<{$ACTIVEMQ.BROKER.PRODUCERS.MIN.HIGH:"{#JMXBROKERNAME}"}`	HIGH
{#JMXBROKERNAME}: {#JMXDESTINATIONTYPE} {#JMXDESTINATIONNAME}: Consumers count is too low	-	`max(/Apache ActiveMQ by JMX/jmx[{#JMXOBJ},ConsumerCount],{$ACTIVEMQ.DESTINATION.CONSUMERS.MIN.TIME:"{#JMXDESTINATIONNAME}"})<{$ACTIVEMQ.DESTINATION.CONSUMERS.MIN.HIGH:"{#JMXDESTINATIONNAME}"} and last(/Apache ActiveMQ by JMX/jmx["org.apache.activemq:type=Broker,brokerName={#JMXBROKERNAME}",{$ACTIVEMQ.TOTAL.CONSUMERS.COUNT: "{#JMXDESTINATIONNAME}"}])>{$ACTIVEMQ.BROKER.CONSUMERS.MIN.HIGH:"{#JMXBROKERNAME}"}` Recovery expression: `min(/Apache ActiveMQ by JMX/jmx[{#JMXOBJ},ConsumerCount],{$ACTIVEMQ.DESTINATION.CONSUMERS.MIN.TIME:"{#JMXDESTINATIONNAME}"})>={$ACTIVEMQ.DESTINATION.CONSUMERS.MIN.HIGH:"{#JMXDESTINATIONNAME}"}`	AVERAGE	Manual close: YES
{#JMXBROKERNAME}: {#JMXDESTINATIONTYPE} {#JMXDESTINATIONNAME}: Producers count is too low	-	`max(/Apache ActiveMQ by JMX/jmx[{#JMXOBJ},ProducerCount],{$ACTIVEMQ.DESTINATION.PRODUCERS.MIN.TIME:"{#JMXDESTINATIONNAME}"})<{$ACTIVEMQ.DESTINATION.PRODUCERS.MIN.HIGH:"{#JMXDESTINATIONNAME}"} and last(/Apache ActiveMQ by JMX/jmx["org.apache.activemq:type=Broker,brokerName={#JMXBROKERNAME}",{$ACTIVEMQ.TOTAL.PRODUCERS.COUNT: "{#JMXDESTINATIONNAME}"}])>{$ACTIVEMQ.BROKER.PRODUCERS.MIN.HIGH:"{#JMXBROKERNAME}"}` Recovery expression: `min(/Apache ActiveMQ by JMX/jmx[{#JMXOBJ},ProducerCount],{$ACTIVEMQ.DESTINATION.PRODUCERS.MIN.TIME:"{#JMXDESTINATIONNAME}"})>={$ACTIVEMQ.DESTINATION.PRODUCERS.MIN.HIGH:"{#JMXDESTINATIONNAME}"}`	AVERAGE	Manual close: YES
{#JMXBROKERNAME}: {#JMXDESTINATIONTYPE} {#JMXDESTINATIONNAME}: Memory usage is too high	-	`last(/Apache ActiveMQ by JMX/jmx[{#JMXOBJ},MemoryPercentUsage])>{$ACTIVEMQ.MEM.MAX.WARN:"{#JMXDESTINATIONNAME}"}`	AVERAGE
{#JMXBROKERNAME}: {#JMXDESTINATIONTYPE} {#JMXDESTINATIONNAME}: Memory usage is too high	-	`last(/Apache ActiveMQ by JMX/jmx[{#JMXOBJ},MemoryPercentUsage])>{$ACTIVEMQ.MEM.MAX.HIGH:"{#JMXDESTINATIONNAME}"}`	HIGH
{#JMXBROKERNAME}: {#JMXDESTINATIONTYPE} {#JMXDESTINATIONNAME}: Message enqueue rate is higher than dequeue rate	Enqueue rate is higher than dequeue rate. It may indicate performance problems.	`avg(/Apache ActiveMQ by JMX/jmx[{#JMXOBJ},EnqueueCount],{$ACTIVEMQ.MSG.RATE.WARN.TIME:"{#JMXDESTINATIONNAME}"})>avg(/Apache ActiveMQ by JMX/jmx[{#JMXOBJ},DequeueCount],{$ACTIVEMQ.MSG.RATE.WARN.TIME:"{#JMXDESTINATIONNAME}"})`	AVERAGE
{#JMXBROKERNAME}: {#JMXDESTINATIONTYPE} {#JMXDESTINATIONNAME}: Queue size is high	Queue size is higher than threshold. It may indicate performance problems.	`min(/Apache ActiveMQ by JMX/jmx[{#JMXOBJ},QueueSize],{$ACTIVEMQ.QUEUE.TIME:"{#JMXDESTINATIONNAME}"})>{$ACTIVEMQ.QUEUE.WARN:"{#JMXDESTINATIONNAME}"} and {$ACTIVEMQ.QUEUE.ENABLED:"{#JMXDESTINATIONNAME}"}=1`	AVERAGE
{#JMXBROKERNAME}: {#JMXDESTINATIONTYPE} {#JMXDESTINATIONNAME}: Expired messages count is high	This metric represents the number of messages that expired before they could be delivered. If you expect all messages to be delivered and acknowledged within a certain amount of time, you can set an expiration for each message, and investigate if your ExpiredCount metric rises above zero.	`last(/Apache ActiveMQ by JMX/jmx[{#JMXOBJ},ExpiredCount])>{$ACTIVEMQ.EXPIRED.WARN:"{#JMXDESTINATIONNAME}"}`	AVERAGE

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.