一 环境 1.1 操作系统 [root@host-xxxsoft]# lsb_release -a LSB Version: :base-4.0-amd64:base-4.0-noarch:core
一 环境
1.1 操作系统
[root@host-xxxsoft]# lsb_release -a
LSB Version: :base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noarch
Distributor ID: CentOS
Description: CentOS release 6.6 (Final)
Release: 6.6
Codename: Final
[root@host-xxx soft]#
1.2 zabbix 版本 agent 和server 以及webfront 都市2.4.6
[wls81@host-xxxx sbin]$ ./zabbix_agent --version
Zabbix agent v2.4.6 (revision 54796) (10 August 2015)
Compilation time: Nov 2 2015 21:29:13
1.3 目前我这边监控了791台虚拟机
二 问题
特此说明:此问题不是zabbix web页面 出现红色的 zabbix server is not running
2.1 web 端
页面显示zabbix_server 不在运行
zabbixserver 还报如下错误
Less than 25% free in the trends cache
2.2 agent 端日志
28079:20161012:121243.196 active check configuration update from [192.168.176.25:10051] started to fail (cannot connect to [[192.168.176.25]:10051]: [4] Interrupted system call)
28079:20161012:122102.894 active check configuration update from [192.168.176.25:10051] is working again
28079:20161012:130105.458 active check configuration update from [192.168.176.25:10051] started to fail (ZBX_TCP_READ() failed: [4] Interrupted system call)
28079:20161012:153008.930 active check configuration update from [192.168.176.25:10051] is working again
28079:20161012:160811.493 active check configuration update from [192.168.176.25:10051] started to fail (ZBX_TCP_READ() failed: [4] Interrupted system call)
28079:20161013:104855.178 active check configuration update from [192.168.176.25:10051] is working again
28079:20161013:112258.667 active check configuration update from [192.168.176.25:10051] started to fail (cannot connect to [[192.168.176.25]:10051]: [4] Interrupted system call)
并且 从agent端 telent server端 10051 不通
2.3 zabbix server
zabbix_server 进程是活的,端口10051 也是监听的。
三解决思路
还是看日志
最后是定位这个配置,默认小了导致的。
### Option: TrendCacheSize
# Size of trend cache, in bytes.
# Shared memory size for storing trends data.
#
# Mandatory: no
# Range: 128K-2G
# Default:
# TrendCacheSize=4M
TrendCacheSize=400M