近日在改造老的nagios 监控方式,原有的nrpe和snmp方式的监控不变,新的主机以check_mk的方式进行监控。由于之前做过相关的搭建,在功能新增上基本上没遇到什么瓶颈。不过还是遇到了三个小问题,这里做下总结。

问题一、Parents 父节点绘图问题

现新建立的KVM虚拟主机的父节点都是物理服务器,但物理服务器之前是由nrpe的方式进行监控的。新增的虚拟以check_mk wato方式增加时,输入父节点时会报警告

1Warning: This host has an invalid configuration!
2You defined the non-existing host 'xxx.37.194.xx' as a parent.

如下图所示:

check-mk-parent

通过scan for parents也找不到可用的父节点。如果增加时不选择父节点 。在nagios的map上查看时,发现其是一个根节点(this is a root host )。显然这在实际的网络结构中不对。通过查看官网资料

https://mathias-kettner.de/checkmk_parents.html

发现解决方法如下

方法1、不管警告信息,直接按wato的各项填写完成后save & Finish 。虽然告警,但在nagios map上查看到的结果是正常的。

方法2、修改/usr/local/nagios/etc/hosts/check_mk_objects.cfg 文件(具体根据自己nagios的编译位置查找),打开该文件后,发现其配置方式和nrpe配置里的配置方式相同。在define host里增加parents 项并重启nagios 即可。

注:该方法缺陷比较大,该文件头上有提示Created by Check_MK. Do not edit. 每次在check_mk的配置界面上更改配置并change active后,新加的配置又没了。所以该方法不推荐用。

方法3、参照官方文档,更改/etc/check_mk/conf.d/wato/hosts.mk文件,按check_mk官网格式添加配置,并cmk -O生效 。不过该方法同样会出现方法2中的问题。

方法4、在/etc/check_mk/conf.d/wato/中新建一个.mk文件,名字随意取。将配置写到该文件中,完成后cmk -O生效。无论在web界面上如何更改,也不会修改该配置中的内容。

问题二、statusmap_image图标问题

该问题也是nagios map上的小问题,按上面的方法增加好host并配置好parent后,默认显示的图标是个问号 。web界面上又找不到配置图标的地方。使用问题一 中的方法2和方法3肯定一样不行。同样,按照问题一中的方法4,新增一个配置文件,增加如下内容:

1# cat statusmap.mk
2extra_host_conf['statusmap_image'] = [
3   ( 'linux40.gd2', ['prod', ], ALL_HOSTS ),
4]

这里是以tag组的方式添加的,也可以以host的方式添加,具体看上面checkmk_parents文档里的语法就行了。该问题的解决方法,参考以下页面。

http://www.monitoring-portal.org/wbb/index.php?page=Thread&threadID=30166

问题三、网卡流量检测出图异常

该问题在之前版本中的check_mk中没遇到,目前使用Check_MK 1.2.4 (stable) 安装后,其他插件出图正常。报错内容为:

1unknow function 'PERCENTNAN' in VDEF outper

问题原因:新的check_mk pnp4nagios模板文件写的有问题。使用以前旧的模板文件即可。

解决方法:在/usr/local/pnp4nagios/share/templates 模板中,将check_mk-lnx_if.php文件(即check_mk-if.php文件,ln软链的)的内容更改为以下内容。

  1<?php # +------------------------------------------------------------------+
  2# |             ____ _               _        __  __ _  __           |
  3# |            / ___| |__   ___  ___| | __   |  /  | |/ /           |
  4# |           | |   | '_  / _ / __| |/ /   | |/| | ' /            |
  5# |           | |___| | | |  __/ (__|   <    | |  | | .             |
  6# |            ____|_| |_|___|___|_|____|_|  |_|_|_           |
  7# |                                                                  |
  8# | Copyright Mathias Kettner 2013             mk@mathias-kettner.de |
  9# +------------------------------------------------------------------+
 10#
 11# This file is part of Check_MK.
 12# The official homepage is at http://mathias-kettner.de/check_mk.
 13#
 14# check_mk is free software;  you can redistribute it and/or modify it
 15# under the  terms of the  GNU General Public License  as published by
 16# the Free Software Foundation in version 2.  check_mk is  distributed
 17# in the hope that it will be useful, but WITHOUT ANY WARRANTY;  with-
 18# out even the implied warranty of  MERCHANTABILITY  or  FITNESS FOR A
 19# PARTICULAR PURPOSE. See the  GNU General Public License for more de-
 20# ails.  You should have  received  a copy of the  GNU  General Public
 21# License along with GNU Make; see the file  COPYING.  If  not,  write
 22# to the Free Software Foundation, Inc., 51 Franklin St,  Fifth Floor,
 23# Boston, MA 02110-1301 USA.
 24setlocale(LC_ALL, 'C');
 25# Performance data from check:
 26# in=6864.39071505;0.01;0.1;0;125000000.0
 27# inucast=48.496962273;0.01;0.1;;
 28# innucast=4.60122981717;0.01;0.1;;
 29# indisc=0.0;0.01;0.1;;
 30# inerr=0.0;0.01;0.1;;
 31# out=12448.259172;0.01;0.1;0;125000000.0
 32# outucast=54.9846963152;0.01;0.1;;
 33# outnucast=10.5828285795;0.01;0.1;;
 34# outdisc=0.0;0.01;0.1;;
 35# outerr=0.0;0.01;0.1;;
 36# outqlen=0;;;;10000000
 37# Graph 1: used bandwidth
 38# Determine if Bit or Byte.
 39# Change multiplier and labels
 40$unit = "B";
 41$unit_multiplier = 1;
 42$vertical_label_name = "MByte/sec";
 43if (strcmp($MIN[11], "0.0") == 0) {
 44    $unit = "Bit";
 45    $unit_multiplier = 8;
 46    $vertical_label_name = "MBit/sec";
 47}
 48$bandwidth = $MAX[1]  * $unit_multiplier;
 49$warn      = $WARN[1] * $unit_multiplier;
 50$crit      = $CRIT[1] * $unit_multiplier;
 51# Horizontal lines
 52$mega        = 1024.0 * 1024.0;
 53$mBandwidthH = $bandwidth / $mega;
 54$mWarnH      = $warn      / $mega;
 55$mCritH      = $crit      / $mega;
 56# Break down bandwidth, warn and crit
 57$bwuom = ' ';
 58$base = 1000;
 59if($bandwidth ?> $base * $base * $base) {
 60    $warn /= $base * $base * $base;
 61    $crit /= $base * $base * $base;
 62    $bandwidth /= $base * $base * $base;
 63    $bwuom = 'G';
 64} elseif ($bandwidth > $base * $base) {
 65    $warn /= $base * $base;
 66    $crit /= $base * $base;
 67    $bandwidth /= $base * $base;
 68    $bwuom = 'M';
 69} elseif ($bandwidth > $base) {
 70    $warn /= $base;
 71    $crit /= $base;
 72    $bandwidth /= $base;
 73    $bwuom = 'k';
 74}
 75if ($mBandwidthH  0){
 76    $bandwidthInfo = " at bandwidth ${bwuom}${unit}/s";
 77}
 78$ds_name[1] = 'Used bandwidth';
 79$opt[1] = "--vertical-label "$vertical_label_name" -l -$range -u $range -X0 -b 1024 --title "Used bandwidth $hostname / $servicedesc $bandwidthInfo" ";
 80$def[1] =
 81  "HRULE:0#c0c0c0 ";
 82  if ($mBandwidthH)
 83      $def[1] .= "HRULE:$mBandwidthH#808080:"Port speed:  " . sprintf("%.1f", $bandwidth) . " ".$bwuom."$unit/s\n" ".
 84                 "HRULE:-$mBandwidthH#808080: ";
 85   if ($warn)
 86      $def[1] .= "HRULE:$mWarnH#ffff00:"Warning:                " . sprintf("%6.1f", $warn) . " ".$bwuom."$unit/s\n" ".
 87                 "HRULE:-$mWarnH#ffff00: ";
 88   if ($crit)
 89      $def[1] .= "HRULE:$mCritH#ff0000:"Critical:               " . sprintf("%6.1f", $crit) . " ".$bwuom."$unit/s\n" ".
 90                 "HRULE:-$mCritH#ff0000: ";
 91  $def[1] .= "DEF:inbytes=$RRDFILE[1]:$DS[1]:MAX ".
 92  "DEF:outbytes=$RRDFILE[6]:$DS[6]:MAX ".
 93  "CDEF:intraffic=inbytes,$unit_multiplier,* ".
 94  "CDEF:outtraffic=outbytes,$unit_multiplier,* ".
 95  "CDEF:inmb=intraffic,1048576,/ ".
 96  "CDEF:outmb=outtraffic,1048576,/ ".
 97  "CDEF:minusoutmb=0,outmb,- ".
 98  "AREA:inmb#00e060:"in                    " ".
 99  "GPRINT:intraffic:LAST:"%6.1lf %s$unit/s last" ".
100  "GPRINT:intraffic:AVERAGE:"%6.1lf %s$unit/s avg" ".
101  "GPRINT:intraffic:MAX:"%6.1lf %s$unit/s max\n" ".
102  "AREA:minusoutmb#0080e0:"out                   " ".
103  "GPRINT:outtraffic:LAST:"%6.1lf %s$unit/s last" ".
104  "GPRINT:outtraffic:AVERAGE:"%6.1lf %s$unit/s avg" ".
105  "GPRINT:outtraffic:MAX:"%6.1lf %s$unit/s max\n" ";
106if (isset($DS[12])) {
107  $def[1] .=
108  "DEF:inbytesa=$RRDFILE[12]:$DS[12]:MAX ".
109  "DEF:outbytesa=$RRDFILE[13]:$DS[13]:MAX ".
110  "CDEF:intraffica=inbytesa,$unit_multiplier,* ".
111  "CDEF:outtraffica=outbytesa,$unit_multiplier,* ".
112  "CDEF:inmba=intraffica,1048576,/ ".
113  "CDEF:outmba=outtraffica,1048576,/ ".
114  "CDEF:minusoutmba=0,outmba,- ".
115  "LINE:inmba#00a060:"in (avg)              " ".
116  "GPRINT:intraffica:LAST:"%6.1lf %s$unit/s last" ".
117  "GPRINT:intraffica:AVERAGE:"%6.1lf %s$unit/s avg" ".
118  "GPRINT:intraffica:MAX:"%6.1lf %s$unit/s max\n" ".
119  "LINE:minusoutmba#0060c0:"out (avg)             " ".
120  "GPRINT:outtraffica:LAST:"%6.1lf %s$unit/s last" ".
121  "GPRINT:outtraffica:AVERAGE:"%6.1lf %s$unit/s avg" ".
122  "GPRINT:outtraffica:MAX:"%6.1lf %s$unit/s max\n" ";
123}
124# Graph 2: packets
125$ds_name[2] = 'Packets';
126$opt[2] = "--vertical-label "packets/sec" --title "Packets $hostname / $servicedesc" ";
127$def[2] =
128  "HRULE:0#c0c0c0 ".
129  "DEF:inu=$RRDFILE[2]:$DS[2]:MAX ".
130  "DEF:innu=$RRDFILE[3]:$DS[3]:MAX ".
131  "AREA:inu#00ffc0:"in unicast              " ".
132  "GPRINT:inu:LAST:"%7.2lf/s last  " ".
133  "GPRINT:inu:AVERAGE:"%7.2lf/s avg  " ".
134  "GPRINT:inu:MAX:"%7.2lf/s max\n" ".
135  "AREA:innu#00c080:"in broadcast/multicast  ":STACK ".
136  "GPRINT:innu:LAST:"%7.2lf/s last  " ".
137  "GPRINT:innu:AVERAGE:"%7.2lf/s avg  " ".
138  "GPRINT:innu:MAX:"%7.2lf/s max\n" ".
139  "DEF:outu=$RRDFILE[7]:$DS[7]:MAX ".
140  "DEF:outnu=$RRDFILE[8]:$DS[8]:MAX ".
141  "CDEF:minusoutu=0,outu,- ".
142  "CDEF:minusoutnu=0,outnu,- ".
143  "AREA:minusoutu#00c0ff:"out unicast             " ".
144  "GPRINT:outu:LAST:"%7.2lf/s last  " ".
145  "GPRINT:outu:AVERAGE:"%7.2lf/s avg  " ".
146  "GPRINT:outu:MAX:"%7.2lf/s max\n" ".
147  "AREA:minusoutnu#0080c0:"out broadcast/multicast ":STACK ".
148  "GPRINT:outnu:LAST:"%7.2lf/s last  " ".
149  "GPRINT:outnu:AVERAGE:"%7.2lf/s avg  "  ".
150  "GPRINT:outnu:MAX:"%7.2lf/s max\n" ";
151# Graph 3: errors and discards
152$ds_name[3] = 'Errors and discards';
153$opt[3] = "--vertical-label "packets/sec" -X0 --title "Problems $hostname / $servicedesc" ";
154$def[3] =
155  "HRULE:0#c0c0c0 ".
156  "DEF:inerr=$RRDFILE[5]:$DS[5]:MAX ".
157  "DEF:indisc=$RRDFILE[4]:$DS[4]:MAX ".
158  "AREA:inerr#ff0000:"in errors               " ".
159  "GPRINT:inerr:LAST:"%7.2lf/s last  " ".
160  "GPRINT:inerr:AVERAGE:"%7.2lf/s avg  " ".
161  "GPRINT:inerr:MAX:"%7.2lf/s max\n" ".
162  "AREA:indisc#ff8000:"in discards             ":STACK ".
163  "GPRINT:indisc:LAST:"%7.2lf/s last  " ".
164  "GPRINT:indisc:AVERAGE:"%7.2lf/s avg  " ".
165  "GPRINT:indisc:MAX:"%7.2lf/s max\n" ".
166  "DEF:outerr=$RRDFILE[10]:$DS[10]:MAX ".
167  "DEF:outdisc=$RRDFILE[9]:$DS[9]:MAX ".
168  "CDEF:minusouterr=0,outerr,- ".
169  "CDEF:minusoutdisc=0,outdisc,- ".
170  "AREA:minusouterr#ff0080:"out errors              " ".
171  "GPRINT:outerr:LAST:"%7.2lf/s last  " ".
172  "GPRINT:outerr:AVERAGE:"%7.2lf/s avg  " ".
173  "GPRINT:outerr:MAX:"%7.2lf/s max\n" ".
174  "AREA:minusoutdisc#ff8080:"out discards            ":STACK ".
175  "GPRINT:outdisc:LAST:"%7.2lf/s last  " ".
176  "GPRINT:outdisc:AVERAGE:"%7.2lf/s avg  " ".
177  "GPRINT:outdisc:MAX:"%7.2lf/s max\n" ";
178?>