nagios监控之所以如些流行,除了其丰富的插件扩展,灵活的架构外。其便捷的web控制也是其中的一个亮点。由于近两天在进行pnp4nagios进行图行化整合调试,而为了配合pnp4nagios的流量图,新写的流量监脚本的阀值定义的不十分准确。所以,调试过程中时不时的会出现nagios邮件和手机短信告警蜂拥而至。我个人倒无所谓,不过经理收到了,免不了又问,所以为了避免麻烦,索性暂时先在web界面上将调试主机的流量告警通知给disable掉。

nagios-notific

不过在disable时,却出现了错误,错误信息如下:

1Could not open command file '/usr/local/nagios/var/rw/nagios.cmd' for update!
2The permissions on the external command file and/or directory may be incorrect. Read the FAQs on how to setup proper permissions.
3An error occurred while attempting to commit your command for processing.  

查看nagios.cfg配置文件中对nagios.cmd文件的配置部分如下:

1command_file=/usr/local/nagios/var/rw/nagios.cmd

不过在该配置上面有这么一段注释:

1# EXTERNAL COMMAND FILE
2# This is the file that Nagios checks for external command requests.
3# It is also where the command CGI will write commands that are submitted
4# by users, so it must be writeable by the user that the web server
5# is running as (usually 'nobody').  Permissions should be set at the
6# directory level instead of on the file, as the file is deleted every
7# time its contents are processed.

无论是报错提示,还是配置文件中的注释。表达的都已经很明白了。问题原因是因为nagios程序及所有脚本的运行,都是以nagios用户进行的(视在搭建时所使用的用户而定,一般大多用nagios用户)。而apache默认安装好后,是以apache用户执行的。其对/usr/local/nagios/var/rw/nagios.cmd 文件没有执行权限。

解决方法:

修改/etc/group文件或通过usermod命令,将apache用户增加到nagios用户组。修改完成后,再重启apache和nagios,问题即可解决。

注:网上有些人给出的解决方式是,将nagios/var/rw目录及其下文件的权限修改为777,经测试,这种方法没有用。而且上面的注释已经说的非常明白,每次在web界面上操作过后,nagios.cmd会被删除被重新生成一个nagios.cmd。

1[root@test rw]# ll
2总计 0
3prw-rw---- 1 nagios nagios 0 11-08 11:49 nagios.cmd

所以,该文件的权限,在每次web操作完成了,又会恢复原样。