Nagios: Fixing "error: Could not stat() command file" (on Debian)
Nagios is a network monitoring tool. I use it to track web servers, mail servers, and whatever else I have running on the LAN and on the Internet.
One common configuration issue is getting the Service Commands menu to work correctly. By default, it is visible in the UI, but disabled on the server backend. And on Debian, not all of the steps to enable it are particularly evident from the docs. Often, one will recieve the cryptic error Could not stat() command file pointing to /var/lib/nagios3/rw/nagios.cmd. This can be fixed without too much fuss.
- Debian 5.3
- Nagios 3
Make sure you have something like this in
# EXTERNAL COMMAND OPTION # Values: 0 = disable commands, 1 = enable commands check_external_commands=1 # EXTERNAL COMMAND CHECK INTERVAL # NOTE: Setting this value to -1 causes Nagios to check the external # command file as often as possible. command_check_interval=15s #command_check_interval=-1 # EXTERNAL COMMAND FILE command_file=/var/lib/nagios3/rw/nagios.cmd
You will need to restart Nagios for these settings to be loaded.
Next, edit the
/etc/nagios3/cgi.cfg file and check the following:
# SYSTEM/PROCESS COMMAND ACCESS authorized_for_system_commands=nagiosadmin # ... Other stuff cut # GLOBAL HOST/SERVICE COMMAND ACCESS authorized_for_all_service_commands=nagiosadmin authorized_for_all_host_commands=nagiosadmin
This enables the user nagiosadmin to submit commands. You can check out the in-file documentation for setting up multiple users.
Restart Apache2 after you have done this.
Add www-data to the Nagios Group
Next, make sure that the user www-data is in the group
nagios. You can check on this in
Set the Permissions
This last item is the tricky one. You need to set the correct permissions for the command pipe file (this is the file set in nagios.cfg). Nagios uses this pipe to pass data from the CGI to the backend daemon.
This file is located in
By default, it should have the following permissions:
# ls -l /var/lib/nagios3/rw/nagios.cmd prw-rw---- 1 nagios nagios 0 2010-01-13 10:17 /var/lib/nagios3/rw/nagios.cmd
Notice that the group ownership is
nagios. That's why you added
www-data to the
nagios group above.
But even with these permissions, you may (will?) still get an error. The reason for this is that the parent directory,
rw, does not allow any user but
nagios to access its contents:
# ls -lh /var/lib/nagios3 total 92K -rw-r--r-- 1 nagios nagios 33K 2010-01-04 17:06 objects.precache -rw------- 1 nagios www-data 48K 2010-01-13 10:17 retention.dat drwx------ 2 nagios www-data 4.0K 2010-01-13 10:17 rw drwxr-x--- 3 nagios nagios 4.0K 2010-01-04 16:00 spool
All you need to do to fix this is add the execute bit (
x) to the
# chmod g+x /var/lib/nagios3/rw
Now any member of the nagios group (including
www-data) should be able to access the contents of the
At this point, you should be able to execute Nagios Service Commands through the web interface. (If not, try doing a stop/start of nagios and try again).