monitoring-microsoft-sql-server.pngThis blog post takes a deep-dive into monitoring Microsoft SQL Server 2008 with Opsview V4. In order to monitor MS SQL 2008 (known hereon as SQL2K8), we will have to leverage Opsview’s ability to support Nagios plugins which can be found on the Exchange.Nagios.org website.

First we will need to monitor a few services. To do this we can use the Opsview guide “Monitoring Windows Services”.
The main 3 SQL services we may wish to monitor are:

  • MSSQLSERVER - “SQL Service server”
  • SQLSERVERAGENT - “SQL Server agent”
  • SQLBrowser - “SQL Server Browser”
Firstly login to Opsview and navigate to “SETTINGS > SERVICE CHECKS” and scroll down to “Database - MS SQL Server” where we can find a few pre-configured checks.

We will need to modify these checks to include the correct service names (as above). For example, the service check for “MS SQL Server Service” should have arguments of:

“ -H $HOSTADDRESS$ -c nsc_checkservicestate -a ‘ShowAll MSSQLSERVER=started’ ”.
Once these services have been created, We should have the following:

Name Arguments
MS SQL Agent Service check_nrpe -H $HOSTADDRESS$ -c nsc_checkservicestate -a ‘ShowAll SQLSERVERAGENT=started’
MS SQL Browser check_nrpe -H $HOSTADDRESS$ -c nsc_checkservicestate -a ‘ShowAll SQLBrowser=started’
MS SQL Server Listener check_tcp -H $HOSTADDRESS$ -p 1433 -w 2 -c 5
MS SQL Server Service check_nrpe -H $HOSTADDRESS$ -c nsc_checkservicestate -a ‘ShowAll MSSQLSERVER=started’

 mssql-database-1.png

Check MSSQL Database Stats”, from the link here. Download the .py script, and copy it to the folder /usr/local/nagios/libexec. Once there, do:

“chown nagios:nagios check_mssql_database.py”,

and then

“chmod 755 check_mssql_database.py”.

This will now show up in our service checks as a plugin. In order to run the plugin on the platform, we will need to install a small plugin called “pymssql”.

To do this, follow the instructions below:

UBUNTU/DEBIAN: sudo apt-get install python-pymssql

RHEL:  yum install pymssql  (Some repos have it python-pymssql)


Once this plugin is installed, we can run a check on our MS SQL server to make sure we can access it. To do this, run a command similar to the one below, but with your information added:

./check_mssql_database.py -H ipaddress -U username -P password -T _Total -p 1433

This command works as follows:

“-H” is the ip address or hostname of the server on which the MS SQL Server 2008 install resides.

Important:
the username and password (-U and -P respectively) are the”SQL SERVER AUTHENTICATION” login details, and not the “Windows authentication” login details.

Generally, the admin username will be “sa” for example. -T specifies the table name; with the operator “_Total” telling the query to look at all tables, and finally, -p specifies the port number - generally 1433.

mssql-database-2.png

If the command works, you should see the following output:

OK:Time to connect is 0.004s|Time_to_connect=0.004s;;;;

If not, you may need to review the login details, if you see something similar to the following:
CRITICAL:Unable to access SQL Server.

As mentioned, the above output was obtained by using the AD credentials and not the SQL server credentials to try and access the box.

Once we have the “OK..” message, we can begin to query the SQL server from within Opsview.

So now we can login to Opsview again (if closed), and begin to create our service checks.

To do this, go to “SETTINGS > SERVICE CHECKS” and click the plus sign in the top left corner.

In this example, we are going to monitor the size of the database. To do this, use the plugin we just installed, and add the arguments, similar to those ran earlier, but with the “-H $HOSTADDRESS$” variable, and the %DATABASE% attribute, for scalability.

We’ll explain these later on in the guide.

mssql-database-3.png With the full arguments field reading:

-H $HOSTADDRESS$ -U sa -P password11! -T %DATABASE% -p 1433 —datasize -w 70000 -c 99000

The key differentiator from hereon are the options after “-p 1433..”.

We are specifying here we want to monitor “—datasize” (so size of DB), and that we want to be warned if we go above 70000KB, (yellow alert in Opsview) and critical alert on 99000KB or more (Red in Opsview). 

For the full list of operators, see the bottom of this guide.

Now we have created our service check (we can do the above for all the operators, where applicable), we now need to add them to our MS SQL host. But before we can do that, we need to add the attribute “%DATABASE%” first.

To do this, go to “SETTINGS > Attributes” and click the plus sign. In the “Name” box, enter “DATABASE”, and click “submit changes”, it is that simple!
mssql-database-4.png Next, we must create this host via “SETTINGS > HOST” and click the “plus” sign at the top left.

Populate all the fields appropriately, using the IP address of the server on which the SQL DB is running on.

Once done, click onto the “MONITORS” tab. Here is where we will assign our new service checks to our host, as below:

mssql-database-5.png Once we have added our monitors, click “ATTRIBUTES”, and select in the dropdown box our new attribute, “DATABASE”.

In the value field, enter the name of your database - or if you want to monitor all, we can use the “_Total” operator.

Finally, click submit, and now have our SQL Server 2008 host created, with the attribute specified, and all our new service checks enabled.

Now this is done, we just need to reload opsview due to the configuration changes using “SETTINGS -> APPLY CHANGES

mssql-database-6.jpeg And Opsview will reload, your new host will be available and will be monitoring your new service checks, as below (with a lot more service checks):

mssql-database-7.jpeg Reference: Below are the options which can be applied to the command, as per the output of the command ./check_mssql_database.py -h I have also added a “arguments” section.

Option Arguments Notes
—activetrans -H $HOSTADDRESS$ -U sa -P password11\! -T %DATABASE% -p 1433 —activetrans -w 5 -c 10 Monitors # of active transactions, warns on over 5, and critical on over 10 transacations.
—datasize -H $HOSTADDRESS$ -U sa -P password11\! -T %DATABASE% -p 1433 —datasize -w 70000 -c 99000 Monitors size of database, warns on over 70000kb, critical on 99000Kb.
—logcachehit -H $HOSTADDRESS$ -U sa -P password11\! -T %DATABASE% -p 1433 —logcachehit -w 75: -c 50: Monitors log cache hit in %. Less than 50% gives us critical, less than 75% gives warning.
—logwait H $HOSTADDRESS$ -U sa -P password11\! -T %DATABASE% -p 1433 —logwait -w 30 -c 50 In .ms, how long it takes to retrieve logs.
—loggrowths -H $HOSTADDRESS$ -U sa -P password11\! -T %DATABASE% -p 1433 —loggrowths -w 5 -c 10 Check Log Growths
—time2connect -H $HOSTADDRESS$ -U sa -P password11\! -T %DATABASE% -p 1433 —time2connect -w 0010 -c 0050 Time in .ms to connect to DB.
—transpsec -H $HOSTADDRESS$ -U sa -P password11\! -T %DATABASE% -p 1433 —transpsec -w 20 -c 30  

 

Enhanced by Zemanta