Monitor the operational status of a Zylinc solution, part 6
In this post (the last in the series), we look at how to monitor the SQL Server that runs the Zylinc solution’s databases.
Background: Many organizations want to be able to monitor how their critical IT and communications solutions perform, so that they can quickly deal with any potential problems before many users become affected. If you work in IT operations or a similar function, and your organization uses Zylinc, this blog series is for you. It covers what you need to know if you’re going to monitor the operational status of a Zylinc solution.
A Zylinc solution has two important SQL Server databases, called ZyDB and ZyStatDB.
When you install the Zylinc solution, you set up a SQL Server service that runs the databases. The service can reside on the server that hosts the rest of the Zylinc backend’s Windows-based software (the Zylinc Windows Application Server), or on a dedicated server, or on a cluster.
We recommend that you monitor the SQL Server service, no matter where it’s located.
One of the most common issues that can cause your Zylinc solution to become unavailable is if a database runs out of available space and becomes read-only. If you want to proactively prevent this, we recommend that you keep an eye on the following:
- Monitor that all disks in your SQL server always have a reasonable amount of free space available, for example at least 5 GB of available space per disk.
- Check recovery models and backup schedules for all your databases
Let’s start by looking at how to check recovery models and backup schedules. You can do that manually, but you can also use a script that automates the procedure:
Check database recovery models and backup schedules
If your backup schedule doesn’t include a regular transaction log backup, and the recovery model for a database is set to Full or Bulk-logged, the database transaction log files (LDF files) will expand until the disk is full. The database will then become read-only, and your Zylinc solution will become unavailable. We don’t want that!
You can use SQL Server Management Studio to go through all your databases and manually check the recovery model for each of them. If you find a database that has its recovery model set to Full or Bulk-logged, you should check if the Last Database Log Backup is not too old, for example older than 24 hours.
The advantage of frequent log backups is that they increase the frequency of log truncation, which results in smaller log files. That’s why, if the Last Database Log Backup is not too old, you can assume, with a high degree of certainty, that a backup schedule includes a regular transaction log backup for that database, which in turn indicates that your configuration is OK.
However, there’s an easier way to do this: You can automatically check your configuration with a SQL script that you can download from Zylinc unified help.
The script will return warnings if backups are too old, or if backups have never been performed. You can automatically run the script at regular intervals, customize the highest acceptable age of a backup, and make the script return an overview of the most recent backups, if you need that.
Defragment indexes in your databases
Databases contain indexes that can get fragmented. Because of that, we recommend that you regularly check if you need to defragment your database indexes. You can read more about this, and about what to do if your database indexes have become too fragmented, on Zylinc unified help.
Alternatively, a database administrator (dba) can set up maintenance plans on the SQL server to automatically schedule the tasks.
Advanced analysis with SQL Server Profiler
If you, or your database administrator (dba), want to use SQL Server Profiler to set up a trace, you may find the following column filters useful:
- Duration with the condition greater than or equal to, for example, 2000ms will warn you if queries on your SQL server take too long to execute
- Reads with the condition greater than or equal to, for example, 65000 pages will warn you if queries on your SQL server have become inefficient
If you identify issues that require changes to, for example, indexes or stored procedures, you need to contact Zylinc support to get assistance to make such changes. That’s because your license agreement with Zylinc doesn’t permit you to change indexes, stored procedures, etc. in databases provided by Zylinc.
Monitor network port of SQL Server service
We recommend that you monitor the default network port for the SQL Server service, which is typically port 1433/tcp.
Monitor SQL user that Zylinc system uses
When you install the Zylinc solution, you typically create a mixed mode security user with the name ZyUser and db_owner database role memberships for the two Zylinc databases, ZyDB and ZyStatDB. The Zylinc solution then uses ZyUser to access the two databases.
We recommend that you monitor that ZyUser can always access the two databases: Both databases contain a table with the name database_info that you can query. This query should always return at least one row.
You can set up two monitoring tasks that uses ZyUser to log in to the two databases ZyDB and ZyStatDB and executes the following query:
select * from database_info
As I mentioned before, the query should return minimum one row.
Monitor that both databases can be updated
Databases can be online and readable, but if the disk that contains the transaction log becomes corrupt or full, or if the database size exceeds the maximum allowed size (which on a SQL Server Express Edition can typically be as little as 10 GB), the database becomes read-only, and the Zylinc solution will no longer work. Again, we don’t want that!
That’s why we recommend that you set up two monitoring tasks that check the state of the updateability database property for both ZyDB and ZyStatDB:
SELECT DATABASEPROPERTYEX('zydb', 'Updateability');
SELECT DATABASEPROPERTYEX('zystatdb', 'Updateability');
Each query should return a row that contains the value READ_WRITE.
For more information, refer to the documentation for Microsoft SQL Server and your monitoring tool.
Earthworms have two ends, or so it seems at least. However, this blog series only has one, and we’re about to reach it: This is the end of our operational status monitoring series. I hope that you’ve been able to use some of the stuff that we’ve covered in the series.