With Opsview, we’re always looking at how to improve performance. We have some users with really large systems, so getting the best out of Opsview is imperative to the user experience.
One thing we’ve done is to create post reload...
Opsview Development Team
With Opsview, we’re always looking at how to improve performance. We have some users with really large systems, so getting the best out of Opsview is imperative to the user experience. One thing we’ve done is to create post reload helper tables. As all status data is stored in a MySQL database - and our status views need to query the database - this is one part that needs to work fast and efficiently. We tried some initial queries to get summarised status data across 10000 services, grouping results by host groups and separated out into handled and unhandled services. This used to take 30 seconds to run, which is clearly unacceptable. With our helper tables, this reduced the query time down to 0.4 seconds! But the helper tables do more than that - they also provide security, to ensure that users only see the hosts and services that they should. We have a table called opsview_contact_services, which lists exactly which services a user has access to. This is really important if you have multiple customers using your Opsview system, since you don’t want one customer to see other customer’s information. So these helper tables are great because they save a lot of time for each individual query, at the expense of a one-time generation cost. But what do you do if the generation time takes too long? One of our largest users, a major north American health organisation, had this problem. Their post reload tasks were taking 4 minutes 20 seconds to run, which is a large amount of time as status information would not be updated during this period. With some database analysis and some changes to our application queries, we were able to reduce this time down to 1 minute 15 seconds. That’s a 70% reduction! And that’s not all - with the latest release of Opsview, we’ve also made this post reload task multi-threaded. This means that the tables can be generated while high priority status information is being updated concurrently. So now Opsview users will know that the latest information on their screens reflects the data as it is coming in. Now, 1 minute 15 seconds could be considered a large amount of time, but considering this is for a system with over 9000 hosts and over 24000 services, that’s not too shabby! So the gains made at this one implementation are now available to all Opsview users. Happy upgrading!