IT Operations Monitoring Best Practices to Ensure Operational Efficiency

In a recent post, we discussed the concerns and challenges that infrastructure and operations (I&O) teams face when they consider AIOps solutions to address their IT infrastructure management needs. In this entry, we’ll look into IT operations monitoring best practices and related considerations when implementing AIOps and other emerging technologies designed to drive operational efficiencies.

The goal of AIOps is to deliver actionable insights that, when combined with automation, provide continuous availability and superior performance of business services. With that in mind, it’s no surprise that at the heart of all this is the need to effectively monitor your complex infrastructure and operational domains. Committing to IT operations monitoring best practices will help your team deliver the business services that your customers and employees depend on.

The IT Monitoring Landscape

For perspective, let’s look at the IT operations monitoring best practices and operations management tools available to you today. Most IT operations teams already have multiple initiatives in play, for example:

  • IT infrastructure monitoring (ITIM), which focuses on understanding the availability of the elements that make up the infrastructure.
  • Network performance monitoring and diagnostics (NPMD), which provides historical, real-time and predictive views into the availability and performance of the network and the application traffic running on it.
  • Application performance monitoring (APM), which observes the performance and availability of software applications to maintain an expected level of service.
  • Digital experience monitoring (DEM), which quantifies and understands the true nature of the end user experience.

Each of these monitoring domains have different objectives, tools, processes and teams. They collect and analyze various datasets to cater to the specific operational efficiency needs they are designed to address.

Identifying Relevant Data

It’s not uncommon to have overlapping operations management tools with similar functionality across teams. Thanks to its very nature, there is a tendency for information to be fragmented and siloed across the organization.

As portrayed in the figure below, for an AIOps solution to be effective, there is a need to capture selective information from across the monitoring landscape to achieve the primary objective: gaining insight into the behavior and performance of the infrastructure.

There is also no shortage of information. In fact, the opposite is true. There is an absolutely overwhelming volume of data. And not all of that data is necessarily useful.

So, what exactly constitutes an effective, best-practice monitoring approach in such a situation? Let’s look at it from the top down:

  1. It must facilitate AIOps and efficiently transcend the disparate data gathering needs of IT operations management software that is focused on disparate domains such as IT infrastructure, application performance, and so on.
  2. To avoid getting swamped, it must focus on the targeted collection of data.
  3. This is where, particularly in the AIOps era, a data-driven strategy (DDS) becomes essential to define and design monitoring.

Using AIOps for IT operations monitoring best practices.
Gartner: IT initiatives that comprise AIOps.

Adopt a Data-Driven Strategy

Your DDS consists of a data management strategy that is focused on what data to collect and how to enrich, store and disseminate it. It should be developed in parallel with a use case strategy that describes how this data can be used to provide operational efficiency and value to the business.

To leverage DDS as one of your core IT operations monitoring best practices, the focus needs to be not just on the gathering of the data, but also on the actual use of this information. Every piece of information needs to be viewed not as a data type alone, but as a data type paired with the purpose it serves.

Gartner has defined different kinds of data that have the ability to provide different kinds of insights, for example:

  • Log/event data builds awareness of local device changes and events
  • Metric data collects performance and usage over time
  • Flow data provides metadata on end-to-end conversations in the network
  • Packet data supplies full conversation content for diagnostics and security audits
  • Configuration data captures device configuration and change management
  • Forwarding routing and path data examines traffic flow between endpoints

According to Gartner, some of this data is historical and some is collected in real-time. There are also passive and active approaches to sourcing the data.

Evolve Your Use Cases

Of course, once the data is collected, there are techniques such as aggregation, enrichment and analysis that help glean insights. But all this is secondary to initially determining what data to collect in the first place. How can you do this?

This is where the use case component of the DDS comes into play: your first step is to identify business use cases. These use cases define the kinds of monitoring data that needs to be collected to drive desired outcomes.

AIOps has immense potential to transform IT operations. However, a best practices-based data-driven strategy is absolutely essential to define the monitoring aspects. To achieve operational efficiency, DDS must be incorporated into the overall IT operations management framework.

Without a DDS in place, AIOps initiatives run the risk of getting overwhelmed in a whirlwind of data, with no clear purpose to either their collection or usage.

For more information about IT operations management and AIOps, visit the Optanix Platform.

1Gartner, 19 November 2018, “Use AIOps for a Data-Driven Approach to Improve Insights From IT Operations Monitoring Tools.” ID: G00374388.
2Gartner, 6 December 2018, “Rethink Network Monitoring for the Cloud Era.” ID: G00371293

Success stories

A LEADING PAYMENT SERVICE PROVIDER

“The Optanix single unified platform replaced multiple point tools, reducing the TCO.”