Related Topics: Cloud Computing, Azure Cloud on Ulitzer, Microservices Journal, Cloud Data Analytics, Amazon Cloud Journal

Article

Application Performance Management - Data Center vs Cloud Computing

How Cloud Eases Pitfalls Traditional Data Center APM

Traditional Application Performance Management
Application systems that are built to provide business capabilities for organizations need to satisfy two sets of criteria:

  • Functional Requirements so that the system meets the desired business functionalities
  • Non Functional needs in terms of Quality of Service (QoS) with the trust on
    • Responsiveness of the application to normal business scenarios
    • Responsiveness of the application to Abnormal Stress conditions
    • System Availability 24X7

Over the years organizations have invested heavily on homegrown or commercial off-the-shelf (COTS) products to manage the second aspect, i.e., the performance of the application.

The term Application Performance Management (APM) termed as a process using IT tools to detect, diagnose, report and remedy an application's performance so that the application meets the nonfunctional needs of the end users.

For the applications deployed on data centers or for the so-called ‘On Premise' applications, the following strategies have been adopted toward the performance management.

Reactive Monitoring
Reactive Monitoring is the analysis and corrective action taken after the event that caused the issue has happened. For example, most databases have tools to monitor the usage of Memory, CPU and Disk I/O and we can have countermeasures if the memory usage exceeds a certain % of the overall physical memory.

Typically profilers to track the memory leaks, bad SQL usage, high CPU intensive batch jobs and high disk I/O related operations are a part of the tools that perform the reactive monitoring.

Performance Management in Reactive Monitoring happens generally with automated scripts coupled with manual administrator activities, For example:

  • Space conditions of the database systems are mitigated by DBAs extending the table spaces to a higher space limit
  • Memory conditions can be mitigated either by restarting the application server, or the load balancer routing the requests to a standby server
  • Instance failures can be mitigated with a clustering mechanisms like MSCS (Microsoft Clustering Services)

Proactive Monitoring

Proactive monitoring consists of monitoring the areas that could potentially cause problems in the near future and take corrective actions automatically, so that the problem is solved without any intervention well before it happens.

Proactive Monitoring goes hand-in-hand in with the self-tuning capabilities of the application components such as databases. These typically include:

  • Automated Scripts to look for de-fragmentation of table spaces and correct them with another automated job to reorganize the table spaces
  • Automatic storage extension features that grows the space of databases till the disk is running out of space
  • Automatic Garbage Collection tools as part of application servers to free up memory of unused objects, typically used for Java or .NET-based applications
  • There are vendor-specific initiatives towards self-healing databases

End-to-End Monitoring
The performance of complex, multi-tiered applications is impacted by many interconnected factors, including system resources, database and application architecture, the efficiency of application code, and network infrastructure. As a result of these interdependencies, symptoms of performance problems usually appear at one or more tiers. Typical approaches of isolating the root cause involve time-consuming, manual analysis of performance metrics reported by disparate tools often owned by different IT teams. When problems occur, remediation can be a complex, lengthy, and costly process.

Typically responsiveness of an application is a combination of multiple layers; an end-to-end tracking tool is important to identify the bottlenecks. These tools base line the application performance across individual layers, so that the performance impact can be assessed.

SOA and Business Activity Monitoring
Service-oriented architecture (SOA), which is an architecture framework for software design that aims at viewing an application as a collection of services that are meaningful to the underlying business, and that can be invoked over the network in a technology independent manner, has brought in a new definition of performance monitoring termed as BAM (Business Activity Monitoring).

Business Activity Monitoring (BAM) is a collection of tools that allow you to manage aggregations, alerts, and profiles to monitor relevant business metrics (called Key Performance Indicators or KPIs). It gives you end-to-end visibility into your business processes, providing accurate information about the status and results of various operations, processes, and transactions so you can address problem areas and resolve issues within your business. It provides visibility into existing SOA, BPM and EDA investments, and third-party infrastructure such as databases, JMS servers, and web services.

Performance Monitoring & Management Tools

  • Oracle Enterprise Manager
  • IBM Tivoli Monitoring and Performance Management Tools
  • HP BTO Software
  • BMC ProactiveNet Performance Management Tools
  • Java Profilers like OptimizeIT, JProbe
  • Monitoring tools shipped with Windows Server like SCOM
  • Compuware's suite of tools for APM
  • APM Tools from Opnet

Issues with Traditional APM & How Cloud Platform Mitigates Them
While the traditional data center or ‘On Premise' application performance management is really proven over the years with the tools from vendors and custom-written scripts, it has limitations in terms of adaptability to the burst conditions on the usage of the system and hence either organizations have to invest heavily on fixed capacity hardware to handle the peak volumes or risk losing the business due to the inability to scale up. The following sections further highlight this point and how the Cloud adoption will mitigate this risk.

Typically the applications are subject to variable patterns of load. In spite of proactive monitoring options, it is always possible that applications will meet the unpredictable loads that they cannot handle with the current infrastructure, invariably in traditional data center scenarios either the application is shut down or performance is degraded resulting in loss of business or customer satisfaction.

The following are standard application load patterns.

  • Periodicity based loads, like applications have heavy traffic during work hours and reduced traffic during non business hours
  • Seasonal Predicted Bursting, where by specific seasons will have a high load like a Christmas Holiday season for retail websites.
  • Unpredictable bursting, where a application load suddenly increased due to reasons beyond control or reasons that cannot be fore seen.

While the traditional APM is reasonably good at handling the situations in the first two bullets, current mechanisms of APM cannot gracefully handle the third condition (i.e., unpredictable bursting) and even the first two conditions are handled at a increased cost factor.

Also enterprises have moved towards platform and storage virtualization which brought in additional challenges to the traditional APM, due to the way the virtualization architecture is laid out. There are multiple ways the Virtual Machines can be architected which will affect  the way the Application Performance is monitored and managed.

  • Type 1 Hyper Visor: Type 1 (ornative,bare metal) hypervisors run directly on the host's hardware to control the hardware and to monitor guest operating systems. A guest operating system thus runs on another level above the hypervisor.
  • Type 2 Hyper Visor: Hyper Visor that run within a conventional operating system. With the hypervisor layer as a distinct second software level, guest operating systems run at the third level above the hardware.

Cloud & On Demand Infrastructure
Cloud Computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources that can be rapidly provisioned and released with minimal management effort or service provider interaction.

The major attributes are:

  • Dynamic Computing Infrastructure
  • Elasticity & Pay Per Use
  • Self Managed Platform
  • Automation

This  makes a Cloud Platform  a perfect   solution to the current issues in implementing a  APM solution in data center or ‘On Premise' applications.

APM on Windows Azure
The Windows Azure platform is poised to radically change the way Microsoft architects and developers think about building and managing applications. The Windows Azure platform provides an Internet-based cloud computing environment for running applications and storing data in Microsoft data centers around the world. In many ways, you can think of it as Windows in the cloud.

The Windows Azure Platform supports the concept of elastic scale through a pricing model that is based on hourly compute increments. By changing the service configuration, either through editing it in the portal or through the use of the Management API, customers are able to adjust the amount of capcity they are running on the fly.

Windows Azure Platform provides APIs as Web Services, to perform APM activities like:

  • Gathering metrics
  • Setting and evaluating business rules
  • Taking action to dynamically scale the application

In Windows Azure, Auto Scaling is achieved by changing the instance count in the service configuration. Increasing the instance count will cause Windows Azure to start new instances, decreasing the instance count will in turn cause it to shut instances down.

Taking action to scale Windows Azure is exposed through the use of the Management API. Through a call to this API it is possible to change the instance count in the service configuration and in doing so to change the number of running instances.

APM On Amazon EC2
Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides resizable compute capacity in the cloud. It is designed to make web-scale computing easier for developers.

Amazon EC2 provides out of the box support for Elasticity and Auto Scaling with multiple components:

  • Amazon CloudWatch
    Amazon CloudWatch is a web service that provides monitoring for AWS cloud resources, starting with Amazon EC2. It provides you with visibility into resource utilization, operational performance, and overall demand patterns-including metrics such as CPU utilization, disk reads and writes, and network traffic.
  • Auto Scaling
    Auto Scaling allows you to automatically scale your Amazon EC2 capacity up or down according to conditions you define. With Auto Scaling, you can ensure that the number of Amazon EC2 instances you're using scales up seamlessly during demand spikes to maintain performance, and scales down automatically during demand lulls to minimize costs. Auto Scaling is particularly well suited for applications that experience hourly, daily, or weekly variability in usage. Auto Scaling is enabled by Amazon CloudWatch.

Future of APM in cloud & Demand for Skills
As evident Auto Scaling will be number one feature in a Cloud Platform that directly addresses the need, to reduce cost for large enterprises, which all along spend lot of money on Application Performance Management with the sole aim  running the business without interruption.

While major Cloud platforms provide support for APM, in this space there is a great demand for more automation and graphical tools to monitor the performance and dynamically scale applications. With the mastery of Cloud Provider APIs like the ones from Amazon EC2, Windows Azure, Google Apps Engine, this skill can be mastered by developers in writing new APM applications for the Cloud.

Enterprises should concentrate on utilizing the potential of Cloud for improving the end user experience without

More Stories By Srinivasan Sundara Rajan

Highly passionate about utilizing Digital Technologies to enable next generation enterprise. Believes in enterprise transformation through the Natives (Cloud Native & Mobile Native).