Ibidunmoye, O., Hernández-Rodriguez, F., Elmroth, E. (2015). Performance anomaly detection and bottleneck identification. ACM Comput. Surv. 48, 1, Article 4 (July 2015), 35 pages.
Introduction
The article chosen is “Performance Anomaly Detection and Bottleneck Identification” in which Ibidunmoye, Hernández-Rodriguez, and Elmroth discuss how varying attempts are being made to create methods that can prevent anomalies and identify bottlenecks that create performance issues within operating systems. The article highlights how bottlenecks and anomalies often create problems within distributed systems and how prior solutions through the use of PADBI systems can be built upon to improve detection methods.
Article Summary
As presented …show more content…
As identified within the article some of the challenges that can be faced are: Dynamic Dependency in which cascading bottlenecks within large data centers can be hard to detect [Wang et al. 2013b]. Dynamic Anomaly Characteristics in which defining a priori all possible behaviors (normal or anomalous) of an application is technically unrealistic [Lan et al. 2010]. Finally, Nature of Data makes it difficult to consume the data in a uniform manner [Lan et al. 2010]. Thus the objective of the research in the article is to understand how systems would be able to address both performance anomaly detection and bottleneck identification (PADBI). According to the authors there has been literature that presents both methods and strategies from different fields that are commonly used techniques in approaching PADBI systems. Most of the methodology proposed is based on methods that involve models of the correlation between work load changes and performance that can be used to predict performance challenges. In the early 2000s and the slow emergence of applications, techniques where being proposed as ways to improve upon detecting bottlenecks and anomalies as server infrastructures were starting to grow. By the mid-2000s analytical techniques were being applied and considered a suitable method for detecting performance abnormalities [Chung et al. 2008; Agarwala et al. 2007]. Currently, there is still research being performed and expanded because of the use of the cloud. Even though existing research contribution is dominated by reactive solutions, there is increasing shift toward proactive approach. Predictive anomaly and bottleneck detection offers better system reliability by raising in advance, just-in-time alerts and detecting potential bottlenecks before a performance issue occur [O. Ibidunmoye et