Skip to main content

High Productivity Computing Systems Toolkit

A framework and toolkit that automates the detection of bottlenecks in application performance.

Date Posted: November 6, 2008

alphaworks tab navigation


 

Update: December 11, 2008 New version for more platforms: AIX/POWER (aix_pwr), Linux/ppc970 (perfctr_ppc970), Linux/POWER5 (perfctr_pwr5), and Linux/POWER5+ (perfctr_pwr5p).

 

What is High Productivity Computing Systems Toolkit?

High Productivity Computing Systems (HPCS) Toolkit automates performance analysis and tuning of applications. The toolkit includes a Bottleneck Detection Engine (BDE); a Solution Determination Engine (SDE) is being developed for inclusion in a future version.

The BDE analyzes the time-spent distribution in the application and discovers the performance bottlenecks by using given bottleneck definitions. The user can query the application execution performance to identify performance problems by specifying the customized bottleneck rules. An example of such a bottleneck is the execution time of a code segment exceeding some threshold.

The framework includes a simple, programmable interface through which application developers can collect and query rich performance data of an execution; analyze the performance by evaluating data using bottleneck definitions; and plan for improvements. The framework collects performance data and tries to make sense of the data for the user. All interactions seen by the users in the framework are in the language of the source program; the details of binary instrumentation are hidden underneath. The framework provides clear textual information regarding the nature of the bottlenecks, so that the user can easily address them in the application.

The design of the framework is flexible and extensible so that it can be tailored to the actual application execution environment and performance tuning requirement. This alpha release includes tools for detecting bottlenecks with CPU performance, MPI messaging, OpenMP threading, and file input/output aspects of applications written in the C or Fortran languages. The tools are provided as binaries along with documentation and examples of how to use them.

How does it work?

The Bottleneck Discovery Engine (BDE) is a rule-based analysis framework that depends on being able to describe a bottleneck. Bottleneck "signatures" are stored in a database, which the BDE uses. These signatures are completely programmable with an open grammar developed at IBM Research. This database can be used statically, where the user depends on pre-existing bottleneck signatures developed by expert users or the community. The database can also be used dynamically, where the user develops, in real time, the "right questions" about the behavior of his application. This capability opens up a whole new way of analyzing and interrogating performance of large-scale applications.

A rule-based methodology is essential to the operation of the entire framework because otherwise it is virtually impossible to deterministically find anything without a preconceived specification of what to look for.

About the technology author(s)

David Kepacki, Ph.D., is a senior staff member of IBM Research and has more than 20 years of experience in high-performance computing. He has worked in a variety of technical areas within IBM, including high-performance processor design, numerically intensive computation, computational physics, parallel computing, and performance modeling. Dr. Kepacki currently manages the Advanced Computing Technology department at the IBM T. J. Watson Research Center.

I-Hsin Chung, Ph.D., is a research staff member at the IBM T. J. Watson Research Center. His experience includes designing and developing performance tools on IBM platforms such as the Blue Gene/L and Blue Gene/P systems and IBM® Power Systems™ on AIX® and Linux®.

Dr. Guojing Cong, Ph.D., is a research staff member at the IBM T. J. Watson Research Center. He is an expert in solving irregular combinatorial problems on parallel systems.

Seetharami R. Seelam, Ph.D., is a post-doctoral research staff member at the IBM T. J. Watson Research Center. While in the IBM Advanced Computing Technologies department, he worked on performance analysis tools and technologies for AIX on the IBM System p™ platform and Blue Gene systems as well as on next-generation automatic performance analysis, bottleneck detection, and solution determination technologies.

Hui-Fang Wen is an advisory software engineer at the IBM T. J. Watson Research Center. She is a member of the IBM Advanced Computing Technology department, where she works on performance tools and GUI design.

Simone Sbaraglia, Ph.D., is a tenured professor in the department of Mathematics and Computer Science at the University of Cagliari, Italy. While working with the Advanced Computing Technology department at the IBM T. J. Watson Research Center, he developed various technologies in memory modeling as well as automated bottleneck detection mechanisms for application analysis.

Trademarks




Related technologies