Monday, December 11, 2017

IBM POWER9 breaks barriers that hamper AI solutions

By Bill Moran and Rich Ptak

On December 5, IBM announced POWER9, its newest Power System. The POWER9 title might be taken to imply it is just another POWER8 iteration with a performance boost and a few new features thrown in. Not so. POWER9 is a significant generational advance, providing much more than a minor turn-of-the-crank. These next-generation Power Systems embed leading edge new technologies, such as PCI-Express 4.0, next-gen NVIDIA NVLink 2.0 and OpenCAPI more about these later. The new server, AC922 is the base platform for the CORAL collaboration, the world’s most powerful supercomputer. 
With this announcement, IBM marks a major change-of-direction as it targets compute intensive the super-computing and AI workloads used for modeling, research, credit risk analysis, etc.  Workloads requiring LOTs of memory, extremely high processing speeds and analyze vast amounts of data. We comment on this and its implications.

POWER9 enhancements

First, a description of the improvements over prior iterations. Skipping the “speeds and feeds”, here are a few important points.
·         POWER9 chips are 14nm technology; a significant advance over last generation’s 22nm. IBM no longer controls a chip foundry having sold it to Global Foundry. However, the Global Foundry – IBM alliance is clearly working effectively and delivering products in a timely manner
·         POWER9 architectural changes yield many improvements. These include a new implementation of OpenCAPI 2.0, that delivers a major improvement in I/O capacity as it speeds bandwidth by a factor of 4 over CAPI[1] in POWER8. Implementation of PCI-Express 4.0 and next-gen NVIDIA NVLink 2.0 means that data flows in and out of the system more quickly. Complex data analysis, simulations and model building/evaluations complete faster. Programming is simplified.
·         POWER9 enhances links between system CPUs and the GPUs. Experience has proven that pairing GPU devices with the CPU can yield dramatic operational improvements. POWER9’s new links increase bandwidth speeds by a factor of 7 - 10 times to benefit data manipulation and analysis.
Connections between the GPUs and system memory is simplified and improved. Thus, AI models run faster, programming is simpler, and this permits quicker creation and evaluation of more complex and larger models for AI, data analytics, research etc. Learning times are also dramatically reduced.
These features and new architecture mean the POWER9 is a strong competitor as it delivers performance improvements that are much needed in AI and supercomputing market segments.

A new Server

But, chip level and design specification improvement tend to be of limited interest to many potential customers. They tend to evaluate a new chip or processor in the context of the product they will purchase. They want to know how they or their projects benefit from the AC922 POWER9 processor-based server.
Detailed specs for the AC922 server appear in IBM’s material.  But, IBM also provided some benchmark runs comparing AC922 to an Intel X86 server. Two AI workloads, Caffe and Chainer[2]  were run. For both workloads, AC922 out-performed the X86 system by approximately 3.7 times. The X86 system is a standard environment. We expect Intel will be enhancing x86 with AI capabilities at some point.
We like the benchmarks that IBM ran. They effectively demonstrate the impact of system improvements in actual application. Paper and pencil comparisons are fine; but nothing equals the actual performance a system delivers with a real workload. The initial air-cooled server will be followed in 2018 by a faster, water-cooled version. The air-cooled servers have a maximum of 4 GPUs; the follow-on water versions allow up to 6 GPUs.

Supercomputer Heaven

CORAL is a supercomputer that is being built for the US DOE with various Oak Ridge, Argonne and Livermore labs. CORAL will be the most powerful supercomputer in the world when deployed in 2018. It is expected to deliver 10X the power of Titan, today’s supercomputer leader. Very impressively, the building block for Coral is a standard AC922, now available for purchase. This provides normal customers the ability (if not the resources) to build their own version of a CORAL-type supercomputer around multiple AC922s. We believe many customers, e.g. weather bureaus, modeling researchers, etc., will be interested in constructing such systems. These deployments will verify AC922’s increased operational and programming simplicity, versatility, robustness and scalability.

IBM’s new direction 

IBM has changed the direction of its Power System marketing. In the past, Power Systems were promoted as a general-purpose Linux server in direct competition with Intel servers. Intel dominated the distributed server market (albeit with Windows) for decades. Windows-based systems would have to convert from Windows to Linux to use Power. As such conversions are generally viewed as risky, customers were far more likely to just continue to upgrade to the latest version of Intel. Even customers installing Linux were more likely to do so on an Intel platform, supplied by HP or Dell. Thus, IBM Power Systems, despite significant advantages in processing performance, capacity and I/O handling faced powerful resistance to change which worked against achieving significant market penetration.
Now, with Power9, IBM sees in the new area of AI an opportunity that plays directly to their architectural and performance advantages. Power9 systems were designed to deliver maximum performance with AI workloads and models. They will still compete with Intel, but on a more level playing field in a rapidly growing and diverse market. Both companies will have to compete with very attractive Cloud offerings. IBM believes there exists sufficient demand for on-premise computing to support a profitable business.  Although many, if not most, Cloud servers are x86 based, IBM believes they can deliver a sufficient performance edge to justify keeping AI projects on-premise.  Initial benchmarks suggest that they may be right, although maintaining that edge will remain a challenge.


We think that IBM has delivered a powerful new answer for anyone searching for a production AI platform. It has the right combination of hardware and software technology to succeed. It has other strengths including Open Power foundation support, enhanced CAPI and GPU interfaces. This support has been critical in the creation of CORAL, as IBM acknowledges. IBM is covering key basics very well.
Finally, a significant messaging advantage they have neglected to mention is the powerful boost that this new architecture and system provides to Watson.  The Watson Marketing group is rightly and understandably focused on marketing segment specific benefits and features.
IBM Watson lays claim to having the best AI solution system in the marketplace. Today, competition in the AI platform space is rapidly growing. Vendors large and small, much x86-based, are effectively competing against IBM. It appears to us that there exists a powerful message in how POWER9 System’s bespoke (for AI) IBM infrastructure meshes with, enables and drives IBM’s showcase AI application. 
The POWER9 architecture represents a significant advance in its offering of AI-specific features, capabilities and performance enhancements. Combined with a solid existing ecosystem, it should increase the market penetration of Power Systems.

[1] CAPI itself was a very significant improvement, see With Redis Labs, CAPI goes Mainstream, Big Time! at
[2] Caffe and Chainer are both open source frameworks. See and for more information.