
Copyright © National Academy of Sciences. All rights reserved.
The Future of Computing Performance:   Game Over or Next Level?
92  THE FUTURE OF COMPUTING PERFORMANCE
The  key  observation  motivating  a  CMP  design  is  that  to  increase 
performance when the overall design is power-limited, each instruction 
needs to be executed with less energy. The power consumed is the energy 
per instruction times the performance (instructions per second). Examina-
tion of Intel microprocessor-design data from the i486 to the Pentium 4, 
for example, showed that power dissipation scales as performance raised 
to the 1.73 power after technology improvements are factored out. If the 
energy per instruction were constant, the relationship should be linear. 
Thus, the Intel Pentium 4 is about 6 times faster than the i486 in the same 
Germany, June 19-23, 2004, pp. 2-13; Jung Ho Ahn, William J. Dally, Brucek Khailany, Ujval 
J. Kapasi, and Abhishek Das, 2004, Evaluating the imagine stream architecture, Proceedings 
of the 31st Annual International Symposium on Computer Architecture, Munich, Germany, 
June  19-23,  2004,  pp.  14-25;  Brucek  Khailany,  Ted  Williams,  Jim  Lin,  Eileen  Peters  Long, 
Mark Rygh, Deforest W. Tovey, and William Dally, 2008, A programmable 512 GOPS stream 
processor  for  signal,  image,  and  video  processing,  IEEE  Journal  of  Solid-State  Circuits 
43(1): 202-213; Christoforos Kozyrakis and David Patterson, 2002, Vector vs superscalar and 
VLIW architectures for embedded multimedia benchmarks, Proceedings of the 35th Annual 
ACM/IEEE  International  Symposium  on  Microarchitecture,  Istanbul,  Turkey,  November 
18-22, 2002, pp. 283-293; Luiz André Barroso, Kourosh Gharachorloo, Robert McNamara, 
Andreas Nowatzyk, Shaz Qadeer, Barton Sano, Scott Smith, Robert Stets, and Ben Verghese, 
2000,  Piranha: A scalable  architecture  based  on  single-chip  multiprocessing,  Proceedings 
of the 27th Annual International Symposium on Computer Architecture,  Vancouver, Brit-
ish  Columbia,  Canada,  June  10-14,  2000,  pp.  282-293;  Poonacha  Kongetira,  Kathirgamar 
Aingaran, and Kunle Olukotun, 2005, “Niagara: A 32-way multithreaded SPARC processor, 
IEEE Micro 25(2): 21-29; Dac C. Pham, Shigehiro Asano, Mark D. Bolliger, Michael N. Day, 
H. Peter Hofstee, Charles Johns, James A. Kahle, Atsushi Kameyama, John Keaty, Yoshio 
Masubuchi, Mack W. Riley, David Shippy,  Daniel  Stasiak, Masakazu Suzuoki, Michael F. 
Wang, James Warnock, Stephen Weitzel, Dieter F. Wendel, Takeshi Yamazaki, and Kazuaki 
Yazawa, 2005, The design and implementation of a first-generation CELL processor, IEEE 
International Solid-State Circuits Conference Digest of Technical Papers, San Francisco, Cal., 
February 10, 2005, pp. 184-185; R. Kalla, B. Sinharoy, and J.M. Tendler, 2004, IBM POWER5 
chip: A  dual-core  multithreaded  processor,  IEEE  Micro  Magazine  24(2):  40-47;  Toshinari 
Takayanagi, Jinuk Luke Shin, Bruce Petrick, Jeffrey Su, and Ana Sonia Leon, 2004, A dual-
core 64b UltraSPARC microprocessor for dense server applications, IEEE International Solid-
State Circuits Conference Digest of Technical Papers, San Francisco, Cal., February 15-19, 
2004, pp. 58-59; Nabeel Sakran, Marcelo Uffe, Moty Mehelel, Jack Dowweck, Ernest Knoll, 
and Avi Kovacks, 2007, The implementation of the 65nm dual-core 64b Merom processor, 
IEEE International Solid-State Circuits Conference Digest of Technical Papers, San Francisco, 
Cal., February 11-15, 2007, pp. 106-107; Marc Tremblay and Shailender Chaudhry, 2008, A 
third-generation 65nm 16-core 32-thread plus 32-count-thread CMT SPARC processor, IEEE 
International  Solid-State  Circuits  Conference  Digest  of  Technical  Papers,  San  Francisco, 
Cal., February 3-7, 2008, p. 82-83; Larry Seiler, Doug Carmean, Eric Sprangle, Tom Forsyth, 
Michael Abrash, Pradeep Dubey, Stephen Junkins, Adam Lake, Jeremy Sugerman, Robert 
Cavin,  Roger  Espasa,  Ed  Grochowski,  Toni  Juan,  and  Pat  Hanrahan,  2008,  “Larrabee: A 
many-core  x86  architecture  for  visual  computing,  ACM  Transactions  on  Graphics  27(3): 
1-15; Doug Carmean, 2008, Larrabee: A many-core x86 architecture for visual computing, 
Hot Chips 20: A Symposium on High Performance Chips, Stanford, Cal., August 24-26, 2008.