
Copyright © National Academy of Sciences. All rights reserved.
The Future of Computing Performance:   Game Over or Next Level?
134  THE FUTURE OF COMPUTING PERFORMANCE
Internet applications) has been important in the steady increase in avail-
able parallelism for these sorts of applications. 
A  large  and  growing  collection  of  applications  lies  between  those 
extremes. In these applications, there is parallelism to be exploited, but it 
is not easy to extract: it is less regular and less structured in its spatial and 
temporal  control and its data-access  and  communication patterns. One 
might argue that these have been the focus of the high-performance com-
puting (HPC) research community for many decades and thus are well 
understood  with respect  to  those  aspects  that  are amenable  to  parallel 
decomposition. The research community also knows that algorithms best 
suited for a serial machine (for example, quicksort, simplex, and gaston) 
differ from their counterparts that are best suited for parallel machines 
BOX 5.1
React, But Don’t Overreact, to Parallelism 
As this report makes clear, software and hardware researchers and practitio-
ners should address important concerns regarding parallelism. At such critical 
junctures, enthusiasm seems to dictate that all talents and resources be applied 
to the crisis at hand. Taking the longer view, however, a prudent research port-
folio must include concomitant efforts to advance all systems aspects, lest they 
become tomorrow’s bottlenecks or crises.
For example, in the rush to innovate on chip multiprocessors (CMPs), it is 
tempting to  ignore sequential  core  performance and  to deploy  many simple 
cores. That approach may prevail, but history and Amdahl’s law suggest caution. 
Three decades ago, a hot technology was vectors. Pioneering vector machines, 
such as the Control Data STAR-100 and Texas Instruments ASC, advanced vec-
tor technology without great concern for improving other aspects of compu-
tation.  Seymour  Cray,  in  contrast,  designed  the  Cray-1
1
  to  have  great  vector 
performance as well as to be the world’s fastest scalar computer. Ultimately, his 
approach prevailed, and the early machines faded away.
Moreover, Amdahl’s law raises concern.
2
 Amdahl’s limit argument assumed 
that a fraction, P, of software execution is infinitely parallelizable without over-
head, whereas the remaining fraction, 1 - P, is totally sequential. From that as-
sumption, it follows that the speedup with N cores—execution time on N cores 
divided by execution time on one core—is governed by 1/[(1 - P) + P/N]. Many 
learn  that  equation,  but  it  is  still  instructive  to  consider  its  harsh  numerical 
consequences. For N = 256 cores and a fraction parallel P = 99%, for example, 
speedup is bounded  by 72. Moreover, Gustafson
3
 made good “weak scaling” 
arguments for why some software will fare much better. Nevertheless, the com-
mittee is skeptical that most future software will avoid sequential bottlenecks. 
Even such a very parallel approach as MapReduce
4
 has near-sequential activity 
as the reduce phase draws to a close.
For those reasons, it is prudent to continue work on faster sequential cores,