(1) P(t) ~ (t0-t)^-nand
(2) P(t) ~ exp(exp((t-t0)/n))Function (1) is a time-honored model of faster than exponential growth. See the seminal 1960 Science paper by Heinz von Foerster, Patricia Mora, and Larry Amiot (vol. 132, pp. 1291-1295) in which they apply their socalled coalition growth model to world population growth. Double-exponential growth is proposed by Ray Kurzweil in his book and on his web site (The Law of Accelerating Returns). The main difference between hyperbolic and double-exponential growth is that, as t approaches t0, P(t) of function (1) grows beyond all bounds and a mathematical singularity is reached at t=t0. Function (2) also steepens faster than exponential, with decreasing doubling times, but continues to do so forever.
The von Foerster et al. model is based on the assumption that the population growth rate, dP/Pdt, is a function of the already achieved population size, dP/Pdt ~ f(P), and there obviously are a number of reasons one can think of, why this might be the case. Similarly, applying this argument to Moore's Law, the assumption is made that the growth rate of computing power is a function of the already available computing power. Different growth models arise, depending on the choice of function f: A power-law dependence, dP/Pdt ~ P^m, has the solution P(t) ~ (t0-t)^-n with n=1/m, i.e., function (1), above. In the special case, m=0, dP/Pdt ~ 1 leads to exponential growth, P(t) ~ exp(t). Assuming a logarithmic dependence, dP/Pdt ~ log(P), instead of a power law, the resulting time dependence is given by function (2), P(t) ~ exp(exp((t-t0)/n). An important distinction between model (1) and model (2) is that the growth law of model (1) is scale-invariant or self-similar, i.e., it has no inherent time scale. Note, that parameter n of model (1) is dimensionless, as opposed to the e-folding time, n, of model (2) which therefore is not scale-invariant.
From dlog(P)/dt = log(2)/T2, the doubling times, T2, may be calculated. In the case of model (1), T2 is proportional to the time left until the singularity is reached, while for model (2) the doubling time decreases exponentially with e-folding time n, resulting in a straight line in a plot of logarithmic doubling time over linear time. If n of model (2) approaches infinity, the limiting case of exponential growth with constant doubling times is reached. See Table 1, below, for an overview of the different functions.
Table 1
model differential equation time dependence doubling time ----------------------------------------------------------------------------------------------------- original Moore's Law dP/Pdt = constant P(t) ~ exp(t) T2 = constant (1) dP/Pdt ~ P^m P(t) ~ (t0-t)^-n; n=1/m T2 = log(2)/n * (t0-t) (2) dP/Pdt ~ log(P) P(t) ~ exp(exp((t-t0)/n)) T2 = n*log(2) * exp((t0-t)/n)
(1) P(t) = C * (t0-t)^-n (2) P(t) = C * exp(exp((t-t0)/n)with the free parameters C, t0, and n. Results are summarizes in Table 2 (best-fit values of scaling parameter C are omitted).
Table 2
model best-fit degrees of probability best-fit parameters doubling time
chi-square freedom (a) t0 [yrs] n [yrs] (b)
-------------------------------------------------------------------------------------
(1) 143.82 133 0.246 2041 26.5 (2041-t)/38.2
(2) 143.21 133 0.257 1747 68.2 [yrs] exp((2010-t)/68.2)
(a) probability of the model fitting the data
(b) calculated from best-fit parameters
Since most of the statistical weight is provided by the data points
after 1950, qualitatively similar results are obtained if one
restricts the analysis to electronic computers (vacuum-tube,
discrete transitor, and silicon chip based), neglecting the early
data points of mechanical and electro-mechanical computing devices.
An attempt to fit the data with a simple exponential, i.e., the classical Moore's Law with constant doubling times, resulted in a best-fit chi-square of 231.21 with 134 degrees of freedom, corresponding to a probability of 3.7e-7, thus excluding this model with very high statistical significance. Limiting the analysis to post-1950 data, constant doubling times are excluded with a statistical significance of 99.92 % (chi-square=161.62 with 109 degrees of freedom).
Statistical errors for the best-fit parameters t0 and n were determined by calculating the chi-square values for a grid of t0 and n values (fitting parameter C at each grid point) and taking the minimum chi-square + 4.61 and minimum chi-square + 9.21 (chi-square for 2 degrees of freedom, corresponding to two free parameters of interest) contours as delimiting the 90 % and 99 % confidence areas. The 90 % and 99 % confidence ranges for t0 and n are listed in Table 3.
Table 3
model t0 confidence range n confidence range
----------------------------------------------
(1) 90%: 2032-2052 90%: 22.9-31.3
99%: 2030-2058 99%: 21.8-34.0
(2) 90%: 1704-1782 90%: 60-78
99%: 1680-1794 99%: 57-83
Fig. 1 (model 1) and
Fig. 2 (model 2) display the range of
model curves corresponding to the 99 % confidence area of the model
parameters, plotted on top of the data
points and error bars. The corresponding doubling times are shown in
Fig. 3 (model 1) and
Fig. 4 (model 2).
Fig. 5 shows the 90 % and 99 % error
contours for model (1). Since parameter t0 and n are strongly
correlated, resulting in elongated (banana shaped) error contours, n
is expressed in terms of the largely model-independent 1980 doubling
time, T2_1980 = log(2)/n * (t0-1980), instead.
Are computing speeds really on track towards a singularity in the near future ? Since one can construct arbitrarily steep functions, which do not have a singularity (e.g. by going from exp(exp(t)) to exp(exp(exp(t))) and so on, adding an arbitrary number of additional exponential functions), it will not be possible to decide this question through model-fitting. However, it would still be possible that, as time proceeds, function (1) will turn out to be the only reasonably simple function fitting the data, thus favoring model (1) on the ground of invoking Occam's Razor. The self-similar growth of model (1), as well as scale-invariant power-laws in general, arises under a wide range of conditions in nature, thus perhaps also favoring model (1) over model (2). World population closely followed the von Foerster et al. model for millennia. It stopped to do so around 1975, presumably because population density had reached levels where the simple model assumptions no longer applied. Computing speeds may repeat this pattern, closely following model (1) until final physical limits or economic constraints force a break of the trend. Close inspection of the data shows a slight flattening of the curve after about 1998. It remains to be seen whether this is a temporary deviation from the long-term trend or a first indication of the end of faster than exponential growth.
If due to less steeply rising computing speeds, model (1) will cease to be an acceptable fit to the data, with the data perhaps following model (2), we would be saved from the embarrassment of near infinite computing speeds within mere decades. Figs. 1, 2, 3, and 4 suggest that it should be possible to distinguish between model (1) and (2) within the next several years, probably not later than 2010. I intend to update the above model fits as additional data become available in coming years. So, come back to this page to see how the story unfolds ...