SUMMARY

An accumulation of seismic moment data gathered over the previous decade justifies a new attempt at a comprehensive statistical analysis of these data: herein, more rigourous statistical techniques are introduced, their properties investigated, and these methods are employed for analysis of large modern data sets. Several theoretical distributions of earthquake size (seismic moment–frequency relations) are described and compared. We discuss the requirements for such distributions and introduce an upper bound or a ‘corner moment’ for a distribution to have a finite energy or moment flux. We derive expressions for probability density functions and statistical moments of the distributions. We also describe parameter evaluation, in particular how to estimate the seismic moment distribution for the largest earthquakes. Simulating earthquake size distributions allows for a more rigourous evaluation of distribution parameters and points to the limitations of the classical statistical analysis of earthquake data. Simulations suggest that several earthquakes approaching or exceeding the corner magnitude (mc) limit need to be registered to evaluate mc with reasonable accuracy. Using the Harvard catalogue data, we compare moment distribution parameters for various temporal spans of the catalogue, for different tectonic provinces and depth ranges, and for earthquakes with various focal mechanisms. The statistical analysis suggests that the exponent β is universal (β = 0.60–0.65) for all moderate earthquakes. The corner moment (Mc) value, determined by the maximum-likelihood method, both in subduction zones and globally, is about 1021 N m, corresponding to the corner moment magnitude mc ≈ 8.0. For mid-oceanic earthquakes, mc is apparently smaller for spreading ridges, it is about 5.8, and for strike-slip earthquakes on transform faults it decreases from 7.2 to 6.5 as the relative slip velocity of faults increases. We investigate the seismic moment errors, both random and systematic, and their dependence on earthquake size. The relative errors seem to decrease for larger events. The influence of moment uncertainties on the parameter estimates is studied. Whereas the β values do not appear to be significantly influenced by the errors, for the corner moment large errors can lead to substantially biased estimates. We compare the Harvard catalogue results with the earthquake data from instrumental catalogues in the first three-quarters of the 20th century. Several very large earthquakes (m ≥ 9) occurred around the middle of the century. Their magnitude values cannot be fitted by a modified Gutenberg–Richter law with mc = 8.0–8.5. Among other factors, this discrepancy can be explained by either substantially higher errors in the earlier magnitude values, or by mc being higher for some subduction zones. It is unlikely that data available presently or soon will be sufficient to determine the corner magnitude of 9 and above, with reasonable precision, using purely statistical methods.

1 Introduction

In this paper we consider the earthquake size distribution, and analyse statistically the distribution of the seismic scalar moment for earthquakes in different tectonic environments and temporal periods. In a previous paper (Kagan 1991) we proposed several theoretical probability formulae that can be used to model the seismic moment distribution. Empirical data, available at that time, were fitted to these distributions and parameter values were estimated. Now 10 years later, modern earthquake catalogues of seismic moment solutions are greatly expanded; thus we can significantly amplify our study of the earthquake size distribution. The seismic moment distribution has recently been the subject of investigation by many researchers (Utsu 1999; Godano & Pingue 2000; Main 2000; Leonard et al. 2001). These studies confirm the universality of the scaling parameter (β) for the moment–frequency relation.

One major reason to study the seismic moment distribution is to compare it with estimates of tectonic deformation and to draw appropriate conclusions. The development of space geodesy and quantitative plate tectonics has provided new data for such a comparison. Kagan (1997, 1999) inferred the parameter values of the earthquake size distribution from this relation and suggested that the maximum earthquake size also has a universal distribution within the continents and their boundaries.

The size distribution for small and moderate earthquakes is known: their sizes follow the Gutenberg–Richter (G-R) relation. A similar conclusion is valid for moderate and large earthquakes distributed over large seismogenic regions (Utsu 1999). However, the major part of tectonic strain is released by the largest earthquakes, the distribution parameters for which are still little known.

We use the notation M for the scalar seismic moment and m for the moment magnitude:

(1)

where lg=log10 and M is measured in newton metre (N m). The magnitude calculated using eq. (1) is used here only for the purposes of illustration; all pertinent computations are carried out with the moment M-values.

This paper describes theoretical statistical distributions of scalar seismic moment and applies these distributions in order to estimate distribution parameters from earthquake data. In a companion paper (Kagan 2002) we discuss the seismic moment conservation principle and draw conclusions by comparing the tectonic moment rate and seismicity parameters.

In Section 2 we summarize expressions for the theoretical statistical distributions that are either most often used in practical work, or are important as distributions best fitting the available data. We also discuss the general requirements for such distributions (Section 2.2.1). In Sections 3 and 4 statistical techniques for parameter evaluation are considered. Since the standard statistical methods are not effective in the estimation of the upper bound for seismic moment, we employ a simulation (Section 4) to infer estimates properties. The statistical techniques described in the previous sections are applied in Section 5 to the analysis of seismic data, primarily the Harvard centroid moment tensor (CMT) catalogue (Dziewonski et al. 2001). We investigate the behaviour of distribution parameters by comparing different tectonic regions, time spans and earthquake focal mechanisms. We also analyse errors in earthquake size determination and the influence of these errors on the parameter estimates. Finally, in Section 5 we discuss possible reasons for the discrepancy between the Harvard data and earlier instrumental catalogues. In a short Discussion section we present arguments for the universal nature of the proposed moment–frequency relation and offer possible explanations for many results, based on magnitude distribution analysis, which seem to contradict this conclusion.

2 Theoretical distributions

2.1 Gutenberg–Richter relation

The earthquake size distribution is usually described using the G-R (Gutenberg & Richter 1941, 1944) magnitude–frequency relation

(2)

where N(m) is the number of earthquakes with magnitude ≥m, and a and the b are parameters: a characterizes the seismic activity or earthquake productivity of a region and b parameter describes the relation between small and large earthquake numbers, b ≈ 1. Expression (2) has been proposed in the above functional form by Vilfredo Pareto (1897, p. 305, his eq. 1) for the financial income distribution.

Given the limited sensitivity of seismographic networks, small earthquakes are not completely sampled in earthquake catalogues. This makes it necessary to introduce a catalogue completeness threshold (observational cut-off) mt and truncate the distribution from the left:

(3)

where at is the logarithm of the number of earthquakes with mmt. If the catalogue completeness threshold is constant in time, we can separately analyse two distributions: a temporal sequence of earthquake numbers and earthquake size. The temporal distribution of event numbers is usually assumed to follow the Poisson law (Molchan & Podgaetskaya 1973; Kijko & Graham 1998) or the negative binomial law (Kagan & Jackson 2000). In this work we study the earthquake size distribution.

The original G-R distribution (3) can be transformed into the Pareto distribution for scalar seismic moment M

(4)

where β is the index parameter of the distribution, β = graphicb (see eq. 1).

Clark et al. (1999) and Ogata & Katsura (1993) suggest using a formula that incorporates non-observation of small faults or earthquakes into the size distribution. This solution is problematic. Although catalogue compilers try to ensure completeness above the threshold, the selection of smaller (below the threshold) earthquakes undergoes many unreported significant changes and variations. Thus, a distribution of these smaller earthquakes may not have statistical stability, and, therefore, standard statistical methods may result in doubtful interpretations. In particular, such estimates, if carelessly used, may distort the conclusions (estimates or inferences) concerning not only the incomplete part but the distribution as a whole.

2.2 Seismic moment upper bound

Simple considerations of the finiteness of the seismic moment flux or of the deformational energy, available for an earthquake generation (Wyss 1973; Knopoff & Kagan 1977; Sornette & Sornette 1999), require that the Pareto relation (4) be modified at the large size end of the moment scale. The distribution density tail must have a decay stronger than M−1−β with β>1. This problem is generally solved by introducing into the distribution an additional parameter called the maximum or corner moment (Mx or Mc).

Anderson & Luco (1983) propose truncation from above at mx of either the cumulative distribution, or its density (the first derivative), or the second derivative of the distribution. Since the first two of the above laws are still widely used in earthquake hazard studies, we briefly consider their properties as applied to the seismic moment distribution. For convenience we refer to these relations as the characteristic and the truncated Pareto distributions.

The distributions considered later in this section have one point in common, they incorporate a parameter that in some way accommodates a change from a power law to a more rapid decay in the tail. Although, as the references above (see also Utsu 1999) show, there are many ways in which this can be accomplished, we discuss below four distributions: two of them are selected since they are used extensively in seismic hazard studies, and the remaining two since, as we show in Section 5, they provide the best fit to available empirical data. For convenience, the most relevant properties of the selected alternatives are listed in the rest of the section.

2.2.1 Requirements for seismic moment distribution

Several requirements can be formulated for the statistical distribution of the seismic moment.

  1. (A)

     The distribution should have an extended scale-invariant part, describing small and moderate earthquakes. Many attempts to find a breakdown of the scaling relation for moderate, small and microearthquakes have not yielded reliable results (see discussion on the potential β-value change in the last paragraph of section 5.2.2). The scale invariance extends to acoustic emission events (see Lei et al. 2000 and references therein), i.e. to scale length of the order of millimetres. Molecular dynamics simulations (Omeltchenko et al. 1997) and laboratory experiments (Fineberg & Marder 1999) suggest that the fracture self-similarity extends to the molecular scale, i.e. to a scale length of the order of 10−6 mm. Although we cannot yet estimate the seismic moment for small earthquakes, acoustic emission events and dynamics simulations, nor compare their power-law exponent directly with large earthquake results, still the scale invariance of these processes suggests that the power-law behaviour may be universal.

  2. (B)

     The distribution should have a small number of free parameters: if we assume (see the previous item) that below the upper bound earthquakes follow a power-law distribution with a universal exponent β, available data on the seismic moment distribution do not seem to require and do not support more than two degrees of freedom (Utsu 1999, his Table 2).

  3. (C)

     A sharp cut-off at the large-event distribution tail is not warranted, since it contradicts the known behaviour of dissipative physical dynamic systems that require a smooth transition. Moreover, as any physical quantity, the seismic moment is determined with an error. Hence, its distribution cannot have a sharp upper truncation limit. To avoid the above criticism, we may replace the abrupt truncation by a soft Gaussian-like roll-off, but difficulties still persist. Such a taper either introduces an additional degree of freedom in the distribution description, or, if the width of the roll-off filter is assumed on a priori grounds, such an assumption cannot usually be rigorously justified.

Below we consider four theoretical distributions used to describe seismic moment–frequency relation:

  1. (a)

    the characteristic distribution;

  2. (b)

    the truncated Pareto or G-R distribution;

  3. (c)

    the tapered G-R (Pareto) distribution (TGR) and

  4. (d)

    the gamma distribution.

These distributions have a scale-invariant, power-law part for small and moderate earthquakes, whereas the right-hand tail of the distributions is controlled by an Mx or Mc value. We use Mxc, Mxp, Mcm, Mcg, for the upper bound seismic moment of each distribution in points (a)–(d), respectively. In the last two cases the limit at the tail of the distribution is not a ‘hard’ cut-off as in items (a) and (b), but a ‘soft’ taper, so it would be more appropriate to refer to Mcm and Mcg as a ‘corner’ moment.

In his review, Utsu (1999) considers many theoretical magnitude–frequency relations proposed by various researchers. Most of these equations can be easily transformed into distributions of seismic moment or energy; however, many relations would violate the conditions (A)–(C), outlined above. Utsu (1999, p. 518) calls two of our distributions (b) and (d), ‘truncated Gutenberg–Richter’ and ‘generalized Saito’ equations [his cases (b) and (g)].

2.2.2 Characteristic distribution

The characteristic distribution has the cumulative complementary (survivor) function truncated on both ends:

and

(5)

The probability density function (PDF) is

and

(6)

where δ is the Dirac delta function.

The statistical moments of kth order about the origin are

(7)

2.2.3 Truncated G-R (Pareto) distribution

For the Pareto distribution with the PDF truncated at both ends, the probability density is

(8)

and

(9)

and the statistical moment of kth order about the origin is

(10)

2.2.4 Tapered G-R (Pareto) distribution

The tapered G-R (TGR) relation has an exponential taper applied to the cumulative number of events with seismic moment larger than M (Vere-Jones, private communication 1999; Vere-Jones et al. 2001)

(11)

here Mcm is the parameter that controls the distribution in the upper ranges of M (‘the corner moment’). The distribution in a survivor function (11) and in the PDF form (see below) was proposed by Pareto (1897, pp. 305–306, his eqs 2, 2bis and 5). The corresponding probability density function is

(12)

and the kth-order statistical moment about the origin is

(13)

Here Γ(a, x) is the incomplete gamma function (Abramowitz & Stegun 1972, p. 260):

(14)

where Γ(a) is the gamma function, and

(15)

2.2.5 Gamma distribution

Previously, we used the gamma distribution (with a negative index parameter, β, and truncated from below) to describe the earthquake size distribution (Kagan 1991, 1993, 1999). In this distribution, the exponential taper is applied to the probability density, not to the cumulative function.

The gamma distribution has the PDF

(16)

where C is a normalizing coefficient and Mcg is a corner moment parameter.

(17)

or, using eq. (15)

(18)

For McgMt the coefficient C ≈ 1. For the next-order approximation, we may keep only the first term in the square brackets. The gamma function Γ(2−β) changes slowly in the range of β-values encountered in moment–frequency relations: for values of β equal to 2/3 or to 1/2, the gamma function is Γ(2−2/3)= 0.893 or Γ(2−1/2)=0.886. The complementary function is

(19)

and the kth-order statistical moment about the origin is

(20)

We display all four theoretical curves and the G-R law in Figs 1(a) and (b), where they are used to approximate the empirical seismic moment–frequency relation (survivor function) for the Harvard catalogue.

3 Estimation of seismic moment distribution parameters

We apply the maximum-likelihood method to estimate the parameters Mx, Mc and β from catalogue data. Since in many cases only one parameter needs to be evaluated and this problem is usually easier, we review this case first.

3.1 One-parameter estimation

3.1.1 Estimation of the ²

For the unrestricted G-R relation (4) the log-likelihood function (0) for n observations of the seismic moment is

(21)

By equating its derivative to zero (∂0/∂β = 0) and solving the resulting equation, we obtain the maximum-likelihood estimate (MLE) of β (cf. Deemer & Votaw 1955; Aki 1965):

(22)
Earthquake cumulative number versus log seismic moment for the global shallow earthquake distribution in the 1977/1/1–2000/12/31 Harvard catalogue. The curves show the numbers of events with moment larger than or equal to M. The total number of earthquakes M ≥ 1017.7 N m (m ≥ 5.8) is 3942. (a) Moment range 1017–1022 N m; (b) moment range 1020–1022 N m. We also show the approximation of the empirical distribution (thick solid line) by five theoretical curves: the characteristic distribution (the complementary cumulative function truncated at the maximum moment), dotted line; the truncated Pareto distribution (the probability density function truncated at the maximum moment), dash-dotted line; the tapered G-R law (the exponential taper is applied to the complementary cumulative function), solid line; the gamma distribution (the exponential taper is applied to the probability density), dashed line; the original G-R law (without the upper bound), dotted line.
Figure 1.

Earthquake cumulative number versus log seismic moment for the global shallow earthquake distribution in the 1977/1/1–2000/12/31 Harvard catalogue. The curves show the numbers of events with moment larger than or equal to M. The total number of earthquakes M ≥ 1017.7 N m (m ≥ 5.8) is 3942. (a) Moment range 1017–1022 N m; (b) moment range 1020–1022 N m. We also show the approximation of the empirical distribution (thick solid line) by five theoretical curves: the characteristic distribution (the complementary cumulative function truncated at the maximum moment), dotted line; the truncated Pareto distribution (the probability density function truncated at the maximum moment), dash-dotted line; the tapered G-R law (the exponential taper is applied to the complementary cumulative function), solid line; the gamma distribution (the exponential taper is applied to the probability density), dashed line; the original G-R law (without the upper bound), dotted line.

with a standard error

(23)

For further discussion on the MLE and other estimates of the Pareto distribution index see Clark et al. (1999).

For distributions incorporating the maximum or corner moment, we can estimate β by truncating the distribution at Mu (well above the assumed or known Mx or Mc values) and using the MLE solution of Deemer & Votaw (1955) to obtain the β-value. Page (1968) proposed a similar solution for magnitude data. The MLE equation for graphic,

(24)

can be solved by iteration. The standard error of estimate (24) is

(25)

where B = 1−(Mt/Mu)β.

We truncate in some of our β estimation procedures instead of applying a two-parameter approximation, in order for estimates to be independent of the assumptions about the moment upper bound. These results can be used as a test of the evaluation techniques. Empirical results (Section 5) show that the loss of estimate accuracy owing to the truncation is minor.

3.1.2 Estimation of the maximum or corner moment

As we argue (Kagan 1997, 1999), the moment–frequency distribution is universal: the β-value is the same for all earthquakes. If this is the case, for β = constant, a simpler estimate of the corner moment can be derived.

For the truncated Pareto distribution the log-likelihood is

(26)

The function reaches the maximum at the value of Mxp = Mn, where Mn is the maximum observed earthquake moment in a sample of n earthquakes (Pisarenko 1991; Kijko & Graham 1998).

The MLE of Mxp is biased. Pisarenko (1991) and Kijko & Graham (1998) propose to correct for the bias by using the estimate

(27)

We define the reciprocal of the corner moment as

(28)

For the gamma distribution (case McgMt)

(29)

and the variance of η can be found as a reciprocal of the second derivative of the log-likelihood:

(30)

For the TGR distribution the MLE of ηm is the solution of the following non-linear equation:

(31)

the root of the equation can be found by iteration. The standard deviation is

(32)

Approximate standard errors for the corner moment can be calculated, using eqs (1), (30) and (32), as

(33)

and for magnitude

(34)

These linearized equations perform reasonably well for large catalogues, whereas for smaller data sets, the standard errors, which are based on asymptotic formulae, fail to provide a good description of the variation of the estimate.

Kagan & Schoenberg (2001) propose an estimate of the corner moment for the TGR law based on statistical moments of the size distribution.

(35)

where graphic, is the average seismic moment. The simplicity and closed analytic form of the equation for graphic make calculating the estimate especially easy, since the problem of poor convergence for eq. (31), discussed in Sections 4.2.1 and 4.2.2, is avoided. Kagan & Schoenberg (2001) study properties of this estimate and compare it with other estimates of the upper bound of the TGR distribution. They also propose a formula for the bias correction in eq. (35).

The average (mean) likelihood estimate (ALE) (Jenkins & Watts 1968, eq. 4.4.15) of η is defined as

(36)

where L(η) is the likelihood function. Kagan & Schoenberg (2001) discuss properties of this estimate.

The uncertainty estimates (σ) obtained in this section and below are based on an asymptotic theory of statistical inference generally valid when the number of random variables is greater than 30–50 (Jenkins & Watts 1968). Although the number of earthquakes involved in the β determination regularly exceeds 30, so that standard errors give an appropriate measure of uncertainty, the situation differs in determining the corner moment. Only for a few of the largest earthquakes in a catalogue is the pure G-R relation (4) usually inadequate; thus, only these events influence the evaluation of Mc. Hence, the asymptotic formulae obtained above and hereafter only approximately measure the corner moment estimate uncertainty. We discuss this problem further below [see, in particular, the paragraphs around eq. (47)].

3.2 Two-parameter estimation

The logarithm of the likelihood function for the TGR distribution eqs (11) and (12) is

(37)

In Appendix A we obtain appropriate expressions for determining MLEs for β and Mcm as well as their errors.

The logarithm of the likelihood function for the gamma distribution eqs (16) and (17) is

(38)

From this expression (see also eq. 6 in Kagan 1991) an advantage of the TGR relationship over the gamma distribution is clear. For the gamma distribution the normalizing coefficient involves the incomplete gamma function, which complicates the formula and its computation.

In Figs 2(a) and (b), we display the maps of the log-likelihood function for two parameters of the gamma and of the TGR distribution. In both plots, the 95 per cent confidence contours are approximately ellipsoidal, and the axes of the ellipses are only slightly inclined, suggesting that the correlation between the parameter estimates (Jenkins & Watts 1968, Ch. 4.3.4) is small. The correlation between the β and mc estimates for the gamma distribution appears to be stronger than that for the TGR relation. Isolines for higher confidence levels extend towards large moment values; in similar maps for smaller earthquake catalogues, even the 95 per cent isoline reaches M→∞. This asymmetry of the likelihood function in the moment scale signifies that standard errors eqs (32)–(34) only measure the uncertainty approximately. In general, upper and lower limits need to be evaluated. In the next section we discuss several other indicators of the mc uncertainty.

When these plots are compared with similar maps obtained for a smaller magnitude threshold (mt = 5.6, see Fig. 2 in Kagan & Jackson 2000), the comparison demonstrates that with a lower mt value we can better estimate β. However, to ensure the completeness of the catalogue in the case of mt = 5.6, its time span is smaller. As a result, the estimate of the corner magnitude shows better accuracy in the present plots.

Log-likelihood maps for the distribution of scalar seismic moment of earthquakes: the Harvard catalogue time span is 1977 January 1–2000 December 31, the seismic moment cut-off is 1017.7 N m (mt = 5.8), the number of events is 3942. (a) Approximation using the gamma distribution; (b) approximation using the tapered Gutenberg–Richter distribution.
Figure 2.

Log-likelihood maps for the distribution of scalar seismic moment of earthquakes: the Harvard catalogue time span is 1977 January 1–2000 December 31, the seismic moment cut-off is 1017.7 N m (mt = 5.8), the number of events is 3942. (a) Approximation using the gamma distribution; (b) approximation using the tapered Gutenberg–Richter distribution.

4 Simulation of seismic moment distributions

4.1 Simulation technique

Three of the four distributions described in Section 2 could be easily simulated; for the gamma distribution only the simulation process involves solving a transcendental equation with an incomplete gamma function, which may present a problem for certain applications. For the characteristic distribution, synthetic values of the seismic moment are obtained as

and

(39)

where R is a random variable uniformly distributed in the interval (0, 1].

For the truncated Pareto distribution

(40)

And, finally, for the tapered G-R relation we follow the suggestion by Vere-Jones et al. (2001). We generate two random variables

(41)

and

(42)

and then obtain the simulated seismic moment as

(43)

Fig. 3 shows the seismic moment distributions for three catalogues simulated according to eqs (39), (40) and (43), as well as their theoretical distributions. For display purposes the curves for the truncated Pareto and the tapered G-R distribution are multiplied by 0.1 and 0.01, respectively. Even for catalogues of 1000 events these curves demonstrate that random fluctuations are significant at the distribution tail, and synthetic distributions behave similarly to observed curves (Figs 1a and b).

4.2 Simulation results for TGR distribution

For the TGR distribution Kagan & Schoenberg (2001) describe the results of simulating earthquake catalogues and evaluating the corner moment. In this paper, we expand their results for the one-parameter estimation by evaluating the standard deviation, using eq. (34), by estimating the reciprocal of the corner moment, and the probability of rejecting the G-R relation (4) for a catalogue. We also apply two-parameter evaluation formulae to synthetic catalogues and derive the properties of these estimates. Finally, in the next section, we apply these statistical methods to model empirical earthquake distributions.

Earthquake cumulative number versus log seismic moment for three simulated earthquake catalogues. The simulated moment (M) follows the characteristic, truncated Pareto and tapered G-R distributions. The curves (solid lines) show the numbers of events with moment greater than or equal to M. Each catalogue has 1000 events. For better visibility, curves for the truncated Pareto and the tapered G-R distribution are multiplied by 0.1 and 0.01, respectively. We also show the approximation of curves by three theoretical curves (dashed lines). In addition to the relations listed above, the unrestricted G-R law, eq. (4), is shown by a dotted line.
Figure 3.

Earthquake cumulative number versus log seismic moment for three simulated earthquake catalogues. The simulated moment (M) follows the characteristic, truncated Pareto and tapered G-R distributions. The curves (solid lines) show the numbers of events with moment greater than or equal to M. Each catalogue has 1000 events. For better visibility, curves for the truncated Pareto and the tapered G-R distribution are multiplied by 0.1 and 0.01, respectively. We also show the approximation of curves by three theoretical curves (dashed lines). In addition to the relations listed above, the unrestricted G-R law, eq. (4), is shown by a dotted line.

To simplify the expressions, in this section we use a reduced seismic moment

(44)

where Mt is the observational moment threshold, i.e. setting, in effect, Mt = 1. For the corner moment of the TGR we use three expressions:

(45)

Following eq. (44), the reduced moment magnitude can be expressed as sc = mcmt.

4.2.1 One-parameter results

In Table 1 we simulate from 104 to 106 catalogues, using the TGR distribution with the reduced corner moment Sc = 1000. The number of events in a synthetic catalogue and the corner moment value are selected to reproduce properties of earthquakes in the Harvard catalogue (see Section 5). To solve eqs (24) and (31) we use the drtmi subroutine of the IMSL (fortran) package. For very small catalogues, with the number of events below 50, the iteration solution for sceq. (31) does not always converge.

We obtain MLEs of relevant parameters and estimate their uncertainties first by calculating the standard errors of MLEs (graphic) and then by computing the standard error graphic as in eqs (32)–(34). The aim is to compare actual simulation scatter with theoretical estimates (for example, graphic).

We obtained the estimates of all the parameters, i.e. β and the upper bound parameters in eq. (45). The individual MLEs of Sc, η and sc are related by eq. (45)—the maximum of the log-likelihood function does not depend on the argument of the function. However, the average estimate values and average standard errors for these variables, displayed in Tables 1 and 3 are different: we sum up the values in a linear, reciprocal or logarithmic space, respectively.

As with the estimates of the upper bound for the truncated G-R law (see eq. 27), the MLE of the corner magnitude is biased (see Kagan & Schoenberg 2001). This bias increases as the catalogue size diminishes (the true sc-value in Table 1 is 2.0). The estimates of the standard deviation (graphics) are close to sc variations; for large catalogues the difference is imperceptible: graphic for n = 5000, whereas for n = 50 the ratio graphic. Hence eq. (34) can be used to evaluate estimation errors for large data sets.

Other estimates of the upper bound (graphic and graphic) do not behave as well as the MLE for the corner magnitude, graphic. Although the corner moment estimate is not substantially biased and both of its standard error estimates are similar: graphic for n = 5000, for n = 50 the ratio graphic, its coefficient of variation is large. It increases from graphic for n = 5000 to graphic for n = 50. Kagan & Schoenberg (2001) show that a distribution of the corner magnitude MLE can be well approximated by a Gaussian law. This means that the MLE of the corner moment should follow a log-normal distribution. Godano & Pingue (2000) also found that empirical estimates of the maximum moment are distributed according to the log-normal law. The log-normal distribution has a heavy tail; thus, some of the estimates are very large, increasing the standard deviation of the MLE distribution significantly.

The estimates of the corner moment reciprocal (η) exhibit a relatively large bias (Table 1): for n = 50 the multiplicative bias reaches 0.15. Estimates of the standard error eq. (32) underestimate the uncertainty too; graphic for n = 50.

The probability values in the last column of Table 1 show the ratio of catalogue numbers for which the pure G-R relation (the null hypothesis) could be rejected at the 95 per cent confidence level in favour of the TGR distribution. The test involves calculating the difference between the log-likelihood functions (eqs 21 and 37) for these two distributions. Since the pure G-R relation is a special case of the TGR law (with η→0), twice the difference of two log-likelihood maxima, 2(0), should be distributed asymptotically as a χ2(1) variable (the chi-squared distribution with one degree of freedom). Therefore,

(46)
The β-, sc-values and their errors in simulated catalogues; one-parameter estimation.
Table 1.

The β-, sc-values and their errors in simulated catalogues; one-parameter estimation.

n, the number of events, graphic, simulated average reduced corner magnitude and its standard deviation, using eq. (31); graphic, average standard deviation, using eq. (34); graphic, β-value and its confidence limits, using eq. (24), β is measured between s = 0 and su = 1.5; graphic, reciprocal of the reduced corner moment and its standard deviation; ‘G-R Prob’, the probability of the pure G-R law to be rejected at the 95 per cent confidence level.

corresponds to 95 per cent confidence limit (Abramowitz & Stegun 1972).

For small catalogues (n = 50) the probability in the table is close to 5 per cent. The 95 per cent confidence limit signifies that 5 per cent of the data sets, even those satisfying the G-R relation, would have been rejected by the test owing to random fluctuations. Thus, such small catalogues yield almost no useful statistical information on whether the upper bound of the earthquake size is finite. For a catalogue with n>750 events, the probability of rejection (the correct answer) is high; almost all catalogues would be recognized as having a finite value of the corner moment. Simulations also show that in 95 per cent of all catalogues with about 667 events the null hypothesis would be rejected; for n = 340, it would be rejected by the test for 50 per cent of catalogues.

Further analysis indicates that in terms of hypothesis testing, the number of events exceeding sc is important. Those catalogues having a different total number of events, but on average the same number of large events (νc), behave similarly in the test. To calculate νc we use eq. (11):

(47)

where n is the number of earthquakes in a catalogue. For the examples above, the νc-values are 2.46 and 1.26, respectively (see also the last column in Table 3). For the truncated G-R law eq. (8), Pisarenko (1991) suggests that a catalogue must contain at least two to three earthquakes just below the upper bound (within about 0.3 magnitude units), to obtain a reasonably good statistical estimate of the bound.

Another estimate of the probability of the G-R model rejection could be obtained by analysing the graphic estimate and its standard error (column 6 in Table 1). If graphic→0, this signifies that the data can be approximated by a G-R relation. The distribution of graphic log-likelihood function is approximately quadratic (Kagan & Schoenberg 2001, their Fig. 3). Thus, if the coefficient of variation for graphic

(48)

is close to 0.5, the zero value of η is within the 97.5 per cent confidence interval (the one-sided limit is applied in this case). From the table, graphic only for the catalogues with n>750. Therefore, based on this test, the G-R hypothesis would be rejected for these data sets in almost all (more than 97.5 per cent) cases.

When we use the one-parameter method to evaluate the corner moment, the β-value is assumed to be known. To evaluate the influence of β variation on the MLE and the moment-based estimate (35) of sc, we performed additional simulations. The results are summarized in Table 2. In these simulations we created synthetic catalogues using varying β-values, but in determining the corner magnitude, β = 2/3 is assumed. The table shows that about 6 per cent variation of β causes the moment-based estimate for sc to change by about 0.03–0.06, MLE of sc by 0.04–0.07. If we use β = 2/3 in simulations, but assume varying β in the inversion procedure, the changes in sc estimates are about the same as those shown in Table 2.

As the next section testifies, the range of the corner magnitude variation in nature greatly exceeds that of β. Since the bias in β causes approximately the same error in mc in absolute terms, this error can be neglected in most circumstances.

Influence of β, used in the simulation, on the corner magnitude estimate.
Table 2.

Influence of β, used in the simulation, on the corner magnitude estimate.

β, simulation parameter; n, the number of events in a synthetic catalogue; graphic, MLE of sc and its standard deviation, β = 2/3 is assumed in estimation; graphic, estimate of sc based on statistical moments of the distribution, graphic is defined by eq. (35).

4.2.2 Two-parameter estimation results

Table 3 displays the results of the catalogue simulations and parameter evaluation, using the maximum-likelihood estimators, described in the Appendix. In principle, the two-parameter inversion is more advanced than the one-parameter case, since it evaluates both parameters in eq. (11) simultaneously as well as the correlation between the estimates (Jenkins & Watts 1968, Ch. 4.3.4). However, the inversion is more unstable than in the one-parameter case: for n<100 and Sc = 1000 some iterations fail to converge. A similar effect is seen for n = 1000 and graphic. No such lack of convergence is observed in the one-parameter case; only for n≤25 are some MLE estimates for βeq. (24) not soluble. Moreover, for Sc we have a one-parameter estimate eq. (35), which has certain computational advantages compared with the MLE.

Comparing Tables 1 and 3 shows that the solutions are close: estimates for the corner magnitude have a slightly smaller standard deviation in the one-parameter case. As Tables 1 and 3 show, for νc smaller than 1.0, the corner magnitude estimates are significantly biased: for νc = 0.2 the estimate is lower by about 0.3 magnitude units compared with the sc real value. The β-estimates and their standard errors, which are determined using eqs (22) and (23) are slightly inferior to the two-parameter case because the large events have been removed (truncated) in the former case. There is good agreement between the estimates of the standard errors for the corner magnitude and β (columns 4 and 5), and the calculations based on inverting the second derivative matrix (eqs A16–A18, see columns 6 and 7).

As one may expect, the MLEs of the two parameters are positively correlated, and increasing the slope in the scale-invariant part of the distribution (see Figs 1a and b) is compensated by an increase in the corner magnitude estimate. The value of the correlation coefficient (ρ) for Sc = 1000 is small, of the order of 0.25. The likelihood map (Fig. 2b) confirms this observation. However, if the reduced corner moment in the simulations is small (see the second part of Table 3), the correlation coefficient increases. If the scale-invariant part of the moment–frequency relation is not extensive, the distribution can be approximated by both the power-law and exponential functions in eq. (11). Therefore, we see a significant correlation between the two parameter estimates.

Similar to the one-parameter case, other estimates of the distribution upper bound (Sc and η) do not show as good a behaviour as does the corner magnitude, sc. The corner moment estimate and its uncertainties only differ slightly from those of the previous subsection, the standard deviations, of course, being larger for the two-parameter case. The behaviour of the η estimate is almost identical to that of the one-parameter case.

The β-, sc-values and their errors in simulated catalogues; two-parameter estimation.
Table 3.

The β-, sc-values and their errors in simulated catalogues; two-parameter estimation.

n, the number of events; Sc, the input corner moment used in simulations; graphic, simulated average reduced corner magnitude and its standard deviation; graphic, the same as for β-value; graphic, graphic, and r are the average values of a correlation matrix components (see the Appendix); νc the number of events larger than sc, eq. (47).

We also obtained the estimates of the probability of the G-R model rejection for a two-parameter case (cf.Table 1, last two columns). These estimates are also similar to those discussed in the previous subsection.

4.2.3 Simulated and real earthquake catalogues

Two considerations are important when applying simulation results to estimate the corner moment for real earthquakes:

  1. (a)

     simulations are based on the statistical independence of each event—in real earthquake catalogues there is a clear dependence such as in aftershock sequences and generally in short- and long-term clustering (Kagan & Jackson 1991)—the influence of such clustering is difficult to evaluate;

  2. (b)

     the simulation conclusions are fully valid only if earthquakes follow the TGR law. Even when some earthquake sets obey the law, in our catalogues we may have a mix of populations with dissimilar parameter values. Such a distribution composition would follow a different law from the original populations (Vere-Jones et al. 2001). Therefore, the estimates of σm and similar measures of the corner moment determination uncertainty, quoted above, should be considered only as lower bounds. The real errors should be higher.

5 Empirical distributions

5.1 Fit of theoretical distributions to global seismicity

Figs 1(a) and (b) display how the theoretical curves fit an empirical distribution of shallow earthquake scalar moments in the Harvard catalogue (Dziewonski et al. 2001). The TGR distribution provides a better visual fit to data, although if we examine the visual depiction more closely, a good approximation by the TGR model appears in both plots to be limited to a region around 1020–1021 N m. The gamma model appears to fit the extreme tail more closely. Section 5.4.2 discusses this problem again.

We determined the slope and corner moment for the tapered G-R relation, using the maximum-likelihood method (Section 3). The slope of the linear part of the curve corresponds to the β-value 0.675±0.014, the corner moment 1.43×1021 N m (magnitude mcm = 8.10±0.13). Using the maximum-likelihood method (Section 3) to determine β and mcg for the gamma distribution, the respective parameter values are β = 0.660±0.016 and Mcg = 3.71×1021 N m (mcg = 8.38±0.19).

The value of the log-likelihood maximum is slightly higher for the TGR distribution compared with the gamma law, indicating that the former distribution provides a slightly better fit to the data, but not by a statistically significant margin. For the global distribution maps displayed in Figs 2(a) and (b), the log-likelihood difference is 0.89.

The corner magnitude values indicate that the rupture length for the largest earthquakes is of the order of a few hundred (500–700) km. For shallow earthquakes the spatial scale invariance breaks down for hypocentral distances of 2000–3000 km (Kagan & Knopoff 1980); this distance corresponds to the average size of major continents or to the total thickness of the mantle. Does this mean that continent formation is caused by full mantle convection, whereas the stress that causes earthquakes accumulates only in the upper mantle?

The difference of log-likelihood functions (0, see 46) is around 8.0 for the gamma and TGR distributions. This difference value means that the probability of obtaining such an empirical distribution is extremely low if earthquakes follow the pure G-R law: less than 0.000 06 (Abramowitz & Stegun 1972). As the previous section testifies, a reliable evaluation of the corner moment requires that a catalogue include a few earthquakes close in magnitude to the corner magnitude. This requirement means that for available earthquake catalogues, the upper moment bound can be estimated effectively only for global seismicity or for large subsets of the catalogue. For smaller data sets the only useful information we can obtain is the lower limit for the corner magnitude/moment. However, such a limit usually lacks a practical or theoretical value.

The erroneous threshold (Mt) value selection influences our estimate of β, since missing earthquakes would bias the beta-estimate to lower values. The estimates of Mc are determined almost exclusively by the largest events in a catalogue; hence they are largely immune to mistakes in Mt selection. However, as Table 3 shows, if the Mc/Mt ratio is not large, the estimates of both parameters β and Mc are strongly correlated. Thus, we can evaluate the corner moment only by assuming that the β-value is universal and known.

All theoretical laws approximate well the power-law part of the empirical distribution, but at the tail of the distribution, the tapered G-R law yields a visually better fit. This conclusion is confirmed by the results of the maximum-likelihood determination of the distribution parameters: the parameter estimates of the tapered G-R law have smaller standard errors and a smaller correlation than those for the gamma distribution. Inspecting the plot also suggests that the distributions with a hard cut-off, i.e. (a) and (b) cases in Section 2.2.1 fit less well.

As discussed above, the TGR distribution has several advantages even compared with the gamma law: it is simpler computationally, it is easier to simulate, its parameter estimates are less correlated and it yields a better fit to empirical distributions. Thus, in the rest of this work, we will use this distribution mostly.

5.2 Universality of moment–frequency distributions

5.2.1 Moment–frequency relation and earthquake depth

Table 4 displays the results of a parameter evaluation for global earthquakes of three depth intervals: shallow (depth 0–70 km), intermediate (70–300 km) and deep (300–700 km). The parameter values are determined for three subcatalogues of the Harvard list: 1977–1999, 1982–1999 and 1987–1999, with threshold magnitudes of 5.8, 5.6 and 5.4, respectively.

The β- and mc-values in the Harvard CMT catalogue.
Table 4.

The β- and mc-values in the Harvard CMT catalogue.

n, the number of shallow (depth limit 0–70 km, mmt) events; graphic is an estimate according to eq. (22); graphic is measured eq. (24) between mt and mu = 7.8; graphic and its standard deviation are measured according to eqs (29), (30) and (34), β is taken from eq. (22); graphic and its standard deviation are measured according to eqs (31), (32) and (34), β is taken from eq. (24); graphic is defined by eq. (35), β = 2/3 is assumed; graphic, graphic and their standard errors are calculated according to formulae in the Appendix; ρ is calculated according to eq. (A22).

The β-estimates are similar for catalogue subsets. The graphic parameter, which assumes that earthquakes follow the pure G-R law eq. (22), shows an increase with increasing threshold level: when Mt is close to Mc, the relative scarcity of large earthquakes has a greater influence on the determination of β (see Figs 1a and b). The difference between β-values for shallow, intermediate, and deep events is caused, most probably, by aftershocks in shallow seismicity (Kagan 1991, 1999); these generally smaller earthquakes increase the β-value (Frohlich & Davis 1993). There is no significant difference between the estimates of graphic and graphic, with the former having a slightly larger standard error. The graphic uncertainty increase is caused by the removal of the largest earthquakes from a sample in this case (see the discussion above eq. 24).

All estimates of the corner moment (magnitude) have a similar behaviour. The estimates for the gamma distribution are larger than those for the TGR approximation by about 0.3 magnitude units, whereas distinct TGR estimates for the same data set do not differ by more than a few hundredths of magnitude units. This similarity in TGR estimates testifies to the robustness of the evaluation procedures. Moreover, in this table as well as in Tables 6 and 7 below, there is no consistent pattern of change in the mc-values when a different threshold mt is used: the corner magnitude estimates are the same within a few hundredths of units. This demonstrates the general validity of modelling the earthquake size distribution by the TGR law.

The corner magnitude uncertainty estimates are based on an approximation eq. (34) valid for relatively small errors only. We also calculated η and ση which in general better represent the errors. As we discussed in the previous section, the coefficient of variation υeq. (48) of this estimate shows whether η→0 (or, equivalently, Mc→∞) is within a confidence interval. We obtain the υ-values between 0.25 and 0.4 for all earthquakes, except deep ones, where υ varies from 1.0 to 1.3. These values for deep earthquakes signify that η = 0 is within the 67 per cent confidence interval. This value is confirmed by log-likelihood maps for Mc (similar to those in Figs 2a and b), where the 95 per cent confidence area extends to infinity (see Figs 2a–c in Kagan 1999). Thus, the Mc-values for deep earthquakes are not constrained from above. This effect can be anticipated from our simulation results: according to eq. (47) the νc-value for deep events is of the order of 0.2–0.5, i.e. the pure G-R model cannot be rejected for these data.

The correlation coefficient ρ between graphic and graphic is small for all estimates; a slightly larger value for intermediate events is caused by the lack of large events in the present catalogue (see subsection 4.2.2).

5.2.2 Focal mechanisms and the moment–frequency relation

As in Table 4, the β-values in Table 5 are determined for three subcatalogues (1977–1999, 1982–1999 and 1987–1999). Following Kagan (1997), the data set is subdivided into five categories—subduction, collision, intracontinental, mid-ocean and others. This subdivision assigns all earthquakes in a particular Flinn–Engdahl seismic region (Young et al. 1996) to one of the categories.

The β-values in the Harvard CMT catalogue for earthquakes with various focal mechanisms.
Table 5.

The β-values in the Harvard CMT catalogue for earthquakes with various focal mechanisms.

n, the number of shallow (depth limit 0–70 km, mmt) events; graphic is measured eq. (24), between mt and mu = 7.8

We also categorize the catalogue according to the prevalent earthquake focal mechanism—thrust, strike-slip and normal. For example, an earthquake is considered to have a normal focal mechanism if its most-compressive principal axis of the moment tensor (P-axis) is more vertical than either principal axis B or T (Frohlich 1992, 2001). Similarly, we define earthquakes with the thrust and strike-slip mechanism when the T-axis or B-axis is more vertical than others, respectively.

It is obvious from a comparison of the earthquake numbers that different tectonic regimes are characterized by varying ratios of focal mechanisms (Frohlich 2001). Whereas in subduction regions earthquakes are prevalently of the thrust type, the plurality of collision zone earthquakes are strike-slip, and intracontinental events most often have the normal focal mechanism. The mid-ocean earthquakes have a strong predominance of strike-slip-type events (see below).

First, we consider the β-values in subduction zones. Frohlich & Davis (1993, their Table 3) found that in a global Harvard catalogue the b-values depend on the earthquake focal mechanism. They obtained b = 0.86, 0.77 and 1.06 for thrust, strike-slip and normal earthquakes, respectively. Since global seismicity values may be influenced by a mixture of different earthquake populations, we compare the β-values for a set of zones which have a more homogeneous population (Kagan 1997, 1999). Subduction earthquakes display the behaviour suggested by Frohlich & Davis (1993),

(49)

at least for mt = 5.4 or 5.6. For these observational thresholds β has the better accuracy. However, after calculating the z-ratio (Kagan 1997, eq. 5)

(50)

the differences are not statistically significant.

There is no statistically significant difference among all other entries, except for strike-slip and normal earthquakes in mid-ocean zones, and the difference in the β-values between these data sets increases with a decrease of the threshold. This increase is caused by decreasing β for strike-slip events, showing that this population does not follow a power law. Hence it probably includes a mix of different subpopulations. The normal and strike-slip earthquakes in mid-ocean zones clearly belong to dissimilar populations and therefore cannot be combined.

The β-values for normal mid-ocean earthquakes exceed 1.0; this value would imply that the expression for the average moment (or moment flux) would diverge for M→0 (Knopoff & Kagan 1977). This divergence in its turn signifies that most tectonic deformation is released by the smallest earthquakes, and necessitates an introduction of a ‘minimum moment’ (see subsection 2.2). This would also imply that mid-ocean normal earthquakes have a physical mechanism, different from all other earthquakes. Thus, in the absence of strong confirming evidence, β ≥ 1.0 in this data set and generally should be considered skeptically, as being due to some as yet unknown systematic effects.

After using the global β-values in Table 5, the difference between focal mechanisms eq. (49) becomes statistically significant. This obviously results from mixing mid-ocean and other earthquakes in one set. This conclusion again highlights the necessity of using a uniform earthquake distribution in a statistical analysis.

In Table 6 we display the results for earthquakes with different focal mechanisms in subduction zones: it is only for these zones that the number of events is sufficient to accurately infer the value of the corner moment (see Section 4.2). The table entries confirm the results of Table 5: the average β-value 0.63 is consistent with all focal mechanisms. The behaviour of various estimates of the corner magnitude is also similar to that in Table 4: the correlation, ρ, between the β- and mc-estimates is low.

The β- and mc-values in the Harvard CMT catalogue for earthquakes of different focal mechanisms in subduction zones.
Table 6.

The β- and mc-values in the Harvard CMT catalogue for earthquakes of different focal mechanisms in subduction zones.

n, the number of shallow (depth limit 0–70 km, mmt) events; graphic, graphic and their standard errors are calculated according to formulae in the Appendix; ρ is calculated according to eq. (A22); graphic and its standard deviation are measured according to eqs (29) and (30), β is taken from eq. (22); graphic and its standard deviation are measured according to eqs (31) and (32), β is taken from eq. (24); graphic is defined by eq. (35), β = 2/3 is assumed; υ is the coefficient of variation eq. (48).

The values of the variation coefficient (see eq. 48) for strike-slip and normal earthquakes approach or exceed 0.5, indicating that mc→∞ cannot be ruled out for these events by available data. For strike-slip and normal earthquakes the catalogue size is relatively small, and νceq. (47) is of the order 0.5–1.4. As Table 3 shows, the mcm estimates need to be increased by 0.05–0.15 to correct for bias. Taking this into account, comparing the magnitude entries indicates that the corner magnitude value is similar for all focal mechanisms and thresholds. At least no consistent pattern of mc difference is observable from these displays. The similarity of the parameters of the moment–frequency relation (mc- and β-values) for events with disparate focal mechanisms in subduction zones suggests that factors that control the earthquake size distribution are largely invariant with respect to mode of earthquake rupture. This confirms the idea of the universality of the earthquake size distribution.

In approximating the earthquake size distribution, we did not try to model a possible change of the β-value (a break in the scaling relation) owing to the finite thickness of seismogenic layers (Pacheco et al. 1992; Triep & Sykes 1997). This break is expected to occur between the magnitudes 6.0–7.5 depending on the crust thickness and the mode of earthquake rupture. In our parameter evaluations, such a break should manifest itself by a consistent increase of the β-values when the threshold magnitude increases, and by the dependence of the parameter values on the focal mechanism of an earthquake. No such pattern is obvious in Tables 4–6. This confirms the results of Kagan (1994), Sornette et al. (1996), Main (2000) and Leonard et al. (2001): there is no observable scaling break in the moment–frequency relation in mid-size magnitude ranges. The exponential taper at the corner magnitude explains the properties of the relation.

5.2.3 Mid-ocean earthquakes

As we see from Table 5, mid-ocean earthquakes cannot be analysed statistically as a uniform population: the behaviour of earthquakes with a normal focal mechanism that generally occur at spreading ridges differs from strike-slip events in ocean transform faults. Our analysis, reported elsewhere (Bird et al. 2000, 2001), suggests that further subdivision of the mid-ocean earthquake catalogues is necessary to obtain homogeneous samples. Most probably, the reason for such an inhomogeneity is the rapid change of the seismogenic lithosphere thickness.

Bird et al. (2000, 2001) subdivided the mid-ocean earthquakes available from the Harvard catalogue according to their focal mechanism and velocity of either ridge spreading (for normal events) or the deformation velocity of transform faults. One of the drawbacks of this analysis is the retrospective character of the earthquake assignment (Kagan 1997, 1999). For example, Flinn–Engdahl zones and their boundaries had been proposed before the Harvard data were collected, whereas in the study by Bird et al. (2000, 2001) earthquake data potentially can be used in delineating zone boundaries. However, in the absence of other detailed tectonic classifications of mid-ocean earthquakes, we use this regionalization of earthquakes to study their corner moment. We subdivide two groups of earthquakes (strike-slip and normal) into three approximately equal subpopulations according to the deformation velocity of their causal tectonic features.

Fig. 4 displays the moment–frequency relation for three groups of strike-slip earthquakes in the transform faults. It seems likely from the curves that in the moment range 1017.1–1018.3 (m = 5.4–6.2) the earthquakes follow a scale-invariant pattern with the same value of β. The difference between the curves is due mainly to varied corner moments.

The results of β and mc evaluation are collected in Table 7. Owing to a relatively small number of events in each subset, the parameter determination accuracy is low, so our conclusions are tentative. Moreover, in contrast to earthquakes in continental and continent boundary areas, the mid-ocean events have low mc-values. Hence the seismic moment range is restricted. Consequently, as in the simulated catalogues in the second half of Table 3, the MLEs of β and of the corner moment are strongly correlated: instead of the ρ-values of the order of 0.1–0.3 (Tables 4 and 6), the correlation coefficient is of the order 0.4–0.6 for transform earthquakes and 0.7–0.9 for normal events. Hence, the parameter estimates listed in Table 7 may have even less accuracy than their individual standard errors indicate.

Earthquake cumulative number versus log seismic moment for the global shallow earthquakes M ≥ 1017.1 (m ≥ 5.4) in the 1987/1/1–1998/12/31 Harvard catalogue in ocean transform faults. The curves show the numbers of events with moment greater than or equal to M for group of faults classified according to their velocity. We also show the approximation of curves by the TGR distributions. The values of parameters are: velocity 0–36 mm yr−1, β = 0.600, Mc = 45×1018 N m (mc = 7.1); velocity 36–67 mm yr−1, β = 0.650, Mc = 11×1018 N m (mc = 6.7); velocity >67 mm yr−1, β = 0.533, Mc = 3.2×1018 N m (mc = 6.3).
Figure 4.

Earthquake cumulative number versus log seismic moment for the global shallow earthquakes M ≥ 1017.1 (m ≥ 5.4) in the 1987/1/1–1998/12/31 Harvard catalogue in ocean transform faults. The curves show the numbers of events with moment greater than or equal to M for group of faults classified according to their velocity. We also show the approximation of curves by the TGR distributions. The values of parameters are: velocity 0–36 mm yr−1, β = 0.600, Mc = 45×1018 N m (mc = 7.1); velocity 36–67 mm yr−1, β = 0.650, Mc = 11×1018 N m (mc = 6.7); velocity >67 mm yr−1, β = 0.533, Mc = 3.2×1018 N m (mc = 6.3).

The β-values for strike-slip earthquakes decrease (but to a lesser degree than in Table 5), as the threshold limit (mt) decreases. This implies that even after a subdivision of the earthquake set according to fault velocity, earthquake sizes do not exactly follow a scale-invariant pattern. This suggests that we are still analysing a heterogeneous sample. A consistent decrease of the corner magnitude with an increase of the fault deformation velocity confirms this conjecture.

The β-values for strike-slip earthquakes are consistent with the value β = 0.63, obtained for subduction zones (Table 6). Large β errors are a consequence of the correlation between β and mc estimates and the small number of events in the catalogues. The β-values for earthquakes in the spreading ridges have very large uncertainties, possibly for the same reason as mentioned above. However, they do not contradict the universal β-value. Although values for both the coefficient of variation υeq. (48) and νceq. (47) suggest that the corner moment should be determined with a low uncertainty, the mc-values for these earthquakes do not provide any consistent picture (cf. Bird et al. 2000, 2001). Owing to the limited moment range for these events, the actual accuracy of mc determination is too low.

These results confirm the conclusion reached by Bird et al. (2000, 2001), that the distributions of oceanic events have β-values similar to subduction and continental earthquakes, and the apparent increase of β-values with the increase of mt is probably caused by a mix of dissimilar earthquake populations.

5.3 Influence of seismic moment inversion errors on the estimate of parameters

In this section, we discuss random and systematic errors in determining the seismic moment and their influence on evaluating the moment–frequency law parameters. Molchan & Podgaetskaya (1973, p. 47), Tinti & Mulargia (1985, p. 1690), Pisarenko et al. (1996, eq. 32) discuss how magnitude errors affect the determination of earthquake productivity at in eq. (3). For practical purposes, they point out that the estimate of b (or β) is not influenced by the errors. However, Rhoades (1996) and Rhoades & Dowrick (2000) indicated that, if magnitude errors vary with magnitude, the β estimate may be biased. Simulations by Tinti & Mulargia (1985) and by Rhoades & Dowrick (2000) suggest that the estimate of the corner magnitude would also be increased (biased) by random errors.

5.3.1 Errors in earthquake size determination

We use three methods to evaluate random and systematic errors in seismic moment determination:

  1. (a)

     we study inversion errors reported in the Harvard catalogue;

  2. (b)

     we use as a proxy for total errors the value of the CLVD component (or Γ index, see Kagan 2000) in each of the CMT solutions;

  3. (c)

     we compare the MT solutions in two catalogues—Harvard and USGS (Sipkin et al. 2000, and references therein).

The β- and mc-values in the Harvard CMT catalogue for mid-ocean earthquakes.
Table 7.

The β- and mc-values in the Harvard CMT catalogue for mid-ocean earthquakes.

n, the number of shallow (depth limit 0–70 km, mmt) events; graphic, graphic and their standard errors are calculated according to the formulae in the Appendix; ρ is calculated according to eq. (A22); graphic is defined using eq. (35), β = 2/3 is assumed; υ is the coefficient of variation eq. (48); νc is the number of events with mmceq. (47). Transform strike-slip earthquakes are earthquakes with strike-slip focal mechanism occurring at transform mid-ocean faults; velocity of these faults is shown in the first column. Spreading normal earthquakes, similarly, are earthquakes with normal focal mechanism occurring in spreading ridges.

A relative error, ε, i.e. the ratio of the error tensor norm to the moment tensor norm, is (corrected eq. 3 in Kagan 2000)

(51)

where Eij and Mij are error and moment tensor components, respectively. The distribution of ε is shown in Fig. 5; since the influence of the threshold value is insignificant for the relative error determination, we use mt = 5.4 in this plot. The coefficient of correlation between ε and the magnitude is −0.67, indicating that relative errors decrease with increasing m. We calculate two regression lines approximating the dependence of the errors on the magnitude—linear and quadratic curves are shown in the plot. Residual regression errors are close for both cases: σ = 0.222 and σ = 0.215, respectively. From the diagram (Fig. 5) it is clear that errors for earthquakes with m>6.5 deviate significantly from a linear trend. However, since the number of large events is small, the linear and quadratic cases differ relatively little in residuals.

Relative error ε versus moment magnitude for shallow earthquakes mt = 5.4 in the 1982/1/1–1999/12/31 Harvard catalogue. The curves show two approximations: linear and quadratic fits.
Figure 5.

Relative error ε versus moment magnitude for shallow earthquakes mt = 5.4 in the 1982/1/1–1999/12/31 Harvard catalogue. The curves show two approximations: linear and quadratic fits.

Table 8 displays the parameter values for the quadratic fit in the earthquake sets of three depth ranges:

(52)

where m* = m−6. The a-parameters values in Table 8 suggest that the distribution of relative errors is similar in all depth ranges, the only difference being the height, a0, of the approximating curve in the log–log plot (see Fig. 5).

We also analyse the Huang et al. (1997) catalogue of deep earthquakes. Again parameter values for this catalogue display a similar pattern, only the value of a0 is different from that of the Harvard list.

Thus, the ε-values for earthquakes in the magnitude range 5.5–6.5, can be represented as

(53)
Parameter values for relative errors.
Table 8.

Parameter values for relative errors.

n, the number of mmt events; for a0, a1 and a2 see eq. (52).

with A0 = 0.07 for shallow events and A0 = 0.05 for intermediate and deep shocks in the Harvard catalogue. In our preliminary study we can assume for m>6.5 events a constant relative error ε = 0.04 for shallow and ε = 0.03 for deeper earthquakes.

The relative error, ε, apparently is only part of the total seismic moment error, possibly 1/3 to 1/2 of the total (Kagan 2000). The CLVD (Γ) index can be used as another measure of the magnitude uncertainty. The index can be determined for any MT solution. As Kagan (2000) argued, the real value of Γ should be close to zero for almost all earthquakes. In earthquake catalogues the Γ-value is regularly of the order of ±0.1–0.5 (theoretically the CLVD index may vary from −1 to +1). The empirical Γ-distribution is usually symmetric around zero, so its mean is close to zero. However, the value of the standard variation (σΓ) for the CLVD index could indicate total moment errors.

The relation between the magnitude error (σm) and σΓ is not one-to-one; it varies slightly for different focal mechanisms (see Table 1 in Kagan 2000). For an approximate estimate of the ratio σM/M we use the simulation results shown in the above table. They suggest that σM/MσΓ/2.5. Then (cf. eq. 34)

(54)

Fig. 6 shows an example of how the standard variation for the CLVD index (Γ) depends on the magnitude, and presents a linear regression curve approximating this dependence. To obtain this plot we calculated the CLVD index for the earthquakes binned into 0.1 magnitude intervals and then determined σΓ for those bins with the number of events equal to or larger than 5. As the number of earthquakes decreases with increasing magnitude, the variation of σΓ estimates increases, as is obvious from the plot. We obtained similar plots for several subdivisions of the Harvard catalogue (such as in Table 8) and all show almost the same pattern. The correlation coefficient between σΓ and m in all diagrams varies from −0.5 to −0.74 (it is −0.74 in Fig. 6), and the values of the regression parameters are also close.

Dependence of Γ index on moment magnitude for shallow earthquakes mt = 5.4 in the 1982/1/1–1999/12/31 Harvard catalogue. Linear regression curve is also shown.
Figure 6.

Dependence of Γ index on moment magnitude for shallow earthquakes mt = 5.4 in the 1982/1/1–1999/12/31 Harvard catalogue. Linear regression curve is also shown.

As a final method to determine possible moment/magnitude errors, we compare the data for identical earthquakes listed in the two catalogues (Harvard/USGS). The difference in the seismic moment or magnitude values provides some insight into uncertainties in earthquake size determination. One should expect that random errors and, to some degree, systematic errors would be unlike in catalogues that use different methods of seismogram interpretation.

Fig. 7 displays one example of such a comparison. We binned the difference of Δ lg M = lg M1−lg M2 for paired earthquakes in two catalogues, Harvard and USGS. Here M1 is the scalar seismic moment in the Harvard catalogue and M2 is the same for the USGS catalogue. If a bin contained 10 or more events, we determined the average difference 〈Δ lg M〉 and its standard deviation σΔ. In the diagram, we subtract and add σΔ to 〈Δ lg M〉 to obtain the limits of the difference variability.

Difference in log ratio for two catalogues of moment tensor solutions for shallow earthquakes mt = 5.6 in the 1982/1/1–1999/12/31 Harvard and USGS catalogues. The central solid curve shows average difference in the log ratio of scalar seismic moments, two dashed curves add and subtract the standard error of the difference.
Figure 7

Difference in log ratio for two catalogues of moment tensor solutions for shallow earthquakes mt = 5.6 in the 1982/1/1–1999/12/31 Harvard and USGS catalogues. The central solid curve shows average difference in the log ratio of scalar seismic moments, two dashed curves add and subtract the standard error of the difference.

Difference in log ratio for Harvard and USGS catalogues.
Table 9.

Difference in log ratio for Harvard and USGS catalogues.

n, the number of mmt events; Δ lg M average difference in log moments for two catalogues; σΔ the standard deviation of the above difference.

We obtained analogous plots for several subdivisions of the catalogues in time and depth limits. All the diagrams display a similar pattern.

  1. (a)

     For most of the magnitude range (5.4≤m≤7), the average difference is small (close to zero).

  2. (b)

     The mean difference for large earthquakes (m>7) is usually positive, i.e. the Harvard magnitude value is higher than that of the USGS by 0.05–0.2. Sipkin (1994) notes that the filter employed by the USGS inversion technique ‘… can lead to a bias in the estimated scalar moments larger than approximately 1020 N m [m>7.33]’.

  3. (c)

     Contrary to the previous two figures (5 and 6), there is no clear dependence of σΔ on the magnitude.

In Table 9, we list the average differences of Δ lg M and standard deviations for paired earthquakes in two catalogues, Harvard and USGS. The value of σΔ is similar to that obtained by Helffrich (1997) who quotes the following values for the standard deviation: 0.15, 0.06 and 0.07 for shallow, intermediate and deep seismicity, respectively.

Rhoades (1996) and Rhoades & Dowrick (2000) argue that magnitude errors increase as the magnitude itself increases. As we see in the discussion above, evidence for MT solutions suggests the opposite: relative errors are strongly dependent on earthquake size and decrease with increasing magnitude. One would expect such a dependence, since large earthquakes are registered by many stations, and the signal-to-noise ratio is higher. Hence, the relative error for the seismic moment should be smaller.

We performed similar computations for the seismic moment and the local magnitude (ML) data in Uhrhammer et al. (1996). Both data sets show a decrease of relative errors with magnitude. A comparison of the magnitudes in disparate catalogues (Harte & Vere-Jones 1999; Peresan et al. 2000) shows no significant increase of the magnitude difference variation with magnitude either.

5.3.2 The β error correction

Rhoades & Dowrick (2000, eq. 10) suggest the following magnitude correction for the magnitude estimates perturbed by random errors:

(55)

where mcorr is the corrected magnitude, mmeas is the measured magnitude and σm is the standard magnitude error.

To apply this formula for b or β correction, we estimate σm at two magnitude values (5.5 and 6.5) and use eq. (55) to compute δcm=mcorrmmeas. Performing such calculations for the relative error ε, eq. (53), we obtain the correction for β of shallow earthquakes 0.0012, i.e. β is decreased by about 0.18 per cent. A similar calculation using the Γ dependence on the magnitude (Fig. 6) yields a decrease of β by about 0.08 per cent. These estimates imply that unless the dependence of total magnitude errors on the magnitude is much stronger than that suggested by the evidence collected in the previous subsection, the β-bias is small and can be ignored.

5.3.3 The corner magnitude error correction

Since the authors cited above use the magnitude in their analysis of error correction, we convert the distribution formulae to magnitude. We use a reduced moment variable eq. (44)μ = log(S), 0≤μ<∞ and obtain for the complementary and probability density functions

(56)
(57)

To apply expression (55) for the corner magnitude correction, we evaluate the apparent graphic-value at the log corner moment, μ = log Sc = − log η

(58)

and

(59)

and for the logarithmic derivative of Φ

(60)

where Φ is Φ(μ = −log η), and the same for φη. To make the calculations of the corner magnitude correction, we employ eq. (55) replacing β by 1+β. Such an estimate is approximate, since the logarithmic derivative of Φ is not constant near mc; but simulations show that the correction term eq. (55), modified as described above, is accurate to within about 10 per cent.

We calculate the value of δc, assuming (see above) the relative error at the corner magnitude is 0.03. Using eqs (54), (55) and (60), we see that the average correction for the mc-value is −0.003, i.e. it can be neglected. If we presume that the difference between the Harvard and the USGS moment estimates reflects total errors in both catalogues, and the errors are equal, then graphic, and the mean magnitude bias is −0.014.

The values quoted above are averages. Since the catalogues we work with contain very few earthquakes exceeding mc, the magnitude errors of these individual events would play a major role in evaluating the corner moment. Thus, the mc estimate would be largely controlled by the moment values of these events. For such catalogues the above-mentioned bias values are to be interpreted as related to the averages one would obtain if many such catalogues were processed.

5.4 Seismic moment distribution for pre-1977 data

5.4.1 Catalogues and results

Here we analyse the earthquake size distribution inferred for the pre-digital seismological data, i.e. 1900–1976 instrumental earthquake catalogues. The quality of the data is significantly lower than the modern data analysed above, and the quality deteriorates for older data sets. As a result, the data may lack the statistical stability necessary for any meaningful statistical analysis.

Kanamori (1977) performed seismic moment calculations for earthquakes occurring before 1977. Using his data for 42 earthquakes m ≥ 8.0 in 1921–1976, we obtain the following estimates of the TGR distribution parameters: graphic, graphic=9.6 (in this subsection we use the same notation as in Table 4). The 95 per cent lower limit for the corner magnitude is at about 9.1 and the upper limit is ∞.

We obtain similar results for the catalogue of Pacheco & Sykes (1992): for the period 1905–1976 the number of earthquakes with m ≥ 8.0 is 56, and the MLEs are graphic, graphic=9.7, the 95 per cent lower limit for mcm is at 9.0. If the gamma distribution is used to fit both data sets, mcg = 10.0 for both catalogues. However, the lower 95 per cent limit is approximately the same as for the TGR distribution (Kagan 1994). For both of these catalogues (Kanamori 1977; Pacheco & Sykes 1992) the null hypothesis (the G-R relation) cannot be rejected.

We analyse the Centennial (1900–1999) catalogue by Engdahl & Villaseñor (2001). The catalogue is complete to magnitude 6.5 (Ms/mB or their equivalent) during the period 1900–1963 and 5.5 (Ms/mB or their equivalent) during the 1964–1999 period. Almost all earthquakes have been relocated using an algorithm by Engdahl et al. (1998). Up to eight different magnitudes for each earthquake are listed in the catalogue.

We use the maximum of the available magnitudes as a substitute for the moment magnitude and construct a moment–frequency histogram. There are 1623 shallow earthquakes in the catalogue with m ≥ 7.0; of these 30 events have m ≥ 8.5. The 95 per cent confidence area in the likelihood map for shallow seismicity is not well-constrained and extends to infinity. MLEs for parameters are graphic=0.76±0.02 (mu = 8.5) and graphic=9.30±0.27, with a minimum of the 95 per cent confidence area at mcm = 8.9.

Results of the seismicity analysis for the pre-1977 data imply that the β-values are consistent with the results of the modern catalogue analysis. Minor differences may be explained by the inferior quality of early 20th century data. However, the values of the corner magnitude inferred from the Harvard data for shallow seismicity seem to be inconsistent with the estimates obtained in this section as well as with the magnitudes of a few earthquakes in the mid-20th century. Pérez & Scholz (1997) find that at least four earthquakes in the 20th century exceeded the moment magnitude 9.0.

The 95 per cent upper limit for mcm in the Harvard data is 8.4 (Fig. 2b). Even if we accept this value, according to eq. (11), the probability of an earthquake m ≥ 8.84 is 100 times below the level that a pure G-R law would yield. This would mean, in its turn, that after extrapolating the Harvard results (Figs 1a and b) by taking β = 2/3 and the number of earthquakes m ≥ 5.8 to be 3765 in 23 years, the probability of an m ≥ 8.84 earthquake in 100 years according to the TGR distribution, is about 0.15. The centennial probability decreases to less than 0.0037 for a m ≥ 9.0 event. However, if we approximate the moment–frequency relation by the gamma distribution, the upper 95 per cent limit is 8.84 (Fig. 2a). Then, the probability of an m ≥ 9 earthquake in 100 years is about 0.36 (see Fig. 1b).

5.4.2 Discrepancy between modern and old data analysis

Why do analysis results of the modern (1977–1999) and the old 1900–1976 data disagree? Let us consider first the influence of magnitude/moment errors on the moment–frequency distribution. As expression (55) demonstrates, a bias in the corner magnitude estimate increases quadratically with a magnitude error; if we, for example, assume σm = 0.5, the correction term would be about −0.7, i.e. enough to explain different mcm-values for pre-digital and post-digital seismological data. Thus, the discrepancy may be within normal statistical variability, i.e. 95 per cent confidence limits for both data sets really intersect.

Moreover, individual earthquake magnitude values in the early part of the 20th century are subject to significant variations depending on the evaluation method used as well as other factors. For example, the seismic moment values of Pacheco & Sykes (1992) often differ by a factor from 2–10 in various publications cited by them. Similarly, in the catalogue of Engdahl & Villaseñor (2001), many magnitude values vary, even if the same magnitude scale is used, by 0.2–0.5 units and more.

There are relatively few large earthquakes in the modern data set. In the Harvard catalogue we find only 12 shallow events with m ≥ 8.0 and none with m ≥ 8.4. As indicated earlier (end of Section 3.1.2), statistical conclusions, based on the asymptotic theory may not be valid: possibly if two or three earthquakes with m>8.5 were to occur in the next few years, the Mc estimate would increase drastically. Two large deep earthquakes in 1994 strongly modified the Mc estimate for deep seismicity (Figs 2a and b in Kagan 1999).

The right-hand tail of the earthquake size distribution may be heavier than exponential as in the TGR. As we show above, the gamma distribution yields a significantly higher probability of occurrence for a magnitude 9 event. A power law with a stretched exponential tail (cf. eq. 11)

(61)

where c is a stretching exponent (c≤1.0), can be used to fit data. Stauffer & Aharony (1994, their eq. 42) propose the following relation for the number ns of percolation clusters of size s:

(62)

where d is the dimensionality of space. Taking d = 3, we obtain, using the example at the end of Section 5.4.1, the probability 0.19 per century for a m ≥ 9 earthquake. This probability value is easier to accept as the explanation for the discrepancy.

As another model for the moment–frequency relation, we can use two power laws (Sornette et al. 1996). This and eq. (61) would yield a more gradual decline of the earthquake probability at very large moment values. However, as mentioned earlier [item (B) in Section 2.2.1], in general, these distributions require three adjustable parameters, and the data presently available do not seem to support more than two degrees of freedom (Utsu 1999).

Another possibility is to use physical or statistical insight in selection of the proper distribution form for the seismic moment. Kagan (1991), Sornette & Sornette (1999), Main et al. (2000), Vere-Jones et al. (2001) discuss various implications for the choice of distribution (see also Section 2.2.1). However, presently these considerations offer little help in the selection of the right-hand tail of the distribution: our main line of evidence is still empirical.

The corner moment may vary in continental areas, but these changes, on the one hand, are too small to be noticeable with the present data, and, on the other hand, are sufficient to explain a few very large m>9.0 events. If, for instance, we assume that mcm = 9.0, calculation (47) shows that νc would be less than 0.9 for the 1977–1999 Harvard data. This means that even for a global data set, the G-R hypothesis could not be rejected (see Section 4.2.1). Therefore, any attempt to determine the corner magnitude for finer subdivisions of the catalogue would be inconclusive in this case.

Still another explanation for the discrepancy is global seismicity fluctuations—the last 25 years have been unusually quiet with a low release of seismic energy. And lastly, the combination of the above factors may explain the discrepancy.

The best way to resolve this problem with available data, is to systematically process pre-digital seismograms of large shallow earthquakes as Huang et al. (1997) and Chen et al. (2001) did, i.e. using the Harvard technique. Therefore, a catalogue of these events complete to a reasonably low threshold, can be compiled. Contrary to the results for great individual earthquakes, which are obtained using various assumptions and techniques, such a catalogue would provide an appropriate set to test whether the Harvard data are sufficient to evaluate the upper bound for the seismic moment.

6 Discussion

The results of statistical analysis of earthquake catalogues present, in our opinion, convincing evidence that the standard G-R relation needs to be re-evaluated as a tool for the description of the earthquake size distribution. Physical and geometrical considerations—conservation energy and finiteness of seismogenic regions—require that an upper seismic moment/magnitude bound be introduced into the distribution. Empirical evidence reported above, supports this modification of the G-R law.

Introduction of the distribution upper bound parameter makes it necessary to re-interpret the moment–frequency or magnitude–frequency relations. From the statistical analysis presented above, it is clear that in almost all cases the earthquake size distribution can adequately be fitted by a theoretical relation having a universal value of β. Available data for earthquakes occurring in plate boundaries and continental interiors do not show significant variation of the corner moment. However, mid-ocean earthquakes have a significantly smaller value of the upper moment bound than events in continents and their boundaries. If the variable corner moment is introduced in the distribution, mid-ocean events appear to have the same value of β as the rest of the global set of earthquakes.

How can we explain many reports of large-b and β-value fluctuations? First, before any conclusion concerning b or β variation can be proposed, the systematic and random errors in the determination of these parameters should be thoroughly and systematically investigated. As Kagan (2000) indicated, even for seismic moment determination, systematic errors appear to dominate the total error balance. For magnitude data, the errors of both kinds should be higher. For example, the b-value for the body wave magnitude, mb, is usually higher by a factor of 1.5–2.0 than the b-value for the surface wave magnitude, MS (Section 4.4 in Kagan 1999). This fact should serve as a strong indicator of significant systematic effects. Thus, any apparent variation of the b-value needs to be carefully examined, the influence of systematic bias should especially be investigated. Standard statistical tests are appropriate only if errors are random and independent; in the case of strong systematic effects in the data, the test results are pointless.

Secondly, if the possibility of the upper bound variations is accepted, one should very carefully inspect an earthquake catalogue for data uniformity. As we explained above, a mix of earthquake populations with varying corner moment/magnitude may have an appearance of a more-or-less linear G-R relation with a substantially higher b-slope.

In some cases, simple physical considerations are sufficient to expect that the corner moment would be significantly different from the rest of the tectonic earthquakes. For example, in volcanic regions the rock volume, subjected to high stress, is limited, hence the maximum earthquake cannot be very large. Since the geometry of the volcanic volume is usually complex, it is difficult to select a uniform set of earthquakes for statistical analysis. Similarly, close to the boundaries of the seismogenic zone, such as a free surface, the proportion of large earthquakes should be smaller than in the interior of the layer (see Section 1 in Kagan 1999). Therefore, if the earthquake size distribution is approximated by a pure G-R law, the result should increase the estimated b parameter.

7 Conclusions

The following conclusions can be drawn from our work.

  1. (1)

     We discuss various theoretical distributions that can be used to approximate the seismic moment data.

  2. (2)

     Various statistical procedures for an estimation of parameters are considered; using the simulation we estimate the accuracy of the procedures.

An analysis of the Harvard catalogue data undertaken in this paper suggests the following conclusions.

  1. (1)

     The β-value for moderate earthquakes has a universal value of about 0.63.

  2. (2)

     Since β<1.0, an earthquake size distribution must have an upper bound at a high seismic moment or magnitude value. Hence, the distribution has a heavy (even super-heavy) ‘body’ and a ‘light’ tail. An appropriate functional form for the upper bound is an exponentially tapered (not truncated) Gutenberg–Richter relation.

  3. (3)

     The corner magnitude mc for earthquakes in all continental regions and their boundaries is similar and falls in the range 8.0–8.5. The earthquake size distribution is influenced by the geometry of seismogenic zones—the corner moment for mid-oceanic ridge earthquakes appears to 1–3 units of magnitude smaller (mc = 5.8–7.3) than that for plate boundary regions and continental areas.

  4. (4)

     We propose a new paradigm for the earthquake size distribution: most reported variations of the b-value (in the magnitude–frequency relation) are artefacts arising from magnitude deficiencies and defects of analysis. Some apparent changes in the β-value (e.g. in volcanic areas, for very shallow earthquakes, etc.) may be genuine effects caused by the mixing of earthquake populations with different mc.

Acknowledgments

I appreciate partial support from the National Science Foundation through grant EAR 00-01128 and from the Southern California Earthquake Center (SCEC). SCEC is funded by NSF Cooperative Agreement EAR-8920136 and USGS Cooperative Agreements 14-08-0001-A0899 and 1434-HQ-97AG01718. This work has greatly benefited from stimulating discussions with D. D. Jackson, P. Bird, F. Schoenberg and D. Sornette (UCLA), D. Vere-Jones (Wellington University), V. F. Pisarenko (Russian Academy of Science) and I. Main (University of Edinburgh). Reviews by I. Main and D. Vere-Jones have been very helpful in revising the manuscript. The SCEC contribution number is 597.

Appendix

Appendix A:Likelihood function for TGR distribution

In this Appendix we review the likelihood function for the tapered Gutenberg–Richter distribution and evaluation of both its parameters β and Mc. As in Section 4.2 we use the reduced seismic moment Seq. (44). For n observations, the log-likelihood function is (see eq. 37)

(A1)

or

(A2)

or

(A3)

where G = 103/2.

Taking the derivative of the log-likelihood and setting it to zero, we obtain, for instance, two equations for MLEs of β and η in eq. (A2)

(A4)

and

(A5)

In these equations

(A6)

and

(A7)

Hence, the MLEs of two parameters should satisfy (Vere-Jones et al. 2001)

(A8)

and we estimate graphic by solving a single equation

(A9)

The solution graphic to this last equation must satisfy 0<graphic<1/(BA mini{Si}) and an iteration starting from graphic converges to graphic (Vere-Jones et al. 2001). The MLE for β is obtained from eq. (A8).

The MLEs of all parameters for the distribution upper bound in eqs (A1)–(A3) are related by eq. (45). However, their standard errors differ. To estimate uncertainties we obtain a matrix of the log-likelihood second derivatives and invert it (Jenkins & Watts 1968, Ch. 4). For eq. (A1) we obtain

(A10)
(A11)
(A12)

where Šc is MLE for the reduced corner moment.

Similarly for eq. (A2)

(A13)
(A14)
(A15)

For eq. (A3)

(A16)
(A17)
(A18)

where to simplify equations, we replaced graphic by sc.

Standard errors of the estimate are

(A19)

where D is the determinant of the second derivatives matrix

(A20)
(A21)

and the coefficient of correlation between two MLEs is

(A22)

For eqs (A1) and (A3) the standard errors are calculated similarly to eqs (A19)–(A22).

References

1

Abramowitz
M.
Stegun
I.A.
1972
Handbook of Mathematical Functions
pp.
1046
Dover
NY

2

Aki
K.
1965
Maximum likelihood estimate of b in the formula log N = abM and its confidence limits
Bull. Earthquake Res. Inst. Tokyo Univ
43
237
239

3

Anderson
J.G.
Luco
J.E.
1983
Consequences of slip rate constraints on earthquake occurrence relation
Bull. seism. Soc. Am
73
471
496

4

Bird
P.
Kagan
Y.Y.
Jackson
D.D.
2000
Frequency–magnitude distribution, effective lithosphere thickness, and seismic efficiency of oceanic transforms and spreading ridges
Eos Trans. AGU
81
3
(abstract), p.
WP147
.

5

Bird
P.
Kagan
Y.Y.
Jackson
D.D.
2001
Plate tectonics and earthquake potential of spreading ridges and oceanic transform faults
in
Plate Boundary Zones, AGU Monograph
eds.
Stein
S.
Freymueller
J.T.
in press

6

Boettcher
N.
Jordan
T.H.
2001
Seismic behavior of oceanic transform faults
EOS Trans. AGU
82
(
47
) Fall AGU Meet. Suppl., (abstract), S32E-07 p.
F882

7

Chen
P.-f.
Nettles
M.
Okal
E.A.
Ekström
G.
2001
Centroid moment tensor solutions for intermediate-depth earthquakes of the WWSSN-HGLP era (1962–1975)
Phys. Earth planet. Inter
124
1
7

8

Clark
R.M.
Cox
S.J.D.
Laslett
G.M.
1999
Generalizations of power-law distributions applicable to sampled fault-trace lengths: model choice, parameter estimation and caveats
Geophys. J. Int
136
357
372

9

Deemer
W.L.
Votaw
D.F.
1955
Estimation of parameters of truncated or censored exponential distributions
Ann. Math. Stat
26
498
504

10

Dziewonski
A.M.
Ekström
G.
Maternovskaya
N.N.
2001
Centroid-moment tensor solutions for April–June 2000
Phys. Earth planet. Inter
123
1
14

11

Engdahl
E.R.
Van der Hilst
R.D.
Buland
R.P.
1998
Global teleseismic earthquake relocation with improved travel times and procedures for depth determination
Bull. seism. Soc. Am
88
722
743

12

Engdahl
E.R.
Villaseñor
A.
2001
Global seismicity
Chapter 38 in
IASPEI Handbook of Earthquake and Engineering Seismology
eds
Jennings
P.
Kanamori
H.
Lee
W.
in press
ftp://ciei.colorado.edu/pub/user/engdahl/~Handbook/

13

Fineberg
J.
Marder
M.
1999
Instability in dynamic fracture
Phys. Rep
313
1
108

14

Frohlich
C.
1992
Triangle diagrams: ternary graphs to display similarity and diversity of earthquake focal mechanisms
Phys. Earth planet. Inter
75
193
198

15

Frohlich
C.
2001
Display and quantitative assessment of distributions of earthquake focal mechanisms
Geophys. J. Int
144
300
308

16

Frohlich
C.
Davis
S.D.
1993
Teleseismic b values; or, much ado about 1.0
J. geophys. Res
98
631
644

17

Godano
C.
Pingue
F.
2000
Is the seismic moment–frequency relation universal?
Geophys. J. Int
142
193
198

18

Gutenberg
B.
Richter
C.F.
1941
Seismicity of the Earth
Geol. Soc. Am. Bull
Special papers
34
pp.
1
131

19

Gutenberg
B.
Richter
C.F.
1944
Frequency of earthquakes in California
Bull. seism. Soc. Am
34
185
188

20

Harte
D.
Vere-Jones
D.
1999
Differences in coverage between the PDE and New Zealand local earthquake catalogues
New Zealand J. Geol. Geophys
42
237
253

21

Helffrich
G.R.
1997
How good are routinely determined focal mechanisms? Empirical statistics based on a comparison of Harvard, USGS and ERI moment tensors
Geophys. J. Int
131
741
750

22

Huang
W.-C.
Okal
E.A.
Ekström
G.
Salganik
M.P.
1997
Centroid moment tensor solutions for deep earthquakes predating the digital era: the World-Wide Standardized Seismograph Network dataset (1962–1976)
Phys. Earth planet. Inter
99
121
129

23

Jenkins
G.M.
Watts
D.G.
1968
Spectral Analysis and its Applications
pp.
525
Oakland, CA
Holden-Day

24

Kagan
Y.Y.
1991
Seismic moment distribution
Geophys. J. Int
106
123
134

25

Kagan
Y.Y.
1993
Statistics of characteristic earthquakes
Bull. seism. Soc. Am
83
7
24

26

Kagan
Y.Y.
1994
Observational evidence for earthquakes as a nonlinear dynamic process
Physica D
77
160
192

27

Kagan
Y.Y.
1997
Seismic moment–frequency relation for shallow earthquakes: Regional comparison
J. geophys. Res
102
2835
2852

28

Kagan
Y.Y.
1999
Universality of the seismic moment–frequency relation
Pure appl. Geophys
155
537
573

29

Kagan
Y.Y.
2000
Temporal correlations of earthquake focal mechanisms
Geophys. J. Int.
143
881
897
see correction http://moho.ess.ucla.edu/~kagan/d2ct\_corr.pdf

30

Kagan
Y.Y.
2002
Seismic moment distribution revisited: II. Moment conservation principle
Geophys. J. Int.
accepted
http://scec.ess.ucla.edu/~ykagan/momc_index.html

31

Kagan
Y.Y.
Jackson
D.D.
1991
Long-term earthquake clustering
Geophys. J. Int
104
117
133

32

Kagan
Y.Y.
Jackson
D.D.
2000
Probabilistic forecasting of earthquakes
Geophys. J. Int
143
438
453

33

Kagan
Y.Y.
Knopoff
L.
1980
Spatial distribution of earthquakes: The two-point correlation function
Geophys. J. R. astr. Soc
62
303
320

34

Kagan
Y.Y.
Schoenberg
F.
2001
Estimation of the upper cutoff parameter for the tapered Pareto distribution
J. appl. Probab
38A
168
185

35

Kanamori
H.
1977
The energy release in great earthquakes
J. geophys. Res
82
2981
2987

36

Kijko
A.
Graham
G.
1998
Parametric-historic procedure for probabilistic seismic hazard analysis—Part I: Estimation of maximum regional magnitude mmax
Pure appl. Geophys
152
413
442

37

Knopoff
L.
Kagan
Y.Y.
1977
Analysis of the theory of extremes as applied to earthquake problems
J. geophys. Res
82
5647
5657

38

Lei
X.
Kusunose
K.
Rao
M.V.M.S.
Nishizawa
O.
Sato
T.
2000
Quasi-static fault growth and cracking in homogeneous brittle rock under triaxial compression using acoustic emission monitoring
J. geophys. Res
105
6127
39

39

Leonard
T.
Papsouliotis
O.
Main
I.G.
2001
A Poisson model for identifying characteristic size effects in frequency data: application to frequency-size distributions for global earthquakes, ‘starquakes’ and fault lengths
J. geophys. Res
106
13 473
13 484

40

Main
I.
2000
Apparent breaks in scaling in the earthquake cumulative frequency–magnitude distribution: fact or artifact?
Bull. seism. Soc. Am
90
86
97

41

Main
I.G.
O'Brien
G.
Henderson
J.R.
2000
Statistical physics of earthquakes: comparison of distribution exponents for source area and potential energy and the dynamic emergence of log-periodic energy quanta
J. geophys. Res
105
6105
6126

42

Molchan
G.M.
Podgaetskaya
V.M.
1973
Parameters of global seismicity, I
in
Computational Seismology
Vol. 6
ed
Keilis-Borok
V.I.
44
66
Nauka
Moscow(in Russian)

43

Ogata
Y.
Katsura
K.
1993
Analysis of temporal and spatial heterogeneity of magnitude frequency distribution inferred from earthquake catalogues
Geophys. J. Int
113
727
738

44

Omeltchenko
A.
Jin
Yu
Kalia
R.K.
Vashishta
P.
1997
Crack front propagation and fracture in a graphite sheet: a molecular-dynamics study on parallel computers
Phys. Rev. Lett
78
2148
2151

45

Pacheco
J.F.
Scholz
C.H.
Sykes
L.R.
1992
Changes in frequency-size relationship from small to large earthquakes
Nature
355
71
73

46

Pacheco
J.F.
Sykes
L.R.
1992
Seismic moment catalog of large, shallow earthquakes, 1900–1989
Bull. seism. Soc. Am
82
1306
1349

47

Page
R.
1968
Aftershocks and microaftershocks of the great Alaska earthquake of 1964
Bull. seism. Soc. Am
58
1131
1168

48

Pareto
V.
1897
Cours d'Économie Politique
Tome Second
Lausanne
F. Rouge, quoted from Pareto, V., (1964), Œuvres Complètes, Publ.de Giovanni Busino, Genève, Droz v.
II

49

Peresan
A.
Panza
G.F.
Costa
G.
2000
CN algorithm and long-lasting changes in reported magnitudes: the case of Italy
Geophys. J. Int
141
425
437

50

Pérez
O.J.
Scholz
C.H.
1997
Long-term seismic behavior of the focal and adjacent regions of great earthquakes during the time between two successive shocks
J. geophys. Res
102
8203
8216

51

Pisarenko
V.F.
1991
Statistical evaluation of maximum possible earthquakes
Phys. Solid Earth
27
3
757
763

52

Pisarenko
V.F.
Lyubushin
A.A.
Lysenko
V.B.
Golubeva
T.V.
1996
Statistical estimation of seismic hazard parameters—maximum possible magnitude and related parameters
Bull. seism. Soc. Am
86
691
700

53

Rhoades
D.A.
1996
Estimation of the Gutenberg–Richter relation allowing for individual earthquake magnitude uncertainties
Tectono-physics
258
71
83

54

Rhoades
D.A.
Dowrick
D.J.
2000
Effects of magnitude uncertainties on seismic hazard estimates
Proc. of the 12th WorldConf. onEarthquakeEngineering, Auckland,NewZealand, 30th January–4th February 2000
Paper No. 1179. New Zealand Society forEarthquake Engineering, Upper Hutt, New Zealand http://nisee.berkeley.edu/cgi-bin/texhtml?form=eea.all controln=1202 746

55

Sipkin
S.A.
1994
Rapid determination of global moment-tensor solutions
Geophys. Res. Lett
21
1667
1670

56

Sipkin
S.A.
Bufe
C.G.
Zirbes
M.D.
2000
Moment-tensor solutions estimated using optimal filter theory: global seismicity 1999
Phys. Earth planet. Inter
122
147
159

57

Sornette
D.
Knopoff
L.
Kagan
Y.Y.
Vanneste
C.
1996
Rank-ordering statistics of extreme events: application to the distribution of large earthquakes
J. geophys. Res
101
13 883
13 893

58

Sornette
D.
Sornette
A.
1999
General theory of the modified Gutenberg–Richter law for large seismic moments
Bull. seism. Soc. Am
89
1121
1030

59

Stauffer
D.
Aharony
A.
1994
Introduction to Percolation Theory
2nd rev. edn
Taylor & Francis
London

60

Tinti
S.
Mulargia
F.
1985
Effects of magnitude uncertainties on estimating the parameters in the Gutenberg–Richter frequency–magnitude law
Bull. seism. Soc. Am
75
1681
1697

61

Triep
E.G.
Sykes
L.R.
1997
Frequency of occurrence of moderate to great earthquakes in intracontinental regions: implications for changes in stress, earthquake prediction, and hazards assessments
J. geophys. Res
102
9923
9948

62

Uhrhammer
R.A.
Loper
S.J.
Romanowicz
B.
1996
Determination of local magnitude using BDSN broadband records
Bull. seism. Soc. Am
86
1314
1330

63

Utsu
T.
1999
Representation and analysis of the earthquake size distribution: a historical review and some new approaches
Pure appl. Geophys
155
509
535

64

Vere-Jones
D.
Robinson
R.
Yang
W.Z.
2001
Remarks on the accelerated moment release model: problems of model formulation, simulation and estimation
Geophys. J. Int
144
517
531

65

Wyss
M.
1973
Towards a physical understanding of the earthquake frequency distribution
Geophys. J. R. astr. Soc
31
341
359

66

Young
J.B.
Presgrave
B.W.
Aichele
H.
Wiens
D.A.
Flinn
E.A.
1996
The Flinn–Engdahl regionalisation scheme: the 1995 revision
Phys. Earth planet. Inter
96
223
297