Home 
Nocturnal awakenings due to aircraft noise*. Do wakeup reactions begin at sound level 60 dB(A)? Correspondence Address: Nighttime wakeup thresholds at noise levels of 60 dB(A) are frequently employed in Germany to establish «DQ»noise polluted areas«DQ». The criterion is, however, based on an incorrect processing of statistical data gathered from an evaluation of literature performed by Griefahn et al. (1976). This finding has emerged from an extensive revision of the study. Using appropriate statistical methods, maximum levels of under 48 dB(A) are assessed as wakingup thresholds at ear level in sleeping persons, in contrast to maximum levels of 60 dB(A) calculated by Griefahn et al. in 1976. The linear doseresponse relationship, which in the course of the revision could be derived from the early publications, agrees with the results of more recent literature evaluations. The present contribution is not intended to give rise to the question whether in the interest of medical prevention it is reasonable to develop nighttime protective policies merely founded on noise levels marking the «DQ»statistical«DQ» onset of nocturnal wakeup reactions. In this context, emphasis is laid on the deformation of the biological rhythm of sleep.
Nocturnal aircraft noise and taking preventive action Nocturnal aircraft noise induces activations of the vegetative nervous system, as is confirmed by several independent investigations of the last 35 years. Noiseinduced activations may interfere with normal sleep stages und cause awakenings. Undisturbed sleep brings about health, wellbeing, efficiency, and optimism. Longterm sleep disturbances, however, may induce adverse health effects of various kinds and intensities. The necessity of taking preventive action is in Germany generally acknowledged (e.g. (SVRU 1999)). But there is still disagreement regarding guideline immission values below which adverse sleep disturbances are most probably avoided. Notwithstanding numerous investigations, the question of adequate healthrelated immission regulations to avoid not rested sleep can still not be answered conclusively. This may have serious consequences with regard to establishing "noise polluted areas", as has become evident from calculations instigated by the mediation procedures in connection with the Frankfurt/Main Airport area extensions (MVFrankfurt/Main 1999). In [Figure 1], the perimeters of several nighttime noise exposed areas around Frankfurt/Main Airport are outlined according to calculations by the Hessian Office for the Environment (Hessische Landesanstalt fdr Umwelt) as based on critical thresholds determined by each Jansen, Griefahn, and Maschke. There is no substantial difference between the contours according to (2) Griefahn and (3) Maschke. Outlined by contour (1), however, the nighttime noise polluted area according to Jansen imposes as several times smaller. The reason for this remarkable discrepancy is Jansen's demand ((Jansen et al. 1995) pp. 92 and 105, (Jansen 2000a) p. 106) that noise protection measures ought to be founded on the "proven" results of German noise effect research of the earlier years. The main outcome of German research in earlier years is the nighttime noise protection criterion 6 x 60 dB(A), based on investigations by Jansen (e.g. (Jansen 1970)) and Griefahn (Griefahn et al. 1976). This criterion implies that nocturnal noise exposures are a risk to health if six flight events or more per night with maximum levels of 60 dB(A) or above occur in a person's bedroom. According to the criterion, aircraft noise exposures below a maximum noise level of 60 dB(A) in the bedroom are regarded as tolerable and not adverse to health. As Jansen points out, noiseinduced awakenings begin at a mean group value of 60 dB(A) at ear level of a sleeping person: "The wakeup reaction indicates a disturbance of the normal balance regulation. Consequently, the doseresponse relationship between awakenings and sound levels constitutes a criterion of health hazards and serves as a reliable indicator (...). The zero point of this doseresponse relationship is (according to Griefahn et al. 1976) established at a maximum sound level value of 60 dB(A) (...). With a standard deviation of ±7, individual awakenings at sound level values between 53 and 67 dB(A) have to be accounted for" ((Jansen 2000b), p.47/48, underlinings by Maschke et al.). The awakening threshold at 60 dB(A) is by Jansen (e.g. in (Jansen et al. 1995)) defined as a "statistical" threshold as opposed to individual wakeup thresholds. This was supported by an extensive evaluation of investigations in the field of "sleep and noise" carried out by Griefahn, Jansen and Klosterkotter in 1976 (Griefahn et al. 1976). In the framework of the project approval proceedings (Planfeststellungsverfahren) on behalf of the Berlin/Brandenburg International Airport, this study has been submitted to a critical review (Maschke et al. 2000). The results are summarized in the present contribution. In order to improve reading the article, quotations are printed in italics. Formatting used in the originals were disregarded. Insertions and obliterations within the quotations are marked by (square brackets). Furthermore, it was unavoidable to develop statistical proofs which for the critical reader are available from the authors. Maximum noise levels and awakening frequencies according to Griefahn et al. 1976 In 1976, Griefahn et al. published a linear dose response relationship between nighttime maximum noise levels and awakening frequencies (Griefahn et al. 1976), given by the straight line in [Figure 2]. According to this dose response relationship, the onset threshold of noiseinduced awakenings is determined at a maximum level of 60 dB(A). This doseresponse relationship ensued from a literature evaluation of 10 out of 19 publications dated before 1976. A number of these publications contained substudies, each dealing with noise induced awakenings and varying factors like gender and age of test persons or the number of noise events. The strong variability in awakening frequencies in the 24 substudies, which in many cases included repeated measurements (see data points in [Figure 3]), was fitted by a straight line. The research results of Steinicke and Gadeke et al. (Steinicke 1957; Gadeke 1969), were disregarded by both Griefahn et al. and the present revision study because of reservations regarding their design. Both investigations were carried out with longer lasting noises. Steinicke (1957) exposed his probands in the early morning hours to a steadily increasing noise from a woodcutting machine until they signaled awakening. Gadeke et al. (1969) exposed 3 to 63 months old infants to noises comparable to factory noises. One cannot, with certainty, infer the baseline data set from the straight line depicted in [Figure 2]. It is good scientific practice to give the baseline data set along with the trend, in this case the straight line. In [Figure 3], the studies which Griefahn et al. referred to in their literature evaluation ((Griefahn et al. 1976), attachment 4) are presented as a data point cloud. Children up to the age of 18 and old persons from 69 years up were depicted separately. The distribution of the data values over the graph does indeed suggest that children and old persons should not be evaluated together with adults, but ought to be considered separately as was proposed, among others, by Hecht and Maschke (e.g. (Hecht et al. 1999)). Yet, considerable differences in awakening frequencies at identical maximum noise levels were equally found in the groups of adults of the substudies. The altogether heterogeneous data set in [Figure 3] was by Griefahn et al. further reduced for formal points of view. Therefore, all investigations were excluded which did not contain repeated measurements. Also, "the very old of the study published by Lukas et al. (…) in 1969 were not included in further calculations", since the results were based on considerably changing numbers of over flights. "Also, the awakening frequency in children taken from the investigations of Lukas et al. (1969, 1971) were not further included because they amounted to Zero within the (whole) tested intensity range" ((Griefahn et al. 1976), p.68). The awakening frequencies of the remaining substudies are, together with the straight line according to Griefahn et al., inserted in [Figure 4]. For the welltrained observer, a discrepancy between the data points and the straight line according to Griefahn is discernable in this graph. The discrepancy becomes evident for the reader who visually averages the awakening frequencies within smaller maximum level ranges around 60, 70, 80 and 100 dB(A), and then draws an imaginary straight line to adjust the mean values. Such an estimated best fit line (dotted line in [Figure 4]) runs much lower than the straight line according to Griefahn et al., and does not cut the xaxis at 60 dB(A). As regards the discrepancy between the data point cloud and the straight line according to Griefahn, it must be taken into account that each data point represents a mean average value of different subordinate data sets (e.g. identical groups of individuals tested at different maximum sound levels). Such dependencies in the data sets as well as differing numbers of probands in each test series may be responsible for the observed discrepancy. Assessment of the best fit line according to Griefahn The problem of jointly calculating dependent and independent mean values in one regression analysis has been solved by Griefahn et al. (1976) by restricting the analysis to the greater number of dependent data sets and to calculate separate regression lines for each of the dependent data series. In a second step, each individual regression line was weighted by the number of persons and then all regression lines averaged. As can be seen in [Figure 5], the resulting individual regression lines are found in quite different level ranges. In order to adjust the lines, Griefahn et al. extrapolated each individual regression line across the level range 20120 dB(A), and then averaged the values of the regression lines within this level range. For averaging the regression lines, the dependent data series were weighted according to the diverse numbers of test persons. In this context, Griefahn et al. state as follows: "Based on each single trend, the expected relative reaction frequency was then calculated for the intensity range 20 dB(A) to 120 dB(A). Weighting all individual values by the number of probands, each according to their respective test series, a mean intensity value was calculated from the relevant data. The trend of wakeup frequencies derived from these mean values is given (as fit line) ((Griefahn et al. 1976), p. 68 69)". Weighting the regression lines in [Figure 5] by the number of probands and then averaging the lines, the revision resulted in a straight line of the equation y = 1.315x  78.827 (see [Figure 6]). This agrees with the equation y = 1.30x  78.41 as calculated by Griefahn et al. in 1976. The small differences may be ascribed to rounded data which in the process of the revision were added from attachment 4 of (Griefahn et al. 1976), and also to slightly differing numbers of probands. Whereas Griefahn et al. stated a total of 94 test persons ((Griefahn et al. 1976); [Figure 9]), the revision study comprised 98 test persons. The reliability of the best fit line according to Griefahn A best fit line is supposed to represent the total of all single data points. Spontaneously, one should think that an appropriate best fit line ought to be drawn in such a way that the sum of predictable errors (deviations from each single data point) is kept as small as possible. Proceeding that way, however, values with greater distances from the main tendency exert only little influence on the best fit line. In this case, the best fit line cannot be representative of all data points. Therefore, instead of the sum of deviations, the sum of the least square errors has to be minimized. A straight line which meets the criterion of the least square errors is obtained by a linear regression analysis yielding a regression line. It is internationally agreed that linear dose response relationships are obtained by linear regression provided that the independent variable is to be considered as (near to) faultless. This requirement is met by sound level measurements. The best fit line calculated by the authors Griefahn et al. (1976) does not meet the requirement of the least square error minimization and therefore cannot be interpreted as a usual doseresponse relationship, as was repeatedly published by the authors. The method of averaging individual regression lines does on principle result in a straight line. But a regression line claiming to be representative of an entire data set can only be derived under the following conditions (statistical proof is available from the authors): 1. The number of data points of all data sets must be in agreement (n 1 = n 2 = n i ), otherwise a weighted averaging has to be carried out. 2. The variance of the tested maximum levels (of the independent values) must be in agreement ((var(x 1 ) = var(x2) = var(xi)). 3. The mean values of the tested maximum sound levels must be in agreement ( x1= x2 = xi ) As is shown in [Table 1], the requirements 2 and 3 are not met by the given data set. The variance of the maximum levels (independent data series) fluctuates between 16 and 100 dB(A) 2 . The mean values of the maximum levels fluctuate between 55 and 110 dB(A). In the case of the given data set, containing extremely varying mean values and variances, the method of averaging the individual regression line employed by the authors does lead to a straight line, but not to a regression line representing the complete data set. In all probability, such severe violation of the above prerogatives has led to a straight line which considerably diverges from the regression line based on the entire data set. Calculation of the regression line based on the complete data set The regression line based on the entire data set (each data point representing a mean average value; the individual parameters being unknown) can be calculated by customary linear regression, providing it is possible to convert the dependent mean values into virtually independent data points. Such a procedure, as was indirectly also adopted by Griefahn et al. (1976), consists in distributing equal numbers of probands, who were repeatedly measured within one test series, over the number of the repeat measurements. If for example the same 12 persons were tested for three sound level intensities, then each averaged data point is weighted by 12/3 = 4 (statistical background available from the authors). This procedure ensures that a dependent data series is only weighted by the number of test persons and not by the total number of measurements. The procedure corresponds with the weighting by persons which was used by Griefahn et al. in 1976 for averaging the individual regression lines. The commonly used regression over all "personweighted" data points leads to results identical with the averaged individual regression lines, if the individual regression lines fulfill all statistical requirements to allow averaging. The splitting of test persons within the repeated measurements is however not tied to identical variances or mean values of maximum levels as was the case when averaging the regression lines. If a linear regression is calculated from data points "weighted by persons", a significant regression line ensues as is demonstrated in [Figure 7]. The number of substudies und the number of test persons is greater than those according to Griefahn et al. (1976). At this point it was possible to include those substudies from attachment 4 into the regression, which did not contain repeated measurements and which therefore did not allow calculation of individual regression lines. These substudies had been excluded from the analyses by Griefahn et al. in 1976. For comparison, the regression line based on the same data set, but ignoring the dependency of the substudies has been added in [Figure 7]. The onset threshold of noiseinduced awakenings The straight line developed and published by Griefahn, Jansen and Klosterkotter in 1976 (Griefahn et al. 1976) diverges considerably from the regression line based on the same data set [Figure 8] and can therefore not be employed for limit value regulations. The difference becomes all the more evident when examining the onset thresholds of noiseinduced awakenings. According to the straight line of Griefahn et al. (1976), noiseinduced awakenings begin "statistical" at a maximum noise level of 60 dB(A). Taking into account the regression line calculated from the data points "weighted by persons", a wakeup frequency of already more than 10% is reached at the maximum noise level 60 dB(A). A "statistical" noiseinduced awakening threshold (as is frequently used in Germany) cannot directly be derived from the regression line based on "personweighted" data points, because the point of intersection with the xaxis (0% wakeup frequency) can only be determined by extrapolation of the regression line. An extrapolation, however, is statistically unreliable. The one question which can statistically be answered is whether there exists a level range in which, for the given data set, a "statistical" awakening threshold of sufficient reliability may be expected. The answer is given by the confidence interval of the regression line. [Figure 9] shows the regression line across the data points "weighted by persons", displaying a 95%confidence interval. The 95%confidence interval in this scattered graph demonstrates that the "statistical" onset threshold of noiseinduced awakenings is to be expected in the level range 048 dB(A), according to the given data set. Examining the maximum noise levels marking the "statistical" onset of noiseinduced awakenings according to the given data set, they are clearly to be assessed (with a reliability of 95%) at a maximum noise level of about 48 dB(A), i.e. the intersection point of the lower confidence interval and the x axis (0% wakeup frequency), but not at 60 dB(A). As the revision has shown, the statement that noiseinduced awakenings statistically begin at a maximum noise level of 60 dB(A) cannot be verified by the literature evaluation by Griefahn et al. (1976). Evaluations of recent literature Largescale evaluations of literature concerned with noiseinduced sleep disturbances were in recent years published by Hofman (Hofman 1994), Maschke (Maschke 1997) and the Health Council of the Netherlands (Gezondheidsraad 1999) among others. Hofman has only taken into consideration studies which refer to realistic environmental noises (flight noises, road traffic noises). From her study, Hofman concludes that under realistic noise exposure conditions, aircraft noise of moderate maximum levels results in a greater probability of awakenings than was stated by Griefahn et al. in 1976. In particular, road traffic noise and flight noise ought to be investigated separately. The results are given in [Figure 10]. For comparison, Griefahn's straight line is added as a dotted line. Regarding aircraft noise, the study of Hofman does not yield a reliable doseresponse relationship with noiseinduced wakingup reactions. The reliability of the regression line calculated by Hofman must by itself be rated as insignificant (R 2 = 0.074). On the other hand, the regression line calculated from real flight noises (y = 0.4362x  9.1415) agrees in tendency with the revised regression line based on the data set of Griefahn et al. 1976 (y = 0.39x  12.869). Both regression lines as well as the fitting line according to Griefahn are depicted in [Figure 11]. According to Griefahn et al. (1976), the statistical value of wakeup frequencies is 0% at the maximum sound level 60 dB(A). The revision of this literature evaluation yielded a statistical value of awakening frequencies of 10%, and the study of Hofman (1994) states a statistical value of about 17% in awakening frequencies. Commissioned by the German Federal Environmental Agency, Maschke et al. (1997) critically reviewed 28 experimental primary studies published since 1980. They equally conclude that quasicontinuous noises and intermittent (flight) noises must be separated. For intermittent road traffic noises they found an effect threshold of stimulated awakening reactions of Lmax = 45 dB(A). Under the term stimulated awakening reactions, Maschke et al. (1997) registered all reactions observed within a short time window following a noise event. The time windows varied from 30 to 90 seconds in various experiments. The studies also demonstrates that memorized awakening reactions (often documented in field studies) were only assessed at levels higher than "physiologically" detectable wakeup reactions, as for instance by encephalogram. In the publication "Public health impact of large airports" of the Netherlands Health Council (Gezondheidsraad 1999), based on data from an evaluation of literature, a sound exposure level (SEL) of 50 dB(A) at the ear of a sleeping person is determined as the onset point of awakenings. This value corresponds with a maximum noise level event of Lmax ˜ 43 dB(A) for a 10dB downtime of 10 seconds. Discussion Nocturnal flight noise induces unwanted activations in sleeping persons. An activation may end in an awakening, and a higher activation is of course indicated by increased awakening reactions (e.g. (SVRU 1999; Jansen et al. 1995; Maschke 1997; Griefahn 1990)). But also below the awakening threshold, considerable interferences with sleep stages including vegetative reactions (arousals) may occur (e.g. (Raschke et al. 1997)). Arousals are a necessary function to counteract hazardous stimuli or events through an activation of compensatory mechanisms, in other words to maintain the physiological balance. Frequent arousals during sleep, caused by internal or external receptor stimuli, may lead to deformations of the biological rhythm. A deformation becomes evident in fragmented sleep processes and in changed hormone responses (e.g. cortisol nadir). A fragmented sleep process with increasing nocturnal arousals may induce a rise in the tone of the sympathetic nervous system. The results are a deterioration of sleep quality, which may lead to reduced daytime performances, sleepiness and tiredness. Longterm interferences with the circadianian rhythm have negative effects on health (Hanley et al. 1989; ZuberiKhokar et al. 1996; Biberdorf et al. 1993; Bonnet et al. 1995; Mercia et al. 1991). Nighttime protective policies should therefore not be confined to awakening reactions alone, but ought to take all aspects of noiseinduced sleep disturbances into consideration, as frequent interferences with physiologically programmed sleep structures may cause adverse health effects (Raschke et al. 1997). The presented revision of the study of Griefahn et al. (1976) shows, in agreement with evaluations of recent literature, that increasing (intermittent) flight noiseinduced awakening reactions are statistically to be expected at maximum noise levels at the latest 48 dB(A), in contrast to the threshold of 60 dB(A) calculated by Griefahn et al. in 1976. A comprehensive protection of individuals against nocturnal awakenings is, however, not even given at a limit value of a maximum level 48 dB(A), because individual sensitivities towards noise immissions vary considerably. The statement of Jansen (Jansen 2000b) that with regard to the studies of Griefahn et al. (1976, 1990) the standard deviation ±7dB(A), individual noiseinduced awakenings have to be expected between 53 and 67 dB(A) cannot be verified by the revision the studies of Griefahn et al. (1976), because the individual parameters are unknown in the literature evaluation by Griefahn et al. Decisionmaking towards comprehensive protective policies, based on the developments in present noise effect research, is overdue. There is an urgent need (for noise effect research) to develop futureorientated protective concepts (e.g. (SVRU 1999), Tz 507). Such protective concepts should in the future not be focussed on statistical awakening threshold alone. The use of the whole doseresponse curve between maximum sound levels and awakening probability can be seen as a promising approach.[35] References


