| [Download PDF]
|Year : 2004 | Volume
| Issue : 25 | Page : 11--21
Effects of reverberation time on the cognitive load in speech communication : Theoretical considerations
University of Gävle, Centre for Built Environment, Gävle, Sweden
University of Gävle, Centre for Built Environment, S-801 76 Gävle
The paper presents a theoretical analysis of possible effects of reverberation time on the cognitive load in speech communication. Speech comprehension requires not only phonological processing of the spoken words. Simultaneously, this information must be further processed and stored. All this processing takes place in the working memory, which has a limited processing capacity. The more resources that are allocated to word identification, the fewer resources are therefore left for the further processing and storing of the information. Reverberation conditions that allow the identification of almost all words may therefore still interfere with speech comprehension and memory storing. These problems are likely to be especially serious in situations where speech has to be followed continuously for a long time. An unfavorable reverberation time (RT) then could contribute to the development of cognitive fatigue, which means that working memory resources are gradually reduced. RT may also affect the cognitive load in two other ways: RT may change the distracting effects of a sound and a person's mood. Both effects could influence the cognitive load of a listener. It is argued that we need studies of RT effects in realistic long-lasting listening situations to better understand the effect of RT on speech communication. Furthermore, the effect of RT on distraction and mood need to be better understood.
|How to cite this article:|
Kjellberg A. Effects of reverberation time on the cognitive load in speech communication : Theoretical considerations.Noise Health 2004;7:11-21
|How to cite this URL:|
Kjellberg A. Effects of reverberation time on the cognitive load in speech communication : Theoretical considerations. Noise Health [serial online] 2004 [cited 2021 Dec 6 ];7:11-21
Available from: https://www.noiseandhealth.org/text.asp?2004/7/25/11/31651
Reverberation time (RT) is the acoustical characteristic in addition to noise level that is most often stressed in acoustic guidelines for classrooms and other rooms where speech communication is of critical importance for successful performance. RT is generally defined as the time taken for a sound to decay 60 dB below its level when it was switched off. The decay time is frequency dependent depending on absorption characteristics of surfaces in the room, but often one value is given representing mean or maximum RT in 0.5-2 kHz or 0.25-4 kHz octave bands. Unfavorable reverberation conditions are assumed to impair speech comprehension, which in some contexts may impair performance and learning.
Much research has been devoted to the effects of RT on speech intelligibility, whereas rather little research has treated cognitive consequences of this effect. In the present paper the effects on speech perception are first briefly summarized. The second part of the paper discusses the cognitive processes involved in speech perception, how reverberation may increase these cognitive processing demands and how these effects could be assessed. An expected consequence of these raised demands is an accelerated growth of cognitive fatigue that entails a decreasing ability for sustained attention to and processing of the speech. This is discussed in the third section of the paper. The next section deals with possible effects of RT on distraction, the automatic, stimulus-driven attentional responses. A final section deals with psychological effects of reverberation time that are not only mediated by its effect on speech perception but also are related to sound quality.
Reverberation and speech intelligibility
Speech intelligibility in a room is determined by the Signal/Noise ratio and the reverberation time (Bradley, 1986). Provided that the S/N ratio is constant, shorter RTs mean improved clarity of the acoustic signal and thereby better intelligibility, as measured with, e.g. percent of correctly identified monosyllables (FinitzoHiber and Tillman, 1978). The detrimental effect of reverberation is an effect of the reflected sound that arrives later than 35-40 ms after the direct sound (Rossing, 1990). One negative effect of reverberation is that it leads to a buildup of the sound level and a lower S/N ratio. Long RTs also mean that one speech sound has not decayed before the next sound arrives and thus that the decaying sound may mask the sound that follows. The most critical masking effect is that a vowel sound may mask the following consonant sound, which carries more information than vowel sounds but typically have a higher frequency and a 10-15 dB lower level. The pauses between words also become less distinct by the reflected sound.
However, the reflected sound may also have a positive effect by reinforcing the direct sound and, thus, improve S/N ratio. This effect is explained by the fact that reflected sound that arrives within 35-40 ms of the direct sound is not perceived as a separate sound or a reverberation but adds to the loudness of the direct sound. Such a strengthening of the speech signal by reflected sound is often necessary to attain a sound level high enough for speech comprehension. This is especially true for listeners far away from the speaker, since the level of the direct speech sounds decays rather rapidly with increased distance from the speaker. The level of the direct sound and thereby the importance of the reflected sound does not only depend on the distance but also on the direction from the speaker. Speech is directional and the direct speech sounds primarily reach listeners directly in front of the speaker. The addition of reflected speech therefore is often of great importance for those not sitting in front of the speaker provided that it is not too delayed.
In line with this it has been found that the ratio between the part of the sound that is useful for speech intelligibility (that is direct+early reflected speech sounds) and the detrimental part of the sound (that is later reflected + noise) is well correlated to speech intelligibility (Bradley, 1986).
This means that RT should not necessarily be minimized but optimized in classrooms and other rooms where speech intelligibility is critical. If the sound absorption of the room surfaces is next to total and the RT thus is close to 0 an unrealistically high voice level is likely to be required to keep the speech at a level high enough in the more distant parts of the room. The optimum RT varies primarily with room volume and S/N ratio. Larger rooms require longer RTs and a better S/N ratio means that the RT should be shorter. This means that the optimum RT for speech intelligibility may vary within a large range. Typical recommended values for classrooms generally are in the range 0.3-0.8 s (American National Standards Institute, 2002, Crandell and Smaldino, 2000, Finitzo-Hiber and Tillman, 1978, Swedish Standards Institute, 2001, Wilson, et al., 2002). Open plan offices require shorter RTs than smaller offices (.4 and .6 s, respectively according to Swedish Standards Institute, (2001), whereas the preferred RT is in the range 1.0-1.8 s in rooms used for orchestra music (Rossing, 1990).
Hodgson and Nosal (2002) proposed a model for predicting the optimal RT in different situations. A critical aspect of the listening situation according to this model, not considered by other models, is the position of the noise source relative to the speaker and listener. An implication of their model is that the optimal RT exceeds zero only when the noise source is closer to the listener than to the speaker.
The effect of a too long RT on speech intelligibility is pronounced if, for other reasons, speech perception is rendered more difficult as a result of signal or listener characteristics. Thus, the detrimental effect of a prolonged RT on intelligibility is larger when the speech is not clearly articulated (Payton, et al., 1994) as well as for persons with a hearing impairment (Finitzo-Hiber and Tillman, 1978, Helfer and Wilber, 1990). The effect is also dependent on the listener's information processing capacity and linguistic competence. Thus, the effects seem to be especially pronounced for older persons and children (Nabelek and Robinson, 1992). The vulnerability of the elderly is partly an effect of the age-related sensorineural hearing impairment, but also depends on the deterioration of working memory capacity especially with respect to processing speed (Pichora-Fuller, 2003, Pichora-Fuller and Souza, 2003). The comprehension of a foreign language is also more vulnerable to a too long RT than the native language (Nabelek and Donahue, 1984, Takata and Nabelek, 1990). This is also true for the understanding of a speaker using a foreign language (Nabelek and Robinson, 1992). For similar reasons it is likely that the complexity of the message and the listener's knowledge of the subject of the speech is of importance for the effect of RT on understanding. A more complex and less well-known subject means less redundant information, which means that the loss of parts of the speech signal becomes more critical. However, to my knowledge there are no studies that have treated this problem.
The reasoning above implicitly assumes that good speech intelligibility should be striven for, which, of course, is not always the case. In open plan offices and many other contexts a major problem is speech privacy, to avoid that speech is overheard unintentionally. This constitutes not only a privacy problem for the speakers but also a disturbance problem for the involuntary listener, since it has been shown that irrelevant speech is a very powerful distractor, especially during work with verbal tasks (Kjellberg and Skoldstrom, 1991, Oswald, et al., 2000). The conflict between speech intelligibility and privacy may be partly insoluble since the desirable listening conditions may vary. However, in most contexts the best solution seems to be to minimize the reverberation time. To ensure that speech intelligibility is acceptable also for the persons who are most far away from the speaker a general prolongation of the RT may not be necessary. Reflectors may be used to improve the signal-to-noise ratio in positions where this is needed, and this could be done without increasing the RT in the room substantially.
Reverberation and cognitive load in speech perception tasks
Although much research has been devoted to the effects of reverberation time on the speech signal and on speech comprehension, no attempts seem to have been done to analyze more in detail how the cognitive processing of the speech signal is affected by reverberation conditions. However, it is likely that these effects are similar to the ones found in studies of other factors that may interfere with speech understanding, like noise and hearing impairment. Like other impoverished listening conditions a too long RT means missed, distorted and thereby less redundant information, which makes speech understanding a more demanding task.
Speech (as well as texts and other incoming stimuli) is processed in the working memory, which is a general cognitive resource with a limited processing capacity for both processing and temporary storage of current information (Baddeley, 1986, 2001). Speech understanding requires the listener to integrate the current information with information that has recently been processed and with stored information. Obviously, processing the speech in unfavorable reverberation conditions puts higher demands on the working memory; the impoverished stimulus means that there are more alternative interpretations of the speech stimuli and that speech understanding has to rely more on stored information for successful performance than in good listening conditions. The perceptual (phonological) coding of the speech signal, which in good listening conditions is largely an automatic and bottom up process, becomes more of a top-down and resource demanding process.
From this line of reasoning follows that a good working memory capacity should be of importance for speech comprehension, especially in impoverished stimulus conditions, which put higher demands on the cognitive processing resources. Several studies confirm this prediction (Andersson and Lyxell, 1999, Lunner, 2003, Lyxell, et al., 1998, Ronnberg, 1995). It should be noted that these studies used text stimuli for measurement of working memory capacity. The processing capacity, thus, is assumed not to be modality specific. This means that these studies only indirectly show how the cognitive processing of speech is affected by impaired listening conditions. Another important feature of their assessment of working memory was that they used a reading span task instead of the conventional test of memory span, in which, e.g. a series of digits has to be remembered. In the reading span, task subjects both have to decide whether a sentence was absurd or normal and remember the first and last word of each sentence. This test not only requires storing but also parallel processing of information. The traditional digit span test thereby only tests the capacity of the phonological store function of the working memory, whereas reading span performance also depends on the capacity of the central executive function (and probably also processes not belonging to the working memory). This should make it a more valid test of working memory capacity.
A Swedish text-based test battery that has been used in many of these studies is the Text Information Processing System (TIPS). The battery includes measures of working memory capacity, phonological processing and verbal information processing (Hallgren, et al., 2001). Some models of working memory (Richardson, et al., 1996) assume that these cognitive skills are modality specific, that is specific for visual and auditory tasks. This was one motive for developing an auditory (and an auditory-visual) version of the test battery, Speech and Visual Information Processing System, SVIPS (Hallgren, et al., 2001). SVIPS includes three tests of verbal information processing: semantic decision making (decide whether a word belongs to a certain semantic category), lexical decision making (discriminate real words from nonsense words) and name matching (decide whether two presented letters are the same or not). Two rhyme tests are used as indicators of phonological processing (the other tests also requires phonological processing, but also the use of the result of the phonological processing for fetching information from the long-term memory). Both speed and accuracy is assessed. The reading span test of working memory is not included in the auditory battery. The first study with SVIPS included four groups: young and older subjects, with and without a hearing impairment. The study also contained a direct testing of how cognitive speech processing is affected by impaired listening conditions by having testing under quiet conditions and in noise (signal-to-noise ratio of +10 dB).
Generally the results indicated few differences between the normal-hearing group and the hearing-impaired in the text version. In the auditory version the hearing-impaired persons primarily performed worse on both speed and accuracy in the lexical decision test. Their main obstacle was to identify the nonsense words. Noise impaired the accuracy of the lexical and name matching tests, especially among the elderly and prolonged reaction times in the lexical, semantic and rhyming tests.
The performance of conventional tests of speech comprehension, like the identification of monosyllabic words, of course require the use of the same cognitive processes as those tapped by the TIPS and SVIPS batteries. However, the traditional word identification tests reflect the total effect of the impaired listening situation on word identification, and do not specify the effects on different stages of processing. Furthermore, only the accuracy of performance but not speed is assessed in these tests, which makes them rather insensitive indicators of the cognitive demands imposed by the task. The fact that number of correctly identified words is unaffected in spite of an unfavorable listening condition does not necessarily mean that this condition did not make the task cognitively more demanding. The effect of the listening condition may have been that a more complex and time consuming processing of the speech signal was required to find the correct answer. By measuring response times it is thus possible to tap impairments of speech processing before the level where words cannot be identified. This was illustrated in some of the tests of the SVIPS battery where reaction times differentiated groups and conditions that performed equally good as judged from the accuracy measure (Hallgren, et al., 2001).
Correct interpretations of speech signals may not only take longer time under impaired listening conditions; they are also likely to be made with less confidence. Therefore another possible way to increase the sensitivity would be to let subjects judge their confidence in the answers they give. It has been shown that noise may affect confidence judgments, (e.g. (Smith, 1989), but none of these studies dealt with speech comprehension.
In conclusion, we know that speech comprehension may be impaired by the reverberating sounds. Much research has been done on how persons with a hearing impairment process speech, and tests of critical cognitive processes have been developed. These tests have been shown to be sensitive to the effects of noise on speech processing (Hallgren, et al., 2001), and could also be used in studies of the cognitive load of speech processing in unfavorable reverberation conditions.
The identification of the elements of the speech signal, the phonological coding, thus requires more effort in demanding listening situations than normal. This does not only mean that the risk is raised for phonological processing failures like a misinterpreted word. Another consequence is that fewer resources are available for the further processing of the speech information. This is discussed further in the next section.
Reverberation, cognitive fatigue and sustained attention
Listening to speech means that we make a controlled selection from the information that reaches us. Like all controlled cognitive processes (as opposed to the automatic ones) this selection makes demands on the available processing resources. When selection of the critical information requires discriminations between very similar acoustic signals more resources must be allocated to the task. In unfavorable listening conditions such a discrimination of signals from distractors and background is more difficult, which makes it necessary to exert more effort to succeed in this selection.
Only two studies have, to my knowledge been reported on the effects of RT on speech discrimination (Culling, et al., 2003, Darwin and Hukin, 2000). Darwin and Hukin (2000) pointed out that reverberation might affect the difficulty of selective attention by affecting the saliency of two major cues used in selective auditory attention, viz. location and pitch. Interaural intensity and time differences provide important information for the localization of sounds and reverberation may affect both, and, thus may make localization more difficult. The pitch effect of reverberation means that the harmonic structure of a sound becomes less clear in connection with fast changes of the fundamental tone frequency (F 0 ), which often occur in speech. This effect does not only make discrimination between voices more difficult; it also may affect speech intelligibility. They report a series of simulator experiments where the subjects' task was to attend to one of two sentences that were presented simultaneously. The words "dog" and "bird" were heard simultaneously and the subject had to decide which of the two words belonged to the attended sentence. This was done with two RTs (.14 and .4 s), with varying distance between the F0 of the two voices, and with varying interaural time differences (ITDs) for the two messages. Their first experiment showed that reverberation made it more difficult to use ITDs and pitch as discriminatory cues. In a second experiment with a similar design the relative strength of spatial (ITDs) and prosodic (sentence intonation) discriminatory cues and the effect of RT on their effectiveness were studied. The results indicated that reverberation has a greater effect on the effectiveness of ITDs than on the prosodic cues. Their last experiment showed that a difference in apparent vocal tract size (spectral envelope) is very effective as a discriminatory cue, which is also highly resistant to reverberation.
Culling et al. (2003) investigated the effect of RT on listeners' ability to segregate two simultaneously presented voices, which were either spatially separated or coming from the same direction. Also the results from these experiments showed that the RT may make it more difficult to use pitch and location as discriminatory cues.
Thus, one way in which RT may make speech comprehension more demanding is by making it more difficult to pick out the relevant information in the auditory stream of information. But even if the listener succeeds in this discrimination task, another problem caused by the reverberating sound may remain; the signal may not unequivocally specify which words were uttered. The speech signal then has to be supplemented with stored information to be correctly identified, which also requires processing resources.
The risk for attentional and processing failures increases when the task is prolonged; that is when there are demands on sustained attention to the speech. Processing demands that can be handled perfectly for a shorter period may be too high in such a prolonged continuous task. A prolonged RT thus may leave the performance of a short-lasting speech processing task unimpaired. Eventually a prolongation of demands will lead to fatigue, a depletion of processing resources. For many applications, for example in school situations, it would therefore be more important to know how the RT affects the ability to uphold a good performance level for prolonged periods than how the initial or maximum ability is affected.
The more resources required for sheer identification of the verbal stimuli, the fewer resources are left for the understanding and further processing of the speech. If so, unfavorable reverberation conditions and other factors that affect speech recognition should lead to a larger impairment of performance of auditory verbal tasks when deeper processing of the speech is required (the more cognitive resources required by the task). For example, unfavorable reverberation conditions may be of little importance if the task just is to indicate each time that an easily discriminable word is uttered. In contrast, the performance may be impaired if it is necessary to integrate temporally separated information in a complex way to manage the task.
Since processing resources are likely to dissipate with time in task, these effects should become more pronounced as the task is prolonged, and more so the more resources that are required by the task. Therefore, this deterioration should become more pronounced by unfavorable reverberation conditions.
A further prediction is that this performance deterioration is likely to be accelerated if the task is made more difficult. If the reverberating sound makes the phonological processing of the task more difficult less resource will be left for further processing of the information. The effect of the higher discrimination demands should therefore be more apparent when the task requires more complex processing of the speech information.
Another important question is to what extent cognitive fatigue caused by processing of auditory information affects the capacity to perform nonauditory tasks. Processing of auditory and visual signals seems to both make demands on general processing resources as indicated by the relatively high correlations usually found between working memory performance during reading and listening (Daneman and Carpenter, 1980) although contradictory results have been reported (Hallgren, et al., 2001).
Other possible effects of RT on attention: Stimulus driven attention and distraction
The effects of RT discussed above all refer to situations where the listener tries to take in certain spoken information and ignore other information. This kind of attention has been called goal driven, active or top-down controlled attention. Another type of attention phenomenon is the stimulus driven or passive attentional responses, also labeled bottom-up controlled attentional processes.
Attention is stimulus driven when it is caught by the stimulus, without any intention from the person. A sudden loud sound may for example. catch a person's attention although he makes an effort to keep his attention on something else. The response elicited by sudden changes in our surroundings is called the orienting response (OR) and is an automatic, passive attentional response that increases the likelihood that we attend to changes that are of importance for us. If the stimulus is very intense and aversive the physiological response pattern is somewhat different (defensive response). A related response is the startle reflex, which is triggered by sounds with a very sudden onset. The negative side effect of these responses on cognitive functions is, of course, that a sound may interrupt an ongoing cognitive activity by directing the attention to an irrelevant noise source. In many tasks the effect is an inconsequential brief interruption of work; in other tasks the effect is to interrupt a line of thought or a behavioral sequence that is difficult to resume.
An important aspect of stimulus driven attention is that it, like other automatic cognitive processes, makes no or very small demands on the available processing resources. Thus, it is not necessary for us to allocate processing resources for the detection of events that elicit an orientation response.
Stimulus driven attention to auditory input depends on its intrusiveness. Important and partly dependent stimulus characteristics for such responses are intensity (S/N ratio), novelty, onset characteristics and signal value. The importance of novelty means that when a stimulus is repeated with a high enough rate the orienting response will habituate, one learns not to attend to the stimulus.
Is it likely that RT affects the attention catching capacity of a sound? The reverberating sound might possibly lower the probability for such attentional responses by making the onset and offset less distinct, thus making an auditory stimulus stand out less clearly against the background. On the other hand a shortened RT also may mean a lowered signal-to-noise level, which should make the signal less intrusive. To our knowledge no direct test of such effects have been reported. Thus, we do not know whether psychophysiological or behavioral indicators of attention responses are affected by reverberation conditions. Berg's (2001) study of arousal responses to twelve different types of sounds during sleep has some relevance for this question. He found that a mean shortening of the RT of 0.12 s in the 200-5000 Hz frequency range reduced the number of arousal responses to various sounds. A problem in the interpretation of these results is that the night with a shortened RT followed the control night for all subjects. Therefore, it cannot be excluded that differences in habituation may have contributed to the observed difference, although it seems unlikely. Efforts were made in Berg's study (Berg, 2001) to avoid such an effect and previous studies of responses to noise during sleep indicate that no habituation of EEG responses occur (Carter, et al., 2002). In his discussion Berg suggests that the reduction of arousal responses was an effect of a smoothening of the sharpness of the sound stimuli by the shortened RT. No supporting psychoacoustic measurement was presented and the hypothesis cannot be evaluated without listening to sounds in the treated and untreated room. An obvious alternative explanation would be that the shortened RT also lowered the sound level, but this cannot have been the case since the effects on sound levels were extremely small, which is also to be expected from the rather moderate reduction of the RT. The interpretation of the results would also have been helped by a separation of the effect of the shortened RT on the response to impulse and continuous sounds.
As mentioned above orienting response is automatic and does not require us to allocate resources to it. However, to inhibit such responses, for example to try to concentrate on one's task in spite of a noisy surrounding is resource consuming and more so the more intrusive the noise is. This means that the cognitive load depends on OR eliciting potency of the potential distractors. To the extent that RT affects the intrusiveness of distracting sounds it may have consequences for the cognitive load imposed by a task.
Irrelevant sounds thus may increase cognitive demands in speech perception in two ways; by impairing information in the speech signal and by drawing the attention from the speech to irrelevant sounds.
Everyday experiences tell us that we sometimes are so absorbed by an activity that we are shut off from surrounding events like telephone signals that normally would draw our attention. Thus the intrusiveness of the distractor is not the only factor that determines the possibility to inhibit orienting responses. Motivational factors are obviously also of importance, but systematic research seems to be lacking on such effects.
Two other ways in which unfavorable reverberation conditions may interfere with the ability to inhibit attentional responses to irrelevant stimuli have already been touched upon above. One case is when a stimulus competing for attention is very similar to the target stimulus. This is the general problem of discrimination between stimuli in selective attention discussed above. Another case is when fatigue develops as a consequence of long lasting cognitive demands. Fatigue means that the capacity for focused attention is reduced and thus, that attention is likely to wander away from the task to other stimuli or from the whole external situation to inner thoughts.
Reverberation, sound quality, mood and cognition
The main cognitive problem created by an unfavorable reverberation time is that it may impair the information in the speech signal. However, as evident in the extreme case of anechoic rooms, a room with extremely short RT feels strange and uncomfortable, and the absence of acoustic feedback makes it more difficult for the speakers to monitor their voices. The early arriving reflected sound thus not only may improve the signal-to-noise ratio; it can also modify the sound quality in a pleasant way.
The subjective importance of reverberation has also been demonstrated in studies of virtual environments with and without the simulation of reverberation characteristics of the visual environment. Reverberation added to the feeling of presence in the virtual environment (Larsson, Vastfjall and Kleiner, 2001 as cited by Vastfjall, et al., 2002). A related effect of the reflected sound is that it affects the perceived direction to the sound source. If the reflected sound arrives less than 35 ms after the direct sound, the direct sound determines the perceived direction of the sound (the precedence effect). The reflected sound that arrives later makes the sense of direction less precise (Rossing, 1990).
It thus seems likely that reverberation and other aspects of the acoustical environment may also be of importance for our comfort and affective state, not only as a result of their effects on speech comprehension. This is in itself of interest, but it may also have implications for the cognitive effects of reverberation time, since many studies have demonstrated that cognitive processes are affected by a person's mood. Mood effects, thus may be another way for reverberation time to influence cognitive processes beside those mediated directly by the effects on speech processing. However, systematic studies of the effects of RT on mood and other affective responses seem to be scarce. Vastfjall, et al. (2002) showed that the RT affected the emotional reaction to music with the longest reverberation time inducing the least pleasant and aroused state.
The effect of mood on cognitive processing may be conceptualized within the same theoretical framework as has been used above in the discussion of the effects of RT and noise, that is in terms of how RT affects the information processing resources available for the processing of the task (Ellis and Ashbrook, 1991, Seibert, 1991). According to this hypothesis both positive and negative mood states could increase the risk for task-irrelevant processing, which compete for the available processing resources. Other authors have proposed specific effects of a positive mood on how reasoning tasks are processed, which should result in a better performance at least of some types of tasks (Isen, 1987, 2002, 2003). Neither of the two hypotheses gets unequivocal support from the empirical studies but it stands clear that also a positive mood may have disruptive cognitive effects. However, the effects may vary between different types of cognitive tasks probably depending on what processes are critical for the task (Oaksford, et al., 1996).
It is easy to envisage situations where reverberating sounds lead to strong negative mood responses, at least when it makes speech communication difficult. The task irrelevant cognitive processing that is elicited by this mood then might add to the cognitive load resulting from the impaired speech signal. It seems much less likely that the acoustic conditions in a speech communication situation should have positive mood effects that are strong enough to interfere with the information processing.
An impressive amount of research has been reported on the cognitive effects of noise and we therefore know rather much about how attention and memory processes and learning are affected by noise level and type of noise (e.g. traffic noise and irrelevant speech). In contrast there are virtually no studies of the cognitive effects of RT beside those on speech intelligibility. Speech intelligibility has in these studies most often been defined as the number of correctly identified words in otherwise good listening conditions. The results therefore have limited relevance for the understanding of how RT affects the reception and storing of information under less optimal and more realistic conditions. Results from these studies show that unfavorable reverberation conditions may make word identification more difficult. However, no attempts have been made to elucidate how different aspects of speech processing are affected by RT. Neither are there any studies on the implications of these effects in normal speech communication situations. These situations differ from the situations generally used in speech intelligibility studies in two critical ways. A first major difference is that the task in normal communication situations is not to identify a series of words but to extract the information from a long series of words. This means that the content of the spoken message must be stored and processed in a way that is not necessary when the task is to identify single words. Given the limited capacity of the working memory, if more resources must be allocated to the phonological coding of the speech, less will be left for the further processing of it. A second difference is that speech communication situations often are extended in time. In schools it is not unusual with demands of continuous attention to speech for an hour. In such situations cognitive fatigue is likely to set in, and increased cognitive demands should accelerate this development.
The general argument thus is that unfavorable reverberation conditions reduce the cognitive resources available for speech processing beyond the identification of speech elements. This means that the argument is akin to the information overload approach to the effects of noise and other stressors (Cohen, et al., 1986). However, according to this model the overload is an effect of attentional resources being allocated to the irrelevant noise, and the effects are therefore not limited to tasks that require auditory information processing. In an auditory task the critical effect of both noise and a too long RT is that they make the identification of the auditory signal more difficult. However, a noise may also reduce the resources available for further processing in two other ways. As Cohen (Cohen, 1978) suggested the noise is monitored and requires some allocation of attention. In addition, the effort of not becoming distracted by the noise may toll the processing resources. Thus, in a task that requires speech comprehension, the effects of noise and a too long RT both probably are explained by their effect on available processing resources. This reasoning leads to a testable hypothesis about the relative effects of noise and RT on the processing of spoken information. If a noise level and a RT are chosen that disrupt speech intelligibility to the same degree, the noise should interfere more than the RT with the further information processing. This is to be expected since the noise reduces the available processing resources also in the other two ways mentioned.
The studies of how RT affects speech intelligibility thus tell us rather little about the practical implications of unfavorable reverberation conditions in speech communication situations. To further the understanding of these implications two basic questions should be studied: How does RT affect different aspects of speech processing by the working memory? How does RT affect the stimulus-driven attentional responses to distracting sounds?
Studies are also needed of RT effects in realistic long-lasting speech communication situations. The interesting question in such studies is primarily under which conditions is the reverberating sound likely to impair the understanding and storing of the spoken message. From a practical point of view it is especially important to study these effects in vulnerable groups: children in schools, hearing impaired, persons with another mother language than the speaker and the elderly. In this way a better basis would be obtained for the design requirements for environments in which these groups are less handicapped in speech communication situations.
The writing of this paper was made possible by financial support from Ecophon and the Centre for Built Environment at the University of Gavle.
|1||American National Standards Institute (2002). ANSI/ASA S12.60-2002. Acoustical Performance Criteria, Design Requirements, and Guidelines for Schools. Washington American National Standard Institute|
|2||Andersson, U. and Lyxell, B. (1999). Phonological deterioration in adults with an acquired severe hearing impairment. Scandinavian Audiology 27 Suppl. 49: 93100|
|3||Baddeley, A. D. (1986). Working memory. Oxford Clarendon Press|
|4||Baddeley, A. D. (2001). Is working memory still working. American Psychologist 56: 851-864|
|5||Berg, S. (2001). Impact of reduced reverberation time on sound-induced arousals during sleep. Sleep 24: 289-92.|
|6||Bradley, J. S. (1986). Predictors of speech intelligibility in rooms. Journal of the Acoustical Society of America 80: 837-845|
|7||Bradley, J. S. (1986). Speech intelligibility studies in classrooms. Journal of the Acoustical Society of America 80: 846-854|
|8||Carter, N., Henderson, R., Lal, S., Hart, M., Booth, S. and Hunyor, S. (2002). Cardiovascular and autonomic response to environmental noise during sleep in night shift workers. Sleep 25: 457-464|
|9||Cohen, S. (1978). Environmental load and the allocation of attention. In Advances in environmental psychology. Baum, A., Singer, J. E. and Valins, S., eds. Erlbaum, Hillsdale, N J. pp 1-29|
|10||Cohen, S., Evans, G. W., Stokols, D. and Krantz, D. S. (1986). Behavior, health, and environmental stress. New York Plenum|
|11||Crandell, C. C. and Smaldino, J. J. (2000). Acoustical modifications for the classroom. Volta Review 101: 33-46|
|12||Culling, J. F., Hodder, K. I. and Toh, C. Y. (2003). Effects of reverberation on perceptual segregation of competing voices. Journal of the Acoustical Society of America 114: 2871-2876|
|13||Daneman, M. and Carpenter, P. A. (1980). Individual differences in working memory and reading. Journal of Verbal Learning and Verbal Behavior 19: 450-466|
|14||Darwin, C. J. and Hukin, R. W. (2000). Effects of reverberation on spatial, prosodic, and vocal-tract size cues to selective attention. Journal of the Acoustical Society of America 108: 335-342|
|15||Ellis, H. C. and Ashbrook, P. W. (1991). The "state" of mood and memory research: a selective review. In Mood and memory. Kuiken, D., ed. Sage, Newbury Park. pp 1-21|
|16||Finitzo-Hiber, T. and Tillman, T. W. (1978). Room acoustic effects on monosyllabic word discrimination ability for normal hearing and hearing impaired children. Journal of Speech and Hearing Research 21: 440-458|
|17||Helfer, K. S. and Wilber, L. A. (1990). Hearing loss, aging, and speech perception in reverberation and noise. Journal of Speech and Hearing Research 33: 149-155|
|18||Hodgson, M. and Nosal, E. M. (2002). Effect of noise and occupancy on optimal reverberation times for speech intelligibility in classrooms. Journal of the Acoustical Society of America 111: 931-9.|
|19||Hallgren, M., Larsby, B., Lyxell, B. and Arlinger, S. (2001). Evaluation of a cognitive test battery in young and elderly normal-hearing and hearing-impaired persons. Journal of the American Academy of Audiology 12: 357370|
|20||Isen, A. M. (1987). Positive affect, cognitive processes and social behaviour. In Advances in experimental social psychology. Berkowitz, L., ed. Academic, New York. pp 203-253|
|21||Isen, A. M. (2002). Missing in action in the AIM: Positive affect's facilitation of cognitive flexibility, innovation, and problem solving. Psychological Inquiry 13: 57-65|
|22||Isen, A. M. (2003). Positive affect as a source of human strength. In A psychology of human strengths: Fundamental questions and future directions for a positive psychology. Aspinwall, L. G. and Staudinger, U. M., eds. American Psychological Association, Washington, DC. pp 179-195|
|23||Kjellberg, A. and Skoldstrom, B. (1991). Noise annoyance during the performance of different non-auditory tasks. Perceptual and Motor Skills 73: 39-49|
|24||Lunner, T. (2003). Cognitive function in relation to hearing aid use. International Journal of Audiology 42, Supplement 1: 49-58|
|25||Lyxell, B., Andersson, U., Arlinger, S. and Harder, H. (1998). Phonological representation and speech understanding with cochlear implants in deafened adults. Scandinavian Journal of Psychology 39: 175-179|
|26||Nabelek, A. K. and Donahue, D. M. (1984). Perception of consonants in reverberation by native and non-native listeners. Journal of the Acoustical Society of America 71:|
|27||Nabelek, A. K. and Robinson, P. (1992). Monaural and binaural speech perception in reverberation for listeners of various ages. Journal of the Acoustical Society of America 56: 628-639|
|28||Oaksford, M., Morris, F., Grainger, B. and Williams, J. M. G. (1996). Mood, reasoning, and central executive processes. Journal of Experimental Psychology: Human Learning, Memory and Cognition 22: 476-492|
|29||Oswald, C. J. P., Tremblay, S. and Jones, D. M. (2000). Disruption of comprehension by the meaning of irrelevant sound. Memory 8: 345-350|
|30||Payton, K. L., Uchanski, R. M. and Braida, L. D. (1994). Intelligibility of conversational and clear speech in noise and reverberation for listeners with normal and impaired hearing. Journal of the Acoustical Society of America 95: 1581-1592|
|31||Pichora-Fuller, M. K. (2003). Cognitive aging and auditory information processing. International Journal of. Audiology 42 (Supplement 2): S26-S32|
|32||Pichora-Fuller, M. K. and Souza, P. A. (2003). Effects of aging on auditory processing of speech. International Journal of. Audiology 42 (Supplement 2): S11-S16|
|33||Rossing, T. D. (1990). The science of sound. Reading, Mass. Addison-Wesley|
|34||Ronnberg, J. (1995). What makes a skilled speechreader? In Profound deafness and speech communication. Plant, G. and Spens, K.-E., eds. Whurr, London. pp 393-416|
|35||Seibert, P. S. E., Henry C (1991). Irrelevant thoughts, emotional mood states, and cognitive task. Cognition and Emotion 19: 507-513|
|36||Smith, A. P. (1989). Noise, confidence ratings and recognition memory. In Contemporary ergonomics 1989. Megaw, E. D., ed. Taylor and Francis, London. pp 494-499|
|37||Takata, Y. and Nabelek, A. K. (1990). English consonant recognition in noise and in reverberation by Japanese and American listeners. Journal of the Acoustical Society of America 88: 663-666|
|38||Vastfjall, D., Larsson, P. and Kleiner, M. (2002). Emotion and auditory virtual environments: affect-based judgments of music reproduced with virtual reverberation times. CyberPsychology and Behavior 5: 19-32.|