Speech Enhancement using Microphone Array in Moving Vehicle Environment
Jaeyoun Cho Department of Electrical Engineering The Ohio State University Columbus, Ohio, USA chojQee.eng.ohi*state.edu Abstract This paper proposes a robust speech enhancement method combining spectml subtmction and beamfonn- ing, which can be used as a preprocessor for speech recognition system. Spectml subtraction is an eflective method to duce staionary additive noise fmm a single micmphone signal. However, it has a major drawback, in that it intduces musical noise. In this paper, it is demonstmted. that the pmposed method improves exist- ing spectral subtraction method8 and reduces its residual noise wing a microphone army. 1 Introduction Speech recognition can be a valuable addition in many applications of vehicle automation and mobile commu- nication. For example, vehicle devices such as cell- phones, PDAs, or computers can he controlled by the driver’s voice. However, the engine sounds and ambi- ent noise around the driver can seriously degrade the quality of speech received by control systems or mobile phones. Since safety is one of the critical issues mo tivating control of vehicles by voice, it is necessary to provide adequate speech recognition performance. Beamforming is one possible method of speech en- hancement that can be used inside a vehice. Beam- forming is a temporal and spatial filtering process us- ing an array of sensors, which emphasizes signals from a particular direction while attenuating noise or interfer- ence from the other directions [IS]. If the beamformer sets the main beam toward the driver’s mouth, there may be no need to put on a headset microphone to talk to the car control system or to phone someone. Beam- forming by itself, however, does not appear to provide enough improvement to signifcantly improve speech recognition performance. Further, the performance of beamforming becomes worse if the noise source comes from many directions or the speech has strong rever- beration (4][6]. Beamforming has been combined with BSS (blind source separation) techniques recently, but Ashok Krishnamurthy Department of Electrical Engineering The Ohio State University Columbus, Ohio, USA akkQee.eng.ohiwstate.edu
this needs much longer calculation time [IS]. Spectral subtraction, on the other hand, is an effective method to reduce additive noise from a single micrn- phone signal. It can outrival other techniques in en- hancing low SNR signal, and is simple to implement. However, spectral subtraction & introduce an unusual
residual noise called musical noise, which is very an- noying to human ears [Z]. It is known that the musical noise can be attenuated by smoothing spectral variance or applying a masking threshold [7][19].
This paper proposes a new method that combines the advantages of beamforming and spectral subtraction. Even though both spectral subtraction and beamform- ing can enhance speech, it is not desirable to apply the single channel algorithm independently to the micro- phone array signals, as these signals are strongly corre- lated to each other. This paper endeavors to develop a novel speech enhancement method based on psychoa- coustic concepts and proposes a method of combining spectral subtraction with beamforming. The important synergy here is that the number of microphones needed in beamforming is lessened and the musical noise of the spectral subtraction is attenuated with better SNR im- provement.
2 Algorithms 2.1 Proposed Method A speech enhancement method using a microphone ar- ray is proposed here. A speaker or a speech source is
located in the near-field of the microphone array. Since the arrival time of the speech wavefront is different to
each microphone as shown in Figure 1, the temporal differences between microphones should be known be- forehand to be aligned. For example, in case that kth microphone bas the longest distance from the source, the signal received on mth microphone should be de- layed by -. The fractional delay filters (FDs)
are used for aligning the arrival time of the speech wavefront [17].
0-7803-7848-2/03/$17.00 WOO3 IEEE 366