End to End Deep Neural Network Frequency Demodulation of Speech Signals

Dan Elbaz, M.Sc. Thesis Seminar
Wednesday, 21.2.2018, 16:30
Taub 601
Adj. Prof. M. Zibulevsky

Frequency modulation (FM) is a form of radio broadcasting which is widely used nowadays and has been for almost a century. The widest use of FM is for radio broadcasting, which is commonly used for transmitting audio signal representing voice. Due to the effect of various distortions, noise conditions and other impairments imposed on the transmitted signal, the detection reliability severely deteriorates. As a result thereof, the intelligibility and quality of the detected speech decreases significantly. In this work we present an end to end learning approach for novel application of software defined radio (SDR) receiver for FM detection. By adopting an end-to-end learning based approach, the system utilizes the prior information of transmitted speech message in the demodulation process. The receiver uses a multi-layered bidirectional Long Short-Term Memory ((B)LSTM) architecture to capture long range dependencies and nonlinear dynamics of the speech signal. The receiver then uses the learned speech structure to detect and enhance speech from the in-phase and quadrature components of its base band version. We compared the new system performance with the conventional method using several speech quality assessment measures, such as: SNR, segmental SNR and also in perceptual evaluation of speech quality score (PESQ). The new system yields high performance detection for both acoustical disturbances and communication channel noise and is foreseen to out-perform the established methods for low signal to noise ratio (SNR) conditions.

Back to the index of events