کتاب حاضر یکی از بهترین و جدیدترین کتابها در زمینه پردازش صوت و گفتار با رویکرد یادگیری عمیق، چاپ 2015 است که مقدمه آن در اینجا آورده شده است (تعداد صفحات: 329)
Automatic speech recognition, A deep learning approach
by Dong Yu and Li Deng
Automatic Speech Recognition (ASR), which is aimed to enable natural human–machine interaction, has been an intensive research area for decades. Many core technologies, such as Gaussian mixture models (GMMs), hidden Markov models (HMMs), mel-frequency cepstral coefficients (MFCCs) and their derivatives, ngram language models (LMs), discriminative training, and various adaptation techniques have been developed along the way, mostly prior to the new millenium. These techniques greatly advanced the state of the art in ASR and in its related fields. Compared to these earlier achievements, the advancement in the research and application of ASR in the decade before 2010 was relatively slow and less exciting, although important techniques such as GMM–HMM sequence discriminative training were made to work well in practical systems during this period. In the past several years, however, we have observed a new surge of interest in ASR. In our opinion, this change was led by the increased demands on ASR in mobile devices and the success of new speech applications in the mobile world such as voice search (VS), short message dictation (SMD), and virtual speech assistants (e.g., Apple’s Siri, Google Now, and Microsoft’s Cortana). Equally important is the development of the deep learning techniques in large vocabulary continuous speech recognition (LVCSR) powered by big data and significantly increased computing ability. A combination of a set of deep learning techniques has led to more than 1/3 error rate reduction over the conventional state-of-the-art GMM–HMM framework on many real-world LVCSR tasks and helped to pass the adoption threshold for many real-world users. For example, the word accuracy in English or the character accuracy in Chinese in most SMD systems now exceeds 90 % and even 95 % on some systems. Given the recent surge of interest in ASR in both industry and academia we, as researchers who have actively participated in and closely witnessed many of the recent exciting deep learning technology development, believe the time is ripe to write a book to summarize the advancements in the ASR field, especially those during the past several years.
برای دیدن تصویر جلد کتاب روی عکس کلیک کنید. حجم فایل: 4.54 مگابایت
لینک مستقیم دانلود: کتاب پردازش گفتار اتوماتیک با رویکرد یادگیری عمیق