Speech processing with deep learning for voice-based respiratory diagnosis : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Computer Science at Massey University, Albany, New Zealand

Ma, Zhizhong

Speech processing with deep learning for voice-based respiratory diagnosis : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Computer Science at Massey University, Albany, New Zealand

dc.confidential	Embargo : No	en_US
dc.contributor.advisor	Wang, Ruili
dc.contributor.author	Ma, Zhizhong
dc.date.accessioned	2022-07-13T05:29:37Z
dc.date.accessioned	2022-11-13T21:36:40Z
dc.date.available	2022-07-13T05:29:37Z
dc.date.available	2022-11-13T21:36:40Z
dc.date.issued	2022
dc.description.abstract	Voice-based respiratory diagnosis research aims at automatically screening and diagnosing respiratory-related symptoms (e.g., smoking status, COVID-19 infection) from human-generated sounds (e.g., breath, cough, speech). It has the potential to be used as an objective, simple, reliable, and less time-consuming method than traditional biomedical diagnosis methods. In this thesis, we conduct one comprehensive literature review and propose three novel deep learning methods to enrich voice-based respiratory diagnosis research and improve its performance. Firstly, we conduct a comprehensive investigation of the effects of voice features on the detection of smoking status. Secondly, we propose a novel method that uses the combination of both high-level and low-level acoustic features along with deep neural networks for smoking status identification. Thirdly, we investigate various feature extraction/representation methods and propose a SincNet-based CNN method for feature representations to further improve the performance of smoking status identification. To the best of our knowledge, this is the first systemic study that applies speech processing with deep learning for voice-based smoking status identification. Moreover, we propose a novel transfer learning scheme and a task-driven feature representation method for diagnosing respiratory diseases (e.g., COVID-19) from human-generated sounds. We find those transfer learning methods using VGGish, wav2vec 2.0 and PASE+, and our proposed task-driven method Sinc-ResNet have achieved competitive performance compared with other work. The findings of this study provide a new perspective and insights for voice-based respiratory disease diagnosis. The experimental results demonstrate the effectiveness of our proposed methods and show that they have achieved better performances compared to other existing methods.	en_US
dc.identifier.uri	http://hdl.handle.net/10179/17677
dc.publisher	Massey University	en_US
dc.rights	The Author	en_US
dc.subject	Respiratory organs	en
dc.subject	Diseases	en
dc.subject	Diagnosis	en
dc.subject	Data processing	en
dc.subject	Speech processing systems	en
dc.subject	Deep learning (Machine learning)	en
dc.subject	Voice	en
dc.subject.anzsrc	460212 Speech recognition	en
dc.title	Speech processing with deep learning for voice-based respiratory diagnosis : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Computer Science at Massey University, Albany, New Zealand	en_US
dc.type	Thesis	en_US
massey.contributor.author	Ma, Zhizhong	en_US
thesis.degree.discipline	Computer Science	en_US
thesis.degree.grantor	Massey University	en_US
thesis.degree.level	Doctoral	en_US
thesis.degree.name	Doctor of Philosophy	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: MaPhDThesis.pdf
Size:: 1.36 MB
Format:: Adobe Portable Document Format
Description:

Download

Collections

Theses and Dissertations