Wavelet-based birdsong recognition for conservation : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Computer Science at Massey University, Palmerston North, New Zealand

Thumbnail Image
Open Access Location
Journal Title
Journal ISSN
Volume Title
Massey University
The Author
According to the International Union for the Conservation of Nature Red Data List nearly a quarter of the world's bird species are either threatened or at risk of extinction. To be able to protect endangered species, we need accurate survey methods that reliably estimate numbers and hence population trends. Acoustic monitoring is the most commonly-used method to survey birds, particularly cryptic and nocturnal species, not least because it is non-invasive, unbiased, and relatively time-effective. Unfortunately, the resulting data still have to be analysed manually. The current practice, manual spectrogram reading, is tedious, prone to bias due to observer variations, and not reproducible. While there is a large literature on automatic recognition of targeted recordings of small numbers of species, automatic analysis of long field recordings has not been well studied to date. This thesis considers this problem in detail, presenting experiments demonstrating the true efficacy of recorders in natural environments under different conditions, and then working to reduce the noise present in the recording, as well as to segment and recognise a range of New Zealand native bird species. The primary issues with field recordings are that the birds are at variable distances from the recorder, that the recordings are corrupted by many different forms of noise, that the environment affects the quality of the recorded sound, and that birdsong is often relatively rare within a recording. Thus, methods of dealing with faint calls, denoising, and effective segmentation are all needed before individual species can be recognised reliably. Experiments presented in this thesis demonstrate clearly the effects of distance and environment on recorded calls. Some of these results are unsurprising, for example an inverse square relationship with distance is largely true. Perhaps more surprising is that the height from which a call is transmitted has a signifcant effect on the recorded sound. Statistical analyses of the experiments, which demonstrate many significant environmental and sound factors, are presented. Regardless of these factors, the recordings have noise present, and removing this noise is helpful for reliable recognition. A method for denoising based on the wavelet packet decomposition is presented and demonstrated to significantly improve the quality of recordings. Following this, wavelets were also used to implement a call detection algorithm that identifies regions of the recording with calls from a target bird species. This algorithm is validated using four New Zealand native species namely Australasian bittern (Botaurus poiciloptilus), brown kiwi (Apteryx mantelli ), morepork (Ninox novaeseelandiae), and kakapo (Strigops habroptilus), but could be used for any species. The results demonstrate high recall rates and tolerate false positives when compared to human experts.
Listed in 2017 Dean's List of Exceptional Theses
Birdsongs, Data processing, Wavelets (Mathematics), Computer sound processing, Pattern recognition systems, Research Subject Categories::TECHNOLOGY::Information technology::Computer science, Dean's List of Exceptional Theses