Many New Zealand native bird species are under threat, and as such conservationists
are interested in obtaining accurate estimates of population density in order to closely
monitor the changes in abundance of these species over time. One method of estimating
the presence and abundance of birdlife in an area is using acoustic recorders; currently,
omnidirectional microphones are used, which provide no estimate of the direction of
arrival of the call. An estimate of the direction from which each sound came from would
help to discern one individual calling multiple times, from multiple birds calling in
succession - thus providing more accurate information to models of population density.
The estimation of this direction-of-arrival (or DOA) for each source is known as acoustic
source localisation, and is the subject of this work.
This thesis contains a discussion and application of two families of algorithm for
acoustic source localisation: those based on the Generalised Cross-Correlation (GCC)
algorithm, which applies weightings to the calculation of the cross-correlation of two
signals; and those based on the Multiple Signal Classification (MUSIC) algorithm,
which provides an estimate of source direction based on subspaces generated by the
covariance matrix of the data. As the MUSIC algorithm was originally described for
narrowband signals - an assumption not applicable to birdsong - we discuss several
adaptations of MUSIC to the broadband scenario; one such adaptation requiring the
use of polynomial matrices, which are described herein.
An experiment was conducted during this work to determine the effect that the
distance between the microphones in a microphone array has on the ability of that
array to localise various acoustic signals, including the New Zealand native North Island
Brown Kiwi, Apteryx mantelli. It was found that both GCC and MUSIC benefit
from larger inter-array spacings, and that a variant of the MUSIC algorithm known as
autofocusing MUSIC (or AF-MUSIC) provided the most precise DOA estimates.
Though native birdlife was the motivator for the research, none of the methods
described within this thesis are necessarily bound only to work on recordings of birdsong;
indeed, any multichannel audio which satisfies the necessary assumptions for each
algorithm would be suitable.
As well as a description of the algorithms, an implementation of GCC, MUSIC, and
AF-MUSIC was produced in the Python 3 programming language, and is available at