MATLAB Project – Modelling an Auditory System
This project aims to familiarise with MATLAB by working on a real-world model, specifically an auditory system that simulates the components associated with hearing. Note that this project is an individual task.
An auditory system mimics the behaviour of a biological cochlea found in humans and other mammals. The system converts a 1D discrete-time audio signal to a 2D time-frequency signal called an auditory spectrogram. From this spectrogram, audio information can be extracted, as shown in Table 1. Its application includes hearing aids, speech and musical information retrieval, audio multimedia systems, and brain modelling.
Index Audio Information Responsible For
1 Intensity Sound loudness.
2 Direction Location of the origins of a sound.
3 Pitch Difference between musical notes and also male and female voices.
4 Timbre Sound colour and shape indicate a sound source, e.g. specific person speaking, specific music instrument playing, etc.
Table 1: Information extractable from an auditory spectrogram.
To convert a one-dimensional (1D) sound signal into a two-dimensional (2D) time-frequency representation, a cochlear filterbank is used. A cochlear filterbank comprises multiple gammatone filters either in parallel or cascaded form. The bandwidth of each gammatone filter increases with increasing frequency so that a high centre frequency filter has a higher bandwidth than a filter with low centre frequency, as shown in Figure 1(a).
A gammatone filter generally behaves like a bandpass filter but has differences associated to the behaviour of the cochlea mechanics. Each gammatone filter is tuned to a specific centre frequency. It only responds to a specific frequency that corresponds to the mechanics of one specific location on the cochlea. So, when the input signal resonates close to the centre frequency of the filter, the filter will output a resonating signal at its centre frequency. Hence, to model an entire cochlea, a gammatone filterbank is used. A filterbank will have a number of gammatone filters whose centre frequencies are tuned from low to high for the entire spectrum of a sound signal.
Figure 1: Increasing bandwidth with increasing centre frequency in the gain response of gammatone filters. (a) x-axis is linearly scaled where intervals between frequencies are the same; (b) x-axis is logarithmically-scaled where intervals between frequencies are nonlinear.
Ideally, the varying filters tuned differently will react to the different frequencies in the input signal and will output multiple signals. These signals are then half-wave rectified, where all negative values are set to 0 and only positive values are maintained. They can be visualised as a 2D image known as an auditory spectrogram, as shown in Figure 2.
An alternative method of showing a spectrogram is by calculating the short-time Fourier transform (STFT) of a sound signal.
Figure 2: Block diagram of auditory system
MATLAB Model Tasks
Implement an auditory system in MATLAB using the following steps:
1. Modify the sample code in gammatonegram.tgz according to your specifications from Table 2 based on your right-most digit in your student number. After your changes are introduced, ensure the following:
- The heights of the two spectrograms generated by default are the same as the number of channels in your settings.
- The lowest centre frequency in your gain response display should be within ±8 Hz of your lowest centre frequency setting.
Right-most index of your student number Gammatone filter with lowest centre frequency Number of channels, ?? (gammatone filters) Gammatone
filter order, ??
0 60 Hz 90 2
1 70 Hz 92 3
2 80 Hz 94 4
3 90 Hz 96 5
4 100 Hz 98 6
5 110 Hz 100 2
6 120 Hz 102 3
7 130 Hz 104 4
8 140 Hz 106 5
9 150 Hz 108 6
Table 2: Cochlear filterbank specification
2. In the same script file, generate a time vector (a vector is also known as an array) ??1 that contains numbers from [0 to (??-1)]/????. Ensure the division by ???? is done after generating the vector 0 to ?? -1. Note that ?? is the length (in number of samples, not time duration) of the sound signal stored in sa2.wav that is found in gammatonegram.tgz and ???? is the sampling rate of the sound signal.
3. Use the sound signal found in sa2.wav provided in gammatonegram.tgz, as input to your model. In MATLAB, display the waveform of the sound signal with respect to time vector ??1 in figure 1 and label the x-axis to reflect time in seconds and y-axis to reflect amplitude (unitless). Add an appropriate title to the graph.
4. In the original figure 1 from gammatonegram.tgz, two spectrograms are shown. Redesignate these spectrograms to figure 2. Display the spectrogram generated by a gammatone filterbank at the top half of the figure and change the graph title to reflect “Gammatone Spectrogram”. Display the spectrogram generated from the short-time Fourier transform (STFT) at the bottom half of the figure and change the graph title to “STFT Spectrogram”. Label all axes.
5. Generate a time vector ??2 that contains ?? number of samples in the range from 0 to the time duration of the sound signal in sa2.wav. Here, ?? is a fixed number dependent on the length (number of samples) of the auditory spectrogram.
6. Calculate and plot the average power (in Watts) of the STFT and auditory spectrograms at the top half and bottom half, respectively in MATLAB figure 3. Here, average power is to be computed independently for each of the two spectrograms. Label the axes and title the graphs. Hint: See online Mathworks help page on bandpower command. Also, it is recommended to use ??2 time vector for the display of both graphs.
7. In Figure 1 displayed above, the gain response of only every 5th channel of the gammatone filterbank is displayed. Generate and display the gain response ??1 (the equation has already been implemented for you in the second argument of the plot line in demo_gammatone.m) of all the channels in the gammatone filterbank on a linearly scaled x-axis and the same response on a logarithmically scaled x-axis in figure 4 in MATLAB. Plot the linearly-scaled gain response at the top half of MATLAB figure 4 and the log-scaled gain response below it. In the graph, your settings from Table 2 can be checked by inspecting the peak of the first filter (left-most curve). This value should within ±8 Hz of your setting from Table 2. The peak of the last filter (right-most curve) should be close to but less than Nyquist frequency of 8 kHz.
8. Display the centre frequencies of every 5th channel from the gain response in the command window using fprintf with the help of a loop. The centre frequencies are the maximum values of every channel in the gain response. The following string should be displayed in a new line in the command window for every 5th channel: “Centre frequency of channel ??: ???? Hz” where ?? is the channel number and ???? is the centre frequency.
9. Generate two temporal profiles – one from the auditory spectrogram and another from the STFT spectrogram. A temporal profile can be generated by summing all the rows of a spectrogram.
10. Generate two spectral profiles – one from the auditory spectrogram and another from the STFT spectrogram. A spectral profile can be generated by summing all the columns of a spectrogram.
11. Display two temporal profiles and two spectral profiles in figure 5. The x-axis of each temporal profile should be displayed with respect to ??2 (in seconds). The x-axis of each spectral profile should be displayed with respect to ?? and ??2 vectors (in Hertz) that correspond to the two spectrograms – these vectors have been automatically generated for you in demo_gammatone.m. The amplitude (y-axis) for all four graphs are unitless. Display:
a. The spectral profile from the STFT spectrogram on the top-left corner in
MATLAB figure 5;
b. The spectral profile from the auditory spectrogram on the top-right corner in
MATLAB figure 5;
c. The temporal profile from the STFT spectrogram on the bottom-left corner in MATLAB figure 5.
d. The temporal profile from the auditory spectrogram on the bottom-right corner in MATLAB figure 5.
12. Use 2D correlation coefficient (CC) to show the quantitative difference between the following sets of signals (note that only one CC should be generated per comparison).
Use fprintf to display the comparisons below one line at a time in your command window.
a. Auditory spectrogram versus STFT spectrogram.
b. Auditory spectrogram bandpower versus STFT spectrogram bandpower.
c. Auditory spectrogram temporal profile versus STFT spectrogram temporal profile.
d. Auditory spectrogram spectral profile versus STFT spectrogram spectral profile.
13. Use symbolic variables and display the impulse response of a ??-order gammatone filter for channel 10 where ?? can be found from Table 2 based on your right-most index of your student number. Also substitute the numeric centre frequency for channel 10 into the ???? variable. Display the equation in 6 significant figures. The impulse response equation is defined by ??[??] in the Auditory Signal Processing.pdf slides.
Add comments to the code you have modified or introduced in MATLAB. Submit only the MATLAB script files that you have modified on vUWS submission link.
Progress Report (25%)
You are expected to complete up to task 4 from the MATLAB model section. Prepare a 2000word progress report on the tasks – follow the guidelines given in the learning guide where relevant. Describe what you have done to complete the tasks. If you are unable to complete any task, explain what you are experiencing. Use any online English grammar and vocabulary checking application to ensure that your report is coherent and clear, e.g. Grammarly – marks will be given if you are able to convey your ideas clearly and concisely.
Submit your progress report and your MATLAB script files that you have modified using the Turnitin link in vUWS under “Assessment 1”.
Final Report (50%)
Prepare a 4000-word final report with the structure outlined below. The final report should include the images as well as the correlation coefficient results and the gammatone filter impulse response (screen capture – do not use your phone to capture any images). Use any online English grammar and vocabulary checking application to ensure that your report is coherent and clear, e.g. Grammarly – marks will be given if you are able to convey your ideas clearly and concisely.
The sections to be included in the final report are:
1. Introduction. Objectives – alternatively, you can include a motivation statement on why this project is important.
2. Components of the auditory model.
3. Modelling the auditory model using MATLAB using the specification from Table 2 clearly described. Also, mention about the filter order from Table 2 required to show
the gammatone impulse response. Also, address the following questions in your report for additional marks:
a. The inner hair cell response has all its negative values set to 0. How can these values be converted from negative to positive? Suggest a computational method to differentiate the current positive values with the positive values converted from the negative values.
b. If individual samples in a spectrogram each have values larger than one, the temporal and spectral profile will have varying dynamic ranges (y-axis range) for different sound loudness. This effect makes it difficult to compare the two profiles of different dynamic ranges. One method is to lock the dynamic range of the signals to be analysed between 0 and 1. Suggest a computational method to attain temporal and spectral profiles with their amplitudes between 0 to 1.
4. Results and discussion (screen capture of all the; screen capture of MATLAB command window showing correlation coefficients (CC) and the symbolic equation). Discuss on the CC results to indicate the degree of difference between pairs of vectors and matrices in task 12.
a. Address which CC result is highest and thus, most similar.
b. Conversely, address which CC result is lowest and thus, least similar.
c. How is the temporal profile signal related to the original sound signal? Use your observation and any key terminologies to describe their association.
5. Conclusion (discuss your experience in using MATLAB for modelling of the auditory model, its usefulness, and difficulties).
6. References. Either Harvard-style or IEEE-style referencing is acceptable – See the last slide in Auditory Signal Processing.pdf as an example.
Submit your final report and your MATLAB script files that you have modified using the Turnitin link in vUWS under “Assessment 3”.
• Signal Processing Toolbox.
• Audio Toolbox.
• Auditory Filterbank Sample Code.
• Auditory Filterbank Documentation.