MEMS speakers for HRTF measurement

HRTFs and their relation to directional hearing

Head-Related Transfer Functions (HRTF) describe how an individual’s anatomy—primarily the head, ears, and torso—affects the way sound waves arrive at the eardrum from various directions. These effects are unique to each person due to anatomical differences. The HRTF essentially captures the filtering properties that occur when sound is diffracted, reflected, and absorbed by the listener’s body before reaching the inner ear.

In other words, an HRTF relates the free-field sound (unimpeded sound from a source) to the sound pressure measured at the eardrum. It is often represented as a set of impulse responses, which vary depending on the direction of the sound source relative to the listener.

HRTFs are fundamental to spatial audio, as they enable the brain to localize sound sources in 3D space, determining the direction, elevation, and distance of sounds. This process, known as auditory spatialization, relies heavily on interaural time differences (ITD) and interaural level differences (ILD). Additionally, the unique shape of the outer ear (pinna) introduces direction-dependent spectral changes, further refining localization cues, particularly in the vertical plane.

Modern applications of HRTFs are vast and include virtual reality, augmented reality, and binaural audio processing. Personalized HRTFs offer more accurate spatial audio experiences by taking into account the listener’s specific anatomical features.

Methods of HRTF acquisition

The most common method for acquiring personalized HRTFs involves placing microphones inside a listener’s ear canals while playing test signals from multiple directions in an anechoic chamber. These signals, typically swept sine waves or noise bursts, are recorded at multiple speaker positions around the head to capture the 3D spatial information required for a full HRTF dataset. This approach is time-consuming and requires specialized equipment, making it impractical for large-scale or individualized HRTF measurements.

Numerical simulations have gained traction as an alternative to physical measurements. Using finite element methods (FEM) or boundary element methods (BEM), HRTFs can be computed by simulating the propagation of sound waves around digital models of the listener’s head, torso, and pinnae. This method, though computationally intensive, provides a non-invasive way to generate personalized HRTFs and is often used in applications where physical acoustical measurements are not feasible. The downside is that it requires a digital model of the listener’s physiognomy, which can be obtained by 3D scanning the subject’s head and torso.

In addition to these approaches, data-driven methods have been explored for personalizing HRTFs without requiring full measurements. By correlating anthropometric data (such as head and ear shape) with pre-existing HRTF databases, machine learning algorithms can estimate HRTFs with reasonable accuracy for individual users. This enables the fast and scalable generation of personal HRTFs; however, the challenge remains in achieving the same level of accuracy as direct measurements.

Using an array of 20 MEMS speakers for HRTF acquisition

As part of the joint Sonicom project, the Acoustic Research Institute and USound are exploring a novel approach to obtaining HRTFs, with a specific focus on the influence of pinna parameters. To validate this approach, we designed and built a prototype headphone earcup. Due to the small size of the USound speakers, it was possible to equip the earcup with an array of 20 USound MEMS speakers and 24 MEMS microphones. A custom-built amplifier powers the speakers, while a preamplifier manages the microphones. Each speaker and microphone can be individually addressed, with signals transmitted and recorded via 24-channel DAC (Digital-to-Analog Converter) and ADC (Analog-to-Digital Converter) units, respectively.

In the testing process, the headphone is placed on a subject’s head, and transfer functions are recorded between each speaker and microphone. These measurements enable an analysis of how the pinna affects the frequency response, and they can be compared to traditional HRTF recordings for the same subject. The goal is to determine whether the system is suitable for extracting pinna-related HRTF details and if specific speaker-microphone pairs are particularly well-suited for this task.

This method shows the potential to provide a more practical alternative for measuring HRTFs and pinna parameters. The key advantage is that this approach eliminates the need for anechoic chambers and complex speaker setups typically required in conventional HRTF measurements. Additionally, the smaller form factor of the measurement system could make it suitable for mobile or on-the-go HRTF recordings.

Interested in USound’s MEMS technology? Contact us to learn more.

Michele Lucchi

Michele earned his degree in electronics and sound engineering in Graz in 2021. During his studies, he worked in the HiFi speaker sector, gaining hands-on experience in audio technology. Since 2016, he has been a member of the USound team, where he specializes in developing 3D sound headphones, evaluation boards, and prototypes, with a focus on electronic hardware design.