Identification and Matching of Room Acoustics With Moving Head-Worn Microphone Arrays

Companion page with binaural audio examples for the manuscript "Identification and Matching of Room Acoustics With Moving Head-Worn Microphone Arrays."

View the Project on GitHub thomasdeppisch/room-acoustic-matching

Identification and Matching of Room Acoustics With Moving Head-Worn Microphone Arrays

This page presents binaural audio examples for the manuscript

T. Deppisch, S. Amengual Garí, P. Calamia, and J. Ahrens, “Identification and Matching of Room Acoustics With Moving Head-Worn Microphone Arrays”, 2026.

Abstract

Head-worn devices such as smartglasses and headsets are the predominant form factor for augmented reality and telepresence applications, where real-world environments are augmented with virtual sound sources. For these sources to appear perceptually convincing, the acoustics of their virtual environment must closely match the room acoustics of the physical space. Estimating room impulse responses (RIRs) in this setting is challenging because practical scenarios require small arrays that continuously move with the user’s head. This study presents a method for the blind identification of RIRs from speech signals captured with a moving head-worn microphone array and the subsequent rendering of virtual sound sources based on perceptually relevant acoustic parameter estimates. A motion-aware signal model that estimates spatial RIRs as sound field coefficients and incorporates position tracking data is compared against an omnidirectional model and a baseline. Numerical results show that the motion-aware model provides the most accurate acoustic parameter estimates when used in an informed setting with the true reference signal. In the blind setting, however, its advantage largely diminishes. In a listening experiment, renderings based on the omnidirectional model are rated as most similar to the reference condition and are most often associated with the correct room significantly above chance. The findings highlight the practical relevance of the proposed framework, with the omnidirectional model offering robust and perceptually convincing performance, while the motion-aware model remains promising for parameter estimation.

Examples

Below are examples from both parts of the listening experiment described in the paper.

MUSHRA

The MUSHRA-like experiment compared a Reference to a Hidden Reference, three different Estimates, and an Anchor. The Reference is spatialized at 30 deg azimuth, and the test conditions at -30 deg azimuth.

Example 1

Example 2

Example 3

Example 4

Example 5

Example 6

Example 7

Example 8

2AFC

In the experiment, participants rated examples like the ones below in 48 different trials. There are two types of trials: comparing the Reference to a Hidden Reference and a Measurement from another room, and comparing the Reference to an Estimate and a Measurement from another room. Below are 10 examples, more are available here.

Example 1: Office vs. Storage

Example 2: Office vs. Kitchen

Example 3: Storage vs. Hallway

Example 4: Kitchen vs. Office

Example 5: Hallway vs. Kitchen

Example 6: Office vs. Storage

Example 7: Storage vs. Hallway

Example 8: Kitchen vs. Office

Example 9: Hallway vs. Storage

Example 10: Hallway vs. Office