A Quality of Experience and Visual Attention Evaluation for 360° Videos with Non-spatial and Spatial Audio

Research output: Contribution to journalArticlepeer-review

4 Citations (Scopus)

Abstract

This article presents the results of an empirical study that aimed to investigate the influence of various types of audio (spatial and non-spatial) on the user quality of experience (QoE) of and visual attention in 360° videos. The study compared the head pose, eye gaze, pupil dilations, heart rate, and subjective responses of 73 users who watched ten 360° videos with different sound configurations. The configurations evaluated were no sound; non-spatial (stereo) audio; and two spatial sound conditions (first- and third-order ambisonics). The videos covered various categories and presented both indoor and outdoor scenarios. The subjective responses were analyzed using an ANOVA (Analysis of Variance) to assess mean differences between sound conditions. Data visualization was also employed to enhance the interpretability of the results. The findings reveal diverse viewing patterns, physiological responses, and subjective experiences among users watching 360° videos with different sound conditions. Spatial audio, in particular third-order ambisonics, garnered heightened attention. This is evident in increased pupil dilation and heart rate. Furthermore, the presence of spatial audio led to more diverse head poses when sound sources were distributed across the scene. These findings have important implications for the development of effective techniques for optimizing processing, encoding, distributing, and rendering content in virtual reality (VR) and 360° videos with spatialized audio. These insights are also relevant in the creative realms of content design and enhancement. They provide valuable guidance on how spatial audio influences user attention, physiological responses, and overall subjective experiences. Understanding these dynamics can assist content creators and designers in crafting immersive experiences that leverage spatialized audio to captivate users, enhance engagement, and optimize the overall quality of VR and 360° video content. The dataset, scripts used for data collection, ffmpeg commands used for processing the videos, and the subjective questionnaire and its statistical analysis are publicly available.

Original languageEnglish
Article number271
JournalACM Transactions on Multimedia Computing, Communications, and Applications
Volume20
Issue number9
DOIs
Publication statusPublished - 20 Aug 2024

Keywords

  • 360° videos
  • ambisonics
  • quality of experience
  • visual attention

Fingerprint

Dive into the research topics of 'A Quality of Experience and Visual Attention Evaluation for 360° Videos with Non-spatial and Spatial Audio'. Together they form a unique fingerprint.

Cite this