Search : [ author: Jisu Han ] (1)

Improved Performance of Multi-Modal Audio-Visual Segmentation with Noise

Jisu Han, Jung Uk Kim

http://doi.org/10.5626/JOK.2025.52.2.101

Multi-modal-based object segmentation using audio and visual information is a topic that is currently being actively studied in the field of computer vision. Audio-Visual Segmentation (AVS) is an audio-visual multi-modal object segmentation method proposed to allow only objects that make sounds in visual information to be segmented in pixel units by additional audio information. These technologies are important for applications that require accurate object recognition, such as robot recognition and autonomous driving. When collecting information from the real world, unwanted information can be included. Noise can also occur due to mechanical defects, which can significantly degrade the performance of the AVS model. In this paper, it was confirmed that the addition of noise to audio and visual could reduces the performance. The necessity of a robust AVS study to cope with it was also confirmed. Therefore, this study can improve the problem of performance degradation even when noise is added by adding a network that can removes noise.


Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr