441 / 2022-10-03 10:51:02
MSARN: A multi-scale attention residual network for end-to-end environmental sound classification
Environmental sound classification,Convolutional neural network,Multi-scale,End to end
摘要待审
Fucai Hu / Wuhan University of Technology
Peng Song / Wuhan University of Technology
Yongsheng Yu / Wuhan University of Technology
In current end-to-end environmental sound classification model, fixed-size filters are difficult to balance the time-frequency resolution while the weight setting of each scale is hard to reflect their importances when using multi-scale filters. Therefore, an end-to-end environmental sound classification method based on Multi-Scale Attention Residual Network (MSARN) is proposed in this paper, which make full use of attention mechanism, muti-scale fusion and residual network structure. A weighted fusion of features at different scales by attention mechanism is utilized for better feature representation. Meanwhile, the residual structure, instead of the normal one-dimensional convolution layer, is taken into account, which alleviates the problems of gradient explosion and gradient disappearance, and accelerates the model training process. Experiments on the environmental sound datasets ESC-10, ESC-50 and UrbanSound8k show that our MSARN model achieves a classification accuracy of 91.9%, 79.4% and 95.4%, respectively, which is better than other mainstream end-to-end methods. Furthermore, the proposed method has fewer parameters compared to other methods in the literature, which reduces the computational effort of training.
重要日期
  • 会议日期

    11月01日

    2022

    11月03日

    2022

  • 10月30日 2022

    初稿截稿日期

  • 11月09日 2022

    注册截止日期

主办单位
Qingdao University of Technology
移动端
在手机上打开
小程序
打开微信小程序
客服
扫码或点此咨询