TY - JOUR
T1 - Semantic guidance incremental network for efficiency video super-resolution
AU - He, Xiaonan
AU - Xia, Yukun
AU - Qiao, Yuansong
AU - Ye, Yuhang
AU - Lee, Brian
N1 - Publisher Copyright:
© The Author(s) 2024.
PY - 2024/7
Y1 - 2024/7
N2 - In video streaming, bandwidth constraints significantly affect client-side video quality. Addressing this, deep neural networks offer a promising avenue for implementing video super-resolution (VSR) at the user end, leveraging advancements in modern hardware, including mobile devices. The principal challenge in VSR is the computational intensity involved in processing temporal/spatial video data. Conventional methods, uniformly processing entire scenes, often result in inefficient resource allocation. This is evident in the over-processing of simpler regions and insufficient attention to complex regions, leading to edge artifacts in merged regions. Our innovative approach employs semantic segmentation and spatial frequency-based categorization to divide each video frame into regions of varying complexity: simple, medium, and complex. These are then processed through an efficient incremental model, optimizing computational resources. A key innovation is the sparse temporal/spatial feature transformation layer, which mitigates edge artifacts and ensures seamless integration of regional features, enhancing the naturalness of the super-resolution outcome. Experimental results demonstrate that our method significantly boosts VSR efficiency while maintaining effectiveness. This marks a notable advancement in streaming video technology, optimizing video quality with reduced computational demands. This approach, featuring semantic segmentation, spatial frequency analysis, and an incremental network structure, represents a substantial improvement over traditional VSR methodologies, addressing the core challenges of efficiency and quality in high-resolution video streaming.
AB - In video streaming, bandwidth constraints significantly affect client-side video quality. Addressing this, deep neural networks offer a promising avenue for implementing video super-resolution (VSR) at the user end, leveraging advancements in modern hardware, including mobile devices. The principal challenge in VSR is the computational intensity involved in processing temporal/spatial video data. Conventional methods, uniformly processing entire scenes, often result in inefficient resource allocation. This is evident in the over-processing of simpler regions and insufficient attention to complex regions, leading to edge artifacts in merged regions. Our innovative approach employs semantic segmentation and spatial frequency-based categorization to divide each video frame into regions of varying complexity: simple, medium, and complex. These are then processed through an efficient incremental model, optimizing computational resources. A key innovation is the sparse temporal/spatial feature transformation layer, which mitigates edge artifacts and ensures seamless integration of regional features, enhancing the naturalness of the super-resolution outcome. Experimental results demonstrate that our method significantly boosts VSR efficiency while maintaining effectiveness. This marks a notable advancement in streaming video technology, optimizing video quality with reduced computational demands. This approach, featuring semantic segmentation, spatial frequency analysis, and an incremental network structure, represents a substantial improvement over traditional VSR methodologies, addressing the core challenges of efficiency and quality in high-resolution video streaming.
KW - Convolutional neural network
KW - Efficiency
KW - Semantic guidance
KW - Video super-resolution
UR - http://www.scopus.com/inward/record.url?scp=85197448208&partnerID=8YFLogxK
U2 - 10.1007/s00371-024-03488-y
DO - 10.1007/s00371-024-03488-y
M3 - Article
AN - SCOPUS:85197448208
SN - 0178-2789
VL - 40
SP - 4899
EP - 4911
JO - Visual Computer
JF - Visual Computer
IS - 7
ER -