ICTACT Journals

CONTENT-AWARE NEURAL VIDEO COMPRESSION WITH SPATIALLY ADAPTIVE RATE–DISTORTION OPTIMIZATION FOR EFFICIENT HIGH-QUALITY VIDEO TRANSMISSION

ICTACT Journal on Image and Video Processing ( Volume: 16 , Issue: 3 )

Abstract

The rapid growth of multimedia communication has significantly increased the demand for efficient video compression techniques. Conventional video coding standards often rely on fixed or globally optimized rate–distortion strategies that inadequately adapt to spatial content variations across video frames. As a result, regions with complex textures or motion frequently experience quality degradation, while smoother areas unnecessarily consume coding resources. This imbalance has created challenges in achieving optimal compression efficiency without sacrificing perceptual quality. Therefore, an adaptive mechanism that intelligently allocates coding resources across spatial regions has remained an important research requirement. To address this limitation, this study has proposed a novel neural compression framework termed Spatially Variable Rate–Distortion Neural Coding (SVRD-NC). The framework has utilized a deep neural encoder–decoder architecture that has integrated spatial attention modules and adaptive rate–distortion optimization strategies. Within the architecture, a content-aware feature extractor has analyzed spatial characteristics of video frames, including texture density, motion intensity, and structural complexity. These extracted features have guided a spatial weighting module that has dynamically adjusted the rate–distortion trade-off for different regions of each frame. The optimization mechanism has employed a learning-based distortion estimator that has predicted perceptual reconstruction errors across spatial segments. This prediction has enabled selective bitrate allocation to visually important regions while maintaining efficient compression in smoother areas. The neural entropy model that has been incorporated within the framework has further enhanced coding efficiency by modeling spatial probability distributions of latent representations. Experimental evaluation has been conducted on widely used video datasets that include diverse motion patterns and scene complexities. Experimental evaluation demonstrates that the proposed SVRD-NC framework achieves significant improvements in neural video compression performance. The method achieves a maximum PSNR value of 37.1 dB, which exceeds the Deep Convolutional Autoencoder Compression model that produces 34.2 dB under similar complexity conditions. The structural similarity evaluation indicates that the proposed framework reaches 0.98 SSIM, while the attention-based compression method achieves 0.97. The bitrate analysis shows that the proposed method reduces the transmission requirement to 620 kbps, compared with 720 kbps that appears in the convolutional autoencoder model. The compression ratio improves to 25.1, while the existing approaches remain between 21.2 and 23.6. The reconstruction accuracy also improves because the Mean Squared Error decreases to 0.006, compared with 0.010 that appears in the baseline compression model. These results demonstrate that the spatially adaptive rate–distortion mechanism effectively improves compression efficiency while preserving the perceptual quality of reconstructed video frames.

Authors

K. Karunambiga¹, M. Ganesha²
Karpagam Institute of Technology, India¹, Sapthagiri NPS University, India²

Keywords

Neural Video Coding, Rate–Distortion Optimization, Spatially Adaptive Compression, Deep Neural Networks, Perceptual Video Quality

Published By

ICTACT

Published In

ICTACT Journal on Image and Video Processing
( Volume: 16 , Issue: 3 )

Date of Publication

February 2026

Pages

3790 - 3800

Doi

10.21917/ijivp.2026.0536

Page Views

152

Full Text Views

View Issue

Article Details ICTACT Journals