ICTACT Journals

AN ENHANCED SWARM-BASED DECISION FRAMEWORK IN VISION TRANSFORMERS FOR LARGE-SCALE MULTIMEDIA STREAM PROCESSING ON CLOUD ENVIRONMENTS

ICTACT Journal on Image and Video Processing ( Volume: 16 , Issue: 1 )

Abstract

The exponential growth of multimedia content in cloud environments has created the need for advanced, real-time processing techniques. Traditional deep learning models, while powerful, often face bottlenecks in handling high-dimensional streaming data efficiently. Conventional vision transformer architectures exhibit computational overhead and delayed decision-making when processing large-scale multimedia streams in distributed cloud systems, impacting latency and accuracy. This study proposes an improvised swarm decision mechanism integrated into vision transformers (VT-SwarmNet) for efficient large-scale multimedia stream analysis. The approach combines swarm intelligence for dynamic token selection with transformer-based feature encoding. Data streams are pre-processed in the cloud using distributed computing, partitioned into manageable chunks, and processed in parallel. Swarm agents prioritize salient tokens, improving attention allocation and reducing redundant computations. Experiments conducted on a large-scale multimedia dataset in a simulated cloud environment demonstrated that VT SwarmNet achieved 12.4% higher accuracy, 18.7% lower latency, and 15.3% better F1-score compared to leading baseline methods. The integration of swarm-based decision-making reduced processing overhead while maintaining superior feature extraction.

Authors

S. Vimala¹, D.K. Mohanty², Karthikeyan Thangavel³
Prathyusha Engineering College, India¹, Government B.Ed. Training College Kalinga, India², University of Technology and Applied Sciences, The Sultanate of Oman³

Keywords

Vision Transformers, Swarm Intelligence, Multimedia Streaming, Cloud Computing, Deep Learning

Published By

ICTACT

Published In

ICTACT Journal on Image and Video Processing
( Volume: 16 , Issue: 1 )

Date of Publication

August 2025

Pages

3689 - 3695

Doi

10.21917/ijivp.2025.0522

Page Views

599

Article Details ICTACT Journals