ICTACT Journals

DA-MMHAR: DOMAIN-ADAPTIVE MULTIMODAL AI MODELS FOR CROSS-USER AND CROSS-ENVIRONMENT HUMAN ACTIVITY RECOGNITION

ICTACT Journal on Image and Video Processing ( Volume: 16 , Issue: 4 )

Abstract

Human Activity Recognition (HAR) systems have demonstrated excellent performance in controlled laboratory settings, but their performance tends to vary significantly when applied to diverse users, devices, and environments across domains. To overcome the above-mentioned limitation, this paper presents DA-MMHAR, a domain-adaptive multimodal artificial intelligence framework for cross-user and cross-environment HAR. The proposed framework combines visual, motion, and contextual modalities in a single end-to-end architecture. Modality-driven encoders learn complementary spatiotemporal features, which are then projected into a shared latent space and adaptively combined using an attention-based mechanism to dynamically modulate the contribution of each modality. To reduce the distribution differences between source and target domains, an adversarial domain adaptation approach is employed to promote the learning of domain-invariant feature representations. Furthermore, a multimodal data processing pipeline is constructed to coordinate the heterogeneous inputs, and consistency regularization is used to stabilize the predictions and enhance generalization. Comprehensive experiments are carried out on popular HAR benchmarks, NTU RGB+D, UTD-MHAD, and PAMAP2, following cross-user and cross-environment HAR evaluation settings. The experimental results clearly show that DA-MMHAR outperforms the state-of-the-art unimodal and traditional multimodal HAR methods in terms of recognition accuracy and robustness, while maintaining comparable inference efficiency. These results confirm the potential of the proposed framework for reliable real-world HAR applications in dynamic and heterogeneous environments.

Authors

Swati Gautam, Ankush Srivastava
Ram Krishna Dharmarth Foundation University, India

Keywords

Human Activity Recognition, Multimodal Learning, Domain Adaptation, Artificial Intelligence, Cross-User Generalization, Cross-Environment Robustness

Published By

ICTACT

Published In

ICTACT Journal on Image and Video Processing
( Volume: 16 , Issue: 4 )

Date of Publication

May 2026

Pages

3928 - 3936

Doi

10.21917/ijivp.2026.0552

Page Views

Full Text Views

View Issue

Article Details ICTACT Journals