摘要
Quantization Index Modulation (QIM) steganography in the Linear Predictive Coding (LPC) domain has emerged as an effective approach for speech steganography, offering high imperceptibility and statistical undetectability, also resulting in low detection accuracy for steganalysis, especially for short samples in low embedding rates. Global and local features are fundamental to VoIP steganography analysis as they comprehensively characterize statistical perturbations across four codeword correlation dimensions induced by embedding well-established research focus. However, conventional pipeline architectures neglect cross-scale and inter-scale feature interactions, while VoIP speech’s temporal dynamics exhibit distinct codeword correlation patterns at varying time scales. We propose a multi-scale steganalysis structure that characterizes different codeword correlation features to address these limitations. The framework incorporates novel Global–Local Interaction (GLI) modules for adaptive fusion of cross-scale and in-scale features to achieve multi-scale blending, and designs a Multi-Predictor Mixing module that leverages the complementary predictive capabilities of hierarchical feature representations for classification. Our experiments demonstrate that the proposed model outperforms existing methods in detection accuracy, particularly with short samples at low embedding rates, while also meeting real-time processing requirements.
| 源语言 | 英语 |
|---|---|
| 文章编号 | 440 |
| 期刊 | Multimedia Systems |
| 卷 | 31 |
| 期 | 6 |
| DOI | |
| 出版状态 | 已出版 - 12月 2025 |
指纹图谱
探究 'A multi-scale blending steganalysis model based on interactive feature extraction' 的科研主题。它们共同构成独一无二的指纹。引用此
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver