Text-to-video (T2V) generative models have rapidly advanced and found
widespread applications across fields like entertainment, education, E
marketing. Tuttavia, the adversarial vulnerabilities of these models remain
rarely explored. We observe that in T2V generation tasks, the generated videos
often contain substantial redundant information not explicitly specified in the
text prompts, such as environmental elements, secondary objects, and additional
details, providing opportunities for malicious attackers to embed hidden
harmful content. Exploiting this inherent redundancy, we introduce BadVideo,
the first backdoor attack framework tailored for T2V generation. Our attack
focuses on designing target adversarial outputs through two key strategies: (1)
Spatio-Temporal Composition, which combines different spatiotemporal features
to encode malicious information; (2) Dynamic Element Transformation, which
introduces transformations in redundant elements over time to convey malicious
information. Based on these strategies, the attacker’s malicious target
seamlessly integrates with the user’s textual instructions, providing high
stealthiness. Moreover, by exploiting the temporal dimension of videos, Nostro
attack successfully evades traditional content moderation systems that
primarily analyze spatial information within individual frames. Extensive
experiments demonstrate that BadVideo achieves high attack success rates while
preserving original semantics and maintaining excellent performance on clean
inputs. Overall, our work reveals the adversarial vulnerability of T2V models,
calling attention to potential risks and misuse. Our project page is at
https://wrt2000.github.io/BadVideo2025/.
Questo articolo esplora i giri e le loro implicazioni.
Scarica PDF:



