Advancing Embodied Agent Security: From Safety Benchmarks to Input Moderation

Embodied agents exhibit immense potential across a multitude of domains,
making the assurance of their behavioral safety a fundamental prerequisite for
their widespread deployment. However, existing research predominantly
concentrates on the security of general large language models, lacking
specialized methodologies for establishing safety benchmarks and input
moderation tailored to embodied agents. To bridge this gap, this paper
introduces a novel input moderation framework, meticulously designed to
safeguard embodied agents. This framework encompasses the entire pipeline,
including taxonomy definition, dataset curation, moderator architecture, model
training, and rigorous evaluation. Notably, we introduce EAsafetyBench, a
meticulously crafted safety benchmark engineered to facilitate both the
training and stringent assessment of moderators specifically designed for
embodied agents. Furthermore, we propose Pinpoint, an innovative
prompt-decoupled input moderation scheme that harnesses a masked attention
mechanism to effectively isolate and mitigate the influence of functional
prompts on moderation tasks. Extensive experiments conducted on diverse
benchmark datasets and models validate the feasibility and efficacy of the
proposed approach. The results demonstrate that our methodologies achieve an
impressive average detection accuracy of 94.58%, surpassing the performance of
existing state-of-the-art techniques, alongside an exceptional moderation
processing time of merely 0.002 seconds per instance.

Este artículo explora los viajes en el tiempo y sus implicaciones.

Descargar PDF:

2504.15699v1

Advancing Embodied Agent Security: From Safety Benchmarks to Input Moderation

Plataforma Online

Enlaces

Verbalus Mater

Advancing Embodied Agent Security: From Safety Benchmarks to Input Moderation

Advancing Embodied Agent Security: From Safety Benchmarks to Input Moderation

Plataforma Online

Enlaces

Verbalus Mater

Signo en

Regístrate

— PRÓXIMO CURSO ONLINE EMPIEZA EL 15 DE ENERO —

La Ciencia Real Detrás de los Viajes Temporales 25% DTO

La Ciencia Real Detrás de
los Viajes Temporales
25% DTO