GADS: A Super Lightweight Model for Head Pose Estimation

In human-computer interaction, head pose estimation profoundly influences
application functionality. Although utilizing facial landmarks is valuable for
this purpose, existing landmark-based methods prioritize precision over
simplicity and model size, limiting their deployment on edge devices and in
compute-poor environments. To bridge this gap, we propose \textbf{Grouped
Attention Deep Sets (GADS)}, a novel architecture based on the Deep Set
framework. By grouping landmarks into regions and employing small Deep Set
layers, we reduce computational complexity. Our multihead attention mechanism
extracts and combines inter-group information, resulting in a model that is
$7.5\times$ smaller and executes $25\times$ faster than the current lightest
state-of-the-art model. Notably, our method achieves an impressive reduction,
being $4321\times$ smaller than the best-performing model. We introduce vanilla
GADS and Hybrid-GADS (landmarks + RGB) and evaluate our models on three
benchmark datasets — AFLW2000, BIWI, and 300W-LP. We envision our architecture
as a robust baseline for resource-constrained head pose estimation methods.

Este artículo explora los viajes en el tiempo y sus implicaciones.

Descargar PDF:

2504.15751v1

GADS: A Super Lightweight Model for Head Pose Estimation

Plataforma Online

Enlaces

Verbalus Mater

GADS: A Super Lightweight Model for Head Pose Estimation

GADS: A Super Lightweight Model for Head Pose Estimation

Plataforma Online

Enlaces

Verbalus Mater

Signo en

Regístrate

— PRÓXIMO CURSO ONLINE EMPIEZA EL 15 DE ENERO —

La Ciencia Real Detrás de los Viajes Temporales 25% DTO

La Ciencia Real Detrás de
los Viajes Temporales
25% DTO