In swarm robotics, confrontation scenarios, including strategic
confrontations, require efficient decision-making that integrates discrete
commands and continuous actions. Traditional task and motion planning methods
separate decision-making into two layers, but their unidirectional structure
fails to capture the interdependence between these layers, limiting
adaptability in dynamic environments. Here, we propose a novel bidirectional
approach based on hierarchical reinforcement learning, enabling dynamic
interaction between the layers. This method effectively maps commands to task
allocation and actions to path planning, while leveraging cross-training
techniques to enhance learning across the hierarchical framework. Furthermore,
we introduce a trajectory prediction model that bridges abstract task
representations with actionable planning goals. In our experiments, it achieves
over 80% in confrontation win rate and under 0.01 seconds in decision time,
outperforming existing approaches. Demonstrations through large-scale tests and
real-world robot experiments further emphasize the generalization capabilities
and practical applicability of our method.
Questo articolo esplora i giri e le loro implicazioni.
Scarica PDF:
2504.15876v2