FailLite: Failure-Resilient Model Serving for Resource-Constrained Edge Environments

Model serving systems have become popular for deploying deep learning models
for various latency-sensitive inference tasks. While traditional
replication-based methods have been used for failure-resilient model serving in
the cloud, such methods are often infeasible in edge environments due to
significant resource constraints that preclude full replication. To address
this problem, this paper presents FailLite, a failure-resilient model serving
system that employs (i) a heterogeneous replication where failover models are
smaller variants of the original model, (ii) an intelligent approach that uses
warm replicas to ensure quick failover for critical applications while using
cold replicas, et (iii) progressive failover to provide low mean time to
recovery (MTTR) for the remaining applications. We implement a full prototype
of our system and demonstrate its efficacy on an experimental edge testbed. Our
results using 27 models show that FailLite can recover all failed applications
with 175.5ms MTTR and only a 0.6% reduction in accuracy.

Cet article explore les excursions dans le temps et leurs implications.

Télécharger PDF:

2504.15856v1

FailLite: Failure-Resilient Model Serving for Resource-Constrained Edge Environments

Plateforme en ligne

Links

Verbalus Mater

FailLite: Failure-Resilient Model Serving for Resource-Constrained Edge Environments

FailLite: Failure-Resilient Model Serving for Resource-Constrained Edge Environments

Plateforme en ligne

Links

Verbalus Mater

Se connecter

S'inscrire

— DÉBUT DU PROCHAIN ​​COURS EN LIGNE 15 JANVIER -

La vraie science derrière Voyage dans le temps 25% DTO

— DÉBUT DU PROCHAIN COURS EN LIGNE 15 JANVIER -

La vraie science derrière
Voyage dans le temps
25% DTO