In-context learning (ICL)-the ability of transformer-based models to perform
new tasks from examples provided at inference time-has emerged as a hallmark of
modern language models. While recent works have investigated the mechanisms
underlying ICL, its feasibility under formal privacy constraints remains
largely unexplored. In this paper, we propose a differentially private
pretraining algorithm for linear attention heads and present the first
theoretical analysis of the privacy-accuracy trade-off for ICL in linear
regression. Our results characterize the fundamental tension between
optimization and privacy-induced noise, formally capturing behaviors observed
in private training via iterative methods. Additionally, we show that our
method is robust to adversarial perturbations of training prompts, unlike
standard ridge regression. All theoretical findings are supported by extensive
simulations across diverse settings.
Questo articolo esplora i giri e le loro implicazioni.
Scarica PDF:
2504.16000v1