May 18 – 22, 2026
Virginia Tech
America/New_York timezone

Look-ahead mixed-precision inference of LLMs

May 19, 2026, 2:25 PM
25m
Torgersen Hall 1010

Torgersen Hall 1010

Minisymposium Talk Numerical Linear Algebra in Machine Learning Numerical Linear Algebra in Machine Learning

Speaker

Stanislav Budzinskiy (University of Vienna)

Description

We address the floating-point computation of compositionally-rich functions, concentrating on LLM inference. Based on the rounding error analysis of a composition, we provide an adaptive strategy to select components of the inner function that need to be recomputed more accurately to improve the numerical stability. We explain how this strategy can be applied to different compositions within a transformer neural network and illustrate its overall effect on LLM inference.

Author

Stanislav Budzinskiy (University of Vienna)

Presentation materials

There are no materials yet.