Cost-efficient training and checkpointing for large models on preemptible cloud VMs

Published in 6th Workshop on Machine Learning and Systems (EuroMLSys '26), 2026

Recommended citation: Omkar Desai, Shuyi Pei, Janki Bhimani, and Bryan S. Kim. 2026. Cost-Efficient Training and Checkpointing for Large Models on Preemptible Cloud VMs. In Proceedings of the 6th Workshop on Machine Learning and Systems (EuroMLSys '26). https://doi.org/10.1145/3805621.3807617

Download paper here Recommended citation: Omkar Desai, Shuyi Pei, Janki Bhimani, and Bryan S. Kim. 2026. Cost-Efficient Training and Checkpointing for Large Models on Preemptible Cloud VMs. In Proceedings of the 6th Workshop on Machine Learning and Systems (EuroMLSys ‘26).