Publications

2025

  1. Gluon: Making Muon & Scion Great Again! (Bridging Theory and Practice of LMO-based Optimizers for LLMs)
    Artem Riabinin, Egor Shulgin, Kaja Gruntkowska, and Peter Richtárik
    arXiv preprint arXiv:2505.13416, 2025
  2. The Ball-Proximal (="Broximal") Point Method: a New Algorithm, Convergence Theory, and Applications
    Kaja Gruntkowska, Hanmin Li, Aadi Rane, and Peter Richtárik
    arXiv preprint arXiv:2502.02002, 2025

2024

  1. Tighter performance theory of FedExProx
    Wojciech Anyszka, Kaja Gruntkowska, Alexander Tyurin, and Peter Richtárik
    arXiv preprint arXiv:2410.15368, 2024
  2. Freya page: First optimal time complexity for large-scale nonconvex finite-sum optimization with heterogeneous asynchronous computations
    Alexander Tyurin, Kaja Gruntkowska, and Peter Richtárik
    Advances in Neural Information Processing Systems, 2024
  3. Improving the worst-case bidirectional communication complexity for nonconvex distributed optimization under function similarity
    Kaja Gruntkowska, Alexander Tyurin, and Peter Richtárik
    Advances in Neural Information Processing Systems, 2024
  4. Communication compression for byzantine robust learning: New efficient algorithms and improved rates
    Ahmad Rammal, Kaja Gruntkowska, Nikita Fedin, Eduard Gorbunov, and Peter Richtárik
    In International Conference on Artificial Intelligence and Statistics, 2024

2023

  1. EF21-P and friends: Improved theoretical communication complexity for distributed optimization with bidirectional compression
    Kaja Gruntkowska, Alexander Tyurin, and Peter Richtárik
    In International Conference on Machine Learning, 2023