Generative Modeling by Transport: Mathematical Foundations

Generative Intelligence: Bridging Two Communities

Generative AI is reshaping everything from content creation to structure-based drug design. But beyond the applications, what are the underlying principles? Generating data, whether images, text, or molecules, ultimately means sampling from a complex probability distribution. This series explores this process through the mathematical lens of transport: learning to transform a simple, known prior into a target distribution via a probability path. This single concept unifies diffusion models, flow networks, optimal transport, and Schrödinger bridges.

The conceptual shift from regression and one-step generation to probability path construction is profound. Creating order from noise and entropy is fundamentally a transport problem that must respect the underlying geometry of the data. Yet, critical questions remain: Are our current transport paths optimal? How do we efficiently construct and navigate these paths? And can we model a type of "collective intelligence" by composing multiple probability paths learned by individual experts, to create an intelligence system greater than the sum of its parts?

This material serves as a bridge between two communities. It is an invitation for mathematicians and physicists to apply stochastic processes and statistical physics to generative AI, and a call for AI researchers to embrace deeper mathematical rigor. The mathematical and physical principles underlying these systems are not just theoretical curiosities; they are the key to expanding their practical capabilities.

While implementation and architecture are crucial, this series focuses on the mathematical foundations often missing in existing literature. Mathematics provides: (1) a unifying language connecting seemingly disparate models, (2) a transparent view of the underlying assumptions and limitations, and (3) a principled foundation for designing novel algorithms.

By the end of this material, you will understand today's most advanced generative models not as isolated technologies, but as diverse instantiations of a single, elegant mathematical framework.

Lecture Notes

Introduction and Overview [pdf]

Mathematical Preliminaries, Background, Motivation, and Related Work [pdf]
Homework 1 [pdf]
Stochastic Interpolant Definitions, Transport Equations, and Quadratic Objectives [pdf]
Homework 2 [pdf]
Generative Models, Likelihood Control, Density Estimation [pdf]
Homework 3 [pdf]
Instantiations of Stochastic Interpolants: Diffusive, One-sided, Mirror, and Schrodinger Bridge Interpolants [pdf]
Homework 4 [pdf]
Spatially Linear Interpolants [pdf]
Homework 5 [pdf]
Connections, Algorithms, Implementations, and Numerical Experiments [pdf]
Homework 6 [pdf]
Composing Generative Paths: Feynman-Kac Formula, Correctors, and SMC [pdf]
Homework 7 [pdf]
ACE: Path Existence, Marginal Path Collapse, Adaptive Exponents, and Generator Composition [pdf]
Homework 8 [pdf]

References

Albergo, M., Boffi, N. M., & Vanden-Eijnden, E. (2025). Stochastic interpolants: A unifying framework for flows and diffusions. Journal of Machine Learning Research, 26(209), 1-80.
Lee, Z., Hwang, M., Jo, S., Lee, W., Ko, J., Park, Y. B., ... & Kim, K. (2026). On the Collapse of Generative Paths: A Criterion and Correction for Diffusion Steering. International Conference on Machine Learning (ICML), 2026.
Skreta, M., Akhound-Sadegh, T., Ohanesian, V., Bondesan, R., Aspuru-Guzik, A., Doucet, A., ... & Neklyudov, K. (2025). Feynman-kac correctors in diffusion: Annealing, guidance, and product of experts. International Conference on Machine Learning (ICML), 2025.
Holderrieth, P., Havasi, M., Yim, J., Shaul, N., Gat, I., Jaakkola, T., ... & Lipman, Y. (2025, May). Generator matching: Generative modeling with arbitrary markov processes. In International Conference on Learning Representations (Vol. 2025, pp. 52153-52219).
Pauline, V., Höppe, T., Neklyudov, K., Tong, A., Bauer, S., & Dittadi, A. (2025). Foundations of diffusion models in general state spaces: A self-contained introduction. arXiv preprint arXiv:2512.05092.