Diffusion Models: Prerequisites, Theory, and Applications
Abstract
Diffusion models have emerged as a powerful class of generative models, achieving state-of-the-art performance in various tasks such as image synthesis, text-to-image generation, and video generation. This study provides a comprehensive overview of diffusion models, covering their theoretical foundations, practical implementations, and applications across different domains.
Study Plan
- Stochastic processes, Brownian motion, Itô calculus
- SDEs and Fokker–Planck equations (forward & backward)
- Girsanov theorem and time reversal
- Score matching (denoising, score estimation)
- DDPM (Ho et al.), Score-based models (Song et al.)
- Reverse-time SDEs, ODE formulation
- Latent diffusion, Conditional Generation
- Consistency models, Flow matching, stochastic interpolants
- Schrödinger bridge & optimal transport formulation
- Diffusion bridges, functional diffusion processes
- Protein folding/design (e.g., AlphaFold, RFdiffusion)
- DNA/RNA generation
- Text-to-image (DALLE, Imagen, SD)
- Text-to-3D, multimodal alignment with LLMs
- Diffusion + LLM
- Diffusion + RL (diffusion policy)
- Research Questions
Implementations
References
- Understanding Diffusion Models: A Unified Perspective, 2022
- Stochastic Interpolants: A Unifying Framework for Flows and Diffusions, 2023
- Continuous-Time Functional Diffusion Processes, 2023
- Statistical Optimal Transport, 2024
- A Unified Approach to Analysis and Design of Denoising Markov Models, 2025