This course explores the cutting-edge intersection of artificial intelligence and visual arts, focusing on AI-driven animation and video generation. Students will delve into key topics, including Transformers, Diffusion models, Denoising Diffusion Implicit Models (DiT), and autoregressive models. The course will cover current state-of-the-art image and video generation models like DALL-E, Stable Diffusion, Make-A-Video, Sora, Kling, Opensora Plan, and VideoPoet. The course emphasizes a hands-on, project-based approach, where students will implement fundamental components like LSTMs, attention mechanisms, diffusion processes, and basic transformers. By the end of the course, students will gain a solid foundation in both theoretical concepts and practical skills necessary for AI-driven media creation.