2024-25 Spring - CSIT6000S - Understanding Large Language Models

Course

Description

Large language models (LLMs) have utterly transformed the field of natural language processing (NLP) in the last 3-4 years. The relevant foundation models form the basis of state-of-art systems and become ubiquitous in solving a wide range of generative tasks in computer vision, for instance. With the unprecedented potential capabilities, these models also give rise to new ethical and scalability challenges. This course aims to cover state-of-the-art research topics, starting with technical foundations (BERT, GPT, T5 models, mixture-of-expert models, retrieval-based models), emerging capabilities (knowledge, reasoning, few-shot learning, in-context learning), fine-tuning and adaptation, system design, as well as security and ethics. We will cover each topic and discuss important papers in depth. Students will be expected to routinely read and present research papers and complete a research project at the end. This is an advanced graduate course and all the students are expected to have taken machine learning and preferably NLP courses before and are familiar with deep learning models such as Transformers.
Course period1/02/2530/06/25
Course levelPG
Course formatLecture