AI Research Experiences
Harvard CS197
Take your AI skills to the next level
New! Course materials have now been compiled into a Course Book, now available here.
Dive into cutting-edge development tools like PyTorch, Lightning, and Hugging Face, and streamline your workflow with VSCode, Git, and Conda. You'll learn how to harness the power of the cloud with AWS and Colab to train massive deep learning models with lightning-fast GPU acceleration. Plus, you'll master best practices for managing a large number of experiments with Weights and Biases. And that's just the beginning! This course will also teach you how to systematically read research papers, generate new ideas, and present them in slides or papers. You'll even learn valuable project management and team communication techniques used by top AI researchers. Don't miss out on this opportunity to level up your AI skills.
Instructed by Professor Pranav Rajpurkar.
Lecture Notes
Interact with language models to test their capabilities using zero-shot and few-shot learning.
Learn to build simple apps with GPT-3’s text completion and use Codex’s code generation abilities.
Learn how language models can have a pernicious tendency to reflect societal biases.
Edit Python codebases effectively using the VSCode editor.
Use git and conda comfortably in your coding workflow.
Debug without print statements using breakpoints and logpoints
Use linting to find errors and improve Python style.
Conduct a literature search to identify papers relevant to a topic of interest
Read a machine learning research paper and summarize its contributions
Summarize previous works in an area
Load up and process a natural language processing dataset using the datasets library.
Tokenize a text sequence, and understand the steps used in tokenization.
Construct a dataset and training step for causal language modeling.
Interact with code to explore data loading and tokenization of images for Vision Transformers.
Parse code for PyTorch architecture and modules for building a Vision Transformer.
Get acquainted with an example training workflow with PyTorch Lightning.
Perform Tensor operations in PyTorch.
Understand the backward and forward passes of a neural network in context of Autograd.
Detect common issues in PyTorch training code
Experiment Organization Sparks Joy
Organizing Model Training with Weights & Biases and Hydra
Lectures 8+9 notes
Manage experiment logging and tracking through Weights & Biases.
Perform hyperparameter search with Weights & Biases Sweeps.
Manage complex configurations using Hydra.
Identify gaps in a research paper, including in the research question, experimental setup, and findings.
Generate ideas to build on a research paper, thinking about the elements of the task of interest, evaluation strategy and the proposed method.
Iterate on your ideas to improve their quality.
Deconstruct the elements of a research paper and their sequence.
Make notes on the global structure and local structure of the research paper writing.
Deep Learning on Cloud Nine
AWS EC2 for Deep Learning: Setup, Optimization, and Hands-on Training with CheXzero
Lectures 14+15 notes
Understand how to set up and connect to an AWS EC2 instance for deep learning.
Learn how to modify deep learning code for use with GPUs.
Gain hands-on experience running the model training process using a real codebase.
Create and fine-tune Stable Diffusion models using a Dreambooth template notebook.
Use AWS to accelerate the training of Stable Diffusion models with GPUs.
Work with unfamiliar codebases and use new tools, including Dreambooth, Colab, Accelerate, and Gradio, without necessarily needing a deep understanding of them.
Learn how to use update meetings and working sessions to stay aligned and make progress on a project.
Understand how to use various tools and techniques to improve team communication and project organization.
Learn strategies for organizing your efforts on a project, considering the stage of the project and the various tasks involved.
Learn how to make steady progress in research, including managing your relation with your advisor, and skills to develop.
Gain a deeper understanding of how to increase the impact of your work
Apply key principles of the assertion-evidence approach for creating effective slides for talks.
Identify common pitfalls in typical slide presentations and strategies for avoiding them.
Apply the techniques learned in this lecture to real-world examples of research talk slides to improve their effectiveness.
Understand the different statistical tests that can be used to compare machine learning models, including McNemar's test, the paired t-test, and the bootstrap method.Be able to implement these statistical tests in Python to evaluate the performance of two models on the same test set.
Be able to select an appropriate test for a given research question, including tests for statistical superiority, non-inferiority, and equivalence.
Assignments
Assignment 1: The Language of Code
Assignment 2: First Dive in AI
Assignment 3: Torched
Assignment 4: Spark Joy
Assignment 5: Ideation and Organization
Assignment 6: Stable Diffusion and Research Operations
Final Project