![](https://images.squarespace-cdn.com/content/v1/62f94b3e3c9b14080504b6a5/f3e06481-4c04-4ad3-a4ec-c71465d84c97/psr_a_dream_of_climbing_rainbow_stairs_414f7857-dbce-410f-b0f6-79370e178a07.jpg)
AI Research Experiences
Harvard CS197
Learn to do applied deep learning research
In this course, you will learn the practical skills required for applied deep learning work, including hands-on experience with model development. You will learn the technical writing skills required for applied AI research, including experience composing different elements of a full research paper.
Instructed by Professor Pranav Rajpurkar.
Lecture Notes
Interact with language models to test their capabilities using zero-shot and few-shot learning.
Learn to build simple apps with GPT-3’s text completion and use Codex’s code generation abilities.
Learn how language models can have a pernicious tendency to reflect societal biases.
Edit Python codebases effectively using the VSCode editor.
Use git and conda comfortably in your coding workflow.
Debug without print statements using breakpoints and logpoints
Use linting to find errors and improve Python style.
Conduct a literature search to identify papers relevant to a topic of interest
Read a machine learning research paper and summarize its contributions
Summarize previous works in an area
Load up and process a natural language processing dataset using the datasets library.
Tokenize a text sequence, and understand the steps used in tokenization.
Construct a dataset and training step for causal language modeling.
Interact with code to explore data loading and tokenization of images for Vision Transformers.
Parse code for PyTorch architecture and modules for building a Vision Transformer.
Get acquainted with an example training workflow with PyTorch Lightning.
Perform Tensor operations in PyTorch.
Understand the backward and forward passes of a neural network in context of Autograd.
Detect common issues in PyTorch training code
Experiment Organization Sparks Joy
Organizing Model Training with Weights & Biases and Hydra
Lectures 8+9 notes
Manage experiment logging and tracking through Weights & Biases.
Perform hyperparameter search with Weights & Biases Sweeps.
Manage complex configurations using Hydra.
Identify gaps in a research paper, including in the research question, experimental setup, and findings.
Generate ideas to build on a research paper, thinking about the elements of the task of interest, evaluation strategy and the proposed method.
Iterate on your ideas to improve their quality.
Deconstruct the elements of a research paper and their sequence.
Make notes on the global structure and local structure of the research paper writing.
Deep Learning on Cloud Nine
AWS EC2 for Deep Learning: Setup, Optimization, and Hands-on Training with CheXzero
Lectures 14+15 notes
Understand how to set up and connect to an AWS EC2 instance for deep learning.
Learn how to modify deep learning code for use with GPUs.
Gain hands-on experience running the model training process using a real codebase.
Create and fine-tune Stable Diffusion models using a Dreambooth template notebook.
Use AWS to accelerate the training of Stable Diffusion models with GPUs.
Work with unfamiliar codebases and use new tools, including Dreambooth, Colab, Accelerate, and Gradio, without necessarily needing a deep understanding of them.
Learn how to use update meetings and working sessions to stay aligned and make progress on a project.
Understand how to use various tools and techniques to improve team communication and project organization.
Learn strategies for organizing your efforts on a project, considering the stage of the project and the various tasks involved.
Learn how to make steady progress in research, including managing your relation with your advisor, and skills to develop.
Gain a deeper understanding of how to increase the impact of your work
Apply key principles of the assertion-evidence approach for creating effective slides for talks.
Identify common pitfalls in typical slide presentations and strategies for avoiding them.
Apply the techniques learned in this lecture to real-world examples of research talk slides to improve their effectiveness.
Understand the different statistical tests that can be used to compare machine learning models, including McNemar's test, the paired t-test, and the bootstrap method.Be able to implement these statistical tests in Python to evaluate the performance of two models on the same test set.
Be able to select an appropriate test for a given research question, including tests for statistical superiority, non-inferiority, and equivalence.
![](https://images.squarespace-cdn.com/content/v1/62f94b3e3c9b14080504b6a5/00307e05-264c-4f7a-a971-c301d7d1e7b8/psr_a_robot_steps_forward_walking_down_a_hill_7b0f8050-eae6-46e0-a5fd-ce8395b41546.jpg)
Assignments
Assignment 1: The Language of Code
Assignment 2: First Dive in AI
Assignment 3: Torched
Assignment 4: Spark Joy
Assignment 5: Ideation and Organization
Assignment 6: Stable Diffusion and Research Operations
Final Project
FAQs
-
CS 197 is a course in applied deep learning research. In this course, you will learn the practical skills required for applied deep learning work, including hands-on experience with method development, model training at scale, error analysis, and model deployment. You will learn the technical writing skills required for applied AI research, including experience composing different elements of a full research paper. Through structured assignments, you will tackle a scoped-out research project in a small team from conception to co-authoring a manuscript.
-
After the course, you should be comfortable with the practical skills required for applied deep learning engineering and research work, including hands-on experience with method development, model training at scale, and deploying your model. Some more concrete learning objectives include:
Develop strong conceptual background in deep learning
Read and (mostly) understand new papers in ML research
Understand theory behind important model architectures and algorithms, including Transformers
Know when to use certain evaluation metrics and statistical tests
Gain skills needed to execute deep learning projects
Write your own or build upon modular research code
Use standard tooling for ML projects, including conda environments, remote clusters, weights & biases, etc.
Implement and perform data loading, model training, and results logging in python/PyTorch
Solve common ML problems, e.g. how to preprocess data, handle class imbalance, finetune models for downstream tasks
Present your research effectively
Release clear, accessible, and well-documented code on GitHub
Write a high-quality technical research report, with clear sections and figures
Deliver an engaging research talk
-
There is an application to enroll in CS197. The applications are now closed.
Diversity and inclusion
CS 197 welcomes a diversity of thoughts, perspectives, and experiences. The CS 197 teaching staff respects our students’ identities, including but not limited to race, gender, class, sexuality, socioeconomic status, religion, and ability, and we strive to create a learning environment where every student feels welcome and valued. We can only accomplish this goal with your help. If something is said in class (by anyone) or you come across instructional material that made you uncomfortable, please talk to the instructors about it (even if anonymously).
-
If you’re not at Harvard, you can follow this course. We will be sharing course materials online. Sign up at the end of the webpage to get updates.
Course Instruction
The course, in its first offering at Harvard and to the public, has been designed by Professor Pranav Rajpurkar with Elaine Liu & Xiaoli Yang; several members and friends of the Rajpurkar Lab have contributed to an early draft of course materials, including Lucy He, Julie Chen, Vish Rao, Jon Williams, Ryan Chi, Nathan Chi, Mark Endo, Chenwei Wu, Kathy Yu, Ryan Han, Oishi Banerjee, Sameer Khanna, Zahra Shakeri, Julian Acosta, Ethan Chi, Vignav Ramesh, Priya Khandelwal. The Teaching Fellow for the course in Fall 2022 is Katherine Tian.