CS 182/282A: Deep Neural Networks

Fall 2022

UC Berkeley

An advanced course dealing with deep networks in the fields of computer vision, language technology, robotics, and control. It delves into the themes of deep learning, model families, and real-world applications. A strong mathematical background in calculus, linear algebra, probability, optimization, and statistical learning is necessary.

14 covered concepts

Slides / notes available

No videos available

Assignments available

No other resources available

Course Page

Overview

Deep Networks have revolutionized computer vision, language technology, robotics and control. They have a growing impact in many other areas of science and engineering, and increasingly, on commerce and society. They do not however, follow any currently known compact set of theoretical principles. In Yann Lecun's words they require "an interplay between intuitive insights, theoretical modeling, practical implementations, empirical studies, and scientific analyses." This is a fancy way of saying “we don’t understand this stuff nearly well enough, but we have no choice but to muddle through anyway.” This course attempts to cover that ground and show you how to muddle through even as we aspire to do more.

Prerequisites

This is a graduate-level/advanced undergraduate course about a particular approach to information processing using (simulated) analog circuits where the desired circuit behavior is tuned via optimization involving data since we have no idea how to do hand-tuning at scale. Probabilistic frames are useful to understand what is going on, as well as how we navigate certain design choices. Overall, we expect students to have a strong mathematical background in calculus, linear algebra, probability, optimization, and statistical learning. Berkeley undergraduate courses that can help build maturity include:

Calculus: Math 53 (note: Math 1B or AP Math is not enough)
Linear Algebra and Optimization: EECS 16B and EECS 127/227A is ideal, but EECS 16B alone might be enough if students have complete mastery of that material. Math 110 is also helpful. (note: Math 54 or EECS 16A is required as a minimum, but are not nearly enough.)
Probability: EECS 126, Stat 134, or Stat 140 (note: CS 70 is required at a minimum, but might not be enough for everyone)
Statistical Learning: CS 189/289A or Stat 154 (note: Data 102 is insufficient, even when combined with Data 100.)

Math 53 and EECS 126 and EECS 127 and CS 189 is the recommended background.

Prerequisites are not enforced for enrollment, but we encourage you to consider taking some of the classes listed above and save this course for a future semester if you feel shaky on the fundamentals.

The course assumes familiarity with programming in a high-level language with data structures. Homeworks and projects will typically use Python. We encourage you to check out this tutorial if you haven’t used it before. Students who have taken Berkeley courses like CS 61A and CS 61B are well-prepared for the programming components of the class.

We do not have the staff bandwidth to help students with material that they should have understood before taking this course. If you choose to proceed with this course, you are accepting full responsibility to teach yourself anything in your background that you are missing. We will not be slowing down to accommodate you, and questions pertaining to background material will always have the lowest priority in all course forums.

Learning objectives

The goal is to teach a principled course in Deep Learning that serves the diverse needs of our students while also codifying the present understanding of the field. Topics covered may include, but are not limited to:

Underlying themes of deep learning, including building beyond underlying machine learning concepts like supervised vs unsupervised learning, regression and classification, training/validation/testing, distribution shifts, regularization, the fundamental underlying tradeoffs;
Defining and training neural networks: features, computation graphs, backpropagation, iterative optimization (SGD, Newton’s Method, Momentum, RMSProp, AdaGrad, Adam), strategies for training (explicit and implicit regularization, batch and layer normalization, weight initialization, gradient clipping, ensembles, dropout), hyperparameter tuning
Families of contemporary models: fully connected networks, convolutional nets, graph neural nets, recurrent neural nets, transformers
Problems that utilize neural networks: computer vision, natural language processing, generative models, and others.
Conducting experiments in a systematic, repeatable way, leveraging and presenting data from experiments to reason about network behavior.

Textbooks and other notes

No data

Courseware availability

Scribe Notes available at Syllabus

No videos available

Discussions with solutions and Review Sessions available at Syllabus

Homeworks available at Syllabus

No other materials available

Covered concepts

Attention (machine learning)Autoencoders Computer Vision Convolutional neural network (CNN)Fine-tuning Generative Models Graph neural network (GNN)Long Short-Term Memory (LSTM)Meta-learning Recurrent neural network (RNN)Self-supervision Sequence-to-sequence (Seq2Seq)Transfer learning Transformer (machine learning model)

About Feedback

Discord

CS 182/282A: Deep Neural Networks

Overview

Prerequisites

Learning objectives

Textbooks and other notes

Other courses in Deep Learning

CS 230 Deep Learning

CSE 490 G1 / 599 G1 Introduction to Deep Learning

CS 330 Deep Multi-Task and Meta Learning

CS 224V Conversational Virtual Assistants with Deep Learning

Courseware availability

Covered concepts