Spring 2020

Carnegie Mellon University

This course provides a comprehensive introduction to deep learning, starting from foundational concepts and moving towards complex topics such as sequence-to-sequence models. Students gain hands-on experience with PyTorch and can fine-tune models through practical assignments. A basic understanding of calculus, linear algebra, and Python programming is required.

“Deep Learning” systems, typified by deep neural networks, are increasingly taking over all AI tasks, ranging from language understanding, and speech and image recognition, to machine translation, planning, and even game playing and autonomous driving. As a result, expertise in deep learning is fast changing from an esoteric desirable to a mandatory prerequisite in many advanced academic settings, and a large advantage in the industrial job market.

In this course we will learn about the basics of deep neural networks, and their applications to various AI tasks. By the end of the course, it is expected that students will have significant familiarity with the subject, and be able to apply Deep Learning to a variety of tasks. They will also be positioned to understand much of the current literature on the topic and extend their knowledge through further study.

If you are only interested in the lectures, you can watch them on the YouTube channel listed below.

The course is well rounded in terms of concepts. It helps us understand the fundamentals of Deep Learning. The course starts off gradually with MLPs and it progresses into the more complicated concepts such as attention and sequence-to-sequence models. We get a complete hands on with PyTorch which is very important to implement Deep Learning models. As a student, you will learn the tools required for building Deep Learning models. The homeworks usually have 2 components which is Autolab and Kaggle. The Kaggle components allow us to explore multiple architectures and understand how to fine-tune and continuously improve models. The task for all the homeworks were similar and it was interesting to learn how the same task can be solved using multiple Deep Learning approaches. Overall, at the end of this course you will be confident enough to build and tune Deep Learning models.

- We will be using one of several toolkits (the primary toolkit for recitations/instruction is PyTorch). The toolkits are largely programmed in Python. You will need to be able to program in at least one of these languages. Alternately, you will be responsible for finding and learning a toolkit that requires programming in a language you are comfortable with,
- You will need familiarity with basic calculus (differentiation, chain rule), linear algebra and basic probability.

No data.

The course is well rounded in terms of concepts. It helps us understand the fundamentals of Deep Learning. The course starts off gradually with MLPs and it progresses into the more complicated concepts such as attention and sequence-to-sequence models. We get a complete hands on with PyTorch which is very important to implement Deep Learning models. As a student, you will learn the tools required for building Deep Learning models. The homeworks usually have 2 components which is Autolab and Kaggle. The Kaggle components allow us to explore multiple architectures and understand how to fine-tune and continuously improve models. The task for all the homeworks were similar and it was interesting to learn how the same task can be solved using multiple Deep Learning approaches. Overall, at the end of this course you will be confident enough to build and tune Deep Learning models.

Lecture slides available at Lectures

No videos available

Additional readings available at Lectures

Recitations available at Recitations

AccelerationAdaGradAutoencodersBack-propagationBidirectional RNNsBoltzmann MachinesCascade Correlation FiltersConnectionist MachinesConnectionist Temporal Classification (CTC)ConvergenceConvolutional neural network (CNN)Empirical risk minimizationGenerative adversarial network (GAN)Gradient descentHebb’s learning ruleHopfield NetworksLearning RatesLong Short-Term Memory (LSTM)McCullough and Pitt modelMomentumMulti-layer PerceptronNestorovNeural networkNormalizing FlowsOptimizationOptimization AlgorithmsOverfittingPerceptron learning ruleRMSPropRecurrent neural network (RNN)Regularization (mathematics)Reinforcement learning (RL)RepresentationsRosenblatt’s perceptronSequence PredictionSequence-to-sequence (Seq2Seq)Stochastic gradient descent (SGD)Translation InvarianceUniversal ApproximatorVariational autoencoder (VAE)