Spring 2020

UC Berkeley

The course addresses programming parallel computers to solve complex scientific and engineering problems. It covers an array of parallelization strategies for numerical simulation, data analysis, and machine learning, and provides experience with popular parallel programming tools.

CS267 was originally designed to teach students how to program parallel computers to efficiently solve challenging problems in science and engineering, where very fast computers are required either to perform complex simulations or to analyze enormous datasets. CS267 is intended to be useful for students from many departments and with different backgrounds, although we will assume reasonable programming skills in a conventional (non-parallel) language, as well as enough mathematical skills to understand the problems and algorithmic solutions presented. CS267 satisfies part of the course requirements for the Designated Emphasis ("graduate minor") in Computational Science and Engineering.

While this general outline remains, a large change in the computing world started in the mid 2000's: not only are the fastest computers parallel, but nearly all computers are becoming parallel, because the physics of semiconductor manufacturing will no longer let conventional sequential processors get faster year after year, as they have for so long (roughly doubling in speed every 18 months for many years). So all programs that need to run faster will have to become parallel programs. (It is considered very unlikely that compilers will be able to automatically find enough parallelism in most sequential programs to solve this problem.) For background on this trend toward parallelism, click here.

Students in CS267 will get an overview of the parallel architecture space, gain experience using some of the most popular parallel programming tools, and be exposed to a number of open research questions. The lectures will also cover a broad set of parallelization strategies for applications covering numerical simulation and data analysis to machine learning.

No data.

No data.

No data

Lecture slides available at Home

Lecture videos available on YouTube at CS267 Spring 2020

Projects and assignments available at Project

No other materials available

Advanced MPIBig Data ProcessingCUDA (Compute Unified Device Architecture)Cloud ComputingCollective Communication AlgorithmsCommunication-avoiding matrix multiplicationComputational BiologyData Parallel AlgorithmsDeep learningDense Linear AlgebraDistributed Data Structures (BCL)Distributed Memory Machines and ProgrammingDomain Specific Languages (Halide)Dynamic Load BalancingFast Fourier transformGraph AlgorithmsGraph PartitioningGraphics Processor (GPU)Hierarchical Methods for the N-Body ProblemHigh Performance Computing (HPC)Iterative SolversMachine learningMatrix MultiplicationMemory hierarchyParallel Matrix MultiplyQuantum ComputingRoofline and Performance ModelingSearch algorithmShared Memory ParallelismSorting algorithmSources of Parallelism and LocalitySparse-Matrix-Vector-MultiplicationStructured GridsSupervised learningUPC++: Partitioned Global Address Space LanguagesUnsupervised learning