Winter 2022

University of Washington

This course dives deep into the role of probability in the realm of computer science, exploring applications such as algorithms, systems, data analysis, machine learning, and more. Prerequisites include CSE 311, MATH 126, and a grasp of calculus, linear algebra, set theory, and basic proof techniques. Concepts covered range from discrete probability to hypothesis testing and bootstrapping.

While the initial foundations of computer science began in the world of discrete mathematics (after all, modern computers are digital in nature), recent years have seen a surge in the use of probability as a tool for the analysis and development of new algorithms and systems. As a result, it is becoming increasingly important for budding computer scientists to understand probability theory, both to provide new perspectives on existing ideas and to help further advance the field in new ways.

Probability is used in a number of contexts, including analyzing the likelihood that various events will happen, better understanding the performance of algorithms (which are increasingly making use of randomness), or modeling the behavior of systems that exist in asynchronous environments ruled by uncertainty (such as requests being made to a web server). Probability provides a rich set of tools for modeling such phenomena and allowing for precise mathematical statements to be made about the performance of an algorithm or a system in such situations.

Furthermore, computers are increasingly often being used as data analysis tools to glean insights from the enormous amounts of data being gathered in a variety of fields; you’ve no doubt heard the phrase “big data” referring to this phenomenon. Probability theory and statistics are the foundational methods used for designing new algorithms to model such data, allowing, for example, a computer to make predictions about new or uncertain events. In fact, many of you have already been the users of such techniques. For example, most email systems now employ automated spam detection and filtering. Methods for being able to automatically infer whether or not an email message is spam are frequently rooted in probabilistic methods. Similarly, if you have ever seen online product recommendation (e.g., “customers who bought X are also likely to buy Y”), you’ve seen yet another application of probability in computer science. Even more subtly, answering detailed questions like how many buckets you should have in your a hash table or how many machines you should deploy in a data center (server farm) for an online application make use of probabilistic techniques to give precise formulations based on testable assumptions.

CSE 311 and MATH 126. Here is a quick rundown of some of the mathematical tools we’ll be using in this class: calculus (integration and differentiation), linear algebra (basic operations on vectors and matrices), an understanding of the basics of set theory (subsets, complements, unions, intersections, cardinality, etc.), and familiarity with basic proof techniques (including induction).

Our goal in this course is to build foundational skills and give you experience in the following areas:

**Understanding the combinatorial nature of problems**: Many real problems are based on understanding the multitude of possible outcomes that may occur, and determining which of those outcomes satisfy some criteria we care about. Such an understanding is important both for determining how ikely an outcome is, but also for understanding what factors may affect the outcome (and which of those may be in our control).**Working knowledge of probability theory and some of the key results in statistics**: Having a solid knowledge of probability theory and statistics is essential for computer scientists today. Such knowledge includes theoretical fundamentals as well as an appreciation for how that theory can be successfully applied in practice. We hope to impart both these concepts in this class.**Appreciation for probabilistic statements**: In the world around us, probabilistic statements are often made, but are easily misunderstood. For example, when a candidate in an election is said to have a 53% likelihood of winning does this mean that the candidate is likely to get 53% of the vote, or that that if 100 elections were held today, the candidate would win 53% of them? Understanding the difference between these statements requires an understanding of the model in the underlying probabilistic analysis.**Applications in machine learning and theoretical computer science**: We are not studying probability theory simply for the joy of drawing summation symbols (okay, maybe some people are, but that’s not what we’re really targeting in this class), but rather because there are a wide variety of applications where probability and statistics allow us to solve problems that might otherwise be out of reach (or would be solved more poorly without the tools that probability and statistics can bring to bear). We’ll look at examples of such applications throughout the class. For example, machine learning is a quickly growing subfield of artificial intelligence which has grown to impact many applications in computing. It focuses on analyzing large quantities of data to build models that can then be harnessed in real problems, such as filtering email, improving web search, understanding computer system performance, predicting financial markets, or analyzing DNA. Probability and statistics form the foundation of these systems. Another example application area is the use of randomized algorithms and probabilistic data structures. These usually have simpler and more elegant implementations than their deterministic counterparts, and have more efficient time and/or space complexity. We will be learning about some of these applications and you will have the opportunity to implement some of these algorithms.

- (Recommended) Alex Tsun, Probability & Statistics with Applications to Computing, 2020. Available free online here.
- (Optional) Sheldon Ross, A First Course in Probability (10th Ed.), Pearson Prentice Hall, 2018.
- (Optional) Larsen and Marx, An Introduction to Mathematical Statistics (5th edition). Prentice-Hall.
- (Optional) Dimitri P. Bertsekas and John N. Tsitsiklis, Introduction to Probability, First Edition, Athena Scientific, 2000. Available free online here.

Beta distributionBootstrappingCentral Limit TheoremChebyshev‘s inequalityConditional distributionConditional probabilityConfidence IntervalsContinuous Random VariablesConvolutionCorrelationCountingCovarianceCredible IntervalsDirichlet DistributionDiscrete ProbabilityDiscrete Time Stochastic ProcessDiscrete random variablesExpectationHypothesis TestingIndependence (probability theory)Joint Continuous DistributionsJoint Discrete DistributionsLimit TheoremsMarkov chain Monte Carlo (MCMC)Markov's inequalityMaximum Likelihood Estimation (MLE)Maximum a Posteriori EstimationMethod of Moments EstimationMoment Generating FunctionsMulti-Armed BanditsMultinomial DistributionMultivariate Normal DistributionOrder StatisticsProbability via SimulationProperties of EstimatorsPythonRandom variableThe Chernoff BoundThe Normal DistributionTransforming Random VariablesVariance