Data stream mining is the process of extracting knowledge from continuous, rapid data records. It involves predicting the class or value of new instances in the data stream based on previous instances. Machine learning techniques are often used to automate this prediction task, and concepts from incremental learning are applied to handle changes in the data stream over time. Some challenges in data stream mining include detecting concept drift, dealing with partially labeled data, recovering from concept drifts, and handling temporal dependencies. Examples of data streams include network traffic, phone conversations, ATM transactions, web searches, and sensor data.
Stanford University
Spring 2023
This course focuses on data mining and machine learning algorithms for large scale data analysis. The emphasis is on parallel algorithms with tools like MapReduce and Spark. Topics include frequent itemsets, locality sensitive hashing, clustering, link analysis, and large-scale supervised machine learning. Familiarity with Java, Python, basic probability theory, linear algebra, and algorithmic analysis is required.
No concepts data
+ 17 more concepts