
| Date: | August 26 - December 13 |
| Location: | Online |
| Hours of Instruction: | 45 |
| Course Fee: | $1500 |
| Registration: | Register Now Enrollment is extremely limited. |
Think BIG! Web server logs, internet clickstream data, social media activity, equipment sensors of all types… Data is now being collected everywhere at a phenomenal rate. How can such massive quantities of data be sorted, analyzed, and turned into something useful? If your company intends to make the best use of data to gain the competitive edge, this course is a “must have.”
Course topics include:
An example is correlation (or regression) analysis based on calibration. A regression algorithm, which is used to process a stream of data in real time, is permanently updated by itself by another stream of “calibration” data.
Consider the problems:
Standard approaches require recording a complete signal and then applying to it a processing algorithm. Although such an approach can provide the most accurate processing, it cannot be done in real time since it would require a record full signal (or big chunks of it). When time is critical, processing can be performed as the data arrives, using a “sliding window.” This approach does not require storing the source signal and provides a feasible balance between processing quality, complexity and delay.
Contemporary algorithmic thinking is severely linked to a programming language. At the same time almost all algorithmic languages, which are used for applied problems explicitly determine the order of operations. As a result, even if a computer can handle multiple problems at once a classical algorithmic language would not allow using parallelism implicitly. To overcome that, a programmer should explicitly determine which parts of code can be executed in parallel. A radical approach to that problem is move towards algorithmic languages that do not specify the order of operations.
For example, consider the information system of a modern aircraft. It has thousands of sensors and is continuously providing streams of readings. It is required in real time to recognize “pathological” patterns of readings’ combinations in dynamics, identify a malfunctioning device or circuit, and initiate appropriate compensation mechanisms.
Peter Golubtsov is a Professor in the Division of Mathematics, Physics Department of the Moscow State Lomonosov University, Moscow, Russia. His areas of research include information science, decision making theory, game theory, probability and statistics, fuzzy sets and systems, mathematical physics, computer simulation, algebra and category theory. At the center of his studies is the concept of informativeness for information transformers which generalizes the notion of sufficiency in statistics.
The basics of big data, Big Data Analytics, will be offered again, beginning in January 2014.
Statistical, Dynamical and Computational Modeling will be offered soon. This is an interdisciplinary course on the integration of statistical and dynamical models with applications to biological problems. Topics include linear and nonlinear models estimation, systems of ordinary differential equations, numerical integration, bootstrapping, MCMC methods. The course is intended for students of mathematics and the natural sciences.