Probability and Statistics

Probability theory is one of the more recently established fields of mathematics, which is currently a very active area of research. It has found numerous applications in mathematical modeling for computer science, finance, statistical mechanics, dynamical systems, bioinformatics etc. Topics in probability theory include martingale theory, stochastic processes, stochastic calculus, ergodicity and others. Probability theory has drawn from many other areas of mathematics such as measure theory, integration theory, representation theory, real and complex analysis and in return has provided some interesting insights into algebraic structures, geometry of graphs, differential geometry, ergodic theory, PDE theory and sampling algorithms.

Probability Theory Courses [Show]

Machine learning (ML) is a sub-field of artificial intelligence that lies at the intersection of computer science, statistics, and probability theory. In broad terms, it concerns the design, implementation, analysis, and application of algorithms that can ‘learn from experience.’ In the last few decades, it has advanced extremely rapidly and found applications to a wide range of problems including analysis of language (“natural language processing,” or NLP) and linguistics, modeling of genetic variation, quantitative modeling in the social sciences, spam filtering, automated discovery of identity theft, and many more.

While ML is a complex and wide field, it can be roughly split in two parts: (1) prediction (or, more correctly, “supervised” ML); and (2) modeling (or, more correctly, “unsupervised” ML). Prediction problems require that one predict the value of some variable in the future given many observations of it in the past; forecasting the value of a stock given its value (and market conditions) over the course of a year is an example. One way to think about modeling problems is as prediction problems in which the variable to be predicted has not been observed at all; forecasting the value of a stock just from market conditions, without ever seeing its previous values, would be an example of a modeling problem. In general, unsupervised problems concern the discovery and exploitation of hidden patterns in data: the canonical example is the problem of collecting together similar images in a large database (“clustering”).

From a math major’s perspective, either or both of the theoretical and applications-oriented sides of ML might be very interesting. Theoretical problems can involve the analysis of specific prediction algorithms, either from a computer science or statistical perspective; the analysis and design of methods of “inference,” the key algorithmic step in modeling that estimates important unobserved quantities from observed data; or connection of machine learning problems with ideas from information theory, probability theory, or other theoretical—and sometimes application—domains. Applications-oriented problems, a catch-all category that includes any problem that ultimately concerns the analysis of real data, show a similar degree of variety. They can involve the deployment of existing methods to interesting data analysis problems; the design of novel methodologies that allow for new kinds of analyses; the use of unsupervised methods to solve traditionally supervised problems; and a range of other research activities. It should be noted that ML research enjoys a rich and fruitful interplay between theory and practice, so that theoretical projects can often include an empirical component and applied projects often require understanding relevant theory—and sometimes extending it!

Machine Learning at Princeton [Show]

Machine Learning Courses [Show]

Contacts [Show]