Mathematics for Data Science and ML/AI

Mathematics on blackboard


Data Science, Machine Learning, and Artificial Intelligence rely heavily on the use of modern mathematics. The key mathematical prerequisites to understanding these disciplines include

  • probability theory
  • linear algebra
  • optimisation theory

The first day of the AI @ Oxford School is dedicated specifically to these mathematical prerequisites and can be done separately from the rest of the School.

More often than not, academics and practitioners are introduced to key mathematical ideas in a highly complex setting of advanced measure theory and Hilbert spaces.

Your course will introduce all key ideas, including the more advanced one, in an elementary, intuitive setting, enabling you to proceed to world-class, novel Data Science in as little time as possible.

The material is supported by a wealth of examples and tutorials.


Paul Alexander Bilokon, PhD


The Christ Church (Ædes Christi) college of the University of Oxford.


08:30 – 09:30Registration and welcome
09:00 – 10:00Lecture 1: Introduction to data science
10:00 – 10:30Tutorial 1
10:30 – 11:00Coffee break
11:00 – 12:00Lecture 2: Probability theory
12:00 – 12:30Tutorial 2
12:30 – 13:30Lunch
13:30 – 14:30Lecture 3: Linear algebra
14:30 – 15:00Tutorial 3
15:00 – 15:30Coffee break
15:30 – 16:30Lecture 4: Optimisation theory
16:30 – 17:00Tutorial 4
17:00 – 18:00Lab
18:00 – 19:30Tour of Christ Church
19:30 – 21:00Dinner at the Dining Hall


  • Introduction to data science
    • Data, information, knowledge, understanding, wisdom
    • Analysis and synthesis
    • Data analysis and data science
    • The process of data science
    • Artificial Intelligence and Machine Learning
    • The language of Machine Learning
    • Machine Learning and statistics
  • Probability theory
    • Random experiment and the sample space
    • The classical interpretation of probability
    • The frequentist interpretation of probability
    • Bayesian interpretation of probability
    • The axiomatic interpretation of probability
    • Kolmogorov’s axiomatisation
    • Conditional probability
    • The Law of Total probability
    • Bayes’s theorem
    • Random variables
    • Expectations
    • Variances
    • Covariances and correlations
  • Linear algebra
    • Vectors and matrices
    • Matrix multiplication
    • Inverse matrices
    • Independence, basis, and dimension
    • The four fundamental spaces
    • Orthogonal vectors
    • Eigenvalues and eigenvectors
  • Optimisation theory
    • The optimisation problem
    • Optimisation in one dimension
    • Optimisation in multiple dimensions
    • Grid search
    • Gradient-based optimisation
    • Vector calculus
    • Quasi-Newton methods
    • Gradient descent (stochastic, batch)
    • Evolutionary optimisation
    • Optimisation in practice


  • Murray R. Spiegel, John Schiller, R. Alu Srinivasan. Schaum’s Outlines: Probability and Statistics, second edition. McGraw-Hill, 2000.
  • John B. Fraleigh, Raymond A. Beauregard. Linear Algebra, third edition. Addison Wesley, 1995.
  • Gerard Cornuejols, Reha Tütüncü. Optimization Methods in Finance. Cambridge University Press, 2007.
  • Philip E. Gill, Walter Murray, Margaret H. Wright. Practical Optimization. Emerald Group Publishing Limited, 1982.