This course given in English is part of the MMMEF master. It starts with an introduction to the Big Data phenomenon and the focuses on the predictive methods of data science (a.k.a. machine learning methods).

Lectures notes/slides

Please note that the R scripts below have been extracted automatically from the knitr sources of the slides. They must be adapted to run properly: paths to data files must be modified and the opt_chunk related code must be removed. The code is developed under GNU/Linux and uses frequently the doMC package which is not available under MS Windows. It should be replaced by the doParallel package (and the code should be adapted).


  • lecture notes for the introductory course: in English
  • slides on an introduction to machine learning: in English (R code)
  • a short introduction to computational complexity: in English

Models and tools


  • slides on empirical risk minimization: in English
  • slides on regularization and capacity control: in English
  • a more advanced and more thorough presentation of the same concepts are available in my slides on learning theory: in French and in English


Recommended reading/viewing

General papers

Relational databases (and SQL)

Machine Learning

Full course

Tom Mitchell's and Nina Balcan's machine learning course:

Selected topics