Python for Data Science and ML/AI

Programming

Overview

Python is the de facto lingua franca of data science, machine learning, and artificial intelligence. Familiarity with Python is a must for modern data scientists.

Your course is designed to take you from the very foundations to state-of-the-art use of modern Python libraries.

You will learn the fundamentals of the Python programming language, play with Jupyter notebooks, proceed to advanced Python language features, learn to use distributed task queues (Celery), learn to work with data using NumPy, SciPy, Matplotlib, and Pandas, examine state-of-the-art machine learning libraries (Scikit-Learn, Keras, TensorFlow, and Theano), and complete a realistic, real-life data science lab.

Instructors

  • Paul Alexander Bilokon, PhD

Venue

This training takes place on Level39.

Schedule

TimeActivity
08:30 – 09:30Registration and welcome, a tour of Level39
09:00 – 10:00Lecture 1: The fundamentals of the Python programming language and Jupyter notebooks
10:00 – 10:30Tutorial 1
10:30 – 11:00Coffee break
11:00 – 12:00Lecture 2: Advanced Python features; algorithmic complexity; distributed computing; sieve of Eratosthenes; applications to cryptocurrencies and Blockchain
12:00 – 12:30Tutorial 2
12:30 – 13:30Lunch
13:30 – 14:30Lecture 3: Python libraries for working with data: NumPy, SciPy, Matplotlib, and Pandas
14:30 – 15:00Tutorial 3
15:00 – 15:30Coffee break
15:30 – 16:30Lecture 4: Machine Learning with Scikit-Learn; Deep Learning with Keras
16:30 – 17:00Tutorial 4
17:00 – 18:00Lab

Syllabus

  • The fundamentals of the Python programming language and Jupyter notebooks
    • Jupyter notebooks
    • The Python syntax
    • Data types, duck typing
    • Data structures: lists, sets, and dictionaries
    • Data types
  • Advanced Python features; distributed tasks queues with Celery
    • List comprehensions
    • Lambdas
    • Objects
    • The Global Interpreter Lock (GIL)
    • Multithreading and multiprocessing
    • Distributed task queues with Celery
  • Python libraries for working with data: NumPy, SciPy, Matplotlib, and Pandas
    • Multidimensional arrays in NumPy
    • Linear algebra and optimisation with SciPy
    • Data visualisation in Matplotlib
    • Time series data
    • Dealing with Pandas DataFrames
  • Machine Learning with Scikit-Learn; Deep Learning with Keras, TensorFlow, and Theano
    • Overview of machine learning
    • Introduction to Scikit-Learn
    • Keras and TensorFlow
    • Introduction to Theano

Bibliography

Your course is designed to be self-contained. However, if you would like to read up on its contents before, during, or after the course, we recommend the following books:

  • David Beazley, Brian K. Jones. Python Cookbook, third edition. O’Reilly, 2013.
  • Yves Hilpisch. Python for Finance: Mastering Data-Driven Finance, second edition. O’Reilly, 2018.
  • Wes McKinney. Python for Data Analysis, second edition. O’Reilly, 2017.
  • Aurelien Geron. Hands-On Machine Learning with Scikit-Learn and TensorFlow. O’Reilly, 2017.
  • Francois Chollet. Deep Learning with Python. Manning Publications, 2018.