Study at City
  1. Courses
  2. Applying
  3. Fees and funding
  4. Living in London
  5. Visit us & online events
  6. Student support
  7. International students
  8. Order a prospectus
  9. Ask a student
  1. Short courses
  2. Courses
Study at City

Introduction to Data Analytics and Machine Learning with Python Short Course

Key information

Choose a start date
To be confirmed
To be confirmed
To be confirmed
To be confirmed
Course Code:
To be confirmed
To be confirmed
Booking Deadline:
To be confirmed
To be confirmed
Covid-19 update: The learning doesn't have to stop, join our online community. We will be delivering courses remotely until further notice. Live tutor support and virtual lessons will take place during advertised teaching hours. The classes are taught in small groups, so you'll get lots of support from your tutor. Book now.

Allowing us to make sense of big data, Python is the future when it comes to data analytics.

Why choose this course?

The very popular Introduction to Data Analytics and Machine Learning with Python 3 short course has been designed to open the vast world of data analytics and machine learning to non-technical people without prior experience of the field, using the Python programming language.

Python 3 is the last iteration of the Python language, and so it will be useful to learn the tools and techniques we teach in this course in Python 3.

As this is an introductory data analytics course you are not expected to have any data analytics or machine learning experience. Pre-requisites are successful completion of Introduction to programming with Python or knowledge of topics therein and knowledge of mathematical concepts such as those presented in the website (

The Introduction to Data Analytics and Machine Learning with Python short course is taught over 10 weeks in the evenings, allowing you to continue with full-time employment. Studying one of our short courses is a fantastic way to learn new skills and can be used as a great way to further your career.

Course overview

For students who already have a sound working knowledge of Python

You will learn the state of the art in data analytics and machine learning by leveraging the most widely used Python libraries, which are developed and maintained by big companies like Google, Facebook and Twitter.

As both data analytics and machine learning fields are vast and fast expanding, we will focus our efforts on grasping the foundations. The foundations which we will go through could enable you to get a junior position as a data analyst and/or machine learning engineer.

Libraries that will be taught in this course:

  • Jupyter Notebook
  • NumPy
  • SciPy
  • matplotlib
  • pandas
  • Scikit-learn

What will I learn?

What will I learn?

  • Jupyter notebook: a quick tour of the data engineers IDE of choice.
  • Introduction to numpy: N-dimensional arrays, broadcasting functions, linear algebra abstractions and random number generators.
  • Exploratory data analysis with pandas: manipulating data: loading, storing, cleaning, transforming, merging, reshaping.
  • Visualising and plotting with matplotlib: generate plots, histograms, power spectra, bar charts, errorcharts, scatterplots. Visualize and understand different types of data.
  • Introduction to scipy with statistics, is mainly focused at providing a quick introduction to the scipy.stats package. We will be looking at distributions, fitting distributions and random numbers.
  • Introduction to machine learning concepts with scikit-learn, training and evaluating learning algorithms. We will be looking at: decision trees, perceptrons, support vector machines, and neural networks.
  • Scikit-learn delving deeper: using data validation and cross-validation. Also some other methods to improve the accuracy of your learning algorithms

Information about the libraries taught in this course:

  • Jupyter Notebook: a web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, machine learning and much more
  • NumPy: the fundamental package for scientific computing with Python, which contains useful things like: a powerful N-dimensional array object; sophisticated (broadcasting) functions; useful linear algebra, Fourier transform, and random number capabilities. We will also be using it as an efficient multi-dimensional container of generic data.
  • SciPy: provides many user-friendly and efficient numerical routines such as routines for numerical integration and optimization. This library builds on top of NumPy and makes heavy use of all the features that we will be learning in NumPy.
  • matplotlib: a plotting library which produces publication quality figures and can also be used to do image manipulation. You can generate plots, histograms, power spectra, bar charts, error charts, scatter plots, with just a few lines of code.
  • pandas: is an easy to use data structuring and data analysis library which we will be using. It has advanced data manipulation capabilities and can use data objects in the same way we use databases. It can also import and export data from a vast number of formats.
  • Scikit-learn: built on top of NumPy, SciPy, and matplotlib this is one of the most widely used machine learning libraries in industry and research. It covers a truly impressive number of machine learning techniques and methods, some of which include: classificatio, regression, clustering, dimensionality reduction, model selection, data preprocessing, etc.



Prerequisite knowledge

Applicants must have successfully completed the Introduction to programming with Python or have Python to a similar standard.

As this is an introductory data analytics course you are not expected to have any data analytics or machine learning experience.

Knowledge of mathematical concepts such as those presented in the website ( is essential.

English requirements

Applicants must be proficient in written and spoken English.

Teaching & assessment

Teaching & assessment

Informal assessment through optional weekly assignments, which will build into a final project that will solve a real world problem using real world data, applying state of the art techniques taught during the course.

Tutor information

  • Michal Grochmal

    Michal Grochmal is a trained physicist and professional data architect, with a passion for data science. His everyday work entails passing data around, make it flow through pipelines, aggregate it into useful sets and ‘munging’ the data together to make something useful. An expert in machine learning, Michal is skilled at writing maths into code, profiling the resulting code and re-implementing the slow parts in a lower language (e.g. C).

    With a strong background in programming, most prominently in Python language, Michal is also a computer security specialist. He is a great fan of applied mathematics to core computer science, notably the use of Machine Learning algorithms, to information security and convex optimization. He has a passion for topology, linear algebra and vectorial calculus.

  • Ana Solaguren-Beascoa

    Ana Solaguren-Beascoa is a skilled data scientist who holds a PhD in theoretical particle physics. She works as a "full-stack data scientist", as she has experience on the full life cycle of data science: from understanding the data and building the best machine learning models to implementing production code using Python. She has worked in projects across multiple sectors, but has a particular passion for machine learning in biology and health.

    Beyond her solid programming Python skills and machine learning knowledge, Ana has a strong mathematical background which allows her to go deep into the core foundations of machine learning. She also has multiple years of experience teaching postgraduate level courses in different areas of advanced theoretical physics.

  • Cosmin Stamate

    Cosmin Stamate started programming on a ZX Spectrum clone when he was 8. Self taught, he has been consulting in several areas of software engineering, from programming to architecture, and web development for over 10 years. He has a MSc in Intelligent Technologies from Birkbeck, University of London where he took a particular interest in in artificial neural networks and evolutionary algorithms.

    Some of Cosmin's industry roles include data analyst (on a consulting basis) for Tesco and Schroders. Currently he is working towards a hybrid PhD which bridges the Department of Computer Science and the Department of Psychological Sciences at Birkbeck; the PhD is focused on developing novel deep learning algorithms that model certain cognitive and behavioural processes.

    Cosmin is also an active member of the Centre for Brain & Cognitive Development and Birkbeck Babylab where he applies state of the art machine learning on electroencephalogram data.