Technical skills required for City’s MSc in Data Science
Our MSc in Data Science welcomes students with a range of backgrounds. We deliberately recruit a diverse range of students with a wide variety of experiences, but expect a certain level of technical competency in both coding and mathematics. We only accept students who we think have a background that is adequately technical. If you are less experienced in these areas you will need to do some preparation before starting the course and be prepared to work hard, particularly in the earlier stages of the programme.
Here, we explain what level you should be at to be successful.
Will I have to write code?
Yes. You should expect to be writing code throughout this course. It is the most flexible and powerful way to manipulate data and apply techniques used in Data Science. If you don’t want to write code, this is not the programme for you.
How much coding will I be doing?
You will not be building software, but you will be using code to manipulate, analyse and visualise data and run algorithms on the data. You will be doing this Python and Matlab, amongst some others. The amount of coding you will have to do will depend on which elective modules you choose, but not at the level needed for software engineering. This is not a programming degree.
How much coding experience do I need when I start and how do I prepare?
We expect you to already know the basics of programming languages, which includes data types, variables, conditional statements and control flow, use of functions and parameter, lists, loops, classes and file input/output. If you are familiar with these concepts and their practical use, then you will be able to cope with the level of programming that we expect from you. The amount of preparation will depend on your experience.
Python. Please be familiar with the basics of Python this before you start. There are many good online resources, but we recommend LearnPython or DataCamp’s Intro to Python (use the free tutorial until we will give you free access to the Premium material as indicated below). We also recommend the Python for Data Analysis book by Wes McKinney.
Matlab. We will also be using Matlab for two core modules (Machine Learning and Neural Computing), so it will be helpful for you to be familiar with the basics. Matlab is not free, so please download a 30 day trial and follow the "Getting Started with MATLAB" tutorial to familiarise yourself with the Matlab environment.
How much statistical and mathematical knowledge do I need?
Basic statistical and mathematical concepts are required. This does not need to be very advanced, but some of the topics will be easier to understand with more advanced mathematical knowledge.
Since we deal with data, we expect you to have a basic understanding of numerical distributions, basic summary statistics, correlations and probability theory and some concepts you might need to be familiar with can be found in these resources:
You will be learning and applying algorithms during the course and an understanding of how the algorithms work will be necessary. A basic understanding of linear algebra, matrix operations, and derivatives will help here. Some recommended resources are at the end of this page.
Will you run preparation sessions?
Yes. We plan to run a couple of preparation sessions that go through the basics of mathematics and programming. These will be before term start and you will be notified if these nearer the time.
How much help will I get?
We will provide some optional preparatory sessions on coding and mathematical basics in late summer and at the start of the course.
We have two full-time teaching assistants who will provide help and will run scheduled surgery sessions throughout the programme. But we expect you to try things on your own and to form your own study groups. We offer plenty of help, but also expect students to organise their own learning.
We will give you free access to DataCamp when you accept your offer. These are great online courses on the programming languages and libraries that are most important for Data Science, which will complement the more theoretical content of the course. We’re not expecting you to do those before starting, but if you want to get stuck in before the course starts, here are some courses we selected for you to look at:
- Introduction to Python
- Introduction to Data Science in Python
- Introduction to Data Visualization with Python
- Linear Classifiers in Python
- Importing Data in Python (Part 1)
- Importing Data in Python (Part 2)
- Cleaning Data in Python
- Preprocessing for Machine Learning in Python
- Big Data
- Python with anaconda
These resources provide some more in-depth material that you will find helpful.
- Khan Academy: Independent and Dependent Events
- Khan Academy: Probability and Combinatorics
- Khan Academy: Random Variables and Probability Distributions
- Khan Academy: Displaying and Describing data
- Khan Academy: Modeling Distributions of data
- Khan Academy: Describing relationships in quantitative data
- Khan Academy: Confidence Intervals
- Khan Academy: Significance Tests
Please contact firstname.lastname@example.org if you have any other questions.