# Technical skills required for City’s MSc in Data Science

Our MSc in Data Science welcomes students with a range of backgrounds. We deliberately recruit a diverse range of students with a wide variety of experiences, but expect a certain level of technical competency in both coding and mathematics. We only accept students who we think have a background that is adequately technical. If you are less experienced in these areas you will need to do some preparation before starting the course and be prepared to work hard, particularly in the earlier stages of the programme.

Here, we explain what level you should be at to be successful.

## Will I have to write code?

Yes. You should expect to be writing code throughout this course and in more than one programming language. It is the most flexible and powerful way to manipulate data and apply techniques used in Data Science. If you don’t want to write code, this is not the programme for you.

## How much coding will I be doing?

You will not be building software, but you will be using code to manipulate, analyse and visualise data and run algorithms on the data. You will be doing this mainly in **Python** and **Matlab**, but might also use R, other languages and different software tools. The amount of code you will have to write will depend on which elective modules you choose, but it will not at the level needed for software engineering. This is not a programming degree.

## How much coding experience do I need when I start and how do I prepare?

We expect you to already know the basics of programming languages, which includes data types, variables, conditional statements and control flow, use of functions and parameter, lists, loops, classes and file input/output. The amount of preparation will depend on your experience.

### Python

You’ll need to be familiar with the basics of Python before you start, including the Pandas, Numpy, MatPlotLib and Seaborne libraries. There are many good online resources, but we recommend LearnPython or DataCamp’s Intro to Python (use the free tutorial until we will give you free access to the Premium material as indicated below). We also recommend the Python for Data Analysis book by Wes McKinney. More specific recommendations are made at the bottom of this page.

### Matlab

We will also be using Matlab for two core modules (Machine Learning and Neural Computing), so you should be familiar with the basics. Matlab is not free, so please download a 30 day Matlab trial and follow these tutorials: "Getting Started with MATLAB" and “Matlab Onramp" to familiarise yourself with the Matlab environment.

We will give you free access to Data Camp when you confirm your place and we will run some session in September – details to follow.

## How much statistical and mathematical knowledge do I need?

Basic statistical and mathematical concepts are required. This does not need to be very advanced, but some of the topics will be easier to understand with more advanced mathematical knowledge.

Since we deal with data, we expect you to have a basic understanding of numerical distributions, basic summary statistics, correlations and probability theory and some concepts you might need to be familiar with can be found in these resources:

You will be learning and applying algorithms during the course and an understanding of how the algorithms work will be necessary. A basic understanding of linear algebra, matrix operations, and derivatives will help here. Some recommended resources are at the end of this page.

## Will you run preparation sessions?

Yes. We plan to run a couple of preparation sessions that go through the basics of mathematics and programming.

**Python:**Monday 7^{th}September: 10:00-13:00 (you’ll need a working version of Python. We recommend Anaconda)**Matlab:**Tuesday 8^{th}September: 10:00-13:00 (you’ll need a working copy of Matlab, probably the 30-day free trial)**Mathematics:**Wednesday 9^{th}September: 10:00-13:00 (you’ll need a working copy of Matlab, probably the 30-day free trial)

More details will follow.

## How much help will I get?

We will provide some optional preparatory sessions on coding and mathematical basics in late summer and at the start of the course.

We have three full-time teaching assistants who will provide help and will run scheduled surgery sessions throughout the programme. But we expect you to try things on your own and to form your own study groups. We offer plenty of help, but also expect students to organise their own learning.

## Data Camp

We will give you **free access **to DataCamp when you accept your offer. These are great online courses on the programming languages and libraries that are most important for Data Science, which will complement the more theoretical content of the course. Suggestions of courses to take follow.

## Suggested online courses:

These resources provide some more in-depth material that you will find helpful.

### Python

- Introduction to Python
- Introduction to Data Science in Python
- Python with anaconda
- Introduction to Python Programming
- Importing Data in Python (Part 1)
- Importing Data in Python (Part 2)
- Cleaning Data in Python
- Practicing Coding Interview Questions in Python
- Pandas foundations
- Data manipulation with Pandas
- Manipulating data frames with Pandas
- Preprocessing for Machine Learning in Python
- Introduction to Data Visualization with Python
- Data analysis and visualisation
- Statistical Thinking in Python (Part 1)
- Statistical Thinking in Python (Part 2)
- Linear Classifiers in Python
- Big Data
- Spark
- Analyzing marketing campaigns with Pandas

### Matlab

- There are many online courses in the Matlab Academy some are free, and you could start before joining
- Matlab Central is a hub for “open exchange for the MATLAB and Simulink user community”.

### Probability

- StatQuest statistics explained
- Khan Academy: Independent and Dependent Events
- Khan Academy: Probability and Combinatorics
- Khan Academy: Random Variables and Probability Distributions

### Statistics

- Khan Academy: Displaying and Describing data
- Khan Academy: Modeling Distributions of data
- Khan Academy: Describing relationships in quantitative data
- Khan Academy: Confidence Intervals
- Khan Academy: Significance Tests

### Linear algebra

## Further enquires

Please contact smcse.msc@city.ac.uk if you have any other questions.