Technical skills required for City’s MSc in Data Science
Our MSc in Data Science welcomes students with a range of backgrounds. We deliberately recruit a diverse range of students with a wide variety of experiences, but expect a certain level of technical competency in both coding and mathematics. We only accept students who we think have a background that is adequately technical. If you are less experienced in these areas you will need to do some preparation before starting the course and be prepared to work hard, particularly in the earlier stages of the programme.
Here, we explain what level you should be at to be successful.
Will I have to write code?
Yes. You should expect to be writing code throughout this course. It is the most flexible and powerful way to manipulate data and apply techniques used in Data Science. If you don’t want to write code, this is not the programme for you.
How much coding will I be doing?
You will not be building software, but you will be using code to manipulate, analyse and visualise data and run algorithms on the data. You will be doing this in various languages including Pythonand R. Matlab and Java are used in some of the elective modules. The amount of coding you will have to do will depend on which elective modules you choose, but not at the level needed for software engineering. This is not a programming degree.
How much coding experience do I need when I start?
We expect you to already know the basics of programming languages, which includes data types, variables, conditional statements and control flow, use of functions and parameter, lists, loops, classes and file input/output. If you are familiar with these concepts and their practical use, then you will be able to cope with the level of programming that we expect from you. If you are not, then use the Python online tutorial suggested below.
The main language you’ll use is Python. We would recommend that you start to become familiar with Python before you start. This will make your life easier. There are many good online resources, but there are two that we’d recommend:
- LearnPython: a tutorial-style guide.
- Python course from CodeAcademy: An online course that you need to sign up for.
- Intro to Python for Data Science by DataCamp: An online interactive platform for you to learn Python. We also provide 6-month full access to our students.
We also recommend the Python for Data Analysis book by Wes McKinney.
You’ll be using various Python libraries such as Numpy, Matplotlib, Scikit-Learn, StatsModel, but we’ll provide details of these during the course.
How much statistical and mathematical knowledge do I need?
Basic statistical and mathematical concepts are required. This does not need to be very advanced, but some of the topics will be easier to understand with more advanced mathematical knowledge.
Since we deal with data, we expect you to have a basic understanding of numerical distributions, basic summary statistics, correlations and probability theoryand some concepts you might need to be familiar with can be found in these resources:
You will be learning and applying algorithms during the course and an understanding of how the algorithms work will be necessary. A basic understanding of linear algebra, matrix operations, and derivatives will help here. Some recommended resources are at the end of this page.
How much help will I get?
We will provide some optional preparatory sessions on coding and mathematical basics in late summer and at the start of the course.
We have two full-time teaching assistants who will provide help and will run scheduled surgery sessions throughout the programme. But we expect you to try things on your own and to form your own study groups. We offer plenty of help, but also expect students to organise their own learning.
These resources provide some more in-depth material that you will find helpful.
- Khan Academy: Independent and Dependent Events
- Khan Academy: Probability and Combinatorics
- Khan Academy: Random Variables and Probability Distributions
- Khan Academy: Displaying and Describing data
- Khan Academy: Modeling Distributions of data
- Khan Academy: Describing relationships in quantitative data
- Khan Academy: Confidence Intervals
- Khan Academy: Significance Tests
Please contact firstname.lastname@example.org if you have any other questions.