Study at City
  1. Courses
  2. Applying
  3. Fees and funding
  4. Living in London
  5. Visit us & online events
  6. Student support
  7. International students
  8. Order a prospectus
  9. Ask a student
  1. Short courses
  2. Courses
Study at City

Introduction to R for Data Analysis Short Course

Key information

Choose a start date
To be confirmed
To be confirmed
To be confirmed
To be confirmed
Course Code:
To be confirmed
To be confirmed
Booking Deadline:
To be confirmed
To be confirmed

Delivery of this module will be online due to government-issued guidelines during the Covid-19 pandemic and may be made available face-to-face subject to Public Health England recommendations. Please submit your application and, once reviewed by the module leader, you will receive more specific information about the online delivery of your module and assessments.

Learn to extract business or statistical information from large data sets with R

Why choose this course?

This course is ideal for data analysts wanting to use R to extract organisationally useful data from large data sets.

It is wide-ranging, covering all aspects of R from the basics, through to sophisticated graphics, advanced programming techniques and data mining algorithms. It also has a strong business focus, illustrating how analytical findings can be used for organisational planning purposes. It also provides an excellent all-round introduction for anybody needing to use R for general statistical purposes.

No prior knowledge required but certain basic statistical and programming skills would be an advantage (see Eligibility tab below).

Course overview

The course covers all aspects of the R Language, focusing on its ability to extract organisationally significant information from databases and other large data sets.

Students will not only acquire a great deal of technical knowledge but will also gain insights into some sophisticated statistical and analytical concepts and the way analysis supports planning and strategic management processes.

By the end of this course, you will be able to:

  • Create and manipulate all R’s data structures, import data into them from outside sources and learned how to clean and transform your data.
  • Use all its core statistical functions, deploy them for useful organisational purposes and test and refine your models.
  • Present your findings using standard R graphics, ggplot2, Tableau and Power BI
  • Create your own sophisticated functions and have an appreciation of R’s more powerful procedural programming techniques.
  • Apply R techniques to the work of a data analyst and use them to support the planning and strategic management processes

What will I learn?

What will I learn?

Data structures:
Vectors, factors, matrices, lists and especially data frames. Manipulation of these using aggregative functions, indexing and other more sophisticated functions including the apply() family. How to use these techniques to best advantage with large organisational datasets.

We learn R’s basic plotting techniques (plot(), hist() etc.), but soon move on to more sophisticated techniques (ggplot2(), Tableau, Power BI). How to use these to further analyse organisational data and to present your analytic findings to co-workers.

With the emphasis very much on practical applications, not mathematical theory, we learn about descriptives, distribution, regression and correlation (including multiple regression), t-tests, ANOVA and categorical data analysis (including chi-squared). There is a strong emphasis of the applicability of statistical techniques to organisational problems, refining our models and rigourously testing them for reliability.

We learn the basics of procedural programming – variables, control structures and writing simple functions – before moving on to building more sophisticated functions geared to manipulating large datasets.

Data loading, cleaning and transformation:
Loading data from Excel, SQL, XML and the web, using SQL notation to query R data, cleaning and transforming your data (missing values, recoding and converting variables, creating new variables), merging and sampling data.



Prerequisite knowledge

While no prior knowledge is required, you will find it useful to have a little knowledge of statistics (descriptives, regression and distribution), some basic SQL (up to using GROUP BY and ORDER BY), and the fundamentals of procedural programming (manipulating variables, ifs and whiles, writing simple functions) that can have been gained using any other programming language.

You must be IT literate.

English requirements

You must be proficient in written and spoken English.

Teaching & assessment

Teaching & assessment

Teaching is in the form of lectures interspersed with exercises to test and expand your knowledge.

There is also a continuous data analysis project, where you will use R techniques to gain insights into a particular client group and how they might be approached to best advantage by an organisation.

Recommended reading

Recommended reading

R in Action. Manning (2015). Robert I Kabacoff

Other useful texts will be suggested during the course.

Tutor information

  • Mark Robbins

    Mark Robbins was for many years a Project Manager working for the government, the BBC and the NHS, where he led large teams that designed and implemented many strategic national networking and messaging systems.

    Mark now works as a freelance academic researcher and author, journalist and IT consultant and teaches a wide range of computer science subjects at London Metropolitan University.