Vectors, factors, matrices, lists and especially data frames. Manipulation of these using aggregative functions, indexing and other more sophisticated functions including the apply() family. How to use these techniques to best advantage with large organisational datasets.
We learn R’s basic plotting techniques (plot(), hist() etc.), but soon move on to more sophisticated techniques (ggplot2(), Tableau, Power BI). How to use these to further analyse organisational data and to present your analytic findings to co-workers.
With the emphasis very much on practical applications, not mathematical theory, we learn about descriptives, distribution, regression and correlation (including multiple regression), t-tests, ANOVA and categorical data analysis (including chi-squared). There is a strong emphasis of the applicability of statistical techniques to organisational problems, refining our models and rigourously testing them for reliability.
We learn the basics of procedural programming – variables, control structures and writing simple functions – before moving on to building more sophisticated functions geared to manipulating large datasets.
Data loading, cleaning and transformation:
Loading data from Excel, SQL, XML and the web, using SQL notation to query R data, cleaning and transforming your data (missing values, recoding and converting variables, creating new variables), merging and sampling data.