Basic R programming knowledge
Data manipulation is a vital data analysis skill – actually, it is the foundation of data analysis. This course is about the most effective data manipulation tool in R – dplyr!
As a data analyst, you will spend a vast amount of your time preparing or processing your data. The goal of data preparation is to convert your raw data into a high quality data source, suitable for analysis. More often than not, this process involves a lot of work. The dplyr package contains the tools that can make this work much easier.
dplyr has a few important advantages over other data data manipulation tools or functions:
For these reasons, dplyr quickly began the most popular data manipulation tool among R data scientists. When you finish this course, you will be able to
It is a short course, but it is focused on the most essential commands and functions of the dplyr package, those commands that you will likely use most often.
So let’s see what you are going to learn in this course.
The first section covers the five core dplyr commands. These commands are: filter, select, mutate, arrange and summarise. You will need this commands practically every time when you work with dplyr. They are used to subset data frames, compute new variables, sort data frames, compute statistical indicators and so on. Here’s a few real life scenarios of their utilization:
The second section approaches other important dplyr commands and functions. In this section you’ll learn:
In the third section you’ll start to take advantage of the true power of dplyr. Here we’ll talk about chaining – creating sequences of dplyr commands that accomplish multiple tasks with one click only.
The fourth section is about joining data frames with dplyr. This is a very important topic, because many times your data will be found in several data frames. So you will need to join these data frames into only one, suitable for your analyses. We are going to look at five join types available in dplyr: inner_join, semi_join, left_join, anti_join and full_join. We are going to examine the output of each join type using a simple example.
In the fifth section we’ll learn how to combine the dplyr and ggplot2 (using chaining) commands to build expressive charts and graphs. For example, if you want to represent the income distribution for the subjects with a higher education only, or the relationship between income and education level for the female subjects only, in this section you will learn exactly how to do it.
Every command is illustrated with video, both the syntax and the output being explained in detail. At the end of the course, a big number of practical exercises are proposed. By doing these exercises you’ll actually apply in practice what you have learned.
Join this course right now and acquire a critical data analysis ability – data manipulation!