Data Cleansing with SQL and R

ABSTRACT

On a given project, data scientists can spend upwards of 80% of their time preparing, cleaning, and correcting data. In this session, we will look at different data cleansing and preparation techniques using both SQL Server and R. We will investigate the concept of tidy data and see how we can use tools in both languages to simplify research and analysis of a small but realistic data set.

ADDITIONAL MEDIA

On August 16, 2017, I gave a version of this talk at NDC Sydney. You can get the recording on the NDC Youtube channel.

DEMO CODE

Click here to access demo code for this presentation. This includes all of the SQL and R code, as well as data sources used in demos. This also includes a notebook for tidyr

The source code is licensed under the terms offered by the GPL.