R for the SQL Server Developer

Kevin Feasel (@feaselkl)
https://csmore.info/on/r

Who Am I? What Am I Doing Here?

What Is R?

R is a language focused around performing statistical analysis, predictive modeling, and data cleansing.

R is an off-shoot of the S language and is built on top of C.

Why Use R?

As a data analysis Domain Specific Language (DSL):

  • R has a large number of built-in functions for performing operations, including aggregates, statistical analysis, and graphing and plotting.
  • The R ecosystem is vast, with over 17K packages in CRAN.
  • R language constructs make set-based analysis and operation easy.

R helps you go well beyond simple Excel analysis and pivot tables.

Motivation

My goals in this talk:

  • Introduce you to the R ecosystem and language.
  • Introduce you to the value of notebooks.
  • Get you thinking about ways you could use R today.

Note that R is not the only data analysis language you could learn. Julia and Python are also great languages, and there are very good closed-source, commercial tools like SAS.

Reviewing stored procedure calls.
Graphing CPU utilization.
Tracking clustered columnstore improvements going to SQL Server 2016.

Agenda

  1. Installing R and Tools
  2. Learning the Basics
  3. Conntecting to SQL Server
  4. Getting a Taste of R

Getting the Right Version

The concentric circle of R libraries.

There are two major branches of R of interest to us: base R and Microsoft R. "Base" R is managed by the R Consortium and is entirely open-source. Microsoft takes base R and adds additional libraries and support.

Choosing an IDE

There is one big IDE available: RStudio. RStudio is a standalone installation and provides a nice development interface for R.

Microsoft had also made available R Tools for Visual Studio (RTVS), a Visual Studio plug-in. It offers some interesting features like making SQL Server R Services integration easier, and it integrates with other Visual Studio projects. It was built into Visual Studio 2017 but removed from SQL Server 2019.

Jupyter

We will also install Jupyer Notebooks and use it during this talk. Installing Jupyter takes a few steps, but the links for this talk include a step-by-step walkthrough. The easiest way to install Jupyter is to use Anaconda, a data science suite for Python.

Jupyter (which name derives from a combination of the languages Julia, Python, and R) is a great framework because it has support for dozens of languages. Microsoft uses Jupyter Notebooks for its Azure Machine Learning products.

Why Notebooks?

Notebooks are a way of mixing Markdown-enabled text and language snippets to make your thoughts clear to others. You can create and share notebooks, allowing others easily to test your process and follow along. Notebooks are also an excellent teaching mechanism.

Agenda

  1. Installing R and Tools
  2. Learning the Basics
  3. Conntecting to SQL Server
  4. Getting a Taste of R

Demo Time

Agenda

  1. Installing R and Tools
  2. Learning the Basics
  3. Conntecting to SQL Server
  4. Getting a Taste of R

Connecting to SQL Server

Connecting to a SQL Server database (or any other relational database) is easy with R. The first step is to install the RODBC or DBI pacakage to give your R code ODBC support. From there, you can connect to a system data source that you've defined in your ODBC Data Sources.

You could also pass in a connection string if you don't want to set up a DSN.

Demo Time

Agenda

  1. Installing R and Tools
  2. Learning the Basics
  3. Conntecting to SQL Server
  4. Getting a Taste of R

Getting a Taste of R

No single talk will expose the full gamut of what you can do with R, but this next section will try to hit a few of the highlights. If this feels a bit overwhelming, don't fret: you can grab the notebook and try it out yourself.

This notebook will cover the analysis of restaurant data for Wake County, North Carolina over a multi-year period.

Demo Time

Wrapping Up

R is a powerful language for performing analysis. We've seen just a few of the many valuable uses of R.

Wrapping Up

To learn more, go here:
https://csmore.info/on/r


And for help, contact me:
feasel@catallaxyservices.com | @feaselkl


Catallaxy Services consulting:
https://CSmore.info/on/contact