Abstract

PolyBase is Microsoft's data virtualization technology, allowing you to integrate SQL Server with external systems such as Hadoop and Azure Blob Storage. As of SQL Server 2019, PolyBase also allows you to connect to diverse systems such as Oracle, MongoDB, Cosmos DB, Teradata, Apache Spark, and even other SQL Server instances. In this talk, we will walk through the mechanics of PolyBase, showing how we can write T-SQL queries to access data from heterogeneous sources. We will investigate the Dynamic Management Views (DMVs) and other resources needed to tease out what the PolyBase engine is doing behind the scenes and where to go when things break down. Finally, we will look at several interesting use cases for PolyBase in the modern data environment.


Slides

The slides are available as a GitPitch slide deck.

The slides are licensed under Creative Commons Attribution-ShareAlike.


Demo Code

The demonstration code is available on my GitHub repository. This includes all of the SQL queries as well as relevant data sets.

The source code is licensed under the terms offered by the GPL. The slides are licensed under Creative Commons Attribution-ShareAlike.


Links And Further Information