In-Memory OLTP

Created by Kevin Feasel

What the Hekaton?

Alternate title

Which Kalen Delaney totally stole

Not really; it was the obvious title

Who Am I? What Am I Doing Here?

Hekawhat?

In-Memory OLTP was released as part of SQL Server 2014. It has had a few marketing names during its short life:

  • Hekaton
  • eXtreme Transaction Pocessing
  • In-Memory OLTP

The SQL Server developers are aiming for a 100x improvement over standard disk-based OLTP methods. At present, they can get ~30x improvement.

Sidebar: Classic "In-Memory"

SQL Server never reads directly from a disk. When you want to read a page of data, that page goes into memory (the buffer pool). If you run out of available memory, pages start getting swapped back to disk. As long as a page is in memory, SQL Server does not need to read it from disk. Historically, database sizes have been greater than the amount of RAM available to a server.

DBCC PINTABLE was a command telling SQL Server to keep a particular table in memory as long as possible.

Hekaton is NOT the same thing!

What's Wrong With SQL Classic?*

* Nothing!

  • Locks and latches: by default, readers block writers and writers block readers. SQL Server uses pessimistic concurrency by default.
  • Reading from disk: even with SSDs, going to disk is expensive. So is maintaining all of that data on disk.
  • SQL is an interpreted language: we write T-SQL which gets turned into execution plans which are run.

What Hekaton Promises

  • No more buffer pool: separate memory space for In-Memory OLTP.
  • "No" locking and latching. Optimistic concurrency with validation at the end of a transaction.
  • Never read from a disk, except at start-up.
  • Native code compilation

Optimistic Concurrency

Optimistic concurrency is nothing new in SQL Server. The on-disk OLTP has it: Read Committed Snapshot Isolation (RCSI).

RCSI uses tempdb to keep "snapshots" of records as they change and automatically performs garbage collection when the snapshots expire.

Tempdb usage tends to be a bottleneck with RCSI: under heavy load, all that swapping to disk can cause performance problems.

In-Memory OLTP does not use the same mechanisms as RCSI; it does not use tempdb for this.

How Optimistic Concurrency Works: Life of a Row

|-----------|----------------|---------------|---------------|	S1
|------------------|---------------|-----------|-------------|	S2
|-----------|-----------|----------|----|----------|---------|	S3
v0
t0         t1     t2   t3   t4    t5    t6  t7 t8  t9         
						
Time Action Consequences
t1 S1 & S3 read, get v0

How Optimistic Concurrency Works: Life of a Row

|-----------|----------------|---------------|---------------|	S1
|------------------|---------------|-----------|-------------|	S2
|-----------|-----------|----------|----|----------|---------|	S3
v0                v1
t0         t1     t2   t3   t4    t5    t6  t7 t8  t9         
						
Time Action Consequences
t1 S1 & S3 read, get v0
t2 S2 updates: v1 created v0 is GC'd ASAP

How Optimistic Concurrency Works: Life of a Row

|-----------|----------------|---------------|---------------|	S1
|------------------|---------------|-----------|-------------|	S2
|-----------|-----------|----------|----|----------|---------|	S3
                  v1
t0         t1     t2   t3   t4    t5    t6  t7 t8  t9         
						
Time Action Consequences
t1 S1 & S3 read, get v0
t2 S2 updates: v1 created v0 is GC'd ASAP
t3 S3 reads, gets v1

How Optimistic Concurrency Works: Life of a Row

|-----------|----------------|---------------|---------------|	S1
|------------------|---------------|-----------|-------------|	S2
|-----------|-----------|----------|----|----------|---------|	S3
                  v1        v2
t0         t1     t2   t3   t4    t5    t6  t7 t8  t9         
						
Time Action Consequences
t1 S1 & S3 read, get v0
t2 S2 updates: v1 created v0 is GC'd ASAP
t3 S3 reads, gets v1
t4 S1 BEGIN TRAN, updates: v2 created

How Optimistic Concurrency Works: Life of a Row

|-----------|----------------|---------------|---------------|	S1
|------------------|---------------|-----------|-------------|	S2
|-----------|-----------|----------|----|----------|---------|	S3
                  v1        v2
t0         t1     t2   t3   t4    t5    t6  t7 t8  t9         
						
Time Action Consequences
t1 S1 & S3 read, get v0
t2 S2 updates: v1 created v0 is GC'd ASAP
t3 S3 reads, gets v1
t4 S1 BEGIN TRAN, updates: v2 created
t5 S2 & S3 read, get v1 v1 still alive!

How Optimistic Concurrency Works: Life of a Row

|-----------|----------------|---------------|---------------|	S1
|------------------|---------------|-----------|-------------|	S2
|-----------|-----------|----------|----|----------|---------|	S3
                  v1        v2
t0         t1     t2   t3   t4    t5    t6  t7 t8  t9         
						
Time Action Consequences
t3 S3 reads, gets v1
t4 S1 BEGIN TRAN, updates: v2 created
t5 S2 & S3 read, get v1 v1 still alive!
t6 S3 tries to update ERROR -- Open update!

How Optimistic Concurrency Works: Life of a Row

|-----------|----------------|---------------|---------------|	S1
|------------------|---------------|-----------|-------------|	S2
|-----------|-----------|----------|----|----------|---------|	S3
                  v1        v2
t0         t1     t2   t3   t4    t5    t6  t7 t8  t9         
						
Time Action Consequences
t4 S1 BEGIN TRAN, updates: v2 created
t5 S2 & S3 read, get v1 v1 still alive!
t6 S3 tries to update ERROR -- Open update!
t7 S1 COMMIT TRAN v1 GC'd ASAP

How Optimistic Concurrency Works: Life of a Row

|-----------|----------------|---------------|---------------|	S1
|------------------|---------------|-----------|-------------|	S2
|-----------|-----------|----------|----|----------|---------|	S3
                            v2
t0         t1     t2   t3   t4    t5    t6  t7 t8  t9         
						
Time Action Consequences
t4 S1 BEGIN TRAN, updates: v2 created
t5 S2 & S3 read, get v1 v1 still alive!
t6 S3 tries to update ERROR -- Open update!
t7 S1 COMMIT TRAN v1 GC'd ASAP
t8 S2 reads, gets v2

How Optimistic Concurrency Works: Life of a Row

|-----------|----------------|---------------|---------------|	S1
|------------------|---------------|-----------|-------------|	S2
|-----------|-----------|----------|----|----------|---------|	S3
                            v2
t0         t1     t2   t3   t4    t5    t6  t7 t8  t9         
						
Time Action Consequences
t4 S1 BEGIN TRAN, updates: v2 created
t5 S2 & S3 read, get v1 v1 still alive!
t6 S3 tries to update ERROR -- Open update!
t7 S1 COMMIT TRAN v1 GC'd ASAP
t8 S2 reads, gets v2
t9 S3 reads, gets v2

Contrast With Pessimistic Concurrency

|-----------|----------------|---------------|---------------|	S1
|------------------|---------------|-----------|-------------|	S2
|-----------|-----------|----------|----|----------|---------|	S3
v0                            
t0         t1     t2   t3   t4    t5    t6  t7 t8  t9         
						
Time Action Consequences
t1 S1 & S3 read, get v0

Contrast With Pessimistic Concurrency

|-----------|----------------|---------------|---------------|	S1
|------------------|---------------|-----------|-------------|	S2
|-----------|-----------|----------|----|----------|---------|	S3
                  v1          
t0         t1     t2   t3   t4    t5    t6  t7 t8  t9         
						
Time Action Consequences
t1 S1 & S3 read, get v0
t2 S2 updates: v1 created v1 overwrites v0

Contrast With Pessimistic Concurrency

|-----------|----------------|---------------|---------------|	S1
|------------------|---------------|-----------|-------------|	S2
|-----------|-----------|----------|----|----------|---------|	S3
                  v1          
t0         t1     t2   t3   t4    t5    t6  t7 t8  t9         
						
Time Action Consequences
t1 S1 & S3 read, get v0
t2 S2 updates: v1 created v1 overwrites v0
t3 S3 reads, gets v1

Contrast With Pessimistic Concurrency

|-----------|----------------|---------------|---------------|	S1
|------------------|---------------|-----------|-------------|	S2
|-----------|-----------|----------|----|----------|---------|	S3
                            v2
t0         t1     t2   t3   t4    t5    t6  t7 t8  t9         
						
Time Action Consequences
t1 S1 & S3 read, get v0
t2 S2 updates: v1 created v1 overwrites v0
t3 S3 reads, gets v1
t4 S1 BEGIN TRAN, updates: v2 created v2 overwrites v1

Contrast With Pessimistic Concurrency

|-----------|----------------|---------------|---------------|	S1
|------------------|---------------|-----------|-------------|	S2
|-----------|-----------|----------|----|----------|---------|	S3
                            v2
t0         t1     t2   t3   t4    t5    t6  t7 t8  t9         
						
Time Action Consequences
t2 S2 updates: v1 created v1 overwrites v0
t3 S3 reads, gets v1
t4 S1 BEGIN TRAN, updates: v2 created v2 overwrites v1
t5 S2 & S3 try to read BLOCKED!

Contrast With Pessimistic Concurrency

|-----------|----------------|---------------|---------------|	S1
|------------------|---------------|-----------|-------------|	S2
|-----------|-----------|----------|----|----------|---------|	S3
                            v2
t0         t1     t2   t3   t4    t5    t6  t7 t8  t9         
						
Time Action Consequences
t3 S3 reads, gets v1
t4 S1 BEGIN TRAN, updates: v2 created v2 overwrites v1
t5 S2 & S3 try to read BLOCKED!
t6 S3 still blocked... No update run

Contrast With Pessimistic Concurrency

|-----------|----------------|---------------|---------------|	S1
|------------------|---------------|-----------|-------------|	S2
|-----------|-----------|----------|----|----------|---------|	S3
                            v2
t0         t1     t2   t3   t4    t5    t6  t7 t8  t9         
						
Time Action Consequences
t4 S1 BEGIN TRAN, updates: v2 created v2 overwrites v1
t5 S2 & S3 try to read BLOCKED!
t6 S3 still blocked... No update run
t7 S1 COMMIT TRAN S2 & S3 unblocked, read v2

Contrast With Pessimistic Concurrency

|-----------|----------------|---------------|---------------|	S1
|------------------|---------------|-----------|-------------|	S2
|-----------|-----------|----------|----|----------|---------|	S3
                            v2
t0         t1     t2   t3   t4    t5    t6  t7 t8  t9         
						
Time Action Consequences
t4 S1 BEGIN TRAN, updates: v2 created v2 overwrites v1
t5 S2 & S3 try to read BLOCKED!
t6 S3 still blocked... No update run
t7 S1 COMMIT TRAN S2 & S3 unblocked, read v2
t8 S2 reads, gets v2

Contrast With Pessimistic Concurrency

|-----------|----------------|---------------|---------------|	S1
|------------------|---------------|-----------|-------------|	S2
|-----------|-----------|----------|----|----------|---------|	S3
                            v2
t0         t1     t2   t3   t4    t5    t6  t7 t8  t9         
						
Time Action Consequences
t4 S1 BEGIN TRAN, updates: v2 created v2 overwrites v1
t5 S2 & S3 try to read BLOCKED!
t6 S3 still blocked... No update run
t7 S1 COMMIT TRAN S2 & S3 unblocked, read v2
t8 S2 reads, gets v2
t9 S3 reads, gets v2

Wrapping Up Optimistic Concurrency

  • Writers don't block readers any longer
  • No dirty reads: readers see valid data as of transaction begin time
  • A new error type comes up: write collisions
  • Code should handle write collisions: retry? show error? let user decide?

Native Code Compilation

We have two ways of accessing In-Memory OLTP tables: regular interpreted queries and native compiled code.

Native stored procedures are compiled as C code in DLLs. There is one DLL per stored procedure, and the minimum size of a stored procedure is 72KB.

Each DLL is loaded at the first execution. This means the first execution will be slower. To improve performance, build warm-up routines at database startup to cache important compiled stored procedures.

DLLs are not part of regular database backups; they should be maintained separately.

Native Code Compilation: Plans

All query engine work is done at compile time rather than runtime. When a stored procedure DLL is created, SQL Server generates an execution plan (using OPTIMIZE FOR UNKNOWN to limit parameter sniffing problems).

There are no automatic query plan updates/recompilation. Recompiliation happens under three circumstances:

  • Drop and re-create stored procedure
  • Restart the server
  • Fail over the Always-On Availability Group cluster

Native Code Compilation: Limitations

We cannot view actual execution plans for compiled stored procedures, and plans do not get stored in sys.dm_exec_cached_plans. Getting plans involves some work.

Compiled stored procedures cannot:

  • Access on-disk tables
  • Use subqueries (including Common Table Expressions and APPLY), OR, NOT, IN clauses
  • Use COALESCE or NULLIF; ISNULL is OK
  • Use temp tables; table variables are OK

Indexing Changes

Clustered indexes are gone. Clustered indexes (sort of) tell us the physical layout on disk of the data, but we don't care about how it's stored on disk--we never read from disk (after startup)!

A new, in-memory friendly, non-clustered index type now exists: Hash indexes.

Hash Index Considerations

  • A hash index is a hash created on the entire index. This means that an index on (C1, C2, C3) will have hash values radically different from an index just on (C1).
  • It is better to have narrow column ranges if the narrow index is still selective and used in the WHERE clause.
  • Hash indexes are good for exact matches but are bad for ranges. For ranges, use the classic non-clustered index type.
  • Hash indexes take as a parameter the number of memory buckets to create. We want # of buckets > E(# of rows). This gives us fewer hash collisions.
  • A hash collision creates a linked list of each row in the bucket. This makes collisions expensive.

Demo Time

In-Memory OLTP Advantages

  • Data is guaranteed to stay in memory, making performance at least as good as on-disk operations and typically much faster.
  • No bottleneck on TEMPDB (historically a huge bottleneck)

V1 General Limitations

  • Loss of visibility: estimated execution plans only, no I/O metrics
  • Execution is single-threaded: no parallelism
  • No:
    • Constraints (foreign key, check, default)
    • TRUNCATE TABLE
    • Replication
    • Database mirroring
    • Altering objects: table, procedures, indexes (use DROP and CREATE)

V1 Native Code Limitations

  • Loss of visibility: estimated execution plans only, no I/O metrics at all, no stats by default
  • No:
    • On-disk table access
    • Cross-database or linked server queries
    • MERGE
    • DISTINCT or WITH TIES
    • Returning more than 8192 rows with TOP (unless using a variable)
    • Subqueries (CTEs, APPLY), OR, NOT, IN
    • COALESCE or NULLIF (use ISNULL and CASE)
    • Temp tables (use table variables)

Speculation

Based on other V1 features Microsoft has released recently (especially Columnstore indexes), it is fair to say that In-Memory OLTP will undergo some improvement in the next version. Some likely candidates for improvement are:

  • ALTER PROCEDURE and creating non-PK indexes later
  • Additional native code features: subqeries, OR, IN, COALESCE DISTINCT
  • Default and Check constraints. Foreign Key constraints maybe.
  • MERGE

Recommendations

Current best practice is to limit the in-memory size of all durable tables to 250 GB or less.

You should also expect ~2x data size requirements, peaking at up to 4x. This means that if you have 250 GB of in-memory table data, you should have up to 1 TB of RAM you can dedicate to supporting that in-memory table.

OLTP Considerations

  • Write-heavy operations
  • Single-table workloads (few foreign key constraints)
  • Simple operations (basic CRUD) and simple data retrieval queries
  • Short-running transactions and requirement for low latency
  • No "weird" HA/DR scenarios aside from Always-On Availability Groups
  • "Brown field" development is easier than "green field" development: you want to know the table use patterns at table creation time.

Good OLTP Scenarios

  • Comments
  • Data analysis (up to 250 GB), feeding into a system like Hadoop
  • Collecting sensor data (e.g., temperature data from many weather stations)
  • Tracking user clicks
  • ETL tables (especially with non-durable tables)
  • Session state management

Bad OLTP Scenarios

  • Orders (FK relationships and integrity constraints)
  • Bookkeeping systems (constraints and data rules)
  • Reporting systems (T-SQL & cross-database limitations)
  • Contentious resource problems (e.g., many threads updating the same record's quantity)

What About OLAP?

This is called "In-Memory OLTP" for a reason!

Given size limitations and V1 limitations, In-Memory OLTP might be useful for small OLAP systems, but the real benefit is in multi-writer scenarios. Good OLAP design means having only one writer, and many warehouses blow past 2TB of data.

Wrapping Up

In-Memory OLTP is equally intriguing and maddening. I want to recommend it now, but you really need to understand your systems in order to benefit from it, and it doesn't always work for every scenario.

 

Slides and code are available at http://www.catallaxyservices.com/presentations/In-Memory-OLTP

E-Mail: feasel@catallaxyservices.com

Twitter: @feaselkl