In-Memory OLTP Programmability - Pro SQL Server Internals

Database Reference

In-Depth Information

Chapter 33

In-Memory OLTP Programmability

This chapter focuses on the programmability aspects of the in-memory OLTP engine in SQL Server. It describes the

process of native compilation, and it provides an overview of the natively-compiled stored procedures and T-SQL

features, which are supported in in-memory OLTP. Finally, it discusses several questions related to the design of new

and the migration of existing systems to the in-memory OLTP architecture.

Native Compilation

As you already know, memory-optimized tables can be accessed from regular T-SQL code using the query interop

engine. This approach is very flexible. As long as you work within the supported feature set, the location of data

is transparent. The code does not need to know, nor does it need to worry about if it works with on-disk or with

memory-optimized tables.

Unfortunately, this flexibility comes at a cost. T-SQL is an interpreted and CPU-intensive language. Even a simple

T-SQL statement requires thousands, and sometimes millions, of CPU instructions to execute. Even though in-memory

data location dramatically speeds up data access and eliminates latching and locking contentions, the overhead of T-SQL

interpretation and execution sets limits the level of performance improvements achievable with in-memory OLTP.

In practice, it is possible to see a 2X-3X system throughput increase when memory-optimized data is accessed

through the interop engine. To improve performance even further, in-memory OLTP utilizes native compilation. As a

first step, it converts any row-data manipulation and access logic into C code, which is compiled into DLLs and loaded

into SQL Server process memory. Those DLLs (one per table) consist of native CPU instructions, and they execute

without any further code interpretation overhead of T-SQL statements.

Consider the simple situation where you need to read the value of a fixed-length column from a data row.

In the case of on-disk tables, SQL Server obtains the starting offset and length of the column from the system catalogs,

and it performs the required manipulations to convert the sequence of bytes to the required data type. With

memory-optimized tables, the DLL already knows what is the column offset and data type. SQL Server can read data

from a pre-defined offset in a row using a pointer of the correct data type without any further overhead involved.

As you can guess, this approach dramatically reduces the number of CPU instructions required for the operation.

On the flip side, this approach brings some limitations. You cannot change the format of a row after the DLL is

generated. The compiled code would not know anything about the changes. This problem is more complicated than it

seems, and simple recompilation of the DLL does not address it.

Again, consider the situation where you need to add another nullable column to a table. This is a metadata-level

operation for on-disk tables, which does not change the data in existing table rows. T-SQL would be able to detect that

column data is not present by analyzing the various data row properties at runtime.

Search WWH ::

Custom Search

Home