One of the features I’m looking forward to testing is the new ability the 2019 SQL server has where it can make a more accurate “guess” on the estimated number of rows on table variables. The new feature is called “Table variable deferred compilation.” Will this new feature mean that the Query optimizer can make a much better Query plan?
I decided to test this feature. My test setup: SQL server 2017 latest update SQL Server 2019 latest CTP 2.1 I will make a table holding 2,000,000 rows where 10 will fit the predicate in the selected statement used later on. I will compare the 2 execution plans and the overall execution time. Let’s get started. Here is the setup: First, I created an table to hold the test values and used this table to fill the table variable.
CREATE TABLE [dbo].[TableVariableExample]( [Id] [int] IDENTITY(1,1) NOT NULL, [total] [money] NULL, [navn] [char](10) NULL ) ON [PRIMARY] GO
I used this script to populate the table.
DECLARE @i INT, @val MONEY SET @i=1 WHILE @i <= 2000000 BEGIN SELECT @val=CONVERT(MONEY,RAND()*150) IF @i IN ( 10000,20000,30000,40000,50000,60000,70000,80000,90000,100000) BEGIN INSERT INTO [TableVariableExample] Values(@val, 'it-Craft') END ELSE BEGIN INSERT INTO [TableVariableExample] Values(@val, REPLICATE('c',10)) END SET @i=@i+1 END
And now for the test, first on SQL server 2019 CTP 2.
DECLARE @MyTableVariable TABLE ( Id int, total money, navn char(10) ) INSERT INTO @MyTableVariable SELECT* FROM TableVariableExample SET STATISTICS IO, TIME ON; SELECT Navn, Total FROM @MyTableVariable WHERE Navn = 'it-Craft' SET STATISTICS IO OFF;
Now, let's look at the execution plan. Note: the estimated number of rows is 1414.21.
Let's run the same Query on SQL Server 2017.
Note: the estimated number of rows is 1.
Let's compare the 2 results.
SQL Server 2019:
Estimated number of rows: 1414.21
Logical reads: 7,663
SQL Server 2017:
Estimated number of rows: 1
Logical reads: 7,663
Ok, the actual number of rows is 10. The new feature “Table variable deferred compilation” seems to still guess a bit. There still is no statistics, as we know in temp tables, but why 1414.21? Well, it seems that it’s the square root of the total number of rows in the table.
Is 1414.21 better than estimating 1 row? Well, in my example, I don’t have evenly distributed data, in a real world example the 2019 estimate would properly be a much better guess than one, but the overall performance I had was better on SQL server 2019. Now, let’s test how joins are handled with this new, better estimate. I will create at table with 25,000 rows and then join that with the table variable. I will use this script:
DROP TABLE IF EXISTS dbo.TableVariabletest Create table dbo.TableVariabletest( Id INT, Name CHAR(8) ) DECLARE @i INT = 1 WHILE @i <= 25000 BEGIN INSERT INTO dbo.TableVariabletest VALUES((SELECT ABS(CHECKSUM(NewId())) % 2000000), 'MatchRow') SET @i = @i+1 END
I will run this simple query first on SQL Server 2019 and then on SQL Server 2017.
SELECT Name FROM @MyTableVariable MT INNER JOIN TableVariabletest TV ON MT.id=TV.id
Execution plan on SQL Server 2019 – Execution time: 1193 ms
Execution plan on SQL Server 2017 – Execution time: 4411 ms
Look in the SQL Server 2017 version.
The table variable is used as the outer table in a Neste loop. That’s because the query optimizer expects 1 row, but in our example, the table variable holds 2,000,000 rows making this a bad choice of joining method. In SQL Server 2019, we get a hash match join and a overall faster execution. That’s clearly a improvement in performance.
Execution plans with table variables look better in SQL Server 2019, and with even distributed data I would expect a better execution than in prior versions of SQL server, but for now I would still prefer a temp table where I have statistics and the ability to create an index.
Thanks for reading. If you have any questions, troubles with your SQL server, queries, or environment, please don’t hesitate to call me or one of my colleagues at Unit IT phone: 88 333 333