Merge vs Insert, update and delete

Scroll

I often see that when people learn about the MERGE statement, they use it a lot, and I do understand that the MERGE statement is easy to read and quite practical as you only need one statement to write an insert, update and delete a statement. The MERGE statement as described on docs.

Merge statement on docs

MERGE   
    [ TOP ( expression ) [ PERCENT ] ]   
    [ INTO ] <target_table> [ WITH ( <merge_hint> ) ] [ [ AS ] table_alias ]  
    USING <table_source>   
    ON <merge_search_condition>  
    [ WHEN MATCHED [ AND <clause_search_condition> ]  
        THEN <merge_matched> ] [ ...n ]  
    [ WHEN NOT MATCHED [ BY TARGET ] [ AND <clause_search_condition> ]  
        THEN <merge_not_matched> ]  
    [ WHEN NOT MATCHED BY SOURCE [ AND <clause_search_condition> ]  
        THEN <merge_matched> ] [ ...n ]  
    [ <output_clause> ]  
    [ OPTION ( <query_hint> [ ,...n ] ) ]      
;

But though it is nice to have one statement to handle it all, if we look at the performance tip on docs:

 “The conditional behavior described for the MERGE statement works best when the two tables have a complex mixture of matching characteristics. For example, inserting a row if it doesn’t exist, or updating a row if it matches. When simply updating one table based on the rows of another table, improve the performance and scalability with basic INSERT, UPDATE, and DELETE statements. For example:”

That is what I often see the merge statement used for, simple Updates and inserting statements. Lets try it out in a very small table. I will copy some data from the AdventureWorks database and make an update table with this script:

DROP TABLE IF EXISTS PersonCopy
DROP TABLE IF EXISTS PersonCopyUpdate
SELECT ROW_NUMBER() OVER (ORDER BY LastName) EmpId ,FirstName, LastName, E.JobTitle
INTO PersonCopy
FROM Person.Person P INNER JOIN HumanResources.Employee E On P.BusinessEntityID = E.BusinessEntityID
SELECT EmpId ,FirstName, LastName, JobTitle
INTO PersonCopyUpdate
FROM PersonCopy WHERE FirstName = 'Michael' AND 1 % 2 = 1
UPDATE PersonCopyUpdate
SET FirstName = 'Jack',
    LastName = 'Sparrow'
INSERT INTO PersonCopyUpdate VALUES( 999, 'Super', 'Man', 'Hero'), (666, 'The', 'Devil', 'Bad Guy')
GO
CREATE UNIQUE CLUSTERED  INDEX idx_PersonCopy ON PersonCopy (EmpId)
CREATE UNIQUE CLUSTERED  INDEX idx_PersonCopyUpdate ON PersonCopyUpdate (EmpId)

I will create an EmpId based on a simple ROW_NUMBER() function, this id is used for the matching, then I will copy some of the rows into the PersonCopyUpdate table, making some changes, and adding 2 new rows. Finally, I will create a Clustered unique index on both tables.
Now for this simple test, In the MERGE statement. I will use PersonCopy as the target and PersonCopyUpdate as the source, then I’ll use EmpId as the matching column. Then, when matched, I will update the FirstName and LastName from PersonCopyUpdate When not matched, insert and when the target rows don’t exist in the source table then delete, it looks like this:

MERGE PersonCopy AS TARGET
USING (SELECT EmpId, FirstName,LastName, Jobtitle FROM PersonCopyUpdate) AS SOURCE
ON (TARGET.EmpId = SOURCE.EmpId )
WHEN MATCHED 
    THEN UPDATE SET FirstName = SOURCE.FirstName,
                    LastName = SOURCE.LastName
WHEN NOT MATCHED
    THEN INSERT (EmpId, FirstName,LastName,Jobtitle) VALUES(SOURCE.EmpId, SOURCE.FirstName, SOURCE.LastName, SOURCE.Jobtitle)
WHEN NOT MATCHED BY SOURCE
    THEN DELETE

When running the statement I should end up with 10 x Jack Sparrow one superman and one The Devil:

And the statistics look like this:

Now let’s write an update, insert and delete a statement to do the exact same:

--Update matching Rows
UPDATE P
SET FirstName = S.FirstName,
    LastName = S.LastName
FROM PersonCopy P INNER JOIN PersonCopyUpdate S ON P.EmpId = S.EmpId
-- Insert new Rows
INSERT INTO PersonCopy
SELECT        EmpId,
       FirstName,
        LastName,
        JobTitle
FROM PersonCopyUpdate S
WHERE NOT EXISTS (
        SELECT NULL
        FROM PersonCopy P
        WHERE P.EmpId = S.EmpId
    )
--Delete Rows
DELETE p
FROM PersonCopy P 
WHERE NOT EXISTS (
        SELECT NULL
        FROM PersonCopyUpdate S
        WHERE S.EmpId = P.EmpId
    )

And then once again we should end up with 10 x Jack Sparrow, one Superman, and one The Devil:

And look at the statistics: (I only show the summarised totals. I used the fantastic website http://statisticsparser.com/)

Ok, the time is less than it was with the MERGE statement (I have nothing else running on this server to interfere with the time measures)

Ok let’s make it a whole lot bigger, lets put a lot more data into the tables:

CREATE TABLE PersonCopy(
 EmpId INT IDENTITY(1,1),
 FirstName NAME,
 LastName NAME,
 JobTitle NVARCHAR(75)
)
GO
INSERT INTO PersonCopy (FirstName, LastName, JobTitle)
SELECT FirstName, LastName, E.JobTitle
FROM Person.Person P INNER JOIN HumanResources.Employee E On P.BusinessEntityID = E.BusinessEntityID
GO 5000

And then take the top 100 into PersonCopyUpdate:

SELECT TOP 100 EmpId ,FirstName, LastName, JobTitle
INTO PersonCopyUpdate
FROM PersonCopy WHERE FirstName = 'Michael' AND 1 % 2 = 1

Now let’s run the same test as before, it should end up with 102 rows:

And the statistics this time

Now for the update, Insert and delete statements (run in that order) ending up with 102 rows as expected and with these stats:

Note that the time is 35 sec where the merge was 39 sec, but look at the logical reads, they total 203,008 on PersonCopy with the Insert, update Delete (totals) and 4,516,795 on PersonCopy in the MERGE statement.

But what if I change the order of Insert, update, and delete, and start with the delete? I have that option when I write the statements separately, here are the statistics:

And now we’re down to 28 sec, almost 11 sec faster than the merge statement.

When should you use MERGE? – It depends (I love that answer, you will often get it when you ask a question regarding SQL)

I have seen examples where MergM was almost as fast and had nearly the same logical reads as an insert used as a substitute for an insert.

You must test it, just remember MERGE is ideal for complex statements, as stated in the performance tip on docs, and is not ideal for simple Insert, update, and delete statements.

Thanks for reading. If you have any questions, troubles with your SQL Server, queries, or environment, please don’t hesitate to call me or one of my colleagues at Unit IT phone: 88 333 333.

Ring til Danmarks Mest Anbefalede Virksomhed 2020