SQL Merge - a standard SQL statement for more elegant queries

When we write code that corresponds to existing data we often write seperate statements that either UPDATE or INSERT records depending on whether certain conditions relating to the original data and new version of the data are met.

The SQL Merge statement is an official SQL standard (introduced in SQL: 2003 - the 5th version of the SQL standard) that enables UPDATE, INSERT and DELETE statements to be combined into one easy to use statement.

SQL Server 2008 is the first version of SQL Server to support the MERGE statement. Other major databases such as Oracle and DB2 also support MERGE.

Common scenario - UPSERT

A common scenario in any sales database is the updating of a sales order. A common scenario may be for a sales advisor to receive a telephone query from a customer and either amend delivery information, add items / remove items or update quanities of items that the order contained. A block of SQL code that either performs an UPDATE or INSERT statement depending on whether the row already exists is often known as an UPSERT.

Lets assume our orders are stored in an OrderSummary table and an OrderItem table which holds information on the products associated with the order. To update an order without using the MERGE statement we would need to do something similar to the following:

-- assumes a table variable @UpdatedItems that holds 
-- the updated OrderItems, also variables with values assigned
-- to them for for OrderID, FirstName, LastName and Telephone.

-- UPSERT the OrderSummary table
IF Exists(SELECT OrderID FROM OrderSummary WHERE OrderID = @OrderID)
BEGIN
	UPDATE OrderSummary 
	SET FirstName = @FirstName, LastName = @LastName, Telephone = @Telephone
	WHERE OrderID = @OrderID
END
ELSE
BEGIN
	INSERT INTO OrderSummary
	(OrderID, FirstName, LastName, Telephone)
	VALUES (@OrderID, @FirstName, @LastName, @Telephone)
END

-- UPDATE the products in the OrderItem table that are in the 
-- updated order and already exist in the Orderitem table

UPDATE oi SET Quantity = ui.Quantity, UnitCost = ui.UnitCost
FROM OrderItem oi INNER JOIN @UpdatedItems ui
ON oi.ProductID = ui.ProductID
WHERE oi.OrderID = @OrderID

-- INSERT new products into the OrderItem table that are in the 
-- updated order information but missing in the OrderItem table.

INSERT INTO OrderItem (OrderID, ProductID, Quantity, UnitCost)
VALUES (
	SELECT OrderID, ProductID, Quantity, UnitCost 
	FROM @UpdatedItems 
	WHERE OrderID = @OrderID 
		AND ProductID NOT IN 
              (SELECT ProductID FROM OrderItem WHERE OrderID = @OrderID)
	)

-- DELETE products from the OrderItem table that do not exist in 
-- the updated order information

DELETE FROM OrderItem WHERE ProductID NOT IN (SELECT ProductID FROM @UpdatedItems)

Now we have seen the type of solution we may try and solve using the MERGE statement lets examine its syntax before we rewrite the above example.

Syntax

MERGE target_table_name
USING source_table_name
ON match_condition
WHEN MATCHED THEN UPDATE SET Col1 = val1 [, Col2 = val2...]
WHEN [TARGET] NOT MATCHED THEN INSERT (Col1 [,Col2...] VALUES (Val1 [, Val2...])
WHEN SOURCE NOT MATCHED THEN DELETE;

A key feature of the MERGE statement is that it requires a semi-colon at the end of the statement to clearly delimit the end of the statement.

The target_table is the table that any UPDATE / INSERT / DELETE statements that are generated in the MATCHED / NOT MATCHED clauses affect. The source_table is the data that we recieve from the application that has amended the data.

The match_conditions are the join(s) and any comparison operators that are used to SELECT records.

  • If a row in the target table and source table match the match_condition(s), then MATCHED = true.
  • If a row from the source_table does not exist in the target table then NOT MATCHED = true.
  • When a row exists in the target but not in the source then SOURCE NOT MATCHED = true.

A more elegant solution using MERGE?

Using the MERGE statement we can update our above example to the following:

-- assumes a table variable @UpdatedItems that holds 
-- the updated OrderItems, also variables with values assigned
-- to them for for OrderID, FirstName, LastName and Telephone.

-- Merge order information into OrderSummary table
MERGE INTO OrderSummary As os
USING (SELECT @OrderID As OrderID, @FirstName As Firstname, 
@LastName As LastName, @Telephone As Telephone) As NewSummary
ON NewSummary.OrderID = os.OrderID
WHEN MATCHED THEN 
	UPDATE SET os.FirstName = NewSummary.FirstName,
		os.LastName = NewSummary.LastName,
		os.Telephone = NewSummary.Telephone
WHEN NOT MATCHED THEN
	INSERT (OrderID, FirstName, LastName, Telephone)
	VALUES (@OrderID, @FirstName, @LastName, @Telephone);

-- Merge order items into OrderItems table
MERGE INTO OrderItem As oi
USING @UpdatedItems ui
ON (oi.OrderID = ui.OrderID AND oi.ProductID = ui.ProductID)
WHEN MATCHED THEN
	UPDATE SET Quantity = ui.Quantity, UnitCost = ui.UnitCost
WHEN NOT MATCHED THEN
	INSERT (OrderID, ProductID, Quantity, UnitCost)
	VALUES (@OrderID, ui.ProductID, ui.Quantity, ui.UnitCost)
WHEN SOURCE NOT MATCHED THEN
	DELETE;

Adding extra conditions to the MATCH clauses

You can also add extra conditions to the MATCH clauses and run different code depending on the clause:

WHEN MATCHED AND ui.Quantity >= 10 THEN
	-- Update with 5% discount off the UnitCost for bulk buying reward
WHEN MATCHED AND ui.Quantify < 10 THEN
	-- Update with normal UnitCost

Conclusion

The introduction of the MERGE statement into T-SQL is a welcome step for many SQL Server users as it provides a clear, easy to understand syntax for a common scenario that used to require a number of statements to solve. It should be of great use in data warehousing ETL routines.

Because MERGE is a SQL: 2003 standard it also means that you can transfer your knowledge of this statement to other database systems such as Oracle and DB2.