Using SQL Cross Join - the report writers secret weapon
The SQL CROSS JOIN is an often overlooked part of SQL but it can prove invaluable if you are aware of which situations it can be useful in.
A Cross Join (also sometimes known as a Cartesian Join) results in every row of one table being joined to every row of another table. The CROSS JOIN does not establish a relationship between the two tables like an INNER JOIN or OUTER JOIN would by specifying which columns to join on.
The example below creates two basic tables and performs a cross
join on them:
CREATE TABLE Customer (
CustomerID int,
CustomerName varchar(50)
)
CREATE TABLE Product (
ProductID int,
ProductName varchar(50)
)
INSERT INTO Customer VALUES (1, 'Joe Bloggs')
INSERT INTO Customer VALUES (2, 'Bob Marley')
INSERT INTO Customer VALUES (3, 'Jimi Hendrix')
INSERT INTO Product VALUES (1, 'MP3 Player')
INSERT INTO Product VALUES (2, 'LCD TV')
SELECT * FROM Customer CROSS JOIN Product
=======
Results:
=======
CustomerID CustomerName ProductID ProductName
----------- --------------- ------------ ----------------------
1 Joe Bloggs 1 MP3 Player
1 Joe Bloggs 2 LCD TV
2 Bob Marley 1 MP3 Player
2 Bob Marley 2 LCD TV
3 Jimi Hendrix 1 MP3 Player
3 Jimi Hendrix 2 LCD TV
You can see from the above example that a cross join gets a record
from the first table and then creates a new row for every row in the
2nd table. It then does the same for the next record in the first
table etc.
When there is no WHERE filters applied to the CROSS JOIN the number
of records returned will be the number of records in Table1 x the
number of records in Table2. i.e. in the above example this is 3 x 2 =
6. With even moderate sized tables of a few thousand records you can
see this can quickly produce very large resultsets that take large
amounts of memory and can greatly reduce the performance of the
application. e.g. 2000 customers CROSS JOIN 2000 products with no
WHERE clause = 4 million records.
Using the WHERE Clause with the CROSS JOIN
We can use the WHERE clause with the CROSS JOIN to filter the
records returned:
SELECT * FROM Customer CROSS JOIN Product
WHERE Customer.CustomerID = 2
=======
Results:
=======
CustomerID CustomerName ProductID ProductName
----------- --------------- ------------ ----------------------
2 Bob Marley 1 MP3 Player
2 Bob Marley 2 LCD TV
Many text books cite that CROSS JOINS should be avoided due to the
very large recordsets they create and that the only reason for using
CROSS JOIN is to create some dummy data that can be used for testing
your queries with. There are however, much more useful and powerful
reasons for using CROSS JOIN.
Using CROSS JOIN as part of a report query
For an example of how we can utilise the CROSS JOIN when writing a
query to generate a report let us use the example that we want to
generate a report of how many items were sold for each customer and
each product. A common attempt to try this would be:
-- assumes a 'Sales' table with the following columns:
-- CustomerID, ProductID, Quantity
SELECT C.CustomerName, P.ProductName, SUM(S.Quantity) As NumberSold
FROM Customer C INNER JOIN Sales S
ON C.CustomerID = S.CustomerID INNER JOIN Product P
ON P.ProductID = S.ProductID
GROUP BY CustomerName, ProductName
ORDER BY CustomerName, ProductName
Remember we want evidence of each customer and product in our
resultset even if there have been no sales for that customer / product
combination. The problem with the above query is that if a customer or
product has no sales associated with it there would be no record of
the customer or product in the resultset. i.e. If NumberSold is 0 then
there is no evidence of the customer or product existing.
We can solve this problem by using a CROSS JOIN in combination with
another sub query. If we use a CROSS JOIN query as the basis of our
report we can guarantee that we will get every Customer / Product
combination in our resultset:
SELECT C.CustomerName, P.ProductName
FROM Customer C CROSS JOIN Product P
We now need to add the NumberSold column onto our resultset and we
can do this by using a derived table and joining it to our
original query by using a LEFT OUTER JOIN:
SELECT C.CustomerName, P.ProductName, S.NumberSold
FROM Customer C CROSS JOIN Product P
LEFT OUTER JOIN
(SELECT CustomerID, ProductID, SUM(Quantity) As NumberSold
FROM Sales
GROUP BY CustomerID, ProductID) S
ON C.CustomerID = S.CustomerID AND
P.ProductID = S.ProductID
ORDER BY C.CustomerName, P.ProductName
========
Results:
========
CustomerName ProductName NumberSold
------------- --------------- -----------
Bob Marley LCD TV 1
Bob Marley MP3 Player NULL
Jimi Hendrix LCD TV 5
Jimi Hendrix MP3 Player NULL
Joe Bloggs LCD TV NULL
Joe Bloggs MP3 Player 3
Great! This query does what we wanted but if the NumberSold was 0
then the LEFT JOIN gives us NULL in the NumberSold column. We can
convert NULL to 0 in our application or in SQL Server we can use the
ISNULL() function to convert NULL to 0. This is demonstrated below
where we amend the first line of our SELECT statement.
SELECT C.CustomerName, P.ProductName, IsNull(S.NumberSold, 0) As NumberSold
This join pattern of: Table1 CROSS JOIN Table2 LEFT OUTER JOIN
DerivedTable3 is the secret weapon of the report writer and can be
used to provide an efficient, elegant solution to the 'missing rows'
problem.
Adding row filters to our report
WHERE clauses can be applied in both the outer query (the query
with the CROSS JOIN) and the inner query (the aggregating part) to
filter the data.
For instance to filter the data by Customer or Product we would add
a WHERE clause to the outer query:
WHERE C.CustomerName LIKE 'J%'
To filter data by Sales characteristics such as DateOfSale or
SalesValue we would add a WHERE clause to the inner query:
-- assumes Sales table has a column called SalesValue
SELECT C.CustomerName, P.ProductName, IsNull(S.NumberSold, 0) As NumberSold
FROM Customer C CROSS JOIN Product P
LEFT OUTER JOIN
(SELECT CustomerID, ProductID, SUM(Quantity) As NumberSold
FROM Sales
WHERE SalesValue > 100.00
GROUP BY CustomerID, ProductID) S
ON C.CustomerID = S.CustomerID AND
P.ProductID = S.ProductID
WHERE C.CustomerName LIKE 'J%'
ORDER BY C.CustomerName, P.ProductName
Conclusion
In this article we have seen how the SQL CROSS JOIN works by
combining each row in one table with each row in another table. This
can produce very large resultsets that can bring a database to it
knees. To keep the size of the resultset down it is advisable to:
- Specify which fields to select from each table in the cross join
rather than using SELECT *
- Use a WHERE clause to reduce the number of rows returned
The CROSS JOIN is a powerful tool when used for creating reports
that require each row from one table to be displayed for one or more
rows of another table. The CROSS JOIN solution is:
Table1 CROSS JOIN Table2 LEFT JOIN DerivedTable3
(where the DerivedTable3 is an aggregating sub query.)
A WHERE Clause can be added to both the outer query and the inner
query to filter the data according to the needs of the report.
We hope you find this useful in helping you develop your reports.
Discuss this article in the forums
Tell a friend about this page