SQL Book

SQL Inner Join - examples and explanations

The nature of relational database design means that we will often have related data that is stored in different tables. To retrieve data from two or more tables in one query we use the SQL JOIN statement.

The JOIN clause tells the database how the data in the two tables is related so that it can return a correct representation of the related data.

The SQL INNER JOIN clause tells the database to only return rows where there is a match found between table1 and table2. An INNER JOIN is most often (but not always) created between the primary key column of one table and the foreign key column of another table.

An easy INNER JOIN example

Let us assume we have an Employee table and a TrainingTaken table. When an employee goes on a training course they have the details of the training recorded in the TrainingTaken table.

Some example data is shown below:

Employee

EmployeeID Name Telephone StartedEmployment
1 Bob Marley 0222 00000 10/01/2005
2 John Lennon 0222 00050 02/05/2003
3 Ralph Kimball 0222 03307 01/04/2004
4 Bill Gates 0222 03307 01/04/2004

TrainingTaken

TrainingTakenID EmployeeID TrainingTitle TrainingDate
1 2 Sales Training 10/01/07
2 1 Risk Management 05/02/07
3 2 First Aid 01/03/07
4 3 Sales Training 10/01/07

The Employee table has a primary key column called EmployeeID which relates to the foreign key column in the TrainingTaken table called EmployeeID.

Now that we know how these two tables relate to each other we can write a query that correctly 'joins' or 'matches' related data from these two tables. To do this we must specify in our INNER JOIN clause, the relationship between the EmployeeID column in the Employee table and the EmployeeID column in the TrainingTakentable.

Lets write a query that returns a list of employee names along with the title and date of any training they have been on.

SELECT Employee.Name, TrainingTaken.TrainingTitle, TrainingTaken.TrainingDate
FROM Employee 
	INNER JOIN TrainingTaken ON Employee.EmployeeID = TrainingTaken.EmployeeID

Returns:

Name TrainingTaken TrainingDate
Bob Marley Risk Management 05/02/07
John Lennon Sales Training 10/01/07
John Lennon First Aid 01/03/07
Ralph Kimball Sales Training 10/01/07

The above query demonstrates the INNER JOIN clause which specifies the two tables that we are using and then uses the ON keyword to define the relationship or 'joining points' between the two tables.

We can see that columns are identified by using TableName.ColumnName syntax so that the query knows which table to find the column we are referencing. This is because the same column name may be present in more than one table (e.g. the column name EmployeeID appears in both tables in our example).

The INNER JOIN clause in the example above can be rewritten in an alternative format (a JOIN condition) by defining the relationship between the two tables in the WHERE clause.

SELECT Employee.Name, TrainingTaken.TrainingTitle, TrainingTaken.TrainingDate
FROM Employee JOIN TrainingTaken
WHERE Employee.EmployeeID = TrainingTaken.EmployeeID

From the above example we can see that the JOIN keyword can be used as shorthand for INNER JOIN and the ON section of the INNER JOIN clause can be replaced by specifying the relationship between the two tables in the WHERE clause.

OUTER JOINS

It is important to note that the INNER JOIN Only returns data where a match is found. Therefore no information for Bill Gates was returned from this query as there is no record in the TrainingTaken table that contains Bill Gates EmployeeID.

If we wanted to return all rows from one table regardless of whether they had any related data in the table they were joining to we would need to use an OUTER JOIN.

Using table aliases for more readable queries

In order to make our queries more readable, quicker to write and debug we can use table aliases. This is where we give the table a shorthand name (after its definition in the FROM clause). We can then use this shorthand name in the rest of the query when referencing the table. Using table aliases we can rewrite our initial INNER JOIN example as follows:

SELECT E.Name, TT.TrainingTitle, TT.TrainingDate
FROM Employee E
	INNER JOIN TrainingTaken TT ON E.EmployeeID = TT.EmployeeID

In the above example we have given the Employee table an alias of E and the TrainingTaken table an alias of TT. You can see how this helps the readability of queries, especially if you have a query which had 3 or more tables joined together.

Joining more than 2 tables

We can create more than one join in a query as follows:

SELECT a.Column1, b.Column1, c.Column1
FROM TableA a INNER JOIN
	TableB b ON a.AID = b.AID INNER JOIN
	TableC c ON b.BID = c.BID
ORDER BY a.Column1, b.Column1

The same query could be rewritten as:

SELECT a.Column1, b.Column1, c.Column1
FROM TableA a JOIN TableB b JOIN TableC c
WHERE a.AID = b.AID 
	AND b.BID = c.BID
ORDER BY a.Column1, b.Column1

Conclusion

In this article we have demonstrated the concept of the INNER JOIN and how it is used to match data from related tables. We can see that the JOIN clause is an essential tool when querying well designed, relational databases. Without the JOIN clause we would not be able to define meaningful relationships between tables in our database.

We have seen how the join relationship can be specified either after the ON keyword in the JOIN clause or in the WHERE clause. We have also seen how table aliases can help make writing, reading and maintaining our SQL statements easier.

We have seen how the INNER JOIN only returns data from a table when a match is found in the other table specified in the JOIN. When data is required from one of the tables in the JOIN regardless of whether a match is found in the other table then an OUTER JOIN must be used.