3 More Joins You Should Be Familiar With

Share this article

There are many ways to JOIN data from two database tables and filter the information you require. Craig Buckler wrote a popular piece on understanding JOINs; namely INNER, LEFT, RIGHT, and FULL OUTER. This article is an extension of that one.

Let’s recap these real quick. Picture two tables, one for customers and one for books to establish a book loan database.


customers table

id firstname lastname book_id
1 Joe Blow 1
2 Jane Doe 2
3 Harry Crow 2
4 Jeffrey Snow 0

books table

id Title
1 Star Wars
2 Jurassic Park
3 Little Women
4 Tom Sawyer

The books table has one row for every book.

The customers table has one row for each customer who can only have one book on loan at a time. If they have no book on loan, the book_id would be 0 or an empty string.

This is a very simple example to make the JOINs as clear as possible to understand!

A LEFT JOIN here would be in the case you want to ask a question such as “show me all customers including any books on loan.

Left Join Venn Diagram

You can see in the image that ALL data in the left circle, or table, is included in the result set. Only data that overlaps from the books table is included from the right table. This means that with a LEFT JOIN, some data in the right table may be excluded.

A RIGHT JOIN would be like asking “show me all the books in my library, along with any customers that have borrowed them.

Right Join Venn Diagram

In this image you see that ALL data in the right table is included in the result set. Only data that overlaps from the customers table is included. This means that with a RIGHT JOIN, some data in the left table may be excluded.

An OUTER JOIN would be like asking “show me all loans and all books, regardless of connections between them.

Full Outer Join Venn Diagram

In this image, you can see that ALL data from both tables will be included, regardless of whether some data overlaps. When a query like this is done, you will have non-overlapping data in the result, and those fields will be set to NULL.

An INNER JOIN would be like asking “show only customers with a loan.

Inner Join Venn Diagram

Here you can see that data may be excluded from both the left and right tables. You won’t see any customers if they DON’T have a book out, and you won’t see any books if they are NOT loaned out!

This is the most common type of data and is the default behavior when using the keyword JOIN by itself. The added word “INNER” is not usually required.

What Does “Left” And “Right” Mean Anyway?

You can think of “left” and “right” as nothing more than source order. Look at this query:

SELECT *
FROM customers
LEFT JOIN books
ON customers.book_id = books.id

Notice that I mentioned the table customers before I used the JOIN keyword. This means customers is my “left” table. Another way of thinking about it is to ask which table is left of the JOIN keyword, and which is to the right of it.

A couple caveats:

  • Simply because one table is the “left” one, does not mean you are selecting ALL its records. This is the function of LEFT JOIN or RIGHT JOIN, not source order!
  • The table that is the left one will have its columns listed first in the result set unless you specifically choose columns in the SELECT section.
  • The order of tables after the ON keyword doesn’t matter, it would return the same results if swapped.

Enough Reviewing. What Are These New JOINs?

Well, they are not “new”, they are just a way to query additional questions. How would you find answers to questions with the words “don’t” or “not” in the query?

Show me all the customers who DON’T have books on loan.

Show me all the books that are NOT loaned out.

LEFT JOIN with Exclusion

Take a look at this graphic. Notice the difference from the above JOINs in the shaded area.

Left join with exclusion Venn Diagram

This looks like a LEFT JOIN, but no longer has the “overlap” data included. Why? This is the same as asking “show me customers who have no books on loan.” Perhaps you want to select all customers without a book and send them a newsletter with a special discount?

If you are clever you might think you can just search the customers table for a book_id of 0 to do the same thing. That would work in this example but most of the time it won’t; it depends how your tables are designed.

The query looks like this:

SELECT *
FROM customers
LEFT JOIN books
ON customers.book_id = books.id
WHERE books.id IS NULL

We’ve now included a WHERE clause. Any time you have a WHERE clause, you are excluding data, or filtering it, almost like a search. So why are we searching for books.id being NULL? Why would that be NULL? Let’s run the standard LEFT JOIN and see the data it returns, we’ll have the answer:

LEFT JOIN return data

id firstname lastname book_id id1 title
1 Joe Blow 1 1 Star Wars
2 Jane Doe 2 2 Jurassic Park
3 Harry Crow 2 2 Jurassic Park
4 Jeffrey Snow 0 null null

Can you see the issue? This is a LEFT JOIN which means ALL data from the customers table is included regardless of overlap with books. Jeffrey Snow is included but he does not have a book on loan, this is why the “id1” and “title” columns are set to NULL. When the tables are JOINed, there would not be a book title or book ID linked to him.

If we ask “show me all customers including which books they have”, you would want the above data because Jeffrey is a customer regardless if he has a book on loan or not. That would be the function of a LEFT JOIN in this case.

If we ask the question “show me customers with no books on loan”, it now makes perfect sense what to look for. We only need to select customers where we see NULL for the books.id column (when JOINed, would be labeled id1 since there are two columns named “id”). We do this with a standard WHERE clause if we add WHERE books.id IS NULL. Now the result is filtered to just this:

id firstname lastname book_id id1 title
4 Jeffrey Snow 0 null null

You now have all customers who do not have books on loan.

RIGHT JOIN with Exclusion

Let’s do the same with a RIGHT JOIN. Let’s find all the books which are NOT loaned out to anybody.

A normal RIGHT JOIN would return every book regardless of whether it is loaned, the result set would look like this:

id firstname lastname book_id id1 title
1 Joe Blow 1 1 Star Wars
2 Jane Doe 2 2 Jurassic Park
3 Harry Crow 2 2 Jurassic Park
null null null null 3 Little Women
null null null null 4 Tom Sawyer

This looks a bit different. First, you might notice that Jurassic Park is listed twice. This is because two people have the book on loan, and the database is returning a row for each match.

Notice all the corresponding columns from the customers table are NULL for Little Women and Tom Sawyer because no one borrowed those titles so there is no overlapping data.

If we want to select all the books that are not loaned out, we just use the WHERE clause to find “NULL” in the customers.id column.

SELECT *
FROM customers
RIGHT JOIN books
ON customers.book_id = books.id
WHERE customers.id IS NULL

The result should be predictable. We get only books that are not loaned out.

id firstname lastname book_id id1 title
null null null null 3 Little Women
null null null null 4 Tom Sawyer

OUTER JOIN with Exclusions

The last JOIN looks like this.

This JOIN is not very useful, but will essentially give you a list of both customers with no loan, and books which are not loaned, at the same time.

A strange JOIN like this may be useful in cases where you literally need to select data with no connection between tables. Perhaps you are hunting for orphan data or looking for inconsistencies in some old database you converted.

In fact, this kind of JOIN is so weird that you can’t even do it in MySQL, it doesn’t support OUTER JOIN. Regular SQL does, and the query would look like this (MSSQL not MySQL):

SELECT *
FROM customers
FULL OUTER JOIN books
ON customers.id = books.id
WHERE customers.id IS NULL
OR books.id IS NULL

The result of this query would return data looking something like this:

id firstname lastname book_id id1 title
1 Jeffrey Snow 0 null null
null null null null 3 Little Women
null null null null 4 Tom Sawyer

You can achieve an OUTER JOIN using the method Craig Buckler explained with a UNION, but that is not without potential issues. The top Google result for simulating a FULL OUTER JOIN in MySQL is from 2006 and can be found here. It can get somewhat complicated.

The main thing to notice in the above code is checking for NULL on both sides of the JOIN, because we want to exclude from both tables. Without checking both sides, we’d simply end up with one of the other JOINs just talked about.

Regardless of how bizarre or supported a query like this one is, I wanted to include it because it is a valid type of JOIN, if you can think of a reason to use it.

Other Thoughts

You should always use your WHERE clause against fields which cannot have NULL as their actual value! We always tested against the ID fields which cannot have NULL as a value. Imagine if our books table had an ISBN field that allowed NULL. If we tested for NULL using that field, it would include rows we may not want!

There is another JOIN called a CROSS JOIN which is also strange and unique. Imagine instead of just matching one user to one book, EVERY user were matched against EVERY book! Yes, this means if you have 20 books and 30 customers, a CROSS JOIN would result in 30*20 rows of data! For an example of how this could be useful, check out this article.

Note that in MySQL, the JOIN, INNER JOIN, and CROSS JOIN are syntactical equivalents and can replace each other. This is because JOIN and INNER JOIN do the same thing and must use the ON keyword to match columns. When using a CROSS JOIN, there is no ON keyword, as it is matching every row in table A to every row in table B.

Conclusion

I hope these extra few JOINs made sense to you. Think about using them any time you are asking for data between tables where something “doesn’t” match the other.

  • “Find all customers who have NOT ordered before”.
  • “Find all customers NOT in the blacklist”.
  • “Find all products that have NOT sold”.
  • “Find all customers NOT borrowing books”.
  • “Find all dogs that HAVEN’T been walked lately”.
  • “Find employees who have NEVER sent in a support ticket”.

In summary, depending on how your database and tables are defined, you may need to use the WHERE clause to check for NULL values to exclude the matches rather than include them as with normal JOIN behavior.

So… have you ever needed a cross join? Any other specific use cases you’d like to tell us about or want to us to cover? Let us know!

Frequently Asked Questions (FAQs) about SQL Joins

What are the different types of SQL Joins and how do they differ from each other?

SQL Joins are used to combine rows from two or more tables, based on a related column between them. There are four basic types of SQL Joins: INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL JOIN. INNER JOIN returns records that have matching values in both tables. LEFT JOIN returns all records from the left table, and the matched records from the right table. RIGHT JOIN returns all records from the right table, and the matched records from the left table. FULL JOIN returns all records when there is a match in either left or right table.

How do I use multiple joins in SQL?

Multiple joins can be used in SQL by simply adding more JOIN statements in your SQL query. Each JOIN statement must have its own ON clause that specifies the conditions for the join. The order of the JOIN statements in your query can affect the results, so it’s important to carefully consider the logic of your query when using multiple joins.

What is a three-table join in SQL and how does it work?

A three-table join in SQL is a type of join where three tables are combined based on a related column between them. This is done by using two JOIN statements in your SQL query. The first JOIN statement combines the first two tables, and the second JOIN statement combines the result of the first join with the third table. The order of the JOIN statements and the conditions specified in the ON clauses determine how the tables are combined.

What is the difference between a LEFT JOIN and a RIGHT JOIN in SQL?

The main difference between a LEFT JOIN and a RIGHT JOIN in SQL is the order in which the tables are joined and the results that are returned. A LEFT JOIN returns all the records from the left table and the matched records from the right table. If there is no match, the result is NULL on the right side. On the other hand, a RIGHT JOIN returns all the records from the right table and the matched records from the left table. If there is no match, the result is NULL on the left side.

How can I visualize SQL Joins?

Visualizing SQL Joins can be done using Venn diagrams. Each circle in the diagram represents a table. The area where the circles overlap represents the records that are returned by the join. For an INNER JOIN, only the overlapping area is shown. For a LEFT JOIN, all of the left circle and the overlapping area are shown. For a RIGHT JOIN, all of the right circle and the overlapping area are shown. For a FULL JOIN, the entire area of both circles is shown.

What is a FULL JOIN in SQL and when should I use it?

A FULL JOIN in SQL returns all records when there is a match in either the left or the right table. If there is no match, the result is NULL on either side. This type of join is useful when you want to retain all records from both tables, even if there is no match between the columns being joined.

Can I use SQL Joins with more than two tables?

Yes, you can use SQL Joins with more than two tables. This is done by using multiple JOIN statements in your SQL query. Each JOIN statement combines two tables, so to join three tables, you would use two JOIN statements, to join four tables, you would use three JOIN statements, and so on.

What is the performance impact of using multiple joins in SQL?

Using multiple joins in SQL can have a significant impact on query performance, especially if the tables being joined have a large number of records. Each join operation requires processing power and memory, so the more joins you have in your query, the more resources are required to execute the query. However, the performance impact can be mitigated by using indexes, optimizing your query, and using appropriate hardware.

How can I optimize my SQL queries that use joins?

There are several ways to optimize your SQL queries that use joins. One way is to use indexes on the columns that are being joined. This can significantly speed up the join operation. Another way is to limit the number of records that are returned by the query by using the WHERE clause. Also, the order of the JOIN statements in your query can affect performance, so it’s important to carefully consider the logic of your query.

What are some common mistakes to avoid when using SQL Joins?

Some common mistakes to avoid when using SQL Joins include forgetting to specify the ON clause, which can result in a Cartesian product, joining on non-indexed columns, which can slow down query performance, and using the wrong type of join, which can result in incorrect results. It’s also important to remember that the order of the JOIN statements in your query can affect the results, so it’s important to carefully consider the logic of your query.

Zack WallaceZack Wallace
View Author

Zack Wallace is a hobbyist programmer and private web developer in Northern Arizona and has been in the IT field for over 12 years.

BrunoSdatabasefilterfilteringjoinjoinsmssqlmysqlsql
Share this article
Read Next
Get the freshest news and resources for developers, designers and digital creators in your inbox each week