Because they bring together information from two tables, joins are perhaps the
most powerful feature of SQL. Database engines can have dozens of algorithms
just for this one key word. A lot of programming and algorithms are hidden
beneath this simple construct.
As with anything powerful, joins need to be used carefully—not sparingly, but
carefully. It is very easy to make mistakes using joins, especially the following two:
“Mistakenly” losing rows in the result set, and
“Mistakenly” adding unexpected additional rows.
Whenever joining tables, it is worth asking whether either of these could be
happening. These are subtle questions because the answer depends on the data
being processed, not on the syntax of the expression itself. There are examples of
both problems throughout the book.
This discussion is about what joins do rather than about the multitude of
algorithms for implementing them (although the algorithms are quite interesting
—to some people—they donot help us understand customers and data). The most
general type of join is the cross-join. The discussion then explains the more
common variants: look up joins, equijoins, nonequijoins, and outer joins.
Whenever joining two tables, ask yourself the following two questions:
1. Could one of the tables accidentally be losing rows because there are no
matches in the other table?
2. Could the result set unexpectedly have duplicate rows due to multiple
matches between the tables?
The answers require understanding the underlying data.