Showing posts with label join. Show all posts
Showing posts with label join. Show all posts

Tuesday, December 06, 2016

Actual Chats: RIGHT OUTER joins in the wild

Colleague:
I think I just encountered my first Right join
*shocked* 
Me:
huh. weird.
a rare Right Join has appeared!
Colleague:
catch it!
Me:
You use ON. It's super effective! 
Colleague:
haha
how often do you see them? 
Me:
I've used RIGHT join maybe twice in 10 years. 
Colleague:
lol
Me:
just makes things difficult to read.
Colleague:
yea it took me a while
it was at the bottom of the query
i kept wondering what was bringing in the extra rows till I found it

And that's why, whenever possible, avoid using RIGHT OUTER. It breaks the nice top-to-bottom flow of your complicated FROM clauses, forces you to page-split your brain around data sets, and generally makes it more difficult for the next DBA to understand.

For this overly-simplistic example, these two queries on the WideWorldImporters database are identical and have the same execution plans:
SELECT  s.CustomerName
, pp.FullName
, ap.FullName 
FROM Sales.Customers AS s
LEFT OUTER JOIN [Application].People AS pp
ON s.PrimaryContactPersonID = pp.PersonID
LEFT OUTER JOIN [Application].People AS ap
ON s.AlternateContactPersonID = ap.PersonID
INNER JOIN Sales.CustomerCategories AS sc
ON s.CustomerCategoryID = sc.CustomerCategoryID 
SELECT  s.CustomerName
, pp.FullName
, ap.FullName
FROM [Application].People AS pp
RIGHT OUTER JOIN Sales.Customers AS s
ON s.PrimaryContactPersonID = pp.PersonID
LEFT OUTER JOIN [Application].People AS ap
ON s.AlternateContactPersonID = ap.PersonID
INNER JOIN Sales.CustomerCategories AS sc
ON s.CustomerCategoryID = sc.CustomerCategoryID

The first query allows you to do the easily relational math as you read. The second has you flipping the dataset before moving your eye back up to the top.

There are certainly legitimate albeit complicated cases for using RIGHT OUTER, but from a readability and maintainability, I'd prefer to stick with LEFT OUTER.

Monday, October 19, 2009

Mini-Lecture: Why I hate "comma joins"

A colleague of mine who is not a DBA asked about comma joins in an email. I replied with the below:

From:
Sent: Monday, October 19, 2009 3:32 PM
To:
Subject: RE: SQL question

Happy to help. Thanks for the hyperbole. :)


There is no functional difference, however, the first method – using a “comma join” is undesirable for a few reasons.

1) Using “comma joins” is old code and only supported at the pleasure of Microsoft. Its considered bad syntax. Its considered lazy. JOINS are the standard for TSQL. It may not be supported in future versions of SQL server.

2) It mixes your JOIN logic with your SELECTION logic. Imagine you added WHERE EMPID = ‘10’ to the sample query. Both functionally and in business logic, the EMPID = ‘10’ and the A.EMPLID = C.EMPLID are two very different statements. Functionally, it will probably work (not 100%), yet it is far more difficult to understand and maintain.

3) It is really difficult to read. Imagine a much larger query that has a bunch of commas in the FROM and then a tangle of statements in the WHERE Clause. Using JOINs and ONs are far easier to read for the user that comes behind you.

4) Screw up your WHERE clause and you’ll end up with a Cartesian product. Row count explosion.


Attached is a lab to demonstrate the various types of joins from a functionality standpoint.



From:
Sent: Monday, October 19, 2009 3:23 PM
To:
Subject: SQL question


Hey man, I have a SQL question for you.

What is the difference between this:

FROM A, C
WHERE A.EMPLID = C.EMPLID

And this:

FROM A
INNER JOIN C on A.EMPLID = C.EMPLID

And the benefit of one over the other?


Hope all is well with the “Grand SQL Wizard”!!!


(Yeah, I'm not sure either if that last bit was sarcasm or hyperbole either. It certainly wasn't genuine or factual. :) )

Here's that handy .sql script I keep around for demonstrating basic JOIN principles and syntax. Note that all of the syntax in this script is valid, but a lot of it is unacceptable and improper - that is sort of the point. This lab usually goes along with a short, stern lecture :) (see above) on why proper JOIN syntax is preferable.


DECLARE @T1 TABLE
(
tkey int
)

DECLARE @T2 TABLE
(
tkey int
)

INSERT INTO @t1 VALUES (1)
INSERT INTO @t1 VALUES (2)
INSERT INTO @t1 VALUES (3)
INSERT INTO @t2 VALUES (3)
INSERT INTO @t2 VALUES (4)
INSERT INTO @t2 VALUES (5)

SELECT * FROM @T1 T1 INNER JOIN @T2 T2 ON T1.tkey = T2.tkey
SELECT * FROM @T1 T1 JOIN @T2 T2 ON T1.tkey = T2.tkey
SELECT * FROM @T1 T1 LEFT OUTER JOIN @T2 T2 ON T1.tkey = T2.tkey
SELECT * FROM @T1 T1 RIGHT OUTER JOIN @T2 T2 ON T1.tkey = T2.tkey
SELECT * FROM @T1 T1 CROSS JOIN @T2 T2
SELECT * FROM @T1 T1 FULL OUTER JOIN @T2 T2 ON T1.tkey = T2.tkey
SELECT * FROM @T1 T1 , @T2 T2
SELECT * FROM @T1 T1 , @T2 T2 WHERE T1.tkey = T2.tkey


"Programmers can be lazy." -Larry Wall