Not the prettiest headline so far. I do admit that. Nevertheless it is actually quite relevant. Here is the question to be answered: If I have multiple exists joins in my select statement and one of them is a not exists how can I be certain that I will get the expected result set back from the SQL server.
Let us start off by lining up 3 tables:
The structure should be fairly obvious. Trans has 2 fields that refers to the sub tables. What I would like is to write a select statement that gives me all records from Trans where the GroupId isn’t present in GroupTable and where the SubGroupId is found in SubGroupTable.
This could be done like this:
select trans exists join subGroupTable.SubGroupId where subGroupTable.SubGroupId == trans.SubGroupId not exists join grouptable where grouptable.GroupId == trans.GroupId
Another way could be by turning it a bit upside down like this:
select trans not exists join grouptable where grouptable.GroupId == trans.GroupId exists join subGroupTable.SubGroupId where subGroupTable.SubGroupId == trans.SubGroupId
What happens when we look at what the SQL server receives as the TSQL-statement is a bit different than expected:
select GroupId, SubGroupId from Trans as t where EXISTS (SELECT 'x' FROM GroupTable as g where g.GroupId = t.GroupId and not (exists (select 'x' from SubGroupTable as s where s.SubGroupId = t.SubGroupId)))
select GroupId, SubGroupId from Trans as t where not (EXISTS (SELECT 'x' FROM SubGroupTable as s where s.SubGroupId = t.SubGroupId and (exists (select 'x' from GroupTable as g where g.GroupId = t.GroupId))))
What to notice is that in the first statement the EXISTS and NOT EXISTS are considered as one and in the second statement we now have 2 EXISTS that is encapsulated by a NOT giving it a bit of a twist compared to what we might have expected.
The result set for the first statement is
A is existing in SubGroupTable which was ruled out by our NOT EXISTS
B is in SubGroupTable and not in GroupTable which both goes against our EXISTS/NOT EXISTS
C is in SubGroupTable A was
D is present in GroupTable and not in SubGroupTable
and for the second the result set is
A is in both GroupTable and SubGroupTable and we specified by having two EXISTS’s and adding these to a NOT that this was a no go.
B is returned although being in SubGroupTable it was not in GroupTable making the double EXISTS false and thereby pleasing the NOT
C is not present in neither GroupTable nor SubGroupTable and therefore matching the NOT
D is – like B – returned since it only exists in GroupTable and not i SubGroupTable and the EXISTS’s then returning false
I realise that this can be a bit tricky to get the head around but nevertheless very important to keep in mind when doing those large selects with lots of tables and EXISTS/NOT EXISTS joins.