Not the prettiest headline so far. I do admit that. Nevertheless it is actually quite relevant. Here is the question to be answered: If I have multiple exists joins in my select statement and one of them is a not exists how can I be certain that I will get the expected result set back from the SQL server.
Let us start off by lining up 3 tables:
The structure should be fairly obvious. Trans has 2 fields that refers to the sub tables. What I would like is to write a select statement that gives me all records from Trans where the GroupId is present in GroupTable and where the SubGroupId is not to be found in SubGroupTable.
This could be done like this:
select trans exists join subGroupTable.SubGroupId where subGroupTable.SubGroupId == trans.SubGroupId not exists join grouptable where grouptable.GroupId == trans.GroupId
Another way could be by turning it a bit upside down like this:
select trans not exists join grouptable where grouptable.GroupId == trans.GroupId exists join subGroupTable.SubGroupId where subGroupTable.SubGroupId == trans.SubGroupId
What happens when we look at what the SQL server receives as the TSQL-statement is a bit different than expected:
select GroupId, SubGroupId from Trans as t where EXISTS (SELECT 'x' FROM GroupTable as g where g.GroupId = t.GroupId and not (exists (select 'x' from SubGroupTable as s where s.SubGroupId = t.SubGroupId)))
select GroupId, SubGroupId from Trans as t where not (EXISTS (SELECT 'x' FROM SubGroupTable as s where s.SubGroupId = t.SubGroupId and (exists (select 'x' from GroupTable as g where g.GroupId = t.GroupId))))
What to notice is that in the first statement the EXISTS and NOT EXISTS are considered as one and in the second statement we now have 2 EXISTS that is encapsulated by a NOT giving it a bit of a twist compared to what we might have expected.
The result set for the first statement is
A is existing in SubGroupTable which was ruled out by our NOT EXISTS
B is in SubGroupTable and not in GroupTable which both goes against our EXISTS/NOT EXISTS
C is in SubGroupTable A was
D is present in GroupTable and not in SubGroupTable
and for the second the result set is
A is in both GroupTable and SubGroupTable and we specified by having two EXISTS’s and adding these to a NOT that this was a no go.
B is returned although being in SubGroupTable it was not in GroupTable making the double EXISTS false and thereby pleasing the NOT
C is not present in neither GroupTable nor SubGroupTable and therefore matching the NOT
D is – like B – returned since it only exists in GroupTable and not i SubGroupTable and the EXISTS’s then returning false
I realise that this can be a bit tricky to get the head around but nevertheless very important to keep in mind when doing those large selects with lots of tables and EXISTS/NOT EXISTS joins.