SQL: How to find the diff between two record sets
In this article, I will show you how to subtract one result set from another. The database I will use is SQLite which supports the EXCEPT…
In this article, I will show you how to subtract one result set from another. The database I will use is SQLite which supports the EXCEPT operator for this task, in other databases like Oracle the operation to achieve the same task is the MINUS operator which does not differ in usage from the EXCEPT, Let's start!
Assume that we have the table employees with the following data
sqlite> select * from employees;
Name Age Year
------- --- ----
Alice 28 2024
Bob 34 2024
Charlie 22 2024
David 45 2024
Eve 29 2024
Alice 28 2023
Bob 34 2023
Charlie 22 2023
David 45 2023
Jones 37 2023Our task is to create a set of employees for the year 2024 and one for the year 2023 and find the different records of each set.
Queries
Running the following query generates one result, which is correct the only difference of the set with the year 2024 with columns Name,Age vs the one of the year 2023 is The record with the Name “Eve”
sqlite> select Name,Age from employees where Year=2024
...> except
...> select Name,Age from employees where Year=2023;
Name Age
---- ---
Eve 29To find the opposite, the difference of the set with the year 2023 vs the set with the year 2024 you swap the order of select statements
sqlite> select Name,Age from employees where Year=2023
...> except
...> select Name,Age from employees where Year=2024;
Name Age
----- ---
Jones 37Wasn't that difficult right? :) Now let's see a trick! assume that they asked for a combined report that will contain the diff between Year 2024 — Year 2023 and Year 2023 — Year 2024, sound complicated? well, it is not! there is a cool command called UNION that can help us achieve the task
sqlite> SELECT * FROM (SELECT Name, Age FROM employees WHERE Year=2023
(x1...> EXCEPT
(x1...> SELECT Name, Age FROM employees WHERE Year=2024)
...> UNION
...> SELECT * FROM (SELECT Name, Age FROM employees WHERE Year=2024
(x1...> EXCEPT
(x1...> SELECT Name, Age FROM employees WHERE Year=2023);
Name Age
----- ---
Eve 29
Jones 37You might wonder, why we need to explicitly SELECT each set. The reason is that if we don't do this the UNION command gets confused in the order of execution.
Conclusion
The EXCEPT operator is a powerful operation that can be used in many cases, one of my favorites is validating two reports that supposedly should be the same, one thing to note is that the columns between the two sets must have the same number of columns and their corresponding data types be compatible. I hope you found this article useful! :)