Getting a list of unique values from a column is easy. But if the same type of values are written on multiple columns and we want to eliminate duplicates, this requires a little trick.
Example: Get all unique email addresses from the columns email1 and email2.
Getting distinct values from a single column
In a perfect world, all emails would be stored in one column. Getting an ordered list of unique emails would be simple:
SELECT DISTINCT email FROM user_contact ORDER BY email;
Getting distinct values from a UNION
But in the real world, production tables often have columns email and email2. This happens because initially only one email is allowed, but at some point a second email needs be added, and developers try to do it with a minimal change to the current database design.
The query
To get ordered unique emails from multiple columns, we need to do this:
(SELECT email FROM user)
UNION DISTINCT
(SELECT email2 AS email FROM user)
ORDER BY email;
How UNION works
The query works in this way:
- Lists of the emails from the two columns are retrieved by the two subqueries;
UNION DISTINCT
concatenates the two returned columns, forming a single resultset that consists of one column; at the same time it removes duplicate emails;ORDER BY
defined the order.