There are many cases when we want to select values that occur multiple times: customers that ordered the same product multiple times, users with the same email, and so on. We can do this with GROUP BY
and HAVING
.
Problem: Find users with the same password, because they should change it.
HAVING
Finding a password that occurs multiple times is easy: we group rows by password, we select the number of times each password occurs, and we use HAVING
to exclude the passwords that only occur one time.
SELECT password_hash, COUNT(*) AS count
FROM user
GROUP BY password_hash
HAVING count > 1;
JOIN with users
But the above example only returns the password hashes. This is not very useful. We also want to see which users use them. To obtain the usernames, we need to join this subquery to the user table.
SELECT u.username
FROM (
SELECT
password_hash,
COUNT(*) AS count
FROM user
GROUP BY password_hash
HAVING count > 1
) AS p
INNER JOIN user AS u
ON p.password_hash = u.password_hash;
GROUP_CONCAT()
An alternative is to use GROUP_CONCAT()
to obtain a comma-separated list of usernames that use the same password.
SELECT
password_hash,
COUNT(*) AS count,
GROUP_CONCAT(username)
FROM user
GROUP BY password_hash
HAVING count > 1;
This solution has some drawbacks:
- While it may be a bit more readable for humans (if they’re interested in knowing which users share the same password), it may be a bit less convenient for applications.
GROUP_CONCAT()
is not supported by all databases, and the syntax may vary. This will be the subject for a future article.