There are many cases when we want to select values that occur multiple times: customers that ordered the same product multiple times, users with the same email, and so on. We can do this with
GROUP BY and
Problem: Find users with the same password, because they should change it.
Finding a password that occurs multiple times is easy: we group rows by password, we select the number of times each password occurs, and we use
HAVING to exclude the passwords that only occur one time.
SELECT password_hash, COUNT(*) AS count FROM user GROUP BY password_hash HAVING count > 1;
JOIN with users
But the above example only returns the password hashes. This is not very useful. We also want to see which users use them. To obtain the usernames, we need to join this subquery to the user table.
SELECT u.username FROM ( SELECT password_hash, COUNT(*) AS count FROM user GROUP BY password_hash HAVING count > 1 ) AS p INNER JOIN user AS u ON p.password_hash = u.password_hash;
An alternative is to use
GROUP_CONCAT() to obtain a comma-separated list of usernames that use the same password.
SELECT password_hash, COUNT(*) AS count, GROUP_CONCAT(username) FROM user GROUP BY password_hash HAVING count > 1;
This solution has some drawbacks:
- While it may be a bit more readable for humans (if they’re interested in knowing which users share the same password), it may be a bit less convenient for applications.
GROUP_CONCAT()is not supported by all databases, and the syntax may vary. This will be the subject for a future article.