Select values that occur multiple times

There are many cases when we want to select values that occur multiple times: customers that ordered the same product multiple times, users with the same email, and so on. We can do this with GROUP BY and HAVING.

Problem: Find users with the same password, because they should change it.

HAVING

Finding a password that occurs multiple times is easy: we group rows by password, we select the number of times each password occurs, and we use HAVING to exclude the passwords that only occur one time.

SELECT password_hash, COUNT(*) AS count
    FROM user
    GROUP BY password_hash
    HAVING count > 1;

JOIN with users

But the above example only returns the password hashes. This is not very useful. We also want to see which users use them. To obtain the usernames, we need to join this subquery to the user table.

SELECT u.username
    FROM (
        SELECT
                password_hash,
                COUNT(*) AS count
            FROM user
            GROUP BY password_hash
            HAVING count > 1
    ) AS p
    INNER JOIN user AS u
        ON p.password_hash = u.password_hash;

GROUP_CONCAT()

An alternative is to use GROUP_CONCAT() to obtain a comma-separated list of usernames that use the same password.

SELECT
        password_hash,
        COUNT(*) AS count,
        GROUP_CONCAT(username)
    FROM user
    GROUP BY password_hash
    HAVING count > 1;

This solution has some drawbacks:

  • While it may be a bit more readable for humans (if they’re interested in knowing which users share the same password), it may be a bit less convenient for applications.
  • GROUP_CONCAT() is not supported by all databases, and the syntax may vary. This will be the subject for a future article.