SQL Server has at least three functions to create checksums or hashes:
HASHBYTES(). Here we discuss what differences exist between them, and how to choose the proper function for a specific use case.
Let’s explain the functions briefly.
Returns a checksum of a value, a list of values, or a whole row.
This function is recommended by the SQL Server documentation to create hash indexes. The following example results in a hash index on all the table rows:
ALTER TABLE user ADD row_checksum AS CHECKSUM(*); GO CREATE INDEX idx_row_checksum ON user (row_checksum); GO
Two lists of values return the same result if their types, values are the same, and they are specified in the same order. Otherwise, they will usually return different values. Similarly, changing a value will usually change its checksum.
This function does not accept these types:
NULLs are considered equal when they are of the same type.
Case-sensitivity of the strings depends on server configuration.
The dash character is ignored for
CHECKSUM(), but it is more precise. As a consequence, it is not as fast but when applied to different values it is more likely to return different outputs.
Documented reasons why the collisions are less likely include:
BINARY_CHECKSUM()is always case-sensitive;
CHECKSUM()ignores dashes, whereas
Also, it accepts
HASHBYTES() is more precise and less fast then BINARY_CHECKSUM().
It returns a hash using the specified algorithm (the argument is mandatory). Supported algorithms are:
It works with all data types.
NULL in the same way as
SELECT HASHBYTES('SHA2_512', secret) FROM dbo.user WHERE id = 24;
Choosing a function
Generic rules of thumb:
- Some data types restrict your choice.
- If you need to generate hashes that can be compared with hashes generated outside of SQL Server, use
- In other cases, it’s generally fine to start with
- If case-sensitivity and dashes matter for your specific case, or if the collisions are too many, switch to
- If the collisions are still too many, switch to
HASHBYTES()algorithms provide different collision probability and different performance, but this is beyond the scope of this article.
SQL Server documentation: