MurmurHash3_x86_32() expects a seed parameter. What value should I use and what does it do?
2 Answers
The seed parameter is a means for you to randomize the hash function. You should provide the same seed value for all calls to the hashing function in the same application of the hashing function. However, each invocation of your application (assuming it is creating a new hash table) can use a different seed, e.g., a random value.
Why is it provided?
One reason is that attackers may use the properties of a hash function to construct a denial of service attack. They could do this by providing strings to your hash function that all hash to the same value destroying the performance of your hash table. But if you use a different seed for each run of your program, the set of strings the attackers must use changes.
See: Effective DoS on web application platform
There's also a Twitter tag for #hashDoS
-
5This is related to, but not exactly equivalent to, the idea of universal hashing: rather than having one hash function, you have an entire family (in this case, MurmurHash3 is the family, with each possible seed value giving you a particular function within that family). If you find that your input data happens to produce badly distributed hashes (e.g., because of an attack), you can choose a new random seed value and rehash the data; the data is unlikely to produce a bad distribution for your new seed value, so you beat the attack. Feb 11, 2012 at 15:42
-
@JohnBartholomew If the hash is used in a really large database, re-hashing everything may not be practical either. Jan 27, 2020 at 4:17
A value named seed
here stands for salt. Provide any random but private (to you app) data to it, so the hash function will give different results for the same data. This feature is used for example make a digest of you data to detect modifcation of original data by 3rd persons. They hardly can replicate the valid hash value until they know the salt you used.
Salt (or seed) is also used to prevent hash collisions for different data. For example, your data blocks A and B might produce the same hash: h(A) == h(B). But you can avoid this conflicting condition if provide some sort of additional data. Collisions are quite rare, but sometimes salt is a way to avoid them for the concrete set of data.
-
1Actually it is doubtfull. What the purpose of
sault
for non-cryptographic hash function?– Lol4t0Feb 11, 2012 at 15:33 -
6MurmurHash is a non-cryptographic hash function. It is not an appropriate choice for a secure message digest. Feb 11, 2012 at 15:34
-
'What the purpose of sault for non-cryptographic hash function?' It's to prevent collision attacks against e.g. hash tables which typically utilise fast, non-cryptographic hash functions. For real-world examples of why this can matter, look up 'hash flooding' or 'HashDoS' Jun 23, 2023 at 4:29