Hask Tables what is the difference in each case If your has
Hask Tables, what is the difference in each case ?
If your hash table size, n, was equal to 20 (or 40, or 2000), why might the following hash function not be a good idea: h(k) = 4k mod n?Solution
A hash table is a data structure which can help to implement an associative array i.e it can map keys and values.It uses a hash function to compute an index into an array of buckets or slots, from which the desired value can be found.
A critical statistic for a hash table is the load factor, defined as
load factor = n/k
where
As the load factor grows larger, the hash table becomes slower, and it may even fail to work (depending on the method used). The expected constant time property of a hash table assumes that the load factor is kept below some bound. For a fixed number of buckets, the time for a lookup grows with the number of entries and therefore the desired constant time is not achieved.
To answer the second parth as to why h(k) = 4k mod n is not a good idea we need to understand modular hashing.
With modular hashing, the hash function is simply h(k) = k mod n for some n (usually, the number of buckets). The value k is an integer hash code generated from the key. If n is a power of two (i.e., n=2p), then h(k) is just the p lowest-order bits of k.
Now we have, h(k) = 4k mod n which is an example of multiplicative hashing. This type of hashing is often misused.
In this, the hash index is computed as n * frac(ka). Here k is again an integer hash code, a is a real number and frac is the function that returns the fractional part of a real number. Multiplicative hashing sets the hash index from the fractional part of multiplying k by a large real number.
Multiplicative hashing works well for the same reason that linear congruential multipliers generate apparently random numbers—it\'s like generating a pseudo-random number with the hashcode as the seed. The multiplier a should be large and its binary representation should be a \"random\" mix of 1\'s and 0\'s. Multiplicative hashing is cheaper than modular hashing because multiplication is usually considerably faster than division (or mod). It also works well with a bucket array of sizem=2p, which is convenient.
In the fixed-point version, The division by 2q is crucial. The common mistake when doing multiplicative hashing is to forget to do it, and in fact you can find web pages highly ranked by Google that explain multiplicative hashing without this step.
