Monday, October 12, 2009

Hash Function

The idea of using the key from a large set K of keys to determine the address/location of a record into a smaller set L of table addresses leads to the form of a hash function H. A function that performs this job, such as

H (K) = L

is called a hash function or hashing function. Another way to describe hashing is as a key-to-address transformation in which the keys map to addresses in a list. This mapping transformation is shown in Figure 2. At the top of Figure 2 is a general representation of the hashing concept. The rest of the figure shows how three keys might hash to three different addresses in the list. There are various methods or functions used to map key to addresses, namely Modulo-Division Method, Midsquare Method, and Folding Method etc.

There are few criteria and purposes for hashing of keys. One is that hashing function should be very easy, simple and quick to compute. Second is to map a large possible set of keys to a smaller set of addresses in table. Usually the population of possible keys is much larger than the allocated table size. Thirdly we would like hashed addresses be distributed evenly on the smaller set so that there are a minimum number of collisions. Collision elimination is not guaranteed. Uneven distribution will cause clustering which is not desirable. Fourthly hashing allow direct access to a specific location in the table where the key of the record has been hashed. Its running time efficiency is O(1).

A hash table is a list of records in which keys are mapped by a hash function. It is similar to table, T, described in section 1.0, except here table entries T (K, I) are generated by a appropriate hash functions. A hash table generated by a perfect hash has no collisions. In this case usually all possible keys must be known before hand. This is also known as optimal hashing or perfect hashing.

No comments:

Post a Comment