© 1986 by British Computer Society
Use of Mean Distance Between Overflow Records to Compute Average Search Lengths in Hash Files with Open Addressing
Computer Science Department, University of Calgary, 250 University Drive, N.W. Calgary, Alberta, Canada, T2N 1N4
Average search lengths for hash files with open addressing have been computed using the well-known Poisson distribution for the number of addresses assigned x records, and a new expression for the mean distance between overflow records overflowing from a common home address. The method involves computing first the number of disk accesses required to randomly retrieve the y records overflowing from any home address, using a knowledge of the mean distance between overflow records on the disk. The Poisson distribution is then used to obtain the total disk accesses required to retrieve all records in the file, from which the average search length, as total accesses divided by total records, may be deduced. The average search length values obtained agree closely with experimental results. Because it also dispenses with the complex mathematics of existing methods, this new method can be recommended for use in practical design situations. A by-product is that values for the mean distances between overflow records for different loading factors and addresses capacities are also predicted.
Received February 1985.
* Computer Science Department, University of Calgary, 250 University Drive, N.W. Calgary, Alberta, Canada, T2N 1N4