Skip Navigation

The Computer Journal 2005 48(2):168-179; doi:10.1093/comjnl/bxh074
This Article
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (2)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Chang, C.-C.
Right arrow Articles by Lin, C.-Y.
Right arrow Search for Related Content
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2005. Published by Oxford University Press on behalf of The British Computer Society. All rights reserved. For Permissions, please email: journals.permissions@oupjournals.org

Perfect Hashing Schemes for Mining Association Rules

Chin-Chen Chang * and Chih-Yang Lin

Department of Computer Science and Information Engineering, National Chung Cheng University, Chaiyi, Taiwan 621, Republic of China

Hashing schemes are widely used to improve the performance of data mining association rules, as in the DHP algorithm that utilizes the hash table in identifying the validity of candidate itemsets according to the number of the table's bucket accesses. However, since the hash table used in DHP is plagued by the collision problem, the process of generating large itemsets at each level requires two database scans, which leads to poor performance. In this paper we propose perfect hashing schemes to avoid collisions in the hash table. The main idea is to employ a refined encoding scheme, which transforms large itemsets into large 2-itemsets and thereby makes the application of perfect hashing feasible. Our experimental results demonstrate that the new method is also efficient (about three times faster than DHP), and scalable when the database size increases. We also propose another variant of the perfect hash scheme with reduced memory requirements. The properties and performances of several perfect hashing schemes are also investigated and compared.


Received 3 December 2003. revised 1 November 2004.

* Email: ccc{at}cs.ccu.edu.tw


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Journal of Information ScienceHome page
J.-J. Shen, C.-C. Chang, and Y.-C. Li
Combined association rules for dealing with missing values
Journal of Information Science, August 1, 2007; 33(4): 468 - 480.
[Abstract] [PDF]



Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.