© 1990 by British Computer Society
| ||||||||||||||||||||||||||||||||||||||||||||||||||
Identification of Program Similarity in Large Populations
Department of Computer Science, University of New South Wales, PO Box 1, Kensington, NSW, Australia 2033
Various techniques for detecting similar programs in large classes have been proposed previously, but research in this area is hampered by the lack of a means for evaluating their performance. To address this deficiency, new concepts are introduced that permit the effectiveness of competing systems to be quantified and enable realistic comparisons to be made. Using these criteria, popular approaches to plagiarism detection based on counting program attributes are shown to be inadequate. A two-stage method of identifying similar pairs based on structural features is proposed, and the superior performance of this technique is established.
Received October 1988.
* Department of Computer Science, University of New South Wales, PO Box 1, Kensington, NSW, Australia 2033