Plagiarism a la Mode: A Comparison of Automated Systems for Detecting Suspected Plagiarism
- 1 January 1996
- journal article
- Published by Oxford University Press (OUP) in The Computer Journal
- Vol. 39 (9) , 741-750
- https://doi.org/10.1093/comjnl/39.9.741
Abstract
Early automated systems for detecting plagiarism in student programs employed attribute counting techniques in their comparisons of program texts, while more recent systems use encoded structural information. Whales claims that the latter are more effective in their detection of plagiarisms than systems based on attribute counting. To explore the validity of these claims, a comparison is presented of five systems, two based on attribute counting and three using metrics based on structure. The major result of this study is that the systems based on structural information consistently equal or better the performance of systems based on attribute counting metrics. A second conclusion is that of the structure metric systems, one using approximate tokenization of input texts (YAP) is as effective as a system that undertakes a complete parse (Plague). Approximate tokenization offers a considerable reduction in the costs of porting to new languages. A distinction is also made between forms of plagiarism common among novice programmers and those employed by more experienced programmers.Keywords
This publication has 0 references indexed in Scilit: