PlaM-DeP: a modular platform for the development and evaluation of academic plagiarism detection algorithms

Main Article Content

Hernán Fajardo Heras
Manuel Barrera Maura
Vladimir Robles Bykbaev
Cristian Timbi Sisalima
Eduardo Calle Ortiz

Abstract

In this paper we present a software platform model to develop and evaluate plagiarism detection algorithms. The platform is based in a scalable modular design, and implements several services to perform automatically the following tasks: syntactic and semantic analysis through WordNet and Freeling, automatic text extraction of multiple file formats (PDF, Word and text), web page content extraction (using some search engines like Google, Yandex, Yahoo, Bing), and storage, load and use of plagiarism detection algorithms. These services allow a programmer to develop a code focusing the effort on the design of the algorithm and the mathematical/statistical basis. The platform was tested using several text queries (n-grams), and currently the performance results are promising.