...

/

Feature #1: Possible Matches

Feature #1: Possible Matches

Description

We are given a set of documents. Each document is submitted by a different individual. However, we suspect that some individuals may have copied from others. Given a plagiarised submitted document, we want to identify the number of documents with which there is a potential match. We have converted each document into a set of tokens based on their content. As mentioned previously, the students could have added dummy statements between the copied content to avoid identification. We’ll have to match the tokens of two students while taking into account that there can be dummy tokens that might not match. A potential match can occur if one token results in the subsequence of the other token. It is ...

Create a free account to view this lesson.

By signing up, you agree to Educative's Terms of Service and Privacy Policy