Search Articles

Home / Articles

AA with Tf-Idf weight using ML

. Urmila Mahor & Aarti Kumar


Abstract

Authorship attribution is the process of automatically identifying a document's author by analysing their writing style. Its history is extensive, and there are many different applications for it. As online work increases, it plays a vital role in forensic science, the detection of plagiarism, conflicts between authors and research. Very helpful when two people contend ownership of the same material. This is a classification kind of problem. It is not in line with the goal of text categorization because it simply takes into account the author's erratic writing style, regardless of whether it uses text categorization techniques for text pre-processing. It is highly dependent on the writing style characteristics of the authors for the author attribution task to be successful. In order to determine a writer's writing technique, multiple researchers have suggested different kinds of characteristics, including language, syntax, semantics, and content-based features. Several of these features have been applied to categorize articles. In this study, Support vector machines, parametric and nonparametric techniques, supervised and unsupervised techniques, and TF and IDF with FW and stylometric characteristics were also used. With the English corpora, we ran a variety of experiments. We conducted numerous tests with various feature sets retrieved from the corpus using various classifiers, and we then enhanced our success rate by integrating these outcomes. Based on the feature sets we evaluated, we identified the classifiers that produce reliable results. Experimentally, success rates change dramatically when feature sets are combined. However, the models that are tested by support vector classifiers (SVC) with a BoWs, FWs, and Gaussian.

 

Index Terms- Sylometric Features, TW, BoWs, FWs, classification, Authorship Attribution.

Download :