A Novel Approach of Clustering Documents: Minimizing Computational Complexities in Accessing Database Systems

IAJIT
Archive
- Volume 20, 2023
  
  January 2023, No.1
  
  March 2023, No.2
- Volume 19, 2022
  
  January 2022, No.1
  
  March 2022, No.2
  
  May 2022, No. 3
  
  Special Issue 2022, No. 3A
  
  July 2022, No. 4
  
  September 2022, No. 5
  
  November 2022, No. 6
- Volume 18, 2021
  
  January 2021, No.1
  
  March 2021, No.2
  
  May 2021, No. 3
  
  Special Issue 2021, No. 3A
  
  July 2021, No. 4
  
  September 2021, No. 5
  
  November 2021, No. 6
- Volume 17, 2020
  
  January 2020, No.1
  
  March 2020, No. 2
  
  May 2020, No. 3
  
  July 2020, No. 4
  
  Special Issue 2020, No. 4A
  
  September 2020, No. 5
  
  November 2020, No. 6
- Volume 16, 2019
  
  January 2019, No.1
  
  March 2019, No. 2
  
  May 2019, No. 3
  
  Special Issue 2019, No. 3A
  
  July 2019, No. 4
  
  September 2019, No. 5
  
  November 2019, No. 6
- Volume 15, 2018
  
  January 2018, No.1
  
  March 2018, No. 2
  
  May 2018, No. 3
  
  Special Issue 2018, No. 3A
  
  July 2018, No. 4
  
  September 2018, No. 5
  
  November 2018, No. 6
- Volume 14, 2017
  
  January 2017, No.1
  
  March 2017, No. 2
  
  May 2017, No. 3
  
  July 2017, No. 4
  
  pecial Issue 2017, No. 4A
  
  September 2017, No 5
  
  November 2017, No. 6
- Volume 13, 2016
  
  January 2016. No.1
  
  March 2016, No. 2
  
  May 2016, No. 3
  
  July 2016, No.4
  
  September 2016, No.5
  
  November 2016, No.6
- Volume 12, 2015
  
  January 2015. No.1
  
  March 2015. No.2
  
  May 2015. No.3
  
  July 2015. No.4
  
  September 2015. No. 5
  
  November 2015. No. 6
  
  December 2015. No. 6A
- Volume 11, 2014
  
  January 2014, No.1
  
  March 2014, No.2
  
  May 2014, No.3
  
  July 2014, No.4
  
  November 2014, No.6
- Volume 10, 2013
  
  January 2013, No. 1
  
  March 2013, No.2
  
  May 2013, No. 3
  
  July 2013, No. 4
  
  November 2013, No. 6
- Volume 9, 2012
  
  January 2012, No. 1
  
  March 2012, No. 2
  
  May 2012, No. 3
  
  July 2012, No. 4
  
  September 2012, No. 5
  
  November 2012, No. 6
- Volume 8, 2011
  
  January 2011, No 1
  
  April 2011, No. 2
  
  July 2011, No. 3
  
  October 2011, No.4
- Volume 7, 2010
  
  October 2010, No. 4
  
  July 2010, No. 3
  
  April 2010, No. 2
  
  January 2010, No. 1
- Volume 6, 2009
  
  January 2009, No. 1
  
  April 2009, No. 2
  
  July 2009, No. 3
  
  October 2009, No. 4
  
  November 2009, No. 5
- Volume 5, 2008
  
  January 2008, No. 1
  
  April 2008, No.2
  
  July 2008, No. 3
  
  October 2008, No. 4
- Volume 4, 2007
  
  January 2007, No. 1
  
  April 2007, No. 2
  
  July 2007, No.3
  
  October 2007, No. 4
- Volume 3, 2006
  
  January 2006, No. 1
  
  April 2006, No. 2
  
  July 2006, No. 3
  
  October 2006, No. 4
- Volume 2, 2005
  
  January 2005, No. 1
  
  April 2005, No. 2
  
  July 2005, No. 3
  
  October 2005, No. 4
- Volume 1, 2003-2004
  
  July 2004, No. 2
  
  January 2004, No. 1
  
  July 2003, No. 0
About IAJIT
About CCIS
IAJIT Impact Factor

A Novel Approach of Clustering Documents: Minimizing Computational Complexities in Accessing Database Systems

Written by Ghadeer
Update: 30/06/2022

font size decrease font size increase font size
Print
Email
Rate this item
- 1
- 2
- 3
- 4
- 5
(0 votes)

A Novel Approach of Clustering Documents: Minimizing Computational Complexities in Accessing Database Systems

Mohammed Alghobiri

Department of Management Information Systems

King Khalid University, Saudi Arabia

maalghobiri@kku.edu.sa

Khalid Mohiuddin

Department of Management Information Systems

King Khalid University, Saudi Arabia

kalden@kku.edu.sa

Mohammed Abdul Khaleel

Department of Computer Science

King Khalid University, Saudi Arabia

mkhlel@kku.edu.sa

Mohammad Islam

Department of Management Information Systems

King Khalid University, Saudi Arabia

maleslam@kku.edu.sa

Samreen Shahwar

Department of Information Systems

King Khalid University, Saudi Arabia

smrin@kku.edu.sa

Osman Nasr

Department of Management Information Systems

King Khalid University, Saudi Arabia

oanassr@kku.edu.sa

Abstract: This study addresses the real-time issue of managing an academic program's documents in a university environment. In practice, document classification from a corpus is challenging when the dataset size is large, and the complexity increases if to meet some specific document management requirements. This study presents a practical approach to grouping documents based on a content similarity measure. The approach analyzes the state-of-the-art clustering algorithms performance, considers Hamiltonian graph properties and a distance function. The distance function measures (1) the content similarity between the documents and (2) the distances between the produced clusters. The proposed algorithm improves clusters’ quality by applying Hamiltonian graph properties. One of the significant characteristics of the proposed function is that it determines document types from the corpus. Hence, this does not require the initial assumption of cluster number before the algorithm execution. This approach omits the arbitrary primordial option of k-centroids of the k-means algorithm, reduces computational complexities, and overcomes some limitations of commonly practicing clustering algorithms. The proposed approach enables an effective way of document organization opportunities to the information systems developers when designing document management systems.

Keywords: Clustering algorithms, document categorization, document clustering, hamiltonian graph, similarity measure.

Received July 11, 2020; accepted February 21, 2021
https://doi.org/10.34028/iajit/19/4/6

Full text

Read 1123 times

Published in July 2022, No. 4

Tagged under

Share

Ghadeer

Latest from Ghadeer

More in this category: « Person-Independent Emotion and Gender Prediction (EGP) System Using EEG Signals MiNB: Minority Sensitive Naïve Bayesian Algorithm for Multi-Class Classification of Unbalanced Data »