Mining Multiple Large Data Sources
Animesh Adhikari1, Pralhad Ramachandrarao2, Bhanu Prasad3, and Jhimli Adhikari4
1Department of Computer Science, S. P. Chowgule College, India
2Department of Computer Science and Technology, Goa University, India
3Department of Computer and Information Sciences, Florida A&M University, USA
4Department of Computer Science, Narayan Zantye College, India
1Department of Computer Science, S. P. Chowgule College, India
2Department of Computer Science and Technology, Goa University, India
3Department of Computer and Information Sciences, Florida A&M University, USA
4Department of Computer Science, Narayan Zantye College, India
Abstract: Effective data analysis using multiple databases requires highly accurate patterns. Local pattern analysis might extract low quality patterns from multiple large databases. Thus, it is necessary to improve mining multiple databases using local pattern analysis. We present existing specialized as well as generalized techniques for mining multiple large databases. We formalize the idea of multi-database mining using local pattern analysis and propose a new generalized technique for mining multiple large databases. It improves the quality of synthesized global patterns significantly. We conduct experiments on both real and synthetic databases to judge the effectiveness of the proposed technique.
Keywords: Multi-database mining, pipelined feedback technique, synthesis of patterns.
Received December 21, 2008; accepted February 8, 2009