Privacy-Preserving Data Mining in Homogeneous Collaborative Clustering
Mohamed Ouda, Sameh Salem, Ihab Ali and El-Sayed Saad
Department of Communication Electronics and Computer
Engineering, Helwan University, Egypt
Abstract: Privacy concern has become an important issue in data mining. In this paper, a novel algorithm for privacy preserving in distributed environment using data clustering algorithm has been proposed. As demonstrated, the data is locally clustered and the encrypted aggregated information is transferred to the master site. This aggregated information consists of centroids of clusters along with their sizes. On the basis of this local information, global centroids are reconstructed then it is transferred to all sites for updating their local centroids. Additionally, the proposed algorithm is integrated with Elliptic Curve Cryptography (ECC) public key cryptosystem and Diffie-Hellman Key Exchange. The proposed distributed encrypted scheme can add an increase not more than 15% in performance time relative to distributed non encrypted scheme but give not less than 48% reduction in performance time relative to centralized scheme with the same size of dataset. Theoretical and experimental analysis illustrates that the proposed algorithm can effectively solve privacy preserving problem of clustering mining over distributed data and achieve the privacy-preserving aim.
Keywords: Privacy-preserving; secure multi-party computation; k-means clustering algorithm.
Received December 20, 2013; accepted April 4, 2013