Privacy preserving regression modelling via distributed computation
- 22 August 2004
- proceedings article
- Published by Association for Computing Machinery (ACM)
- p. 677-682
- https://doi.org/10.1145/1014052.1014139
Abstract
Reluctance of data owners to share their possibly confidential or proprietary data with others who own related databases is a serious impediment to conducting a mutually beneficial data mining analysis. We address the case of vertically partitioned data -- multiple data owners/agencies each possess a few attributes of every data record. We focus on the case of the agencies wanting to conduct a linear regression analysis with complete records without disclosing values of their own attributes. This paper describes an algorithm that enables such agencies to compute the exact regression coefficients of the global regression equation and also perform some basic goodness-of-fit diagnostics while protecting the confidentiality of their data. In more general settings beyond the privacy scenario, this algorithm can also be viewed as method for the distributed computation for regression analyses.Keywords
This publication has 7 references indexed in Scilit:
- Privacy-Preserving Multivariate Statistical Analysis: Linear Regression and ClassificationPublished by Society for Industrial & Applied Mathematics (SIAM) ,2004
- Privacy-preserving k -means clustering over vertically partitioned dataPublished by Association for Computing Machinery (ACM) ,2003
- Tools for privacy preserving distributed data miningACM SIGKDD Explorations Newsletter, 2002
- Privacy preserving association rule mining in vertically partitioned dataPublished by Association for Computing Machinery (ACM) ,2002
- Secure multi-party computation problems and their applicationsPublished by Association for Computing Machinery (ACM) ,2001
- Distributed Multivariate Regression Using Wavelet-Based Collective Data MiningJournal of Parallel and Distributed Computing, 2001
- An efficient method for finding the minimum of a function of several variables without calculating derivativesThe Computer Journal, 1964