Colloquium: Towards Statistical Queries over Distributed Private User Data

Computer Science Colloquium

"Towards Statistical Queries over Distributed Private User Data"

Paul Francis,  Max Planck Institute for Software Systems, Kaiserslautern, Germany

Davis Marksbury Building Theater

Abstract: Today the method du jour for statistical analysis of user behavior is to gather lots of user data, anonymize it (more-or-less), and then analyze that data.  The need for doing statistical analysis drives many companies to gather large amounts of user data, often without the users' awareness.  My research group at MPI-SWS has been exploring approaches for doing statistical analysis without gathering user data.  Rather, user data is kept on user devices, and queries are pushed to these devices.   The resulting answers are anonymized and fuzzed such that 1) no single party can associate data with individual users, and 2) the aggregate answers are differentially private.  In this talk, I will present a general approach that we will present in NSDI this year.  I will outline the shortcomings of this approach, and follow with some enhancements that scale better in specific applications domains, namely web analytics and behavioral advertising.

Host:  Ken Calvert/Jim Griffioen