Then join this exciting data privacy competition with up to $150,000 in prizes, where participants will propose a mechanism to enable the protection of personally identifiable information while maintaining a dataset's utility for analysis using differentially private synthetic data generation tools.
Our increasingly digital world turns almost all our daily activities into data collection opportunities, from the more obvious entry into a webform to connected cars, cell phones, and wearables. Dramatic increases in computing power and innovation over the last decade along with both public and private organizations increasingly automating data collection make it possible to combine and utilize the data from all of these sources to complete valuable research and data analysis.
Due to the sensitive nature of information contained in these types of datasets, and the risk of individuals being identifiable even in anonymized data, these datasets can’t easily be made available to analysts and researchers. Differentially private synthetic data generation solves this problem by producing new, artificial data that can serve as a practical replacement for the original sensitive data.
The “Differential Privacy Synthetic Data Challenge” will entail a sequence of three marathon matches run on the Topcoder platform, asking contestants to design and implement their own synthetic data generation algorithms, mathematically prove their algorithm satisfies differential privacy, and then enter it to compete against others’ algorithms on empirical accuracy over real data, with the prospect of advancing research in the field of Differential Privacy.
Competitors may enter the contest at any point to participate between November 2018 and April 2019. Topcoder will be running three separate Marathon Matches and will bring the registrations from previous matches to the next matches.
If you’re not a differential privacy expert, and you’d like to learn, we’ll have tutorials to help you catch up and compete!