Okay, that title is a little bit of a lie — unless you are a first grader who happens to know a lot about data privacy or Data Science, in which case I’m happy to recommend some excellent bedtime reading.
Now that differential privacy is once again a trending topic, it’s become the subject of dinner table conversations. Our research team is used to fielding questions on privacy-enhancing technologies, but never like this. After a little bit of a struggle, I think I have a solution. If you have been looking for a way to explain differential privacy to your first grader — as you undoubtedly are — then look no further!
Bobby has a crush on Aly. He wants to ask her to eat lunch with him. Of course, not without asking a statistically significant — err, uh big — group of his closest friends what they think he should do first. Bobby’s friends are horrified at the thought of having to give him advice. Bobby can be sensitive, and if it doesn’t work out, he won’t just be upset with the situation. He will be upset with his friends too.
They all meet up at the monkey bars to discuss the question — should Bobby ask Aly to eat lunch with him?
“Let’s just tell him what we think,” says Tom.
“What if I say yes and then Aly tells Bobby no? He’ll blame me for the rejection! Not saying which way I’d vote, but I don’t want Bobby upset with me,” complains Daryl.
“Upset with you!? I agree voting is a terrible idea. He thinks I like Aly. If I tell him not to, he’s going to accuse me of having ulterior motives!” shouts Charlie.
“I know!” says Carey, “what if we all privately vote and just report the total numbers of yes and no answers?”
“That won’t work. What if we all give the same answer? He’ll know how we voted!”
“Are we really all going to vote the same way?”
“Probably not, but I would guess most of us are leaning toward no. If he knows only a handful of us said yes, he won’t be able to let it go. We’ll either be interrogated endlessly, or he’s going to unfriend us all.”
“Or someone could crack and tell him how the rest of us voted!” Daryl says, glaring at Charlie.
“Can’t we just pick like eight people at random?”
“Yeah, and who’s going to sign up for that?! Plus, he’s going to ask us how we voted, and the math better work out,” says Tom, who’s playing with six-sided dice.
Carey says, “We all have to vote, but maybe we don’t all have to vote.”
“Lay off the glue, Carey!” shouts Tom.
“No, I’m serious,” Carey says, snatching one of Tom’s dice, “Mind if I borrow that?”
“Have it. I stole it anyway.”
Carey describes her plan, which essentially boils down to secretly rolling the die and answering yes if you roll 1 or 2, no if you roll 3 or 4, or speaking your mind to Bobby if you roll 5 or 6.
She explains, “Imagine we do this. The odds of being forced to answer yes are the same as being forced to answer no. The false answers will cancel out. So if there are more noes, then the noes have it.”
“But only 1 in 3 of us will really vote!” protests Daryl.
“That’s the beauty of it! Most of us didn’t vote. Bobby can’t accuse you of failing to vote whichever way he wanted you to — because you probably didn’t vote at all. Better still, he has absolutely no way of knowing who the true voters are!” says Carey, who continues, “And if he accuses you of anything, you can just tell him that your vote was determined for you.”
The group agrees. Aly is spared, and everyone finds new friends.
For the adults in the room, differential privacy is a mathematical framework for summarizing sensitive data in a manner that protects the privacy of data set participants. In our first-grade scenario, the group collectively votes while individuals are afforded plausible deniability of the content of their votes, freeing them from adverse personal consequences.
Our research team is continuously tracking the latest mathematical techniques for privacy-enhancing technologies. These are helpful whether you’re a data scientist, compliance professional, passer-by, or pushy first grader.