How Google tricks itself to protect Chrome user privacy

Google is testing a new approach to collecting data in its Chrome browser.
Stephen Shankland/CNET

It’s a sticky issue for software developers: how do you gather data about your product’s users without invading their privacy?

One solution, as embodied in a new Google open-source project called Rappor, is to have the software send data that you know is wrong.

That approach may seem counterintuitive, given how much effort data gatherers usually devote to screening out bad data. The key to Google’s approach, though, is a trick called randomized response that still lets the truth shine through, according a blog post Thursday by Úlfar Erlingsson, a manager in Google’s security research division.

Rappor allows “the forest of client data to be studied, without permitting the possibility of looking at individual trees,” according to a draft paper on Rappor.

Google is testing the approach in its Chrome browser. It’s been gathering data on what sites people set as their browser’s default homepage so Google can get a better handle on malware that tries to change people’s homepages. About 14 million users are participating in that study, drawn from the larger population of people who’ve agreed to let Chrome send usage data back to Google.

Why gather user data?

Software companies for years have benefited by gathering data from those who use their products: What’s the top cause of crashes? What software features are popular or not? What effect does an interface change have? How many users still have that older operating system?

Typically, software gathers that data and sends it to the software developer, which if it cares about privacy protection has the responsibility of “anonymizing” it so identifying details are removed. With Rappor, which stands for randomized aggregatable privacy-preserving ordinal response, the data is muddled before it’s even sent to Google.

Here’s how it works, using the paper’s example of surveying a population for sensitive information, their membership in the Communist party. A respondent first flips a coin. If it’s tails, they answer the question truthfully. If it’s heads, though, they say they they are a member regardless of whether they actually are. That muddles the “yes” responses, and another flip can muddle the “no” responses.

Random noise obscures individual data gathered with Google's Rappor project. The more participants in a study, the closer more closely the Rappor responses (light green) match the original values (dark green). — Random noise obscures individual data gathered with Google’s Rappor project. The more participants in a study, the more closely the Rappor responses (light green) match the original values (dark green).
Google

Statistical analysis, of course, can reveal what’s going on with the overall population as long as the tested population is big enough. In the case of Chrome, it’s vast: there are hundreds of millions of users, though many of them doubtless choose not to send usage data to Google.

Chrome homepage study

The Chrome homepage study reveals a bit more about extracting useful information from the raw data. With the 14 million users monitored, a particular homepage wasn’t visible in the statistics until at least 14,000 people were using it. And though the study found 8,616 different homepages being used, only 0.5 percent of them passed that threshold. That small number of Web pages — something less than 50 — were very commonly used, though, accounting for 85 percent of the choices people had made.

The randomized response technique has been around for decades, but one problem is that it can reveal personal information if the same person answers the same question repeatedly. The truth eventually shows through the random noise.

Google, though, says Rappor bypasses this problem. One of its accomplishments is “the elegant manner in which [it] protects the privacy of clients from whom data is collected repeatedly,” the paper said. Google describes the process, called “memoization,” in the paper, cautioning that even randomized data can show patterns over time.

Since it’s an open-source project, anyone can build Rappor into their own software. Google is encouraging that: It “puts control over client’s data back into their own hands.”

Joggingvideo.com Lifestyle, Culture, Relationships, Food, Travel, Entertainment, News and New Technology News

How Google tricks itself to protect Chrome user privacy

Related Articles

Related Links

Why gather user data?

Chrome homepage study

Check Also

14 Hidden iPhone Features You Should Really Know About

Leave a Reply Cancel reply

Google Rescinds 5 Million Item Limit on Drive

Apple Watch SE Is Back Down to Just $219 at Amazon

Do Social Security Recipients Have to File Income Taxes?

Waiting for a Tax Refund? How to Tell Exactly When Your IRS Money Will Arrive

Motorola Edge 40 Pro Brings 125W Charging to Premium Android Phones

13,000+ People Have Bought Our Theme

Which Company Would You Choose?

Nexus 6 review

P2P fingerprinter to get data from label

Google Rescinds 5 Million Item Limit on Drive

13,000+ People Have Bought Our Theme

Which Company Would You Choose?

Nexus 6 review

P2P fingerprinter to get data from label

Google Rescinds 5 Million Item Limit on Drive

TripAdvisor acquires maker of personal journal app Rove

Shaving an iPhone 7 down with power tools gives it a fresh look

The race for 5G, and a promise to keep it cheap

Samsung Galaxy Ace 2 now available from Three

Google rumored to launch two new Nexus phones this year

Google Rescinds 5 Million Item Limit on Drive

Apple Watch SE Is Back Down to Just $219 at Amazon

Do Social Security Recipients Have to File Income Taxes?

Waiting for a Tax Refund? How to Tell Exactly When Your IRS Money Will Arrive

Motorola Edge 40 Pro Brings 125W Charging to Premium Android Phones