You are here: silicon.com > Networks > WebWatch

WebWatch

Google has data-retention change of heart

No more indefinite search logs?

Tags: google, search logs, data retention

By Elinor Mills

Published: 15 March 2007 08:40 GMT

Google is changing its data retention practices to make it harder to identify the specific computers used in searches.

Google's servers log information every time someone conducts a web search, keeping data such as the keywords used, the IP address or unique number assigned to that person's computer, and information from web cookies, which are small bits of data exchanged between a server and a web browser each time the browser accesses the server. Cookies are used to authenticate the user and maintain information such as the user's site preferences.

Currently, Google maintains the search data logs indefinitely. Under the new policy, which it expects to have fully implemented by the end of the year, the company will make the final eight bits of the IP address and the cookie data anonymous after somewhere between 18 and 24 months, unless legally required to retain the data for longer. The information on specific searches will remain indefinitely but it will be much harder to tie searches to specific individuals or computers.

The company said: "Logs anonymisation does not guarantee that the government will not be able to identify a specific computer or user but it does add another layer of privacy protection to our users' data."

The policy change will apply to future web search data as well as archived logs and all copies of the data stored on other servers, Google said. Users will be able to opt out of the practice and request that their search data be maintained indefinitely.

Privacy advocates in general said Google's policy change is a step in the right direction but not nearly enough to really protect web searchers from overzealous law enforcers. Keeping the search histories could enable investigators and governments to get to all sorts of personal information about people, they argue.

Marc Rotenberg, executive director of the Electronic Privacy Information Center, said: "I don't think the Google proposal is adequate. This period is too long and it's not in fact data destruction, it's more data de-identification, and that should be happening in 18 to 24 hours, not months. I'm not persuaded that this isn't still a ticking time-bomb for Google's search engine."

Richard M Smith, an internet security and privacy consultant at Boston Software Forensics, said Google should never be archiving the IP address and cookies on servers. "Google should not be in the spy business," he said. "By logging IP addresses and search strings they are running the largest intelligence operation in the world."

Anonymising the last eight bits of the IP address effectively would enable investigators to narrow the IP address down to 256 possible computers or users.

Ari Schwartz, deputy director for the Center for Democracy and Technology, said: "For most average consumers that is pretty much anonymous," because many people connect to the internet through large companies that dynamically assign IP addresses, making it even harder to determine exactly which person conducted a search. "It is a risk but it is better than what we have today."

Kevin Bankston, staff attorney at the Electronic Frontier Foundation, said he would like to see Google scrub the entire IP address within six months but praised the company for making this "positive first step".

He added: "We hope other online service providers will heed this example and work to minimise the amount of data they keep about their customers."

Google said it can't anonymise the entire IP address, delete it altogether or anonymise any of it sooner than 18 months because it needs the data to analyse usage patterns and diagnose system problems. For example, it uses the information for fraud detection and prevention and to combat denial of service attacks which can temporarily cripple or shut down servers.

Nicole Wong, deputy general counsel for Google, said: "Knowing what country a user is coming from helps us figure out whether or not we are delivering the right search."

Elinor Mills writes for CNET News.com

  1. Zones
  2. Management
  3. Networks
  4. Software
  5. IT Services
  6. Hardware
  1. Verticals
  2. Public Sector
  3. Financial Services
  4. Retail & Leisure
Read and write about internet access at the airports of the world at atlarge.com. Rate airports, and see what others have to say...

Natasha Lomas Exclusive: Jimmy Wales on what's next for Wikipedia Why Wikipedia needs geeks and why a life unplugged is unthinkable

Peter Cochrane Peter Cochrane's Blog: United breaks guitars? Customer service has changed forever


  • Jobs
Web Developer / PHP Developer- Joomla or Magento, PHP, (X)HTML

Cross-browser compatible CSS-driven layouts Structured and commented PHP Relational MySQL for e-commerce, authentication or content management Joomla ...

Website Tester - Staffordshire, West Midlands - Payment Testing, Cross-Browser Testing, Testing Tools,

Website Tester - Staffordshire, West Midlands - Payment Testing, Cross-Browser Testing, Testing Tools, My Staffordshire based client requires a ...

PHP Developer, Online Design Agency, Bracknell - 35k - 40k

Knowledge of browser quirks and variations SEO and SEM understanding and ability to use Google Analytics PHP Developer, Online Design Agency, Near ...

Agenda Setters 2009
Welcome to the ninth annual Agenda Setters poll – silicon.com's list of the top 50 most influential individuals in the technology and IT industries, from techies and CIOs to entrepreneurs and business leaders. Find out more in our latest special report.





Quick Sitemap Links: