
Googlebot unearths confidential info for all to see...
By Elinor Mills
Published: 2 February 2006 08:40 GMT
Dell apparently learnt the hard way this week that companies have to be careful to ensure information they store on the internet and want to keep hidden is not automatically added to a search engine index for everyone on the web to see.
Specifications for future Dell laptops were accessible via Google's search site before the content was pulled from a Dell file transfer protocol site and from Google's cache.
Google, like the other major search engines, has an automated search engine that sends software robots called "spiders" out to crawl the web and find sites to add to the index of websites it maintains. Because the spiders follow links running from one website to others, they pick up sites on their own without webmasters having to manually submit them to search engines.
Webmasters can also provide the URL, or numerical web address, for pages they want crawled, and they can submit detailed site maps to Google, according to Google's "information for webmasters" pages.
Webmasters who want to keep some or all of their site private from the Googlebot can put a standard document called "robot.txt" at the root of the server that instructs the crawler not to download content. If the removal request is urgent, the webmaster can submit a request via Google's automatic URL removal system but must provide an email address and password first.
Content that has been removed can still be viewed through Google's cache, which is a "snapshot" and archive of each page crawled. Webmasters can prevent pages from being cached by inserting specific code on them.
Elinor Mills writes for CNET News.com
The purpose of a Pay Per Click Analyst is to maximise business and profitability by tracking and analysing search engine marketing campaigns for ...
Monitor algorithmic changes in the search engines to determine when and how to adapt our proven delivery methodology. Optimisers, and Web Developers ...
Have a sound awareness of search engines and how to optimise a page in order to improve its relevance for online marketing purposes. Be competent in ...
Agenda Setters 2009
Welcome to the ninth annual Agenda Setters poll – silicon.com's list of the top 50 most influential individuals in the technology and IT industries, from techies and CIOs to entrepreneurs and business leaders. Find out more in our latest special report.
Stories from the web...
Copyright © 2008 CBS Interactive Limited. All rights reserved. Top of page
Natasha Lomas Exclusive: Jimmy Wales on what's next for Wikipedia Why Wikipedia needs geeks and why a life unplugged is unthinkable
Peter Cochrane Peter Cochrane's Blog: United breaks guitars? Customer service has changed forever