You are here: silicon.com > Networks > WebWatch

WebWatch

Google bots are crawling in a new way

On the hunt for HTML forms…

Tags: search, google

By Stephen Shankland

Published: 16 April 2008 08:55 GMT

Google's search bots, which scour the web constantly for new pages, have begun a new, more active phase of their indexing jobs.

In a blog post last week, Jayant Madhavan and Alon Halevy of Google's crawling and indexing team said the company has begun an experiment in which its indexing software experimentally enters text in website forms to see what previously undiscovered pages may appear.

The best of Google Earth

From Hollywood to Vegas and racetracks to controversial domes... click here to travel the world with Google Earth.

The post said: "In the past few months, we have been exploring some HTML forms to try to discover new web pages and URLs that we otherwise couldn't find and index for users who search on Google. This experiment is part of Google's broader effort to increase its coverage of the web. In fact, HTML forms have long been thought to be the gateway to large volumes of data beyond the normal scope of search engines."

The new Google indexing practice involves only "high quality" websites and doesn't run on sites with 'robots.txt' files or other standard mechanisms of warding off indexing software.

To decide what words to "type" into the forms, the indexing software samples from among words on the web page with the form, Google said.

The technology looks related to a company called Transformic which Google acquired, according to a blog post by Anand Rajaraman, who was involved with the technology earlier in his career, while working for Halevy.

  1. Zones
  2. Management
  3. Networks
  4. Software
  5. IT Services
  6. Hardware
  1. Verticals
  2. Public Sector
  3. Financial Services
  4. Retail & Leisure
Read and write about internet access at the airports of the world at atlarge.com. Rate airports, and see what others have to say...

Peter Cochrane Peter Cochrane's Blog: Facebook saves teen from prison Another unexpected impact of social networking

Natasha Lomas Exclusive: Jimmy Wales on what's next for Wikipedia Why Wikipedia needs geeks and why a life unplugged is unthinkable


  • Jobs
(IA) Information Architect / (UX) User Experience Architect, London; VOD (Video On Demand), Digital Media

Specify wireframes with excellent attention to detail to create amazing user experience without undefined error states and 404 pages. Create concepts ...

Web Designer / Web Developer - HTML, CSS, Photoshop - SEO Enthusiast

Key words: Web Developer, Web Designer, Flash, HTML, CSS, JavaScript, Silverlight, SEO, PPC, Web 2.0, social media, Web forms, CMS, Drupal, Joomla, ...

SEO Specialist - Lincoln, Lincolnshire

You will be working with the business's clients to ensure that their pages rank higher on search engines such as Google. If you have examples of SEO ...

Agenda Setters 2009
Welcome to the ninth annual Agenda Setters poll – silicon.com's list of the top 50 most influential individuals in the technology and IT industries, from techies and CIOs to entrepreneurs and business leaders. Find out more in our latest special report.





Quick Sitemap Links: