Where would we be without 'em?

Our experience of the Internet is often facilitated through the use of search engines and search directories. Before they were invented, people’s Net experiences were confined to plowing through sites they already knew of in the hopes of finding a useful link, or finding what they wanted through word of mouth.

As author Paul Gilster puts it in Digital Literacy "How could the world beat a path to your door when the path was uncharted, uncatalogued, and could be discovered only serendipitously?".

This may have been adequate in the early days of the Internet, but as the Net continued to grow exponentially, it became necessary to develop a means of locating desired content.

At first search services were quite rudimentary, but in the course of a few years they have grown quite sophisticated.

Not to mention popular. Search services are now among the most frequented sites on the Web with millions of hits every day.

Even though there is a difference between search engines and search directories (although less so every day), I will adopt the common usage and call all of them search engines.

 

Archie and Veronica

The history of search engines seems to be the story of university student projects evolving into commercial enterprises and revolutionizing the field as they went.

 

Certainly, that is the story of Archie, one of the first attempts at organizing information on the Net. Created in 1990 by Alan Emtage, a McGill University student, Archie archived what at the time was the most popular repository of Internet files, Anonymous FTP sites.

Archie is short for "Archives" but the programmer had to conform to UNIX standards of short names.

 

What Archie did for FTP sites Veronica did for Gopherspace. Veronica was created in 1993 at the University of Nevada. Jughead was a similar Gopherspace index.

 

Robots

Archie and Veronica were for the most part indexed manually. The first real search engine in the sense of a completely automated indexing system is MIT student Matthew Gray’s World Wide Web Wanderer.

The Wanderer robot was intended to track the growth of the Web counting only web servers initially. Soon after its launch it captured URLs as well. This list formed the first database of websites, called Wandex.

Robots at this time were quite controversial. For one, they occupied a lot of network bandwidth and they would index sites so rapidly it was not uncommon for the robots to crash servers.

In the Glossary for Information Retrieval Scott Weiss describes a robot as:

[a] program that scans the web looking for URLs. It is started at a particular web page, and then accesses all the links from it. In this manner, it traverses the graph formed by the WWW. It can record information about those servers for the creation of an index or search facility.

Most search engines are created using robots. The problem with them is, if not written properly, they can make a large number of hits on a server in a short space of time, causing the system’s performance to decay.

The First Web Directory

In response to the problems with automated indexing of the Web, Martjin Koster in Oct. 1993 created Aliweb, which stands for Archie Like Indexing of the Web. This was the first attempt to create a directory for just the Web.

Instead of a robot, webmasters submit a file with their URL and their own description of it. This allowed for a more accurate, detailed listing.

Unfortunately, the application file was difficult to fill out so many websites were never listed with Aliweb.

 

GO TO NEXT PAGE

PAGE 1 of 4

 

by Glen Farrelly, July 1999