Deep Web

The World Wide Web has become so big that search engines can’t index it all; in fact, they find only a small proportion. There’s also lots of stuff out there — mostly in databases — that can’t be reached at all by the conventional search technologies in use since the Web began. The firm BrightPlanet has estimated that this deep Web (a term it seems to have invented) contains 7,500 terabytes of data, compared with about 19 terabytes of data on what it calls the surface Web, numbers impossible to visualise in other than the vaguest way. Even if these figures are overestimates, it still suggests that there is a lot of material out there that would be useful if only one could find it. The firm also points out that the deep data is usually of excellent quality, and that most of it is publicly accessible without charge. Now we have to find a way of getting at it.

BrightPlanet estimates that this so-called “deep Web” could be 500 times larger than the surface Web that most search engines try to cover.

NewsScan Daily, Jan. 2001

The FAA database is part of the invisible Web, sometimes called the “deep Web” — a vast repository of information hidden in databases that general-purpose search engines don’t reach.

The Industry Standard, Sep. 2000