Execute Search 
Web site design and development specialist Cyber Media

It’s information, Jim, but not as we know IT!

Thursday, July 1, 2004

It’s information, Jim, but not as we know it! OK, so for years careers professionals have been able to turn to a trusted, well thumbed collection of paper based resources, supplemented by some well chosen software programs for up to date information. All is safe and sound – no resources find their way into use without being thoroughly evaluated first. However, for many of today’s youngsters if it can’t be found on the web then it doesn’t exist, and books are things only old people use. So what can we make of information found on the Internet? And how can we help people find our information? Well, let’s do a little exploring to see if we can find out more ... and where better to start than a look at “Search Engines”.

Search Engines don't really search the Web directly when you ask them to. Instead, they search their own database which holds information about web pages already indexed. When you click on links provided in the search results however, you then jump to the current version of the page (which is sometimes no longer there).

Search engine databases are typically populated automatically, by programs called “spiders”. These programs find their pages by following the links in the pages they already have in their database, and then rank them according to their own unique method, or algorithm. If a web page is never linked to in any other page, search engine spiders cannot find it. So the only way a brand new page – one that no other page has ever linked to – can get into a search engine is for its web address (URL) to be sent to the search engine companies as a request that that new page to be included. Most search engine companies offer ways to do this.

After spiders find pages, they pass them on to another computer program for indexing. This program identifies the text, links, and other content in the page and stores it in the search engine's database so that the database can subsequently be searched, and the page will then be found if your search matches its content.

Internet Directories are often mistaken for search engines, because users can conduct searches of the sites listed on them, but directories are actually databases of hand selected and human reviewed sites that have been arranged into a hierarchy of topical categories.

Directories serve as excellent starting points for navigating the Internet. Even search engines themselves view directories as valuable starting points, sending their spiders to the directories to get started on their journey through the Internet. By starting at a directory, a search engine is able to find high quality, hand selected sites to add to their database. The search engines then "follow" the links on those sites to find a second set of sites, and so on and so forth as they journey their way through the Internet.

Two leading sites worth starting with are www.google.co.uk (a search engine) and www.yahoo.co.uk (an Internet directory).

So, we understand how search engines work, but can information providers help search engines find the right information for us? Well, the answer is yes, and there is a move towards standards that support this approach. The answer is being found in metadata – metadata is commonly defined as data about data, or information about information. A growing number of information providers are adopting the basic metadata standard known as the Dublin Core Metadata Initiative. (The original workshop for the Initiative was held in Dublin, Ohio, USA in 1995, and hence the term "Dublin Core" in the name of the initiative – see http://dublincore.org.) The UK Government have taken this starting point and incorporated it into the e-GIF (e-Government Interoperability Framework) and associated standards for use on all government web sites. By standardising on the metadata to be embedded in documents stored on the web, these documents can be more easily catalogued by search engines, and thus found using well understood search words and phrases.

Most people today are sufficiently informed to know that just because you saw it on TV doesn’t necessarily mean that it’s true. Any information found on the Internet should be treated with even more caution. Users will soon find some sites which they trust due to branding and experience – many Connexions and IAG sites would be included here, such as http://www.connexions-norfolk.co.uk or http://www.staffsiag.com. Several resources are also available which will point towards reliable sites – see the discussion in “Weaving through the Web” elsewhere in this issue of Career Guidance Today. After that it is important to look for consensus in the information found and to venture forth with much caution.

So maybe – if it’s worth finding it can be found on the web – but it just might take some searching out.

News & Events

Events Archive
All archived events from Cyber Media.
News Archive
All archived news and site launches from Cyber Media.

Accessibility | Legal
Tel: +44 (0) 1785 222 350
Fax: +44 (0) 1785 253 507
E-mail: enquiries@cyber-media.co.uk

Powered by CommonSpot Content Server
Copyright ©2007 Cyber Media Solutions Ltd
All rights reserved

This website complies with Bobby AAA Accessibility Guidelines.