Google's new Custom Search Engine aims to improve the relevance of search results while attracting more users to its services. Users can create search engines dedicated to a particular topic by restricting results to a list of specified web sites, or by giving certain sites priority. They can also annotate sites with descriptive labels, collaborate with others to fine-tune a list, and share the revenue from context-sensitive advertising in the search results. Another idea is for organisations to use it as a search engine for their own web sites.
The concept helps to solve two of the core problems of search. The first is ambiguity.
A search for WPF, for example, could bring back hits for the World Puzzle Federation as well as Windows Presentation Foundation; if it is this last I am looking for, a search engine dedicated to programming can find it. The second issue is that the web has millions of dud sites and blogs set up solely to profit from advertising clicks. A custom white list can avoid these.
The downside is the time needed to find the right custom search engine. Some of us prefer to just type into Google, hit search and put up with some irrelevant results. And a custom search engine is at best merely a way of filtering results.
In theory, a better answer to the search conundrum is the semantic web. This enables a web site or any online database to describe its content using RDF (Resource Description Framework), which uniquely identifies and describes resources using properties and other assertions.
HP's research lab is talking up the merits of RDF, having achieved more than 125,000 downloads of its free Jena library for RDF applications. Martin Merry, leader of semantic web research at HP labs, explained: "We were trying to make the semantic web take off. There wasn't anything available for people wanting to experiment with semantic web technology."
Yet Merry admits that it is still early days for the semantic web. RDF is still being used "more in a specialised form inside enterprises", he said.
A truly semantic web would be significantly more powerful than today's internet, not just for search but for integrating and aggregating data. But the lesson of Google is that clever algorithms to improve simple text search have more impact than attempts to introduce semantic mark-up. The limited progress that has been made is in bottom-up movements like tagging, microformats and RSS, a dialect of RDF, rather than in top-down efforts to persuade organisations to publish RDF data. The semantic web remains some way off.









