Wednesday, August 26, 2009

Search Market to Get Another Engine

HP, along with the Indian Institute of Technology, Bombay, is working on an engine to make online search more meaningful

Last year, Hewlett-Packard (HP) Labs initiated open research grants to dozens of universities worldwide. One such grant was given to the Computer Science Department of Indian Institute of Technology, Bombay (IIT-B).

Professor Soumen Chakrabarti and his group at IIT-B used this grant to work on a new search engine which would trawl the web to provide relevant answers to queries. Their efforts are yielding results.

The IIT-B team has already created billions of annotation links between a 500-million web page corpus and millions of entities known to Wikipedia. The data is being churned on 42 high-end HP servers with over 350 gigabytes of RAM and over 150 terabytes of disks, donated by Yahoo. HP Labs and Microsoft Research have provided additional research funding.

To be successful, any search engine needs a robust mechanism that indexes web pages. At any given time, there are millions of web pages on the internet. For instance, Google has over 8 billion pages indexed and over 1.1 billion images. Add to that an efficient crawler which basically connects servers across the world wide web and across servers.

In case of the HP-IIT-B machine, the mainstay is annotation, indexing of annotations alongside ordinary text, and supporting a query language that can combine categories, annotations, quantities and regular text in creative ways, typically ending with evidence aggregation. The key to moving up in the search value chain, according to Chakrabarti, is to add semi-structured knowledge to the unstructured corpus, in the form of type, entity, category and relationship annotations, to index these annotations along with the text, and open up search application programming interfaces (APIs) and query languages to probe these indices and aggregate the resulting knowledge.

For more information http://www.business-standard.com/india/news/search-market-to-get-another-engine/368255/

No comments:

Post a Comment