IETE Technical Review
Vol 23, No 5, September-October 2006, 2006, pp 313-319

Natural-Language-based Generation of Search Results in a Meaning-based Multilingual Search Engine

YOGESH P AWATE, JAGGER BODAS, SACHIN DESHPANDE
Department of Computer Engineering, Vidyalankar Institute of Technology,
Wadala (E), Mumbai 400037, India.
email: yogeshpga@gmail.com,yogesh.awate@iitb.ac.in;
jagger_bodas@yahoo.com; sach_desh@rediffmail.com

AND

PUSHPAK BHATTACHARYYA
Department of Computer Science and Engineering,
Indian Institute of Technology, Bombay, Mumbai 400076, India.
email: pb@cse.iitb.ac.in

Most of the current search engines are keyword based. They do not consider the meaning of the query posed to them and hence can be ineffective. In contrast to this, Agro Explorer [1] - a multilingual, meaning based search engine in the agricultural domain first extracts the meaning of the query and then performs a search based on this extracted meaning. Hence, search can be carried out even if the language of the query is different from the language of the documents. The meaning is represented in the form of Universal Networking Language (UNL) Expressions. The search is carried out using UNL expression matching. The relevant documents are in the UNL form. The Deconverter converts these documents into the language of the user’s choice using a lexicon and a rule base. In this paper, we will discuss the design of the Deconverter developed by us, with Marathi as the target language, for Agro Explorer. The deconversion proceeds through the following four stages:

a) Syntax Planning; b) Lexical Replacement; c) Case Mark Insertion; d) Morph Generation.

1. INTRODUCTION

THE internet has revolutionized our lives. However, most of the information on the Internet being in English causes the Internet to be effectively unavailable to the rural masses unqualified in English. The problem of language barrier should be cited as one of the primary reasons. To break the language barrier, the AgroExplorer - a language independent search engine with multilingual information access facility is being developed at Indian Institute of Technology, Bombay. This means that the search can be carried out even when the language of the query differs from the language of the documents to be searched. Thus the user can give the search query in his own native language e.g. Marathi and will be presented the searched documents (may be originally in English) in his own language Marathi.

 

___________________________________
Paper No 45-E; Copyright © 2006 by the IETE.

 

To achieve this multilingual search capability, simple keyword based search is not effective. Hence the search is carried out using the meaning representation, using an Interlingua form called Universal Networking Language (UNL) expressions.

2. REVIEW OF LITERATURE

A. Existing Multilingual, Meaning Based Search


Only a few search engines like oingo.com, excite.com and simpli.com provide meaning based search. Clush [2] is a new engine, which produces clustered search results from millions of web pages giving the user dynamically categorized data that cannot be duplicated.

313