Natural Language Search

With the increasing amount of information available on the Internet one of the most challenging tasks is to provide search interfaces that are easy to use without having to learn a specific syntax. The development and availability of efficient and appropriate search functions are still a challenge in the field of database and information systems. Consider, for example, the context of tourism information systems where intuitive search functionality plays a crucial role for economic success. Querying an information system in natural language is especially appealing in the tourism domain because users usually have very different backgrounds regarding their computer literacy. Hardly any computer scientist or technically interested person has problems understanding the Boolean logic underlying conventional web search engines. Unfortunately, a growing majority of people using search engines has.

Hence, we have developed a query interface exploiting the intuitiveness of natural language in cooperation with the largest Austrian web-based tourism platform Tiscover. The prototypical system allows for searching for accommodations throughout Austria via queries posed in natural language, such as “I am looking for a double room in a hotel in Innsbruck having a sauna and a swimming pool. It should furthermore provide a baby sitting service”. After automatically determining the language of the query (currently English and German), the relevant terms are extracted from the query and semantically tagged. This is done by means of an domain ontology, which contains the concepts, the associated terms representing them, and relations between them. Then, the tagged query elements are analyzed and a SQL query reflecting the natural language query posed by the user is created. Finally, the results are presented to the user. Furthermore, we have carried out a field trial in which the interface has been promoted on and linked from the Tiscover homepage. The analysis of the real-world queries gathered during the field trial shows how users formulate queries when their imagination is not limited by conventional search interfaces with structured forms consisting of check boxes, radio buttons and special-purpose text fields. Thus, the results of this field trial have been valuable indicators into which direction the web-based tourism information system should be extended to better serve the customers. Moreover, a controlled user study was also carried out with very positive feedback from the test persons.

Regarding knowledge representation, the results of the user studies have animated us to conduct research in the field of information retrieval using associative networks and spreading activation. We wanted to show that this alternative method of storing and processing domain knowledge, i.e. concepts and their associations, has the potential of, first, providing a more flexible means for concept modeling, second, reducing linguistic difficulties encountered during the development of the first prototype, and third, better incorporation of geographic data. This approach has driven research towards recommendation systems, because not only the exact matching information items (accommodations in our case) are retrieved from the database, but rather the best matching items are ranked and presented to the user. Furthermore, using associative networks and spreading activation has opened a convenient way of using personal preferences as well as seasonal factors to bias search results.

The time-consuming task of ontology creation and enhancement is another issue we address. We use text mining techniques and neural networks for semantic term clustering to provide a map of the vocabulary of a specific domain as a tool to support ontology engineers in their task, as opposed to automatic ontology learning, e.g. in the context of semantic web. This map presents spatially organized clusters of words that have been automatically mined from free-form texts of the domain. The higher the semantic similarity of two words, the closer they are located to each other on the map. Thus, we use the map metaphor to go beyond conventional concordance lists commonly used.

Contact

If you are interested in commercial search and text analytics solutions and consulting, send me an e-mail (m NULL.dittenbach null@null max-recall NULL.com) or call the max-recall (http://www NULL.max-recall NULL.com) phone number: +43 720 978603. I am also happy to answer e-mails regarding my research topics.

Location: Vienna, Austria