Authors: Viktoria Dorfer, Stephan M. Winkler, Thomas Kern, Gerald Petz, Patrizia Faschang
The amount of data available in the field of life sciences is growing exponentially; therefore, intelligent information search strategies are required to find relevant information as fast and correctly as possible. In this paper we propose a document keyword clustering approach: On the basis of a given set of documents, we identify groups of keywords found in the given documents. Having developed those clusters, the complexity of the data base can be handled much easier: Future user queries can be extended with terms found in the same clusters as those originally defined by the user. In this paper we present a framework for representing and evaluating keyword clusters on a given data basis as well as a simple evolutionary algorithm (based on an evolution strategy) that shall find optimal keyword clusters. In the empirical section of this paper we document first results obtained using a data set published at the TREC-9 conference.