Authors: Stephan Winkler, Michael Affenzeller, Gabriel Kronberger, Michael Kommenda, Stefan Wagner, Witold Jacak, Herbert Stekel
In this paper we describe the use of evolutionary algorithms for the selection of relevant features in the context of tumor marker modeling. Our aim is to identify mathematical models for classifying tumor marker values AFP and CA 15-3 using available patient parameters; data provided by the General Hospital Linz are used. The use of evolutionary algorithms for finding optimal sets of variables is discussed; we also define fitness functions that can be used for evaluating feature sets taking into account the number of selected features as well as the resulting classification accuracies. In the empirical section of this paper we document results achieved using an evolution strategy in combination with several machine learning algorithms (linear regression, k-nearest-neighbor modeling, and artificial neural networks) which are applied using cross-validation for evaluating sets of selected features. The identified sets of relevant variables as well as achieved classification rates are compared.