# br referred to as shape

referred to as shape compactness and is defined by Equation (4).

where c is the value of shape compactness, A is the area and p is the perimeter of the nucleus. Debris were assumed to be objects with a P2A value (c) greater than 0.97 or less than 0.15.

Texture analysis: Texture is a very important characteristic feature that can differentiate between nuclei and debris. Image texture is a set of metrics designed to quantify the perceived texture of an image [63]. Within a Pap smear, the distribution of average nuclear stain intensity is much narrower than the stain intensity variation among debris objects [16]. This fact was used as the basis to remove debris from image in-tensities and colour information using Zernike moments (ZM) [64]. Zernike moments have utility for a variety of pattern recognition ap-plications, are known to be robust with regard to noise, and have a good reconstruction power. In this work, the ZM as presented by Malm et al. [16] of order n with repetition I of function f ( r, ), in polar cordinates inside a disk centered in square image I ( x , y) of size m x m, given by Equation (5).
Anl = n + 1 vnl ( r, ) I ( x , y),

vnl ( r, ) denotes the complex conjugate of the Zernike polynomial vnl ( r, ). To produce a texture measure, magnitudes from Anl centered at each pixel in the texture image are averaged [16].

2.5. Feature extraction

Feature extraction helps in converting the image to a format that is understandable to the classification algorithms. The success of the classification algorithm depends greatly on the correctness of the fea-tures extracted from the image. The AZD 2281 in the Pap smears in the da-taset used are split into seven classes based on characteristics such as size, area, shape and brightness of the nucleus and cytoplasm. The features extracted from the images included morphology features pre-viously used by others [46,65]. These features include: nucleus area, cytoplasm area, nucleus to cytoplasm ratio, nucleus gray level, cyto-plasm gray level, nucleus shortest diameter, nucleus longest diameter, nucleus elongation, nucleus roundness, cytoplasm shortest diameter, cytoplasm longest diameter, cytoplasm elongation, cytoplasm round-ness, nucleus perimeter, cytoplasm perimeter, nucleus relative position, maxima in nucleus, minima in nucleus, maxima in cytoplasm, and minima in cytoplasm. Due to the biological significance of the nucleus in cancer classification, three geometric (solidity, compactness and eccentricity) and six textual features (mean, standard deviation, var-iance, smoothness, energy and entropy) were extracted from the nu-cleus, resulting in 29 features in total. A method based on prior knowledge has been implemented in MATLAB that extracts features from segmented images using pixel level information and mathematical functions.

2.6. Feature selection

Feature selection (also called variable/attribute selection) is the process of selecting subsets of the extracted features that provide the best classification results. Among those features extracted, some might contain noise, while the chosen classifier may not utilize others. Hence, an optimum set of features has to be determined, possibly by trying all combinations. However, when there are many features, the possible combinations explode in number, and this increases the computational complexity of the algorithm. Feature selection algorithms are broadly classified into the filter, wrapper and embedded methods [66].

The method presented in this paper combines simulated annealing with a wrapper approach. This approach has been proposed elsewhere

Fig. 8. The simulated annealing algorithm.

[65], but in this paper, the performance of the feature selection is evaluated using a double-strategy random forest algorithm [67]. Si-mulated annealing is a probabilistic technique for approximating the global optimum of a given function. The approach is well-suited for ensuring that the optimum set of features is selected. The search for the optimum set is guided by a fitness value [68]. When simulated an-nealing has completed, all of the different subsets of features are compared, and the fittest (that is, the one that performs the best) is selected. The fitness value search was obtained with a wrapper where K-fold cross-validation was used to calculate the error on the classifi-cation algorithm. The implementation of simulated annealing is shown in Fig. 8.

The wrapper method considers the selection of a set of features as the search problem [66]. Different combinations from the extracted features are prepared, evaluated and compared to other combinations. A predictive model is then used to evaluate a combination of features, and to assign a score based on model accuracy. The fitness error given by the wrapper is used as the error (F) by the simulated annealing al-gorithm shown in Fig. 4. A fuzzy c-means algorithm was wrapped into a black box, from which an estimated error was obtained for the various feature combinations. Fuzzy c-means is a clustering algorithm that uses coefficients to describe how relevant a feature is to a cluster. The error estimate was obtained by k-fold cross validation as shown in Fig. 9.