Metodoloski zvezki, Vol. 15, No. 2, 2018, 1-20 Internal Evaluation Criteria for Categorical Data in Hierarchical Clustering: Optimal Number of Clusters Determination Zdenek Sulc1 Jana Cibulkova2 Jin Prochazka3 V , Hana Rezankova4 Abstract The paper compares 11 internal evaluation criteria for hierarchical clustering of categorical data regarding a correct number of clusters determination. The criteria are divided into three groups based on a way of treating the cluster quality. The variability-based criteria use the within-cluster variability, the likelihood-based criteria maximize the likelihood function, and the distance-based criteria use distances within and between clusters. The aim is to determine which evaluation criteria perform well and under what conditions. Different analysis settings, such as the used method of hierarchical clustering, and various dataset properties, such as the number of variables or the minimal between-cluster distances, are examined. The experiment is conducted on 810 generated datasets, where the evaluation criteria are assessed regarding the optimal number of clusters determination and mean absolute errors. The results indicate that the likelihood-based BIC1 and variability-based BK criteria perform relatively well in determining the optimal number of clusters and that some criteria, usually the distance-based ones, should be avoided. 1 Introduction Cluster analysis is a multivariate statistical method that reveals an underlying structure of data by identifying homogeneous groups (clusters) of objects. The homogeneity is defined as possessing a certain relevant property by the majority of objects in the group (de Souto et al., 2012). Under the term cluster analysis, many clustering algorithms are included, each of them with several possible similarity measures. Different algorithms can lead to different object assignments, and thus, a comparison of several object assignments 1 Department of Statistics and Probability, University of Economics, Prague, Czech Republic; zdenek.sulc@vse.cz 2Department of Statistics and Probability, University of Economics, Prague, Czech Republic; jana.cibulkova@vse.cz 3Department of Statistics and Probability, University of Economics, Prague, Czech Republic; jiri.prochazka@vse.cz 4Department of Statistics and Probability, University of Economics, Prague, Czech Republic; hana.rezankova@vse.cz 2 Sulc et al. using one or more evaluation criteria is often welcomed. For cluster partition evaluation, external or internal evaluation criteria are commonly used. The external criteria, see, e.g., de Souto et al. (2012), are based on comparing a cluster assignment to an a priori-known class variable. Apart from the simulation studies, where the values of a class variable are known, they are not suitable for clustering evaluation. The internal criteria, see, e.g., Liu et al. (2010), Milligan and Cooper (1985), Vendramin et al. (2010), use intrinsic properties of a dataset. Hence, they are more suitable for the unsupervised methods. Moreover, the internal criteria can be further divided either to the criteria trying to determine the optimal number of clusters or to judge the quality of a particular cluster solution, see Arbelaitz et al. (2013). Some of the criteria were developed for both these tasks. Studies, some of which have been mentioned in the previous paragraph, deal with evaluation criteria for quantitative data, and the majority of them cannot be used for categorical data. There are two main reasons for that. First, many evaluation criteria for quantitative data use mathematical operations between the values of the raw data matrix. That is not possible if the categorical values are used. Thus, the evaluation criteria for categorical data can be only based on calculations within a dissimilarity matrix, which is numeric also for categorical data. Second, the commonly used concepts for cluster evaluation used in quantitative data, such as the variance to express variability, cannot be used directly, but they have to be adjusted for their use in categorical data or appropriately substituted by a categorical data alternative. The clustering of categorical data is not as trustworthy as the quantitative one. The reason is the lower variability of categories by categorical variables compared to that by quantitative variables, which does not enable distinguishing groups in the data so precisely. However, there are many situations when clustering of purely categorical data is necessary (medicine, psychology, marketing), and for such cases, one should have a few reliable criteria to evaluate the obtained clusters. There is a lack of papers comparing and assessing the evaluation criteria determined for categorical data. Although there are many papers that use the internal evaluation criteria for categorical data, see Bontemps and Toussile (2013), where the BIC and AIC criteria for categorical data are used, or Rezankova et al. (2011), where the variability-based criteria are applied, we found none which compares internal evaluation criteria for categorical data. Thus, this paper tries to fill this gap by presenting and comparing different approaches which researchers can use to evaluate their categorical clustering outputs. This paper compares selected internal evaluation criteria for categorical data, which are determined for the optimal number of clusters determination in hierarchical cluster analysis (HCA). The criteria are evaluated on different cluster analysis settings (similarity measures, linkage methods) and also on the generated dataset properties (number of variables, categories, and clusters) regarding their ability to identify the optimal number of clusters. This can be formulated as two main aims. The first one is to evaluate the performance of the examined evaluation criteria regarding the correct number of clusters determination. The second one is to determine which properties of the clustered datasets associate with the outcomes of the examined evaluation criteria. To achieve this aim, logistic regression is used. In the experiment, 11 internal evaluation criteria are compared and assessed. The criteria are grouped according to the principles they use: variability, likelihood, and distance. The variability-based criteria express the cluster quality by their low within-cluster variability, the likelihood-based criteria assess it by the low values of Internal Evaluation Criteria for Categorical Data . 3 the likelihood function approximation for the categorical data, and the distance-based criteria express it by the low within-cluster distances. The experiment is performed on 810 generated datasets with known cluster assignments and certain properties under control, such as the numbers of clusters or variables. Paper is organized as follows. Section 2 presents the examined internal evaluation criteria for categorical data. Section 3 focuses on the selected similarity measures and methods of HCA. Section 4 describes the dataset generation process and the experiment settings. The results are presented in Section 5, and the outcomes of the research are summarized in the Conclusion. 2 Internal Evaluation Criteria Since the clusters in a dataset should be ideally distinct and their objects similar, internal evaluation criteria are usually constructed with the aim to satisfy the assumptions of compactness and separation of the created clusters (Liu et al., 2010; Zhao et al., 2002). Whereas the compactness measures the similarity of the objects in clusters, the separation measures distinctness between the clusters. The evaluation criteria presented in this section measure the compactness and separation based on the principles of either a variability, a likelihood or a distance. 2.1 Evaluation Criteria Based on the Variability The variability-based evaluation criteria are usually based on the compactness principle, which is expressed by the low within-cluster variability of the created clusters. In this subsection, three internal evaluation criteria based on this principle are presented, because they performed well in (Sulc, 2016) and (Yang, 2012), namely the pseudo F index based on the mutability (PSFM), the pseudo F index based on the entropy (PSFE) and the BK index. The PSFM index (Rezankova et al., 2011) is based on the within-cluster variability expressed by the mutability (the Gini coefficient), see Gini (1912) that appears in Light and Margolin (1971). It can be written as PQJTM(k) (n - k)(WCM(1) - WCM(k)) PSFM(k) =-(k - 1) WCM(k)-' (2.1) where k represents the total number of clusters and n is the number of objects in a dataset. WCM(1) and WCM(k) represent the within-cluster variability in the whole dataset and the k-cluster solution moving in a range from zero (no variability) to one (maximal variability). WCM(k) is computed as k m ng WCM (k) = V -^Y G 'go n ■ m *—' g=i c=1 where ng is the number of objects in the g-th cluster (g = 1, 2,... ,k), m is the total number of variables and Ggc is the mutability by the c-th variable (c = 1, 2,..., m) in the 4 Sulc et al. g-th cluster expressed as Kc Ggc =1 - £ u=1 where ngcu is the number of objects in the g-th cluster by the c-th variable with the u-th category (u = 1,..., Kc) and Kc is the number of categories by the c-th variable. The PSFE index (Rezankova et al., 2011) is constructed analogically to (2.1) with the difference that instead of WCM(k), the variability WCE(k) based on the entropy is used. WCE(k) can be expressed as k m ng WCE w = E n^H. g=1 c=1 where Hgc is the entropy by the c-th variable in the g-th cluster according to the formula Kc , HgC = -y ngcu ln ngcu U=i\ ng ng Both PSFM and PSFE indices indicate the optimal number of clusters by their maximal value over several examined cluster solutions. In such a cluster solution, the highest decrease in the within-cluster variability occurs. The BK index (Chen and Liu, 2009) is defined as the second-order difference of the incremental entropy of the dataset with k clusters BK(k) = A2/(k) = (/(k - 1) - /(k)) - (I(k) - I(k + 1)), where /(k) is the incremental expected entropy in the k-cluster solution with the formula /(k) = He(k) - He(k + 1), where HE is the expected entropy in a dataset expressed as km he 2 k>2 2.3 Evaluation Criteria Based on the Distances The distance-based criteria usually utilize both principles of compactness and separation. Satisfying the compactness principle, clusters should be of a small size, and satisfying the separation principle, their distance to the other clusters should be sufficiently high. The internal distance-based criteria from the NbClust R package (Charrad et al., 2014) were selected for comparison. In this package, there are 30 criteria of this type, but 25 of them require the raw data matrix for their computation, which makes them unsuitable for use in categorical data. Thus, five internal evaluation criteria, which need only a dissimilarity matrix for their computation, are used. Namely, the Dunn index, the silhouette index, the McClain index, the c-index and the Frey index. Since the Frey index cannot be calculated in every dataset (depending on dataset properties), the remaining four criteria are examined in this paper. The Dunn index (DU) (Dunn, 1974) assumes that clusters in a dataset are compact and well separated by maximizing the inter-cluster distance while minimizing the intra-cluster distance, see Yang (2012). For the cluster solution with k clusters, it can be expressed by the formula where D(Cg, Ch) is the distance between the g-th and h-th clusters (expressed by a given linkage method), and diam(Cv) is the maximal distance expressed by a given similarity measure between two objects in the v-th cluster. The Dunn index takes values from zero to infinity. The highest value indicates the optimal cluster solution. The silhouette index (SI) (Rousseeuw, 1987), also known as the average silhouette width, can be written as max 1R(k) k>2 1 r max 2R(k) n Internal Evaluation Criteria for Categorical Data . 7 where a(i) is the average dissimilarity of the i-th object to the other objects in the same cluster, and b(i) is the minimal average dissimilarity of the i-th object to other objects in any cluster not containing the i-th object. The silhouette index takes values from -1 to 1. The values close to one indicate well-separated clusters, the values close to minus one suggest badly separated clusters, and values close to zero indicate that the objects in the dataset are often located on the border of two natural clusters. The value zero also indicates single-object clusters. The McClain index (MC) (McClain and Rao, 1975) is defined as a ratio of the within-cluster and the between-cluster distances jijri (] \ _ Sw /uw _ Sw Ub MC (k) = = T; , Sb/Ub SbUw where nw is the number of pairs of objects in the same cluster, and nb is the number of pairs of objects not belonging to the same cluster. Sw is the sum of the within-cluster distances for nw pairs of objects, and Sb is the sum of the between-cluster distances for nb pairs of objects. The lowest value of the index indicates the optimal number of clusters. The c-index (CI) (Hubert and Levin, 1976) is defined as SS Ci ^w ^ min S — S • ^mai '-'min The statistics nw and Sw are defined the same way as by the McClain index. Smin and Smax are sums of nw lowest resp. highest distances across all the pairs of objects. The CI criterion takes values from zero to one, and the optimal number of clusters is attained by its minimum. 3 Experimental Background This section describes steps that are necessary to set before performing a comparison of the evaluation criteria, namely a process of data generation, a choice of similarity measures, a selection of HCA methods, and the used assessment criteria. 3.1 Data Generation Process The datasets for the experiment were generated with an aim to cover a wide range of possible situations that can occur. Thus, 81 different dataset settings were used, see Figure 1, which describes the data generation process. The datasets were generated with two to four natural clusters. Three minimal between-cluster distances2 (0.1, 0.3, 0.5) were used, representing intersecting, partly intersecting and almost non-intersecting clusters. Next, the datasets were generated with three different numbers of variables (4, 7, 10) covering the typical range of clustering of categorical datasets. Based on the empirical experience, three ranges of categories (2-4, 5-7, 8-10) were chosen representing small, medium and 2Based on the sepVal parameter in the clusterGeneration R package. For instance, sepVal = 0.1 represents the low between-cluster distance, where most of the clusters intersect, whereas sepVal = 0.5 depicts the high between-cluster distance, where the clusters do not intersect in most of datasets. 8 Sulc et al. number of replications 10 number of clusters 2 3 4 distance between clusters 0.1 0.3 0.5 number of variables 4 7 10 number of categories 2-A 5-7 8-10 Figure 1: Data generation scheme large numbers of categories. The numbers of objects in generated datasets were not firmly set; they varied from 300 to 700 cases. To ensure the robustness of the obtained results, each dataset setting combination was replicated ten times. In total, this makes 810 generated datasets that were used for the analysis. To perform a generation process, an R function with the name gen_ob ject, which was developed and described in Sulc (2016), is used. The function depends on the clus-terGeneration (Qiu and Joe, 2015) and arules (Hahsler et al., 2017) R packages. The generation is based on a two-step approach. In the first step, a quantitative dataset with multidimensional correlation structure reflecting the given properties (between-cluster distances, the number of clusters, variables and the range of categories) is created. In the second step, the dataset is categorized according to the desired number of categories for each variable in a dataset. The categorization process creates equal-width intervals from the quantitative values of a given variable differing in the numbers of categories. In comparison to an equal-frequency approach, the equal-width approach creates more natural-looking datasets, and moreover, if the categories differ in frequency counts, favorable properties of certain similarity measures can be used. Figure 2 demonstrates two different dataset generation settings using the clusplot () function in the cluster R package (Maechler et al., 2018), which displays the clusters in the two-dimensional space, i.e., there is some loss of data variability. The displayed datasets express 74 % resp. 73.4 % of their original variability. Both the datasets contain three natural clusters, which differ by their minimal between-cluster distance. On the left plot, where dist =0.1 was used, the clusters are largely overlapping, whereas on the right one, they are well separated. 3.2 A Choice of Similarity Measures Five similarity measures for nominal data, namely SM, ES, IOF, LIN, VE, were chosen for the experiment. The SM (simple matching) measure represents a standardly used approach when determining similarity by datasets characterized by categorical variables. In Sulc (2016) it was found out that the cluster partitions produced by this measure are the same as the partitions created by the majority of similarity measures for binary-coded data based on four frequencies in the 2 x 2 contingency table, such as the Jaccard coefficient or Sokal and Sneath measures. This finding also corresponds to Todeschini et al. (2012), where 44 similarity measures for binary-coded data were examined, and it was discovered Internal Evaluation Criteria for Categorical Data . 9 Dataset 013 Variables= 4; Categories= 5-7; Clusters= 3; Distance= 0.1 Dataset 070 1-r 2 3 \ a aA a£A A A A A a & a + A —o— ^o n<§> o te/V'o Component 1 These two components explain 74 % of the point variability. Component 1 These two components explain 73.4 % of the point variability. Figure 2: Two generated datasets with different properties Table 1: Calculation of the used similarity measures for categorical data Measure Sc(xic xjc) Sc(xic = xjc) S(xi, Xj) D(Xi, Xj) SM 1 0 Eq. (3.1) Eq. (3.3) ES 1 Kc2 K2+2 Eq. (3.1) Eq. (3.4) IOF 1 1 1+ln f (xic)^ln f (xjc) Eq. (3.1) Eq. (3.4) LIN 2ln p(xic) 2ln(p(xic) + p(xjc)) Eq. (3.2) Eq. (3.4) VE ln Kc Sm= 1 Pu ln Pu 0 Eq. (3.1) Eq. (3.3) 0 2 that many of them were monotonically dependent and the rest of them were near the monotonical state (the Spearman's Rho ranged from 0.97 to 0.99). Thus, the outputs for the SM measure also represents these similarity measures for binary-coded data. Next, four similarity measures for nominal data, which provided the best clusters in Sulc (2016), were used. Each of them treats the similarity between two categories differently. The ES measure (Eskin et al., 2002) is based on the number of categories of the c-th variable, whereas the IOF measure (Sparck-Jones, 1972) uses the absolute frequencies of the observed categories xic and xjc. The LIN measure (Lin, 1998) uses the relative frequencies instead. The VE measure (Sulc, 2016) is based on the variability of the c-th variable expressed by the entropy. All of the measures can be applied directly to the categorical data matrix X = [xic], where i = 1, 2,..., n (n is the total number of objects) and c = 1, 2,..., m (m is the total number of variables). The number of categories of the c-th variable is denoted as Kc, absolute frequency as f, and relative frequency as p. Their overview can be found in Table 1, where the column Sc (xic = xjc) presents the similarity computation (or just a value) for matches of categories in the c-th variable for the i-th and j-th objects, and the column Sc (xic = xjc) for mismatches of these categories. At the second level, the total similarity S (xi, Xj) between the objects xi and Xj is determined. For the majority of the examined similarity measures, it is calculated as the 10 Sulc et al. arithmetic mean S( ) = E^ll Se(Xie,Xje) . (3D m For the LIN measure, the total similarity is expressed as S (x .) = Ec=1 Sc(xic,xjc) (32) S(Xi'Xj) = EC=i(lncp(XiC) +lnp(XjC)) • (3.2) To compute a proximity matrix, which is required by the majority of software solutions, it is necessary to compute dissimilarities D(x^ Xj) between all pairs of objects, which can be simply obtained from similarities. The dissimilarities are calculated in two ways. For the similarity measures, which take values from zero to one, it is D(Xi, Xj) = 1 - S(Xi, Xj) (3.3) and for the similarity measures which can exceed the value one, it is D(X" Xj > = siXTX")-1 (34) The S (Xi, Xj) and D(Xi, Xj) columns in Table 1 show which similarity measures use the particular formulas. 3.3 Methods of Cluster Analysis To determine the between-cluster distances, three methods of HCA are examined in this paper: the complete linkage, the average linkage, and the single linkage methods. They are commonly used in hierarchical clustering of the categorical data (see, e.g., Lu and Liang, 2008; Morlini and Zani, 2012). The complete linkage method treats a dissimilarity between two clusters as the dissimilarity between two farthest objects from different clusters. This between-cluster distance usually produces compact clusters with approximately equal diameters. It can be expressed by the formula D(Cg ,Ch)= max D(Xi, Xj )• The average linkage takes average pairwise dissimilarity between objects in two different clusters. The obtained clusters are often similar to the ones obtained by complete linkage. Its formula can be expressed as D(Cfl,Ch) = — £ £ D(xi,Xj), n nh ^ ^ where nä and nh are numbers of objects in the g-th resp. h-th cluster. The single linkage uses dissimilarity between two closest objects from two different clusters. The formula of this algorithm can be expressed as D(Cä ,Ch)= min D(xi, xj). xi £=Cg ,xj Internal Evaluation Criteria for Categorical Data . 11 3.4 Evaluation Criteria Assessment The quality of evaluation criteria will be assessed by the statistic accuracy (AC) and by the mean absolute error (MAE). AC is defined as a percentage of the correctly determined numbers of clusters, expressed as AC = ^ 1 • 100%, where kt is the optimal number of clusters based on given evaluation criteria for the t-th dataset, Kt is the known number of clusters for the t-th dataset, T is the number of datasets, and I is the function which takes the value one in the case of kt = Kt and the value zero otherwise. MAE is defined as the average of the absolute differences between optimal numbers of clusters based on a given evaluation criterion and the known numbers of clusters. Low values indicate good stability of an evaluation criterion in its performance and vice versa. MAE is expressed by the formula MAE = ET=i |kt - Kt|. n 4 Experiment The experiment consists of two main parts. In the first one, the examined evaluation criteria are assessed regarding their ability to determine the optimal number of clusters. In the second part, the properties of datasets, which significantly associate with the performance of the evaluation criteria, are identified. The analysis was performed on 810 generated datasets whose generation process was explained in Section 3.1. To each of these datasets, a series of HCAs for two to ten clusters with five examined similarity measures presented in Section 3.2 were applied. The complete, average and single linkage methods of HCA described in Section 3.3 were used. The optimal numbers of clusters based on 11 evaluation criteria presented in Section 2 are then assessed by the AC and MAE statistics. In supplementary material to this paper, a script run_evaluation.R containing the whole evaluation process can be found. 4.1 Evaluation of the Optimal Number of Clusters Determination In this subsection, the optimal number of clusters determination of the 11 examined evaluation criteria will be assessed regarding the used similarity measures and the inherent properties of datasets. The assessment is based on the principle that evenly distributed values of the AC statistic (percentages of the correctly assigned clusters) for a particular evaluation criterion over an examined factor (e.g., similarity measure) indicate that this factor is not associated with the evaluation criterion. Conversely, substantial differences in the AC statistic over the factor values indicate an association between the factor and the criterion. The detailed analysis is limited to results for the average linkage since they provided the best results from all the examined linkage methods. The most important outputs for the complete and single linkages are placed in the Appendix, and they are briefly discussed at the end of this section. 12 Sulc et al. Table 2: AC and MAE statistics broken down by five similarity measures AC MAE Crit. SM ES IOF LIN VE SM ES IOF LIN VE PSFE 37.8 37.8 37.5 36.9 38.3 1.02 0.98 0.97 1.01 0.93 PSFM 39.1 38.3 39.3 37.3 39.6 1.01 0.98 0.92 1.00 0.91 BK 46.8 40.9 45.3 46.3 44.7 0.74 0.84 0.76 0.73 0.76 BIC1 12.7 38.0 46.0 49.9 44.7 2.95 1.53 1.22 1.04 1.15 BIC2 15.3 16.5 19.9 24.0 19.5 2.37 2.56 2.51 2.18 2.44 DU 29.4 25.3 22.1 23.6 28.0 1.30 1.60 2.12 1.97 1.47 SI 42.4 33.0 41.0 42.3 39.9 1.03 1.59 0.91 1.09 1.17 CI 1.1 4.6 5.6 7.5 0.8 4.80 4.36 4.12 3.48 4.73 MC 33.1 32.5 33.8 32.7 33.1 1.02 1.06 1.05 1.05 1.01 Table 2 shows values of the AC and MAE statistics for the examined internal criteria (Crit.) which were calculated as the averages over all datasets broken down by the five used similarity measures. The criteria AIC1 and AIC2 are not displayed in the output because they provided the same outputs as the more commonly used BIC1 and BIC2 criteria. When looking at the AC values, which express the percentages of correctly determined clusters, it is clear that internal evaluation criteria for categorical data are not nearly as successful as their counterparts for quantitative data, see, e.g., Vendramin et al. (2010), where the accuracy is around 80 %. This is caused by the fact that the clusters in the purely categorical data are much more difficult to recognize since the categorical data have low discriminability compared to quantitative ones. Nevertheless, since nine cluster solutions (two to ten) were investigated, a random guess is 11.1 %, and thus, all the criteria except for CI perform better than that. The overall best performance among the examined criteria was attained by the BK index, whose AC was around 45 %. The other two variability-based evaluation criteria, PSFM and PSFE, also provided stable but somewhat worse results with AC slightly under 40 %. All three variability-based criteria also have very low MAEs (mostly lower than one); thus, there are very stable in their results. Regarding the likelihood-based evaluation criteria, the BIC1 criterion provided good accuracy (for categorical data) but only by the measures LIN, IOF and VE with MAEs slightly over one. The BIC2 criterion performed poorly by all the similarity measures. This suggests that the entropy-based variants of this type of criteria (BIC1, AIC1) should be preferred. The distance-based evaluation criteria perform rather poorly. The only exception is the silhouette index whose AC is around 40 % (apart from the ES measure). From Table 2 is also apparent that the MAE values are inversely proportional to the AC scores. The criteria with better accuracy have lower mean absolute errors. The well-performing evaluation criteria have MAEs around one or lower. Overall, the examined evaluation criteria provided the best ACs and MAEs by the LIN measure. Therefore, this similarity measure is going to be used by more detailed analysis and for comparison with the reference SM measure in the rest of the paper. Table 3 displays the average ACs of the examined evaluation criteria using the LIN Internal Evaluation Criteria for Categorical Data . 13 0 co m cn IT) 00 CD co ^iD in ^iD 00 —' 3 Tf m VO 3 3 m cn o a o on CD C^ 00 O r^ i> en Ö in CD CD cK en C3 ^ ¿3 3 3 IT) 1 00 0 co ^ ^ — tU & m o o .1 M M 1 1. o o •— & Ö a in cd O 5 £ .3 T3 1 .9 M o •— 1. ,-H' 'iH .5 ■S oo o 'in O o CÄ T3 tu SH in .5 en G -Q S CÄ ^ en 11 en d Ö X tu .3 i> tu (N 3. cn £ 9 ON o W S Q < '5H xyy XAi U eu eu Tt JU 3 H .3 .5 .9 o .1 3. 5 8. 5. 3. 3 3. 5 1 (N cn .7 .7 .8 .1 co 3. CD VO 7. 2 CD 1 cn cn .3 .3 .3 .1 .1 en 3. 9. 3. 7. 2 3 1 en en .9 .7 .8 .3 o en 8. 3. 3. 3 5. CD en en .7 .1 .8 .3 .9 C^ en VO 3. 3 CD en en .8 ^ .3 .5 .3 ^o o VO 21 9. 3 3. (N en .1 .3 .8 .5 .5 .7 en VO 7. 31 8. 3 CD en en .1 .5 .7 o .3 .5 en 8. 11 3. CD 3 3. 1 en en .1 .7 .2 .1 ^o 8. CD Tf VO 2 5. 1 (N en .9 VO VO .9 .7 CD 5.