Informatica 23 (1999) 501-506 501 Generalized Blockmodeling Vladimir Batagelj University of Ljubljana, Faculty of Mathematics and Physics Jadranska 19, 1000 Ljubljana, Slovenia E-mail: vladimir.batagelj@uni-lj.si AND Anuška Ferligoj University of Ljubljana, Faculty of Social Sciences Kardeljeva pi. 5, 1 000 Ljubljana, Slovenia E-mail: anuska.ferligoj@uni-lj.si AND Patrick Doreian University of Pittsburgh, Department of Sociology PA 15260, Pittsburgh, USA E-mail: pitpat+@pitt.edu Keywords: clustering, blockmodeling, social network, pre-specified blockmodel, local optimization. Edited by: Cene Bavec and Matjaž Gams Received: October 3, 1999 Revised: November 20, 1999 Accepted: December 4, 1999 The goal of blockmodeling is to reduce a large, potentially incoherent network to a smaller comprehensible structure that can be interpreted more readily. In the paper we present an overview of basic ideas and developments in this area. 1 Basic Notions 1.1 Network Let E = {Xi, X2, ■ ■ ■ , Xn} be a finite set of units. The units are related by binary relations Rt C E x E, t — 1,... , r, r > 1 which determine a network M = (E,Ri,R2,... ,Rr) In the following we restrict our discussion to a single relation R described by a corresponding binary matrix R = [■rij]nxn where r - = i 1 XiRXi n \ 0 otherwise In some applications ry can be a nonnegative real number expressing the strength of the relation R between units Xt and Xj. 1.1.1 Example: Student Government In Table 1 and Figure 1 the Student Government network is presented. It consists of communication interactions among twelve members and advisors of the Student Government at the University in Ljubljana (Hlebec, 1993). The results of the measurement are not real interactions among actors but cognition about communication interactions. Data were collected with face to face interviews, conducted in May 1992. Figure 1: Network graph: Student Government - discussion, recall 502 Informática 23 (1999) 501-506 V. Batagelj et al. Table 1 : Student Government matrix m p m m m m m m a a a 1 2 3 4 5 6 7 8 9 10 11 minister 1 1 0 1 1 0 0 1 0 0 0 0 0 p.minister 2 0 0 0 0 0 0 0 1 0 0 0 minister 2 3 1 1 0 1 0 1 1 1 0 0 0 minister 3 4 0 0 0 0 0 0 1 1 0 0 0 minister 4 5 0 1 0 1 0 1 1 1 0 0 0 minister 5 6 0 1 0 1 1 0 1 1 0 0 0 minister 6 7 0 0 0 1 0 0 0 1 1 0 1 minister 7 8 0 1 0 1 0 0 1 0 0 0 1 adviser 1 9 0 0 0 1 0 0 1 1 0 0 1 adviser 2 10 1 0 1 I 1 0 0 0 0 0 0 adviser 3 11 0 0 0 0 0 1 0 1 1 0 0 Communication flow among actors was identified by the following question: Of the members and advisors of the Student Government, whom do you (most often) talk with? The content of the communication flow was limited to the matters of the Student Government. The time frame was also defined: the question was referred to the six months period. One respondent refused to cooperate in the experiment. As he was not considered in the analysis, the network consists of eleven actors. 1.2 Cluster amd Clustering One of the main procedural goals of blockmodeling is to identify, in a given network, clusters (classes) of units that share structural characteristics defined in terms of R. The units within a cluster have the same or similar connection patterns to other units. They form a clustering C = {Ci,Ci,... ,Ck} which is a partition of the set E: (J{ Cj = E and i ^ j Ci n Cj = 0. Each partition determines an equivalence relation (and vice versa). if C> a? V =>o Figure 2: Blockmodeling scheme. complete row-dominant col-dominant regular row-regular col-regular row-functional col-functional 1.3 Block A clustering C partitions also the relation R into blocks R(Ci, Cj) = RnCiX C) Each such block consists of units belonging to clusters Ci and Cj and all arcs leading from cluster C» to cluster Cj. If i = j, a block R(C{, Ci) is called a diagonal block. 1.4 Blockmodel amd Blockmodeling The goal of blockmodeling is to reduce a large, potentially incoherent network to a smaller comprehensible structure that can be interpreted more readily. Blockmodeling, as an empirical procedure, is based on the idea that units in a network can be grouped according to the extent to which they Figure 3 : Types of connection between two sets; the left set is the ego-set. are equivalent, according to some meaningful definition of equivalence. A blockmodel consists of structures obtained by identifying all units from the same cluster of the clustering C. For an exact definition of a blockmodel (see Figure 2) we have to be precise also about which blocks produce an arc in the reduced graph and which do not, and of what type. Some types of connections are presented in Figure 3. A block is symmetric if V(X,F) eCiX Cj : (XRY & YRX) Note that for nondiagonal blocks this condition involves a pair of blocks R(Ci, Cj) and R(Cj,Ci). GENERALIZED BLOCKMODELING Informatica 23 (1999) 501-506 503 Table 2: Block types and matrices. 1 1 1 1 1 1 0 0 1 1 1 1 0 1 0 1 1 1 1 1 0 0 1 0 1 1 1 1 1 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 1 0 1 1 0 0 0 0 1 1 0 1 0 0 0 0 1 1 1 0 Cl c2 Cl complete regular c2 null complete The reduced graph can be presented by relational matrix, called also image matrix (see Table 2). A clustering and the induced blockmodel of the Student Government is presented in Figure 4. position t G U is C(t) = = {X 6 E : fi(X) = t) Therefore C(n) = {C(t) :teU} is a partition (clustering) of the set of units E. A blockmodel is an ordered sextuple M = (U,K,T,Q, 7T, a) where: - U is a set of positions (types of units); - K Ç U x U is a set of connections', - T is a set of predicates used to describe the types of connections between different clusters in a network. We assume that nul 6 T. - a mapping it : K -ï T\ {nul} assigns predicates to connections; - Q is a set of averaging rules. A mapping a : K -» Q determines rules for computing values of connections. A (surjective) mapping fi : E U determines a blockmodel M of network M iff it satisfies the conditions: V(i,tu) S K : 7r(t,w){C{t),C(w)) and V(f,tu) eUxU\K: nul {C(t),C(w)). 2.1 Equivalences Let « be an equivalence relation over E and [X] — {Ye E : X « Y}. We say that « is compatible with T over a network Af iff \/X,YE E3T &T :T{[X],[Y]). It is easy to verify that the notion of compatibility for T = {nul, reg} reduces to the usual definition of regular equivalence (White and Reitz 1983). Similarly, compatibility for T = {nul, com} reduces to structural equivalence (Lorrain and White 19 71 ). For a compatible equivalence « the mapping fi : X h-> [X] determines a blockmodel with U = E/ 3 Optimization Figure 4: Blockmodeling example. 2 Blockmodeling - Formalization Let U be a set of positions or images of clusters of units. Let fi : E U denote a mapping which maps each unit to its position. The cluster of units C{t) with the same 3.1 Criterion Function The problem of establishing a partition of units in a network in terms of a selected type of equivalence is a special case of clustering problem that can be formulated as an optimization problem: determine the clustering C* for which P(C*) =minP(C) 504 Informática 23 (1999) 501-506 Table 3: Characterizations of Types of Blocks. null nul all 0* complete com all r row-regular rre each row is 1-covered col-regular ere each column is 1 -covered row-dominant rdo 3 all 1 row* col-dominant cdo 3 all 1 column* regular reg 1 -covered rows and 1-covered columns non-null one 3 at least one 1 * except may be diagonal where C is a clustering of a given set of units E, $ is the set of all feasible clusterings and P : $ IR the criterion function. One of the possible ways of constructing a criterion function that directly reflects the considered equivalence is to measure the fit of a clustering to an ideal one with perfect relations within each cluster and between clusters according to the considered equivalence. Given a set of types of connection T we can introduce the set of ideal blocks for a given type T £ T by B{Ci,Cj\T) = {bccjx Cj : T(B)} Using Table 3 we can efficiently test whether the block R(Ci,Cj) is of the type T; and define the deviation 6(Cj,Cj-,T) of a block R(Ci,Cj) from the nearest ideal block. For example 8(Ci,Cs;reg) = \Ci\ • ~ <*) + \Cj\■ (|C,| - n) where Cj is the number of non-zero columns, and r\ is the number of non-zero rows in the block R(Ci,Cj). For details see (Batagelj 1997). For the proposed types all deviations are sensitive S(Ci, Cj; T) = 0 T{R(CU Cj)). Therefore a block R{Ci, Cj) is of a type T exactly when the corresponding deviation 8(Ci,Cj; T) is 0. In the deviation S we can also incorporate values of lines v. Based on deviation S(Ci, Cj; T) we introduce the block-error e(Ci,Cj\T) of R(Ci,Cj) for type T. An example of block-error is e(Ci,Cj;T) = w(T)5(Ci,Cj-,T) where w(T) > 0 is a weight of type T. We extend the block-error to the set of feasible types T by defining e{Ci, Cj-, T) = min e{Cu Cj;T) and n{fi{Ci),n(Cj)) = argmin T€Te{CuCj\T) V. Batagelj et al. To make -k well-defined, we order (priorities) the set T and select the first type from T which minimizes e. We combine block-errors into a total error - blockmodeling criterion function P(C(t*)\T)= £ e(C(t), C(w)\T). (t,w)euxu For criterion function P we have P(C(fi)) = 0 <3- n is an exact blockmodeling The obtained optimization problem can be solved by local optimization. Once a partitioning fi and types of connection 7r are determined, we can also compute the values of connections by using averaging rules. 3.2 Local Optimization For solving the blockmodeling problem we use a local optimization procedure (relocation algorithm): Determine the initial clustering C; repeat: if in the neighborhood of the current clustering C there exists a clustering C1 such that P(C') < P(C) then move to clustering C' . The neighborhood in this local optimization procedure is determined by the following two transformations: - moving a unit Xk from cluster Cp to cluster Cq (transition); - interchanging units Xu and Xv from different clusters Cp and Cq (transposition). 3.3 Benefits from Optimization Approach - ordinary / inductive blockmodeling: Given a network Af and set of types of connection T, determine M, i.e., n, 7r and a; - evaluation of the quality of a model, comparing different models, analyzing the evolution of a network (Sampson data, Doreian and Mrvar 1996): Given a network Af, a model M, and blockmodeling /¿, compute the corresponding criterion function; - model fitting / deductive blockmodeling: Given a network Af, set of types T, and a model M, determine /i which minimizes the criterion function (Batagelj, Ferligoj, Doreian, 1998). - we can fit the network to a partial model and analyze the residual afterward; - we can also introduce different constraints on the model, for example: units X and Y are of the same type; or, types of units X and Y are not connected; GENERALIZED BLOCKMODELING Informatica 23 (1999) 501-506 505 4 Pre-Specified Blockmodels Figure 5: Symmetric acyclic blockmodel of Student Government. The pre-specified blockmodeling starts with a blockmodel specified, in terms of substance, prior to an analysis. Given a network, a set of ideal blocks is selected, a reduced model is formulated, and partitions are established by minimizing the criterion function. The pre-specified blockmodeling is supported by the program MODEL 2 (Batagelj, 1996). As an example of pre-specified blockmodel we present in Figure 5 a symmetric acyclic blockmodel of Student Government. The obtained clustering in 4 clusters is almost exact - acyclic model with symmetric clusters. The only error is produced by the arc (a3, m5). 5 Final Remarks Acknowledgment: This work was supported by the Ministry of Science and Technology of Slovenia, Project Jl-8532. References [1] Batagelj, V. (1991): STRAN - STRucture ANalysis. Manual, Ljubljana. [2] Batagelj, V. (1997): Notes on Blockmodeling. Social Networks, 19, 143-155. Also in: Abstracts and Short Versions of Papers, 3rd European Conference on Social Network Analysis, München, 1993: DJI, 1-9. [3] Batagelj, V., P. Doreian, and A. Ferligoj (1992): An optimizational approach to regular equivalence. Social Networks, 14:121-135. [4] Batagelj, V., A. Ferligoj, and P. Doreian (1992): Direct and indirect methods for structural equivalence. Social Networks, 14:63-90. [5] Batagelj, V., Ferligoj, A., and Doreian, P. (1998): Fitting Pre-Specified Blockmodels, in Data Science, Classification, and Related Methods, Eds., C. Hayashi, N. Ohsumi, K. Yajima, Y. Tanaka, H. H. Bock, and Y. Baba, Springer-Verlag, Tokyo, p.p. 199206. [6] Borgatti, S.P. and M.G. Everett (1989): The class of all regular equivalences: Algebraic structure and computation. Social Networks, 11:65-88. [7] Doreian, P., V. Batagelj, and A. Ferligoj (1994): Partitioning Networks on Generalized Concepts of Equivalence. Journal of Mathematical Sociology, 19/1:127. [8] Doreian, P., V. Batagelj, and A. Ferligoj (1998): Symmetric-Acyclic Decompositions of Networks. To appear in Journal of Classification. [9] Doreian, P. and A. Mrvar (1996) A Partitioning Approach to Structural Balance. Social Networks 18:149-168. The current, local optimization based, programs for gener- [10] Faust, K. (1988): Comparison of methods for posi- alized blockmodeling can deal only with networks with at tional analysis: Structural and general equivalences, most some hundreds of units. What to do with larger net- Social Networks, 10:313-341. works is an open question. For some specialized problems also procedures for (very) large networks can be developed [11] A. Ferligoj, V. Batagelj, and Doreian, P. (1994): On (Doreian, Batagelj, Ferligoj, 1998). Connecting Network Analysis and Cluster Analysis. Another interesting problem is the development of In Contributions to Mathematical Psychology, Psy- blockmodeling of valued networks. chometrics, and Methodology (G.H. Fischer, D. Lam- MODEL 2 and related programs and data can be ob- in8 Eds-)> New York: Springer. tamed from [ 12] Lorrain, F. and H.C. White (1971): Structural equiv- http://vlado.fmf .uni-lj .si/ alence of individuals in social networks. Journal of pub/networks/stran/ Mathematical Sociology, 1:49-80. 506 Informática 23 (1999) 501-506 V. Batagelj et al. [13] Hlebec, V. (1993): Recall versus recognition: Comparison of two alternative procedures for collecting social network data. Developments in Statistics and Methodology. (A. Ferligoj and A. Kramberger, editors) Metodološki zvezki 9, Ljubljana: FDV, 121-128. [14] White, D.R. and K.P. Reitz (1983): Graph and semigroup homomorphisms on networks of relations. Social Networks, 5:193-234.