Metodoloˇ skizvezki,Vol.17,No.2,2020,49–66
blockmodeling: AnRpackageforgeneralizedblockmodeling
MihaMatjaˇ siˇ c,MarjanCugmas,Aleˇ s
ˇ
Ziberna
∗
UniversityofLjubljana,FacultyofSocialSciences,Ljubljana,Slovenia
Abstract
This paper presents the R package blockmodeling which is primarily meant as an imple-
mentationofgeneralizedblockmodeling(morebroadlyblockmodeling)forvaluednetworks
where the values of the ties are assumed to be measured on at least interval scale. Block-
modelingisoneofthemostcommonlyusedapproachesintheanalysisof(social)networks,
whichdealswiththeanalysisofrelationshipsorconnections,betweentheunitsstudied(e.g.,
peoples,organizations,journalsetc.). TheRpackageblockmodelingimplementsseveralap-
proaches for the generalized blockmodeling of binary and valued networks. Generalized
blockmodeling is commonly used to cluster nodes in a network with regard to the structure
of their links. The theoretical foundations of generalized blockmodeling for binary and val-
ued networks are summarized in the paper while the use of the R package blockmodeling is
illustratedbyapplyingittoanempiricaldataset.
1. Introduction
The aim of this paper so to present the R package blockmodeling which is primarily
meantasanimplementationofgeneralizedblockmodelingforvaluednetworks.
Anetworkisdeﬁnedbythesetofnodes(alsocalledvertices,unitsoractors)andtheset
of links among the nodes. These two sets determine a graph which describes the network’s
structure. In,e.g. social sciences, thenodesoften representindividualsand thelinksamong
them represent a selected (social) relationship among individuals. Additional data can be
assignedtothenodes(e.g.,genderorage)andlinks(e.g.,thenumberofcontacts)todescribe
theirproperties(alsocalledattributes)(Batageljetal.,2004).
Since real-world networks may be large and complex, researchers try to simplify them
tosmallerandmoreunderstandablestructuresthatareeasiertointerpret. Acommonwayof
accomplishingthisgoalisablockmodelingapproachwhichpartitionsthenodesofanetwork
anddeterminesthetiesamongthe(obtained)clustersofnodes(Batageljetal.,2004). Inthe
social sciences, blockmodeling is also a very important explanatory tool for studying social
roles because it is assumed that the way a cluster of nodes is embedded in the network
structureiscloselyassociatedwiththenodes’socialrole(s)(BorgattiandEverett,1992).
∗
Correspondingauthor
Emailaddresses: miha.matjasic@fdv.uni-lj.si(MihaMatjaˇ siˇ c),
marjan.cugmas@fdv.uni-lj.si(MarjanCugmas),ales.ziberna@fdv.uni-lj.si(Aleˇ s
ˇ
Ziberna)
50 Matjaˇ siˇ cetal.
While blockmodeling may entail several different methods, the focus of this paper is on
generalizedblockmodelingofbinaryandvaluednetworksusingthe blockmodeling package
fortheRprogramminglanguage(RCoreTeam,2018).
The structure of this paper is as follows: In Section 2, we describe blockmodeling. In
Section 3, we describe the R package blockmodeling and in the Section 4, we provide ex-
amplesofthepackageuse,whileinSection5,wesummarisethemainfunctionalitiesofthe
package.
2. Blockmodeling
Blockmodeling is a set of approaches for partitioning nodes into clusters (also called
positions) and simultaneously partitioning the links into blocks which are deﬁned by the
obtained clusters (Lorrain and White, 1971; Batagelj et al., 2004;
ˇ
Ziberna, 2007). A block
isasubmatrixshowingthelinksbetweennodesfromthesameordifferentclusters.
The concept of blockmodeling is presented in Figure 1 where the illustrative valued
network is shown in both matrix form and as a graph (Figure 1a). The units are ordered in
rows and columns according to their names (n1, n2, n3, ...). The units are then partitioned
by considering the weights such that those with similar patterns of links are partitioned into
the same clusters. The network is represented consistently represented in matrix form in
accordance with the clusters obtained in matrix form so that units from the same clusters
are placed next to each other and different clusters are separated by blue lines (Figure 1b).
Nodes of the same cluster are coloured using the same colour in the corresponding graphic
visualization. Thennodesfromthesameclustersareshriekedandrepresentedasnodesofa
blockmodel, which is shown as both a matrix and a graph (Figure 1c). The block densities
are provided in the matrix. These summarize the strength of the relationship within and
between the clusters. It can be seen that one core cluster and two cohesive clusters were
identiﬁed. The core cluster is linked with both cohesive clusters (and vice versa) whereas
thecohesiveclustersarenotlinkedtoeachother.
Thenodesareclusteredaccordingtosomenotionofequivalence(WassermanandFaust,
1994). The most commonly used are structural equivalence (Lorrain and White, 1971) and
regular equivalence (White and Reitz, 1983), both originally deﬁned for binary networks
(
ˇ
Ziberna, 2007). Two nodes are structurally equivalent if they are identically linked to the
rest of the network (and to themselves), while the nodes are regularly equivalent if they are
connected in the same way to equivalent others. Regular equivalence is a generalization
of structural equivalence. While analysing valued networks, regular equivalence should be
replacedby f-regularequivalence,where f referstoanyfunction,suchassum,maxormean
(
ˇ
Ziberna,2007).
In practice, structural equivalence is probably the most commonly used type of equiv-
alence (
ˇ
Znidarˇ siˇ c, Ferligoj and Doreian, 2012). At the same time, regular equivalence has
never achieved widespread use (
ˇ
Ziberna, 2013), especially because it is rarely present in
empirical data (Boyd and Jonas, 2001) and very sensitive to small changes in the network
(
ˇ
Znidarˇ siˇ c, Ferligoj and Doreian, 2012). Concerns have also been voiced about regular
equivalence’sapplicabilitytosocialtheory(Boyd,2002).
Intermsofgeneralizedblockmodeling,achosentypeofequivalencedeﬁnesthepossible
blocktypes(andviceversa,i.e.,theallowedblocktypesingeneralizedblockmodelingimply
the type of equivalence). For example, when binary networks are analysed, and structural
equivalence is used, only null (ideally there are no links) and complete (ideally there are all
possiblelinks)blocksarepossible,whilewithregularequivalencenull,completeandregular

52 Matjaˇ siˇ cetal.
Table1: Characterisationsofidealblocks(
ˇ
Ziberna,2007)
Idealblockname Descriptionfor
binary
blockmodeling
Descriptionfor
valued
blockmodeling
Descriptionfor
homogeneity
blockmodeling
null all0
a
all0
b
all0
c
complete all1
d
allvaluesatleastm
d
allequal
e
row-dominant anall1rowexists
d
arowwhereall
valuesareatleastm
exists
d
arowexistswhere
valuesareallequal
c
col-dominant anall1column
exists
d
acolumnwhereall
valuesareatleastm
exists
d
acolumnexists
whereallvaluesare
equal
c
row(-f)-regular atleastone1in
eachrowexists
d
thefovereachrow
isatleastm
d
foverallrowsequal
column(-f)-regular atleastone1in
eachcolumnexists
thefovereach
columnisatleastm
foverallcolumns
equal
(f-)regular atleastone1in
eachrowandeach
columnexists
thefovereachrow
andeachcolumnis
atleastm
foverallrowsand
allcolumns
seperatelyequal
row-functional exactlyone1in
eachrowexists
exactlyonetiewith
valueatleastmin
eachrowexists,all
other0
maxoverallrows
equal,allother
values0
column-functional exactlyone1in
eachcolumnexists
exactlyonetiewith
valueatleastmin
eachcolumnexists,
allother0
maxoverallrows
equal,allother
values0
a
Anexceptionmaybecellsonthediagonal,theirvaluesshouldallbeequalto1.
b
Anexceptionmaybecellsonthediagonal,theirvaluesshouldallbeleastm.
c
Anexceptionmaybecellsonthediagonal,theirvaluesshouldbeequal.
d
Diagonal,theirvaluesshouldallbeequalto0.
e
Cells on the diagonal may be treated separately - their values should all be equal, however they can be
differentfromthevaluesoftheoff-diagonalcells.
blockmodeling: AnRpackageforgeneralizedblockmodeling 53
of the direct blockmodeling approach. Use of the blockmodeling package is demonstrated
for(direct)generalizedblockmodelingonly.
Somenon-generalizeddirectblockmodelingapproaches(fornon-signedandsignednet-
works) are implemented in the dBlockmodeling package (Brusco, 2020), while generalized
blockmodelingforbinarynetworksandsomedirectapproachesforsignednetworksareim-
plementedinPajek(Batageljetal. 2004).
2.1.1. Conventionalblockmodeling
Conventional blockmodeling (Doreian et al., 2005) is an indirect approach involving
two steps: (i) obtaining a dissimilarity matrix on the nodes using a dissimilarity measure
whichisconsistentwiththetypeofequivalenceselected(e.g.,correctedEuclideandistance
(Batagelj, Ferligoj and Doreian, 1992) for structural equivalence); and (ii) clustering the
nodes with a hierarchical clustering method (e.g. Ward’s agglomerative clustering method
(Ward,1963)),basedonthedissimilaritymatrixobtained. Sincethesecondstepiswellsup-
ported by other R packages, the blockmodeling package only provides functions for com-
puting (dis)similarity matrices according to structural equivalence (sedist function) and
regularequivalence(REGEfunctionandotherfunctions).
2.1.2. Generalizedblockmodeling
With generalized blockmodeling, a blockmodel is directly obtained from the network
databyoptimizingacriterionfunction,typicallywitharelocationalgorithm(Batageljetal.,
1992). Differenttypesofequivalencesand/orblocktypescanbespeciﬁed.
Generalized blockmodeling holds several advantages over conventional blockmodeling
(Doreian, 2006; Doreian et al., 2005; Batagelj et al., 2006): (i) since the direct approach al-
readyincludesthecriterionfunctionintheprocessofoptimizingpartitions,atleastalocally
optimal solution will be obtained with the generalized approach; (ii) the partitions obtained
by generalized blockmodeling frequently outperform those obtained with the conventional
approach(atleastinthecaseofstructuralandregularequivalence);(iii)conventionalblock-
modelinghasmainlybeenusedinaninductiveway,meaningthatresearchershaveaccepted
what was delineated through the clustering procedure. Yet, researchers often possess some
priorknowledgeabouttheglobalnetworkstructurethatcanbeincludedintheblockmodel’s
speciﬁcation.
Examples of generalized blockmodeling use are found in Doreian et al. (2005), Mrvar
and Doreian (2009), Cugmas, Ferligoj, and Kronegger (2016) and Cugmas et al. (2020).
These examples include social relations in working settings, classroom networks, political
unitnetworks,scientiﬁccollaborationandcitationnetworks,sportnetworksandothertypes
ofnetworks.
Inthispaper,thefollowingtypesofgeneralizedblockmodelingareconsidered:
(i) Generalized binary blockmodeling, which is intended for analysing binary networks.
ThebinaryblockmodelingconceptispresentedthoroughlyinDoreianetal. (2005).
(ii) Generalized valued blockmodeling, which was developed because earlier researchers
were converting valued networks into binary networks and analysing them as binary
networks. Thebinarizationwasaccomplishedbyrecodingvaluesabove(orequalto)a
certainthreshold(often1)into1sandtheotherinto0s(seeDoreianetal.,2005),which
however caused a loss of considerable amount of information. The valued blockmod-
eling approach reduces the amount of information lost, although some loss may still
occur.
54 Matjaˇ siˇ cetal.
Valued blockmodeling may be seen as an extension of binary blockmodeling. It ex-
tends the equivalence relations and thereby the deﬁnitions of possible block types by
replacing the stipulations for 1 with analogous stipulations for the value m (the mini-
mal value that characterizes the tie between a unit and either a cluster or another unit
suchthatthistiesatisﬁestheconditionoftheblock). Therefore,thecriterionfunction
used in the valued blockmodeling measures block inconsistencies as the deviation of
appropriatevaluesfromeither0orm(
ˇ
Ziberna,2007).
(iii) Generalizedhomogeneityblockmodeling,whichisbasedontheideathatblocksshould
beashomogeneousaspossiblewithrespecttosomeproperty. Accordingly,theincon-
sistencies of an empirical block with respect to its ideal block are measured by the
within-blockvariabilityofappropriatevalues.
Oneofthetwovariabilitycriteriacanbeused: thesumofthesquareddeviationsfrom
themeanorthesumofabsolutedeviationsfromthemedian(
ˇ
Ziberna,2007).
2.2. Prespeciﬁedblockmodeling
A researcher can consider a prior knowledge concerning the ties among the clusters
whileconductingblockmodeling(Doreianetal.,2005). Thismaybedonebyspecifyingnot
only the number of clusters and allowed block types (the same for all blocks), but also by
specifyingtheallowedblocktypesforeachblockseparately. Typically,onlyoneblocktype
isspeciﬁedasallowedforatleastsomeblocks.
3. Packagedescription
A stable version of the R package blockmodeling
1
is available from the Comprehensive
R Archive Network (CRAN) at https://cran.r-project.org/web/packages/bloc
kmodeling while test versions are available from the R-Forge at https://r-forge.r-pr
oject.org/R/?group id=203. The package has been around since 2007 and is currently
writtenintheprogramminglanguagesR,CandFortran. Inthispaper,version1.0.0isused.
The package supports generalized and indirect blockmodeling. For generalized block-
modeling, one-mode, two-mode and multilevel networks (also linked networks (
ˇ
Ziberna,
2020)) with one or more relations are supported. However, for purpose of clarity and sim-
plicity, this paper is limited to generalized blockmodeling of one-mode single relational
networks.
Toobtainageneralizedblockmodelingsolution,aresearchermightwanttousethefunc-
tionoptRandomParC,whichoptimizesaspeciﬁednumberofrandomlygeneratedpartitions
based on the criterion function selected (to optimize a single partition, a researcher can use
thefunctionoptParCthatoptimizesonlythesupplied(one)partition,althoughsoastoavoid
alocalminimumthisisnotrecommended). Themainargumentsofthefunctionare:
• M: anadjacencymatrixrepresentingthe(usuallyvalued)network.
• k: thenumberofclusters.
1
The blockmodeling package leverages functions from a variety of other packages. Key computations use
stats (R Core Team, 2019a), methods (R Core Team, 2019b), Matrix (Bates and Maechler, 2019), parallel (R
CoreTeam,2019c)andothers.
blockmodeling: AnRpackageforgeneralizedblockmodeling 55
• approach: the chosen generalized blockmodeling approach; 'bin' for generalized
binary blockmodeling, 'val' for generalized valued blockmodeling and 'hom' for
homogeneityblockmodeling.
• regFun: the function f speciﬁes regular block types (e.g., max-regular block when
regFun = 'max'). The functionisonlyrelevantwhen the f-regular blocksare spec-
iﬁedbytheargumentblocks.
• blocks: a vector with the names of allowed block types. At least two must be speci-
ﬁed for binary and valued blockmodeling. Possible types are: null ('nul'), complete
('com'), regular ('reg'), column-(function) regular ('cre') and row-(function) reg-
ular block ('rre'). In the case of binary and valued blockmodeling, a researcher can
also specify column-dominant block ('cfn') and row-dominant block ('rfn') and
with valued blockmodeling a researcher can also specify average block ('avg'). The
option “do not care” ('dnc') is also available. When pre-speciﬁcation is used, the
argumentisstatedintheformofanarray,asshownintheexamplesection.
• rep: thenumberofdierentstartingpartitions.
• nCores: thenumberofphysicalCPUcores tobeused. Allavailablebutonephysical
CPUcoresareusedwhennCores = 0.
• preSpecM: thevaluemmustbespeciﬁedonlyinthecaseofgeneralizedvaluedblock-
modeling.
To calculate only the value of a criterion function, a researcher can use the function
critFunC. The same arguments apply as for the case of the function oprRandParC, ex-
cept that k is replaced by a partition (a vector) clu and the arguments rep and nCores are
omitted.
OnceablockmodelandpartitionhavebeenobtainedbyeitheroptParCoroptRandom-
ParC, a researcher can use the function IM to extract a blockmodel, the function clu to
extractanobtainedpartitionorthefunctionerrtoextractthevalueofacriterionfunction.
The package contains some other handy functions such as funByBlocks (which com-
putesthevalueofafunction(meanbydefault)overblocksofamatrixdeﬁnedbyapartition),
plotMat(whichplotsanetworkinmatrixformbyconsideringthecorrespondingpartition)
and functions for computing the adjusted and original Rand Index (e.g. crand2). A plot
method that internally calls plotMat is available for S3 classes returned by optParC and
optRandomParC.
4. Demonstrationofthepackageuse
The use of various generalized blockmodeling approaches is illustrated using the Baker
citation network data (Baker, 1992). Here, the nodes represent journals from the ﬁeld of
social work (the 20 journals listed in Table 2). There is an arc from journal i to journal j
if journal i cited journal j. The values on the arcs correspond to the number of citations in
1985.
The data can be loaded from the package blockmodeling using data('baker'). The
diagonal values, representing the number of citations by papers from the same journals, are
replaced with 0s (diag(baker) <- 0). The network can be visualized with the function
plotMat. Sincethepartitionisnotyetobtained,thecluargumentisnotset.
56 Matjaˇ siˇ cetal.
Table2: JournalsinSocialWorkCitationNetwork
Label Journal
AMH AdministrationinMentalHealth
ASW AdministrationinSocialWork
BJSW BritishJournalofSocialWork
CAN ChildAbuseandNeglect
CCQ ChileCareQuarterly
CW ChildWelfare
CYSR ChildrenandYouthServicesReview
CSWJ ClinicalSocialWorkJournal
FR FamilyRelations
IJ6W IndianJournalofSocialWork
JGSW JournalofGerontologicalSocialWork
JSP JournalofSocialPolicy
JSWE JournalofSocialWorkEducation
PW PublicWelfare
SCW SocialCasework
SSR SocialServicesReview
SW SocialWork
SWG SocialWorkwithGroups
SWHC SocialWorkinHealthCare
SWRA SocialWorkResearchandAbstracts
plotMat(baker, main = 'Baker Network Data',
mar = c(1, 1, 3, 1), title.line = 2)
Figure2isobtainedwiththefunctionplotMat. Tomaketheploteasiertoread,thecell
valuesareautomaticallymultipliedbythefactor(inthiscase0.1)which(bydefault)places
their absolute values in the range [0,100). The factor by which the values are multiplied is
automaticallyselectedandreported,asnotedbelowtheplot.
It is immediately apparent from Figure 2 that the network is relatively sparse, meaning
thejournalsdidnottendtociteeachother. However,thehighestnumberofcitationsextends
from SCW to SW and from SSR to SW. The latter journal is also the one which cited the
highestnumberofotherjournals.
4.1. Binaryblockmodeling
To analyse valued networks by using generalized blockmodeling for binary networks, a
re-searchermustbinarizethevaluednetworkandforthisadoptoneofseveralways,suchas
keepingallofthearcswithvaluesgreaterthan0.
bakerBinar <- baker
bakerBinar[bakerBinar > 0] <- 1
In all of the following examples, the function optRandomParC is used. The number of
clusters is set to 2 or 3. The number of clusters is chosen arbitrarily by examining multiple
partitionswithdifferentnumbersofclusters(2or3seemtobethemostappropriate)(alsosee
Doreian et al., 2005). The 1000 randomly generated partitions are optimized and multiple
coresareused. Forbinaryblockmodeling,theapproachargumentmustbesetto'bin'.
blockmodeling: AnRpackageforgeneralizedblockmodeling 57
AMH
ASW
BJSW
CAN
CCQ
CW
CYSR
CSWJ
FR
IJSW
JGSW
JSP
JSWE
PW
SCW
SSR
SW
SWG
SWHC
SWRA
AMH
ASW
BJSW
CAN
CCQ
CW
CYSR
CSWJ
FR
IJSW
JGSW
JSP
JSWE
PW
SCW
SSR
SW
SWG
SWHC
SWRA
Baker Network Data
1
1
1
2
0
1
1
1
1
0
1
1
7
0
2
2
5
1
1
1
0
1
1
1
2
1
1
2
1
3
1
2
1
1
1
2
1
0
3
1
5
2
2
2
3
6
1
2
1
2
1
1
2
1
2
4
5
1
4
0
7
2
1
6
3
4
1
0
2
6
12
11
4
3
4
1
2
1
1
4
1
1
0
2
2
2
1
* all values in cells were multiplied by 0.1
Figure2: Bakernetworkdatainmatrixform
4.1.1. Structuralequivalence
Ifstructuralequivalenceisused,onlynullandcompleteblocktypesarepossible. There-
fore,avectorc('nul', 'com')isprovidedtoblocks(thestructuralequivalenceissetby
thevectoroftheallowedblocktypes).
resBinStr <- optRandomParC(M = bakerBinar, k = 3, rep = 1000,
nCores = 0, blocks = c('nul', 'com'), approach = 'bin')
The number of errors (inconsistencies) of the blockmodel then obtained is 47 (accessed
via the function err). The obtained partition can be accessed with the function clu while
theblockmodelcanbeseenintheformofanimagematrixusingthefunctionIM. Theimage
matrixspeciﬁestheblocktypesbyblocks. ThefunctionIMshowstheimagematrixobtained
(notthespeciﬁedone)withblockmodeling.
IM(resBinStr)
[,1] [,2] [,3]
[1,] "nul" "nul" "com"
[2,] "nul" "com" "com"
[3,] "nul" "com" "com"
Theimagematrixshowsthejournalsincluster2andcluster3citeeachotherbothwithin
andbetweentheclusters. Journalsincluster1donotciteeachotheringeneral,buttheycite
journalsincluster3. Cluster3canbeidentiﬁedasthemostcentralclusterwhilecluster1as
aperipheralclusterbecauseinthisclusterthejournalsaregenerallynotcitedmuchbyother
journals.
The block densities can be calculated with the function funByBlocks and visualized
with the function plotMat. Finally, the empirical network can be visualized in matrix
form and in line with the blockmodeling solution that is obtained. When using the func-
58 Matjaˇ siˇ cetal.
tion plotMat, the obtained partition has to be provided to the function by the argument
clu. The latter is not necessary when using the function plot (the S3 method exists for the
optMorePar class that is returned by the optRandomParC function), as shown below. The
clustersofjournalsobtainedareseparatedbylinesinFigure3.
plot(resBinStr, main = 'A Baker Network Data',
mar = c(1, 2, 3, 1), title.line = 2)
AMH
BJSW
CAN
CCQ
CYSR
CSWJ
FR
IJSW
JGSW
JSP
PW
SWG
SWHC
ASW
CW
JSWE
SSR
SWRA
SCW
SW
AMH
BJSW
CAN
CCQ
CYSR
CSWJ
FR
IJSW
JGSW
JSP
PW
SWG
SWHC
ASW
CW
JSWE
SSR
SWRA
SCW
SW
Baker Network Data
1
2
3
1
2
3
Block densities
0
1
4
2
8
10
7
9
10
* all values in cells were multiplied by 10
Figure 3: Matrix representation of the network of journals partitioned into 3 clusters using
binaryblockmodelingwithstructuralequivalenceandthecorrespondingblockdensities
It can be seen in Figure 3 (left) that the block densities are lowest in the null blocks, as
expected. Among the null blocks, however, the density is highest in the block belonging to
thelink(citing)fromcluster3tocluster1,whichreﬂectsatendencyforreciprocity.
The most central cluster (cluster 3) only consists of two journals, SCW and SW, while
cluster 2 contains the following journals: ASW, CW, JSWE, SSR and SWRA. All other
journalsarelocatedintheperipheralcluster.
4.1.2. Regularequivalence
Here regular equivalence is used and the number of clusters is set to 2. The regular
equivalence is speciﬁed in the function optRandomParC by adding a regular block type
amongthepossibleblocktypes.
resBinReg <- optRandomParC(M = bakerBinar, k = 2, rep = 1000,
nCores = 0, blocks = c('nul', 'com', 'reg'),
approach = 'bin')
ThepartitionedmatrixinFigure4showsthatasmallcluster(cluster1)ofjournalsexists
that are not cited by any journal. The three journals in this cluster are AMH, IJSW and JSP.
The ﬁrst two journals cited SW while JSP cited SSR (all of the cited journals are in cluster
2). Thesecitationsrepresentinconsistentlinks(err(resBinReg)).
The similarity of the obtained partitions can be measured with the Adjusted Rand Index
(Rand, 1971; Hubert and Arabie, 1985), where the expected value is 0 in the case of two
random partitions and the maximum value of the measure is 1 (in the event of two identical
partitions).
crand2(clu1 = clu(resBinStr), clu2 = clu(resBinReg))
blockmodeling: AnRpackageforgeneralizedblockmodeling 59
AMH
IJSW
JSP
ASW
BJSW
CAN
CCQ
CW
CYSR
CSWJ
FR
JGSW
JSWE
PW
SCW
SSR
SW
SWG
SWHC
SWRA
AMH
IJSW
JSP
ASW
BJSW
CAN
CCQ
CW
CYSR
CSWJ
FR
JGSW
JSWE
PW
SCW
SSR
SW
SWG
SWHC
SWRA
Baker Network Data
1
2
1
2
Block densities
6
31
* all values in cells were multiplied by 100
Figure 4: The network of journals partitioned into 2 clusters using binary blockmodeling with
regularequivalenceandthecorrespondingblockdensities
The value −0.12 conﬁrms what is seen when comparing Figure 3 and Figure 4, i.e. that
the partitions obtained (by using structural equivalence vs. regular equivalence) are very
different.
4.2. Valuedblockmodeling
The main dilemma in valued blockmodeling is how to determine the most appropriate
valueofm. Thebestapproachistochooseavaluebasedonpriorknowledgeabouthowhigh
the value of a tie should be for it to be considered as strong or relevant. In the absence of
such prior knowledge, a researcher may refer to one of the guidelines provided by
ˇ
Ziberna
(2007)orselectthemostappropriatembasedonthedistributionofalltievalues(Figure5).
Number of citations
Frequency
0 20 40 60 80 100 120
0
5
10
15
20
25
30
Figure5: Distributionofthenumberofcitationsamongthejournals
Here,missettoamedianvalue(onlyvaluesgreaterthan0aretakenintoaccount),which
is13.
4.2.1. Structuralequivalence
To apply blockmodeling of valued networks, a researcher must set approach = 'val'
and specify the value m by setting the argument preSpecM. In addition, the allowed block
60 Matjaˇ siˇ cetal.
types (indirectly the type of equivalence) and number of clusters must be speciﬁed. The
numberofclustersissettothree.
resValStr <- optRandomParC(M = baker, k = 3, rep = 1000,
preSpecM = 13, approach = 'val', blocks = c('nul', 'com'),
nCores = 0)
It can be seen in Figure 6 that cluster 1 and cluster 3 form a very clear symmetric core-
periphery global network structure since the journals in cluster 1 (core; JSWE, SCW, SSR
and SW) mutually cited each other and also cited those in cluster 3 (periphery; ASW, CW,
CSWJ,SWG,SWHCandSWRA).Anotherinternallynon-linkedcluster(cluster2)ofjour-
nalsexists. Somejournalsinthisclustercitedthejournalsinthecorecluster.
JSWE
SCW
SSR
SW
AMH
BJSW
CAN
CCQ
CYSR
FR
IJSW
JGSW
JSP
PW
ASW
CW
CSWJ
SWG
SWHC
SWRA
JSWE
SCW
SSR
SW
AMH
BJSW
CAN
CCQ
CYSR
FR
IJSW
JGSW
JSP
PW
ASW
CW
CSWJ
SWG
SWHC
SWRA
Baker Network Data
2
1
3
1
2
1
1
2
2
3
6
1
0
1
2
2
3
5
1
2
1
2
4
5
1
1
2
1
2
1
4
6
12
11
0
2
1
3
1
0
2
7
6
4
4
3
4
0
1
1
1
1
0
1
1
1
0
1
2
1
1
1
1
1
1
2
2
2
5
1
1
7
0
1
1 1
2
1
1
4
1
2
2
2
1
0
1
* all values in cells were multiplied by 0.1
1
2
3
1
2
3
Block densities
47
4
23
1
0
1
12
2
1
Figure 6: The network of journals partitioned into 3 clusters using valued blockmodeling (m=
13)withstructuralequivalenceandthecorrespondingblockdensities
4.2.2. Regularequivalence
In the case of valued blockmodeling with regular equivalence, a researcher must select
thefunction f tospecifythetypeof f-regularblocks. Thisissettomaxbydefault,although
it can also be set to sum, mean or other functions. Here, the max-regular block type is to
be allowed and therefore the argument regFun within the function optRandomPar is set to
'max'. A regular block type is added to the vector of allowed block types. The number of
clustersisarbitrarilysettotwoclusters.
resValReg <- optRandomParC(M = baker, k = 2, rep = 1000,
preSpecM = 13, approach = 'val',
blocks = c('nul', 'com', 'reg'), nCores = 0, regFun = 'max')
The blockmodel (image matrix) that is obtained is the same at that obtained by binary
blockmodeling with regular equivalence, but the sizes and obtained partitions are differ-
ent with a value of the Adjusted Rand Index of 0.1. There are more links in null blocks
(comparedtobinaryblockmodelingwithstructuralequivalence),butthecorrespondinglink
values are relatively low. Consequently, the sizeofthecluster withthejournalsthat are less
cited(andciteless)isbigger.
blockmodeling: AnRpackageforgeneralizedblockmodeling 61
ASW
CW
CYSR
CSWJ
JSWE
SCW
SSR
SW
SWG
SWHC
SWRA
AMH
BJSW
CAN
CCQ
FR
IJSW
JGSW
JSP
PW
ASW
CW
CYSR
CSWJ
JSWE
SCW
SSR
SW
SWG
SWHC
SWRA
AMH
BJSW
CAN
CCQ
FR
IJSW
JGSW
JSP
PW
Baker Network Data
1
1
1
2
7
2
2
5
1
1
1
0
1
1
2
1
2
1
3
1
2
1
3
1
5
2
3
6
1
2
1
1
0
2
2
2
1
1
2
2
4
5
1
4
1
7
6
3
4
6
12
11
4
3
4
0
2
1
1
0
2
1
2
1
1
4
1
1
0
2
2
2
1
0
1
1
1
1
0 0
1
1
1
1
1
1
2
* all values in cells were multiplied by 0.1
1
2
1
2
Block densities
14
1
1
0
Figure 7: The network of journals partitioned into 2 clusters using valued blockmodeling (m=
13)withmaxregularequivalenceandthecorrespondingblockdensities
4.3. Homogenityblockmodeling
The homogeneity blockmodeling approach’s advantage over the valued blockmodeling
approach is that no parameters (such as the binarization threshold or parameter m) need to
be set. Therefore, it is very well suited as a preliminary or the main approach to valued
networks when no prior knowledge about these values is available. Homogeneity block-
modelingemphasizesthesimilarityoftiestrengthswithinblocksoverthepatternofties.
4.3.1. Structuralequivalence
To use homogeneity blockmodeling, the approach argument must be set to 'hom'. To
apply sum of squares homogeneity blockmodeling, the homFun argument must be set to
'ss'while,toapplyabsolutedeviationblockmodeling,theargumentmustbesetto'ad'in
theoptRandomParCfunction.
Becausethecomputationofinconsistenciesisverysimilarforsumofsquaresandabso-
lutedeviationsblockmodeling,applicationoftheﬁrstapproachisonlyshownhere.
resHomSSStr <- optRandomParC(M = baker, k = 2, rep = 1000,
approach = 'hom', homFun = 'ss', blocks = c('nul', 'com'),
nCores = 0)
Usually, the image matrix is not of interest in the case of homogeneity blockmodeling
becausethenullblocksareaspecialcaseofcompleteblocksandthusonlyclassiﬁedasnull
when the mean of the block is exactly 0, which rarely happens in practice. Instead, blocks
with low block means are interpreted as null blocks (see
ˇ
Ziberna (2013) for another way of
identifyingnullblocks).
The results shown in Figure 8 suggests the global network structure of the journal cita-
tion network can be characterized as a symmetric core-periphery structure. Here, the core
cluster is cluster 2 because the corresponding journals (SCW, SSR and SW) not only cited
each other, but also cited and were cited by other journals (according to the block densities,
the peripheral journals cited the core journal more often than the other way around). All
otherjournalsarelocatedintheperipheralclusterwithaveryfewcitationsfoundwithinthe
cluster.
62 Matjaˇ siˇ cetal.
AMH
ASW
BJSW
CAN
CCQ
CW
CYSR
CSWJ
FR
IJSW
JGSW
JSP
JSWE
PW
SWG
SWHC
SWRA
SCW
SSR
SW
AMH
ASW
BJSW
CAN
CCQ
CW
CYSR
CSWJ
FR
IJSW
JGSW
JSP
JSWE
PW
SWG
SWHC
SWRA
SCW
SSR
SW
Baker Network Data
1
1
1
2
0
1
1
1
1
0
1
1
7
0
1
2
2
5
1
1
0
1
1
1
2
1
1
1
2
2
1
3
1
1
1
2
1
2
1
1
1
4
1
0
2
2
2
1
1
0
3
1
5
2
2
2
1
2
1
3
6
2
1
1
2
1
2
1
4
4
5
0
7
2
1
6
3
4
1
0
2
6
4
3
4
12
11
* all values in cells were multiplied by 0.1
1
2
1
2
Block densities
1
7
15
68
Figure8: Thenetworkofjournalspartitionedinto3clustersusinghomogeneityblockmodeling
(sumofsquares)structuralequivalenceandthecorrespondingblockdensities
4.3.2. Regularequivalence
To apply blockmodeling with homogeneity regular equivalence, the regular block type
mustbeaddedtothevectorofpossibleblocktypesinthefunctionoptRandomParCandthe
f functionmustbedeﬁned,e.g. 'max',asanargumentofregFun.
resHomSSReg <- optRandomParC(M = baker, k = 2, rep = 1000,
approach = 'hom', blocks = c('nul', 'com', 'reg'),
regFun = 'max', nCores = 0)
Given that the partition and blockmodel which are obtained are the same as those in the
caseofstructuralequivalence,theyarenotinterpreted.
4.4. Pre-speciﬁedblockmodeling
A blockmodel can be fully or partially speciﬁed (see the subsection Prespeciﬁed block-
modeling). Thefollowinggivesanexampleoftheuseofpre-speciﬁedblockmodels.
Inthecaseofajournalcitationnetwork,aresearchermightpossesspriorknowledgethat
the global network structure is symmetric core-periphery, i.e. there are some journals (the
core) which are cited by most journals, while other journals (the periphery) cite journals in
the core and not those in their own cluster. Therefore, the pre-speciﬁed blockmodel may be
representedbythefollowingimagematrix:
preImageReg <- rbind(c('com', 'reg'), c('reg', 'nul'))
Here, the blocks connecting the core and the periphery are of the regular type. Alterna-
tively, a researcher can assume these blocks can be of the regular or complete type. When
thisisthecase,theimagematrixmustbespeciﬁedasanarray.
preImageRegCom <- array(NA, dim = c(2, 2, 2))
preImageRegCom[1,,] <- rbind(c('com', 'reg'), c('reg', 'nul'))
preImageRegCom[2,,] <- rbind(c('com', 'com'), c('com', 'nul'))
To apply pre-speciﬁed blockmodeling, the above matrix or array must be provided as
the argument to blocks within the function optRandomParC. To apply valued blockmod-
eling with m = 13, the approach and preSpecM arguments must be set to 'val' and 13,
respectively.
blockmodeling: AnRpackageforgeneralizedblockmodeling 63
resValPre <- optRandomParC(M = baker, k = 2, rep = 1000,
preSpecM = 13, approach = 'val', blocks = preImageRegCom,
nCores = 0)
Theobtainedimagematrix(blockmodel)isthefollowingone,
IM(resValPre)
[,1] [,2]
[1,] "com" "reg"
[2,] "reg" "nul"
indicating that the journals CW, SCW, SSR and SW are all part of a closely connected
core (cluster 1) while other journals are classiﬁed in the periphery (cluster 2). The core
and the periphery are connected with max-regular links and the density (Figure 9) is higher
within the block that links periphery to the core than within the block that links the core to
theperiphery.
CW
SCW
SSR
SW
AMH
ASW
BJSW
CAN
CCQ
CYSR
CSWJ
FR
IJSW
JGSW
JSP
JSWE
PW
SWG
SWHC
SWRA
CW
SCW
SSR
SW
AMH
ASW
BJSW
CAN
CCQ
CYSR
CSWJ
FR
IJSW
JGSW
JSP
JSWE
PW
SWG
SWHC
SWRA
Baker Network Data
2
2
5
1
1
7
0
1
3
3
6
1
0
1
5
2
2
2
1
2
1
1
4
5
2
1
2
1
2
1
4
6
12
11
0
7
2
1
3
4
1
0
2
6
4
3
4
1
1
2
1
0
1
1
1
1
0
1
1 1
1
0
1
1
2
1
3
2
1
1
2
1
2
1
1
1
2
1
4
1
1
2
2
1
1
0
2
* all values in cells were multiplied by 0.1
1
2
1
2
Block densities
49
12
5
1
Figure9: Thenetworkofjournalspartitionedinto2clustersusinghomogeneityblockmodeling
(sumofsquares)max-regularequivalenceandthecorrespondingblockdensities
5. Conclusion
Generalized blockmodeling is an approach for ﬁnding clusters of equivalent units in a
network and for determining the ties among these units. As such, it is used to study global
network structures and the (social) positions of the units. While generalized blockmodel-
ing is also implemented in the Pajek software (Batagelj et al., 2004), the implementation of
generalized blockmodeling in the blockmodeling package for the R programming language,
which is presented in this paper, is the only one that supports also the blockmodeling of
valued networks and the generalized blockmodeling of more complex networks (e.g., mul-
tilevel,multi-relational). Inaddition,italsosupportssomeotherblockmodelingapproaches
(indirectapproach)besidesgeneralizedblockmodeling.
This paper demonstrates the use of the blockmodeling package for generalized block-
modeling of binary and valued one-mode networks on a real network data set, namely
Baker’s data (Baker, 1992) set on citing among the journals. Based on the examples given,
64 Matjaˇ siˇ cetal.
itisclearthatblockmodelingsolutionscanvaryacrossdifferentblockmodelingapproaches,
underlining the fact that a prior knowledge concerning the analysed networks is crucial, not
onlyforthechoiceofthemostappropriateblockmodelingapproach,butalsowhenitcomes
tointerpretingtheresultsobtained.
Ultimately,thispaper,togetherwiththepackagedocumentation,canserveasabasisfor
analysingmorecomplexnetworksandfurtherexplorationsofthepackage’scapabilities.
Acknowledgment
This research was ﬁnancially supported by the Slovenian Research Agency (http:
//www.arrs.si) within the research program P5-0168 and the research project J7-8279
(Blockmodelingmultilevelandtemporalnetworks).
References
[1] Baker,D.(1992): Astructuralanalysisofthesocialworkjournalnetwork: 1985–1986.
JournalofSocialServiceResearch,15,153–167.
[2] Batagelj,V.,Bock,H.,Ferligoj,A.,and
ˇ
Ziberna,A.(2006): Datascienceand classiﬁ-
cation.Berlin: Springer.
[3] Batagelj, V., Doreian, P., Ferligoj A., and Kejˇ zar, N. (2014): Understanding large
temporalnetworksandspatialnetworks: Exploration,patternsearching,visualization
andnetworkevolution.NewYork,NY:JohnWiley&Sons.
[4] Batagelj, V., Ferligoj, A., and Doreian, P. (1992): Direct and indirect methods for
structuralequivalence.SocialNetworks,14,63–90.
[5] Batagelj,V.,Mrvar,A.,Ferligoj,A.,andDoreian,P.(2004): Generalizedblockmodel-
ingwithPajek.Metodoloˇ skizvezki,1,455–467.
[6] Bates,D.andMaechler,M.(2019): Matrix: Sparseanddensematrixclassesandmeth-
ods. R package version 1.2-17. Retrieved from https://cran.r-project.org/web
/packages/Matrix/index.html.
[7] Borgatti,S.P.andEverett.M.G.(1992): Notionsofpositioninsocialnetworkanalysis.
SociologicalMethodology,22,1–35.
[8] Boyd, J.P. (2002): Finding and testing regular equivalence. Social Networks, 24, 315–
331.
[9] Boyd, J.P. and Jonas, K.J. (2001): Are social equivalences ever regular? Permutation
andexacttests.SocialNetworks,23,87–123.
[10] Brusco, M. (2020): dBlockmodeling: Deterministic blockmodeling of signed, one-
mode and two-mode networks. R package version 0.2.0. Retrieved from https://CR
AN.R-project.org/package=dBlockmodeling.
[11] Cugmas, M., DeLay, D.,
ˇ
Ziberna, A., and Ferligoj, A. (2020): Symmetric core-
cohesive blockmodel in preschool children’s interaction networks. PLOS ONE, 15,
e0226801.
blockmodeling: AnRpackageforgeneralizedblockmodeling 65
[12] Cugmas, M., Ferligoj, A., and Kronegger, L. (2016): The stability of co-authorship
structures.Scientometrics,106,163–186.
[13] Doreian, P. (2006): Some open problem sets for generalized blockmodeling. In V.
Batagelj, H.-H. Bock, A. Ferligoj, A.
ˇ
Ziberna (eds.): Data science and classiﬁcation,
119–130.Berlin: Springer.
[14] Doreian, P., Batagelj, V., and Ferligoj, A. (2005): Generalized blockmodeling. Struc-
turalanalysisinthesocialsciences.NewYork,NY:CambridgeUniversityPress.
[15] Funke, T. and Becker, T. (2019): Stochastic block models: A comparison of variants
andinferencemethods.PLOSONE,14,e0215296.
[16] Holland, P.W., Laskey, K.B., and Leinhardt, S. (1983): Stochastic blockmodels: First
steps.SocialNetworks,5,109–137.
[17] Hubert, L. and Arabie, P. (1985): Comparing partitions. Journal of Classiﬁcation, 2,
193–218.
[18] INRA, L. J. B. (2015): blockmodels: Latent and stochastic block model estimation by
a ‘V-EM’ algorithm. R package version 1.1.1. Retrieved from https://CRAN.R-pro
ject.org/package=blockmodels.
[19] Lorrain, F. and White, C. H. (1971): Structural equivalence of individuals in social
networks.JournalofMathematicalSociology,1,49–80.
[20] Matias, C. and Miele, V. (2020): dynsbm: Dynamic stochastic block models. R pack-
ageversion0.7.Retrievedfromhttps://CRAN.R-project.org/package=dynsbm.
[21] Mrvar, A. and Doreian, P. (2009): Partitioning signed two-mode networks. Journal of
MathematicalSociology,33,196–221.
[22] Peixoto, T. (2020). Bayesian Stochastic Blockmodeling. In P. Doreian, V. Batagelj, A.
Ferligoj (eds.): Advances in Network Clustering and Blockmodeling, 289–332. Nex
York,NY:Wiley.
[23] R Core Team. (2018): R: A language and environment for statistical computing. R
FoundationforStatisticalComputing,Vienna,Austria.Retrievedfromhttps://www.
R-project.org.
[24] R Core Team. (2019a): R: A language and environment for statistical computing. R
FoundationforStatisticalComputing,Vienna,Austria.Retrievedfromhttps://www.
R-project.org.
[25] R Core Team. (2019b): R: A language and environment for statistical computing. R
FoundationforStatisticalComputing,Vienna,Austria.Retrievedfromhttps://www.
R-project.org.
[26] R Core Team. (2019c): R: A language and environment for statistical computing. R
FoundationforStatisticalComputing,Vienna,Austria.Retrievedfromhttps://www.
R-project.org.
66 Matjaˇ siˇ cetal.
[27] Rand, W. M. (1971): Objective criteria for the evaluation of clustering methods. Jour-
naloftheAmericanStatisticalAssociation,66,846–850.
[28] Snijders, T. and Nowicki, K. (1997): Estimation and prediction for stochastic block-
modelsforgraphswithlatentblockstructure.JournalofClassiﬁcation,14,75–100.
[29] Ward, J. H. (1963): Hierarchical grouping to optimize an objective function. Journal
oftheAmericanStatisticalAssociation,58,236–244.
[30] Wasserman, S. and Faust, K. (1994): Social network analysis: Methods and applica-
tions.Cambridge: CambridgeUniversityPress.
[31] White, D. R and Reitz, K. P. (1983): Graph and semigroup homomorphisms on net-
worksofrelations.SocialNetworks,5,193–234.
[32]
ˇ
Ziberna, A. (2007): Generalized blockmodeling of valued networks. Social Networks,
29,105–126.
[33]
ˇ
Ziberna, A. (2013): Generalized blockmodeling of sparse networks. Metodoloˇ ski
zvezki,10,99–119.
[34]
ˇ
Ziberna, A. (2020): Blockmodeling linked networks. In P. Doreian, V. Batagelj, A.
Ferligoj (eds.): Advances in network clustering and blockmodeling, 267–287. New
York,NY:JohnWiley&Sons.
[35]
ˇ
Znidarˇ siˇ c, A., Ferligoj, A., and Doreian, P. (2012): Non-response in social networks:
Theimpactofdifferentnon-responsetreatmentsonthestabilityofblockmodels.Social
Networks,34,438–450.